From-Scratch Build · Distributed Systems
A control plane for a fleet of small networked devices: each one runs a lightweight agent that announces itself, reports its health, and obeys remote commands over MQTT — while a central console installs, configures and watches them all. Built from scratch to learn how device fleets are actually managed.
What it is
The platform has two halves. On each device runs an agent — a small long-lived process that connects to a message broker, publishes a retained description of itself (metadata, status, current state, configuration), and listens for commands. On the network sits a central command plane that discovers every agent, pushes configuration, and triggers installs, restarts and upgrades.
I built this to answer a question I kept hitting in hobby hardware projects: once you have more than one or two devices, how do you manage them without SSH-ing into each box by hand? The answer turned out to be a publish/subscribe contract and a registry of installable capabilities.
The core idea I wanted to learn: a device fleet becomes manageable the moment every node speaks the same MQTT contract. Retained messages mean a freshly-connected console instantly knows the full state of the fleet — no polling, no central database of truth, just the broker.
The stack
The whole point of this rebuild was the toolchain that makes a self-describing agent possible. Here is what each piece actually does.
A lightweight publish/subscribe broker. Every agent and the console meet here; nobody connects to anybody directly.
The process on each device. Connects, publishes retained metadata/status/state/cfg, and runs the command handlers.
The operator-facing plane that lists agents, edits their config and issues fleet-wide commands.
A loader that lets an agent install and run pluggable capabilities at runtime instead of baking everything into one binary.
Each piece is its own Python package; the agent ships as a systemd service so it survives reboots.
CPU, memory and disk percentages stream as telemetry topics; logs batch back over MQTT, each gated by config.
Architecture
The agent is deliberately boring — its job is connectivity, lifecycle and command handling. The interesting behaviour lives in components that the agent can install on demand. To prove the model I built a few against a shared base contract.
The shared contract every component implements — lifecycle hooks, context, and MQTT-aware logging.
Drives an addressable LED strip with a library of effects: fades, wipes, rainbows, sparkle, theatre chase.
Controls a projector over a serial connection — power and input switching through the same agent.
Bridges the agent to a robotics middleware bus so robot nodes can be driven as fleet components.
A component that surfaces what the fleet is doing, for debugging the live system.
A sample mobile-robot behaviour, included to exercise the platform end-to-end rather than as core infrastructure.
How it runs
Bringing a device online follows a fixed path. The MQTT contract is the same for every node, which is what makes the fleet uniform:
metadata, status, state and cfg so the console sees it immediately.ping, restart, install, uninstall, upgrade, refresh.In my rebuild I focused on getting one agent fully self-describing and remotely commandable before adding a second — the contract is what scales, not the count.
Reflection