← all builds

From-Scratch Build · Distributed Systems

Distributed Device Agent Platform

A control plane for a fleet of small networked devices: each one runs a lightweight agent that announces itself, reports its health, and obeys remote commands over MQTT — while a central console installs, configures and watches them all. Built from scratch to learn how device fleets are actually managed.

PythonMQTTAgents Pluggable componentssystemd

What it is

An agent on every device

The platform has two halves. On each device runs an agent — a small long-lived process that connects to a message broker, publishes a retained description of itself (metadata, status, current state, configuration), and listens for commands. On the network sits a central command plane that discovers every agent, pushes configuration, and triggers installs, restarts and upgrades.

I built this to answer a question I kept hitting in hobby hardware projects: once you have more than one or two devices, how do you manage them without SSH-ing into each box by hand? The answer turned out to be a publish/subscribe contract and a registry of installable capabilities.

The core idea I wanted to learn: a device fleet becomes manageable the moment every node speaks the same MQTT contract. Retained messages mean a freshly-connected console instantly knows the full state of the fleet — no polling, no central database of truth, just the broker.

The stack

Tools under the hood

The whole point of this rebuild was the toolchain that makes a self-describing agent possible. Here is what each piece actually does.

transport

MQTT

A lightweight publish/subscribe broker. Every agent and the console meet here; nobody connects to anybody directly.

runtime

Agent core

The process on each device. Connects, publishes retained metadata/status/state/cfg, and runs the command handlers.

control

Central command

The operator-facing plane that lists agents, edits their config and issues fleet-wide commands.

extensibility

Component registry

A loader that lets an agent install and run pluggable capabilities at runtime instead of baking everything into one binary.

packaging

pyproject + systemd

Each piece is its own Python package; the agent ships as a systemd service so it survives reboots.

telemetry

Metrics + logs

CPU, memory and disk percentages stream as telemetry topics; logs batch back over MQTT, each gated by config.

Architecture

Plug-in components

The agent is deliberately boring — its job is connectivity, lifecycle and command handling. The interesting behaviour lives in components that the agent can install on demand. To prove the model I built a few against a shared base contract.

  1. Component base live

    The shared contract every component implements — lifecycle hooks, context, and MQTT-aware logging.

  2. LED strip live

    Drives an addressable LED strip with a library of effects: fades, wipes, rainbows, sparkle, theatre chase.

  3. Projector live

    Controls a projector over a serial connection — power and input switching through the same agent.

  4. ROS bridge live

    Bridges the agent to a robotics middleware bus so robot nodes can be driven as fleet components.

  5. Visualisation live

    A component that surfaces what the fleet is doing, for debugging the live system.

  6. Foraging behaviour demo

    A sample mobile-robot behaviour, included to exercise the platform end-to-end rather than as core infrastructure.

How it runs

The lifecycle of one agent

Bringing a device online follows a fixed path. The MQTT contract is the same for every node, which is what makes the fleet uniform:

In my rebuild I focused on getting one agent fully self-describing and remotely commandable before adding a second — the contract is what scales, not the count.

Reflection

What rebuilding it taught me