Self-Hosted MQTT Broker — Built From Scratch

What it is

One broker, many devices

MQTT is a lightweight publish/subscribe protocol built for machines that have little power and flaky connections — exactly the situation a room full of sensors finds itself in. Instead of every device calling every other device directly, they all connect to a single broker. A device publishes a message to a named topic (say floor2/lab/temp); any device subscribed to that topic receives it. Publishers and subscribers never need to know each other exist.

I built this to understand the backbone of almost every IoT system. The broker here is EMQX, a production-grade MQTT server, paired with a PostgreSQL database that acts as a registry — a single source of truth describing which devices exist, what they can do, and which topics carry their state and commands.

The core idea I wanted to learn: a good messaging layer decouples everything. Add a new sensor and it just starts publishing; nothing else has to change. The broker handles fan-out, the registry handles identity, and the devices stay blissfully ignorant of one another.

The stack

Tools under the hood

Two containers do all the work. Here is what each one actually contributes.

broker

EMQX 5

A scalable MQTT broker. It accepts connections, routes published messages to every matching subscriber, and ships a web dashboard for watching live traffic.

transport

MQTT + TLS

Plain MQTT on port 1883 for the LAN, and an encrypted TLS listener on 8883 so devices crossing untrusted networks stay private.

registry

PostgreSQL 16

The catalogue of devices, capabilities and topic bindings. It answers "what is this device and how do I talk to it?" — something MQTT itself never stores.

schema

init.sql

Seeds the database on first boot: tables for devices, capabilities, routing rules, topic bindings and last-known state.

packaging

Docker Compose

Brings up the broker and the database on a shared bridge network with a single command, ports and secrets driven by environment variables.

dashboard

EMQX Console

A built-in web UI on port 18083 to inspect connected clients, subscriptions and message rates in real time.

The registry

What the database remembers

MQTT is deliberately stateless — it forgets a message the moment it's delivered. To run a real fleet you need somewhere to record what exists. My Postgres schema models the device world in a handful of tables:

device
Every physical thing — its name, area, floor, kind and an online flag, plus a free-form JSON meta field for anything unusual.
capability
A named thing a device can do (e.g. "read temperature", "toggle relay"), each with a JSON schema describing its payload.
device_capability
The join table — which devices have which capabilities. A many-to-many map of the fleet's abilities.
topic_binding
For each device, the exact state_topic it publishes to and the command_topic it listens on. This is the bridge between the registry and live MQTT traffic.
routing_rule
Higher-level intent: "send this capability to any device matching this selector," with a strategy and priority — the seed of orchestration.
device_state
The last-seen timestamp and last-known JSON state per device, so the system has a memory even when a device goes quiet.

How it runs

From compose-up to first message

The whole stack is configuration, not code. Bringing it to life is a short sequence:

Set the environment: ports, the database name/user/password and the dashboard admin password all come from environment variables, so no secrets live in the compose file.
Bring it up: one docker compose up starts EMQX and Postgres on a shared labnet bridge; the schema seeds itself from init.sql on first boot.
Publish and subscribe: a sensor publishes to its state_topic; anything subscribed receives it instantly. Commands flow back down the device's command_topic.
Watch it live: the EMQX dashboard shows every connected client and subscription, which made debugging the message flow far easier than reading logs.

In my rebuild I focused on getting the broker and registry talking cleanly — a device row in Postgres, a matching topic binding, and a message arriving on the right subscription.

Reflection

What rebuilding it taught me

Pub/sub is the whole trick. Once devices talk through topics instead of addresses, adding the hundredth sensor is exactly as easy as adding the first.
The broker is stateless on purpose. Keeping the registry in a separate database — not the broker — keeps each tool doing one job well.
Topics are an API. Designing a clean topic hierarchy (area/device/capability) is as important as designing a REST URL scheme.
Compose plus env-vars is a real deployment. No build step, no hand-installed services — the entire system is two images and a handful of variables.

One broker, many devices

Tools under the hood

EMQX 5

MQTT + TLS

PostgreSQL 16

init.sql

Docker Compose

EMQX Console

What the database remembers

device

capability

device_capability

topic_binding

routing_rule

device_state

From compose-up to first message

What rebuilding it taught me