From-Scratch Build · Human–Robot Interaction
A pair of sensor gloves and a haptic vest that let one person command a swarm of mobile robots by hand. A hand posture picks a mode, finger taps issue orders, and the vest buzzes feedback back to the operator. Built from scratch to learn how the body becomes a controller.
What it is
The system turns gestures into swarm commands. The left glove selects the active mode by posture — an open hand means drive, a point means select, a fist means formation. The right glove reads finger-pressure taps that pick the specific command. A vest collects the wireless glove packets, forwards them to a small onboard computer, and drives six haptic motors so the operator feels confirmations and warnings.
I built this because I wanted to understand the full chain from a finger bend to a robot moving: how a wearable sensor becomes a clean, safe intent that many robots can act on at once — without the operator ever looking at a screen.
The core idea I wanted to learn: controlling a swarm isn't about steering each robot — it's about expressing intent ("form a wedge", "patrol the perimeter") and letting decentralised robots work out their own paths. The wearable just has to turn the body into a reliable, gated source of those intents.
The stack
This rebuild spans firmware, wireless links, robotics middleware and motion capture. Here is what each layer actually does.
Flex sensors and an IMU read hand posture and motion; force-sensitive resistors on the fingers register discrete taps.
A low-latency ESP32 radio protocol. Gloves broadcast packets to the vest with no Wi-Fi network in the loop.
The vest forwards validated packets over USB serial to a Pi, which publishes them into the robotics graph.
Carries raw input, recognised gestures, swarm intent and per-robot velocity commands between every node.
A motion-capture system supplies shared world-frame robot poses, so the swarm can hold real formations.
Six motors translate system state back into touch — confirmations, mode changes and stop signals you can feel.
Architecture
A command travels through a fixed chain of nodes. Each stage has one job, which keeps the safety-critical parts isolated and testable.
Left and right ESP32s read sensors and broadcast packets over ESP-NOW.
Receives both gloves, validates packets, and bridges them to the Pi over serial.
Classifies posture and finger taps into a mode and a discrete command.
A deadman and shake-to-stop layer that gates all motion before any intent is published.
Turns intent into formations and behaviours, using shared robot poses.
Each robot consumes swarm intent and computes its own local velocity command.
The vocabulary
The control language is small and deliberate — modes selected by posture, then orders chosen by finger taps:
In my rebuild I treated safety as the first feature, not the last — every intent passes the deadman and stop layer before a single robot moves.
Reflection