From-Scratch Build · Tangible Interfaces
Put a physical object on a table, move it around, and watch a live digital graphic follow it underneath — drawn straight back onto the surface by a projector. I built this from scratch to understand how motion capture, real-time messaging and projection mapping stitch together.
What it is
A tangible interface erases the screen: instead of clicking a mouse, you move real objects, and the computer responds in the same physical space. Here a ceiling-mounted motion-capture system watches small tracked markers on the table. Each tracked object becomes a rigid body with a known position and orientation, and a short-throw projector paints digital graphics onto the tabletop so they appear to live under the objects you hold.
I built this as my entry point into spatial computing. The hard part isn't the tracking — it's getting the projected image to line up exactly with the real surface, so a marker at one corner of the table draws a graphic at that same corner.
The core idea I wanted to learn: a tangible interface is really a coordinate-translation problem. Motion capture speaks in metres of room space; the projector speaks in pixels on a warped surface. Everything interesting happens in the maths that maps one to the other.
The stack
The point of this rebuild was the toolchain. Here is what each piece actually does in the system.
A multi-camera system that tracks retro-reflective markers and reports the position and orientation of each rigid body many times a second.
A publish/subscribe message bus. A node reads the tracking stream and republishes each body's pose on its own topic so the rest of the system can listen.
Pose data is forwarded over UDP to the visualisation engine — fast, connectionless, and fine to drop the occasional frame for real-time work.
An agent-simulation platform receives each pose and draws a graphic for it, which is what gets projected back onto the table.
The four corners of the projected grid are dragged to match the real surface, cancelling the distortion a projector introduces at an angle.
A small linear fit (slope + offset per axis) converts capture-space metres into surface coordinates so objects and graphics align.
Pipeline
Every frame travels the same short journey, from a physical object on the table to a graphic drawn underneath it.
Cameras locate the markers and the system reports each rigid body's pose.
A ROS node turns the raw capture stream into a clean per-object pose topic.
Poses are pushed over UDP to the visualisation engine with minimal latency.
Capture-space coordinates are converted into table-surface coordinates.
The engine renders a graphic for each object at its mapped location.
Keystone-corrected output is projected onto the table, completing the loop.
The tricky part
Two calibration problems took most of my time, and solving them is what turns a noisy demo into something that feels like magic:
In my build the whole loop runs from a single launch: bring up the tracking bridge, start the per-object publishers, then run the display script that listens and draws.
Reflection