Tangible Tabletop Tracking Interface

What it is

Objects you can pick up, graphics that follow

A tangible interface erases the screen: instead of clicking a mouse, you move real objects, and the computer responds in the same physical space. Here a ceiling-mounted motion-capture system watches small tracked markers on the table. Each tracked object becomes a rigid body with a known position and orientation, and a short-throw projector paints digital graphics onto the tabletop so they appear to live under the objects you hold.

I built this as my entry point into spatial computing. The hard part isn't the tracking — it's getting the projected image to line up exactly with the real surface, so a marker at one corner of the table draws a graphic at that same corner.

The core idea I wanted to learn: a tangible interface is really a coordinate-translation problem. Motion capture speaks in metres of room space; the projector speaks in pixels on a warped surface. Everything interesting happens in the maths that maps one to the other.

The stack

Tools under the hood

The point of this rebuild was the toolchain. Here is what each piece actually does in the system.

sensing

Motion capture

A multi-camera system that tracks retro-reflective markers and reports the position and orientation of each rigid body many times a second.

middleware

ROS

A publish/subscribe message bus. A node reads the tracking stream and republishes each body's pose on its own topic so the rest of the system can listen.

transport

UDP bridge

Pose data is forwarded over UDP to the visualisation engine — fast, connectionless, and fine to drop the occasional frame for real-time work.

visuals

Agent-based display

An agent-simulation platform receives each pose and draws a graphic for it, which is what gets projected back onto the table.

calibration

Keystone correction

The four corners of the projected grid are dragged to match the real surface, cancelling the distortion a projector introduces at an angle.

maths

Coordinate mapping

A small linear fit (slope + offset per axis) converts capture-space metres into surface coordinates so objects and graphics align.

Pipeline

From marker to projected pixel

Every frame travels the same short journey, from a physical object on the table to a graphic drawn underneath it.

Track live
Cameras locate the markers and the system reports each rigid body's pose.
Publish live
A ROS node turns the raw capture stream into a clean per-object pose topic.
Stream live
Poses are pushed over UDP to the visualisation engine with minimal latency.
Map live
Capture-space coordinates are converted into table-surface coordinates.
Draw live
The engine renders a graphic for each object at its mapped location.
Project live
Keystone-corrected output is projected onto the table, completing the loop.

The tricky part

Making the projection line up

Two calibration problems took most of my time, and solving them is what turns a noisy demo into something that feels like magic:

Keystone: a projector mounted off-axis throws a trapezoid, not a rectangle. Dragging the four corners of the virtual grid onto the real table corners pre-warps the image so it lands square.
Axis mapping: because I only project onto a flat 2D plane, I use the object's height axis as rotation instead, so a graphic can spin in place as you twist the real object.
Re-fitting on the move: change the surface and the numbers drift. Sampling a few known coordinate pairs and re-fitting the slope and offset re-aligns everything in minutes.

In my build the whole loop runs from a single launch: bring up the tracking bridge, start the per-object publishers, then run the display script that listens and draws.

Reflection

What rebuilding it taught me

Tangible computing is calibration, not novelty. The interaction feels effortless only because the coordinate maths is invisible and correct.
UDP is the right kind of lazy. For pose streaming, dropping a late frame beats waiting for it — real-time wants freshness over completeness.
ROS topics keep things honest. Splitting each object onto its own topic made the system easy to reason about and extend to many markers at once.
A projector is an output device with opinions. Until you correct for its geometry, every graphic lands in the wrong place.

Objects you can pick up, graphics that follow

Tools under the hood

Motion capture

ROS

UDP bridge

Agent-based display

Keystone correction

Coordinate mapping

From marker to projected pixel

Track live

Publish live

Stream live

Map live

Draw live

Project live

Making the projection line up

What rebuilding it taught me