Smart Cooking Projection Interface

What it is

The recipe lives on the counter

This is a tangible user interface for cooking. A tracking system knows where your tools and ingredients are, an agent-based program holds the recipe state, and a projector draws instructions, timers and highlights straight onto the same surface you're working on. A vision language model interprets what the camera sees in the kitchen, so the interface can react to the real scene rather than a fixed script.

The premise that drew me in: text recipes force you to context-switch constantly — read, look away, cook, look back. Projecting the guidance into the workspace removes that gap. I wanted to build the plumbing that makes the counter itself the screen.

The core idea I wanted to learn: ambient interfaces win by being where your attention already is. The technical challenge is making the projected layer agree with the physical layer — the recipe step has to land next to the bowl it's talking about.

The stack

Tools under the hood

The point of this rebuild was the toolchain. Here is what each piece actually does in the system.

tracking

Motion capture

Locates tangible objects on the counter — tools, containers, markers — and reports their positions to the rest of the system.

middleware

ROS bridge

Carries tracking data between components and forwards it over UDP to the projection engine.

logic

Agent-based program

Holds the recipe state machine — which step you're on — and decides what the projector should show next.

perception

Vision language model

Interprets the camera view of the kitchen so the interface can read and reason about the real cooking scene.

display

Projector

Paints the recipe steps, timers and highlights onto the cooking surface, aligned with the physical workspace.

feedback

Audio cues

Spoken and tonal feedback complements the projection, so you get a nudge even when you're not looking down.

Pipeline

From the counter to a projected step

Each moment of cooking flows through the same sense-interpret-project loop.

Track live
The tracking system reports where objects are on the counter.
See live
The vision model interprets the camera view to understand the current scene.
Decide live
The agent program advances the recipe state and picks the next instruction.
Project live
The relevant step, timer or highlight is projected onto the workspace.
Speak live
Audio feedback reinforces the visual cue for hands-busy moments.
Advance live
As you finish a step, the loop repeats with the next one.

Why it works

What projecting the recipe actually changes

The interesting claim behind this interface is that moving the recipe off a screen and onto the counter measurably helps people cook:

Fewer interruptions: the next step is already in your field of view, so you stop less often to check what's next.
More confidence: seeing guidance anchored to your real ingredients makes it easier to trust you're doing the right thing.
Better focus: without the read-look-away cycle, attention stays on the food instead of the device.

In my rebuild I concentrated on the tracking-to-projection loop, then layered the vision model on top so the interface could respond to the real scene rather than a fixed sequence.

Reflection

What rebuilding it taught me

The best interface disappears. Projecting into the workspace beats any app because it removes the device entirely.
Vision models are glue, not magic. Used as a perception layer, an LLM that can see turns a rigid script into something that reacts to reality.
Alignment is everything. A step projected an inch off from the bowl it describes breaks the illusion instantly.
Multimodal feedback covers blind spots. Audio carries the message in the moments your eyes are on the knife, not the counter.

The recipe lives on the counter

Tools under the hood

Motion capture

ROS bridge

Agent-based program

Vision language model

Projector

Audio cues

From the counter to a projected step

Track live

See live

Decide live

Project live

Speak live

Advance live

What projecting the recipe actually changes

What rebuilding it taught me