Interactive Projector-Lamp Robot

What it is

A lamp you can talk to with your hands

Most desk robots are arms that grab things. This one is an arm that looks and shows. Its end effector isn't a gripper — it's a small projector, so the robot's job is to point a beam of useful information at exactly the right spot on the desk in front of you.

The interaction is deliberately physical. You don't type or click; you point. A camera watches the table, detects which object your finger is aimed at, and the system picks a mode — explain it, do the homework on it, generate a picture of it, or draw on the surface near it. The arm then orients the projector to display the result.

The core idea I wanted to learn: a robot becomes far more approachable when the interface is the room itself. No screen, no keyboard — just a pointing gesture and a projected reply. Building that meant wiring a perception pipeline straight into arm motion.

The stack

Tools under the hood

Each piece of this was new to me. Here is what each one actually does in the system.

middleware

ROS

The messaging bus. Vision nodes, the mode selector and the motion node all publish and subscribe to topics, so each stays a small independent program.

perception

Pointing detection

Computer vision that locates the hand, follows the finger ray and decides which object on the table is being pointed at.

motion

6-axis arm control

Turns "aim the projector here" into safe joint angles, broadcasting markers so the arm and target stay in one shared coordinate frame.

output

Mini projector

The end effector. Instead of grabbing, the robot projects images, text and answers directly onto the desk surface.

generation

Image + answer generation

A generative layer that produces a picture or response from what was pointed at, ready to be projected back.

modes

Mode switcher

Custom ROS messages signal the active mode — think, do homework, generate image, draw, open links — so the right behaviour fires.

Architecture

From a finger to a projected answer

The behaviour is a chain of small ROS nodes. Each does one job and hands off to the next over a topic.

Publish mode live
A vision node reads the scene and decides which interaction mode the user is asking for.
Detect pointing live
Follows the finger to work out which object on the table is the target.
Open the mode live
Activates the chosen behaviour and gathers whatever input it needs.
Process the image live
Generates or processes the visual that will be shown back to the user.
Broadcast marker live
Places the target in the arm's coordinate frame so motion is aimed accurately.
Move the robot live
Drives the arm to orient the projector and display the result on the desk.

How it runs

One launch, many behaviours

The whole system comes up from a single launch description that starts every node at once. From there the modes are interactive:

Explain: point at an object and the robot reasons about it, then projects a short answer.
Homework: point at a problem on paper and have it worked through and shown back.
Generate: ask for an image of what you pointed at; the generative layer produces it and the projector displays it.
Draw / open: sketch onto the surface near the target, or open related links — the arm repositions for each.

In my rebuild I leaned on the pointing-to-motion path: detect the gesture, place a marker in the arm frame, and let one roslaunch bring the perception and control nodes up together.

Reflection

What rebuilding it taught me

The interface can be the desk. Swapping a gripper for a projector reframes the whole robot — output becomes light on a surface, and pointing becomes the input.
Custom messages are how intent travels. Defining my own ROS message types for mode, pointing target and URL made the pipeline readable — each node speaks a tiny, purpose-built language.
Coordinate frames are everything. Aiming a beam accurately means the camera's world and the arm's world have to agree, which is why a marker broadcaster sits in the middle.
Generative output and robotics mix well. The arm doesn't need to be smart; it just needs to point a smart, generated result at the right place.

A lamp you can talk to with your hands

Tools under the hood

ROS

Pointing detection

6-axis arm control

Mini projector

Image + answer generation

Mode switcher

From a finger to a projected answer

Publish mode live

Detect pointing live

Open the mode live

Process the image live

Broadcast marker live

Move the robot live

One launch, many behaviours

What rebuilding it taught me