← all builds

From-Scratch Build · Interactive Robotics

Interactive Projector-Lamp Robot

A desk lamp that thinks. A six-axis robot arm carries a tiny projector instead of a bulb — point your finger at something on the table and it watches, understands what you want, and projects an answer back onto the surface. Rebuilt from scratch to learn how vision, language and motion stitch together.

ROSComputer Vision6-axis arm Mini projectorGenerative AI

What it is

A lamp you can talk to with your hands

Most desk robots are arms that grab things. This one is an arm that looks and shows. Its end effector isn't a gripper — it's a small projector, so the robot's job is to point a beam of useful information at exactly the right spot on the desk in front of you.

The interaction is deliberately physical. You don't type or click; you point. A camera watches the table, detects which object your finger is aimed at, and the system picks a mode — explain it, do the homework on it, generate a picture of it, or draw on the surface near it. The arm then orients the projector to display the result.

The core idea I wanted to learn: a robot becomes far more approachable when the interface is the room itself. No screen, no keyboard — just a pointing gesture and a projected reply. Building that meant wiring a perception pipeline straight into arm motion.

The stack

Tools under the hood

Each piece of this was new to me. Here is what each one actually does in the system.

middleware

ROS

The messaging bus. Vision nodes, the mode selector and the motion node all publish and subscribe to topics, so each stays a small independent program.

perception

Pointing detection

Computer vision that locates the hand, follows the finger ray and decides which object on the table is being pointed at.

motion

6-axis arm control

Turns "aim the projector here" into safe joint angles, broadcasting markers so the arm and target stay in one shared coordinate frame.

output

Mini projector

The end effector. Instead of grabbing, the robot projects images, text and answers directly onto the desk surface.

generation

Image + answer generation

A generative layer that produces a picture or response from what was pointed at, ready to be projected back.

modes

Mode switcher

Custom ROS messages signal the active mode — think, do homework, generate image, draw, open links — so the right behaviour fires.

Architecture

From a finger to a projected answer

The behaviour is a chain of small ROS nodes. Each does one job and hands off to the next over a topic.

  1. Publish mode live

    A vision node reads the scene and decides which interaction mode the user is asking for.

  2. Detect pointing live

    Follows the finger to work out which object on the table is the target.

  3. Open the mode live

    Activates the chosen behaviour and gathers whatever input it needs.

  4. Process the image live

    Generates or processes the visual that will be shown back to the user.

  5. Broadcast marker live

    Places the target in the arm's coordinate frame so motion is aimed accurately.

  6. Move the robot live

    Drives the arm to orient the projector and display the result on the desk.

How it runs

One launch, many behaviours

The whole system comes up from a single launch description that starts every node at once. From there the modes are interactive:

In my rebuild I leaned on the pointing-to-motion path: detect the gesture, place a marker in the arm frame, and let one roslaunch bring the perception and control nodes up together.

Reflection

What rebuilding it taught me