HoloVinyl — Built From Scratch

What it is

A record sleeve that comes alive

Anything drawn on top of the record has to stick to it as it tilts and rotates — if the overlay slides, the illusion breaks. So before any visual flourish, HoloVinyl solves a pure vision problem: find the sleeve in the photo, confirm which album it is, and recover the homography that says exactly how it's positioned. That same matrix then warps the overlay into correct perspective so it lands glued to the surface.

The engine is real and runs in holovinyl/recognize.py. Album art is copyrighted, so the database is six original abstract covers generated procedurally and committed to the repo; queries are made by photographing-style augmentation — perspective warp, cluttered background, brightness jitter and blur — so recognition is tested on deliberately degraded inputs, not pristine scans.

100%

match accuracy across 72 photographed-style queries (12 per cover), mean 201 RANSAC inliers, 0 false positives on 24 random non-cover images — real numbers from demo.py.

The pipeline

From a sleeve in frame to an overlay on it

Every step below is real OpenCV in holovinyl/recognize.py and overlay.py.

cv2.ORB

ORB features

Detect up to 1500 keypoints and their binary descriptors on the query photo.

BFMatcher

Descriptor matching

kNN-match query descriptors against each enrolled cover with a Hamming brute-force matcher.

ratio test

Lowe ratio test

Keep a match only when the best neighbour beats the second by 0.75×, dropping ambiguous pairs.

findHomography

RANSAC homography

Fit a cover→frame transform robustly and count geometric inliers; most-inliers wins, ≥15 to accept.

warpPerspective

Pose → overlay

Push an overlay through that same homography so it lands warped exactly onto the tilted sleeve.

reject

No-match guard

Too few inliers ⇒ no match, so random clutter is never hallucinated into an album.

Architecture

How a photo becomes a pose

One pass of the recognition pipeline, exactly as the Python core runs it:

Features
ORB keypoints + binary descriptors on the query image.
Match + ratio test
kNN-match against every enrolled cover; keep only confident pairs via Lowe's 0.75 ratio test.
RANSAC homography
Fit a cover→frame homography and count inliers; the cover with the most inliers (≥15) is the match.
Project the pose
Map the cover's corners through the homography to get the sleeve's outline in the photo.
Warp the overlay
Push the overlay through the same matrix so it sits in correct perspective on the sleeve.

Try it

The overlay step, live in the browser

Drag the four corners of the sleeve to tilt it. The disc overlay is warped onto the quad through a homography computed in JavaScript — the same cv2.warpPerspective idea the Python engine uses, just rendered here so you can feel it. The recognition that finds the sleeve and recovers this pose from a real photo is the OpenCV core in the repo.

Drag the green handles · overlay follows the perspective

Reflection

What rebuilding it taught me

AR is a tracking problem wearing a costume. The visuals are the easy part; the believable part is knowing exactly where the object is, every frame.
Homographies are the bridge. One matrix maps a known flat image to its place in the scene — and unlocks the entire overlay.
Jitter kills immersion. A pose that wobbles frame to frame reads as fake instantly; smoothing the tracking is half the craft.
Feature matching has to be robust. Glare, motion blur and partial occlusion are where naive recognition falls apart.
Latency is the experience. A correct overlay that arrives a beat late still feels broken — real-time isn't optional here.

A record sleeve that comes alive

From a sleeve in frame to an overlay on it

ORB features

Descriptor matching

Lowe ratio test

RANSAC homography

Pose → overlay

No-match guard

How a photo becomes a pose

Features

Match + ratio test

RANSAC homography

Project the pose

Warp the overlay

The overlay step, live in the browser

What rebuilding it taught me