Reel — Built From Scratch

What it is

Discovery by closeness, not category

Reel learns a vector for every film by factorising the user-by-film ratings matrix with stochastic gradient descent — written from scratch in NumPy, no ML library doing the fitting. Films that people rate the same way land near each other, so discovery becomes geometry: name a film you love and Reel returns its nearest neighbours in taste-space. The matches cut across genres, because closeness captures the feel a label can't.

Trained on MovieLens latest-small — 610 users, 9,724 films, 100,836 ratings. The honest catch I hit: a rating-accurate model is a poor ranker, so top-N recommendations use a second set of embeddings trained on a ranking objective (BPR). Every number on this page is from a real run.

cos θ

cosine similarity between learned film embeddings — the single measure of "these feel alike" that powers every "films like X" lookup.

The stack

From a film you love to ones you'll love

Two from-scratch SGD models, each doing the job it's good at.

data

MovieLens ratings

610 users × 9,724 films × 100,836 ratings, fetched on demand. A labelled synthetic fallback backs the tests.

factorise

Rating MF (SGD)

r̂ = μ + b_u + b_i + p_u·q_i, fit by L2-regularised squared error. Hand-written SGD updates over NumPy.

similar

Cosine neighbours

"Films like X" = nearest item embeddings q_i by cosine. The matches cross genres.

rank

BPR ranking model

A second embedding set trained to put liked films above unliked ones — what powers top-N.

recommend

Top-N for a user

Score every unseen film with the ranking model, return the highest.

evaluate

RMSE + Recall@K

Held-out RMSE for the MF; Recall@10 for top-N, measured against a popularity baseline.

Architecture

How a film is discovered

The pipeline, end to end:

Load
Pull MovieLens ratings; remap sparse user/film ids to dense indices.
Factorise
SGD over observed ratings learns biases + a latent vector for every user and film.
Similar
"Films like X" = cosine nearest neighbours among the learned film vectors.
Rank
A second BPR-trained embedding set scores unseen films per user for top-N.
Evaluate
Held-out RMSE for the MF; Recall@10 for top-N vs a popularity baseline.

Results

Real numbers from a real run

From python demo.py on MovieLens latest-small — 80/20 split, fixed seeds, so they reproduce.

Metric	Value
Rating model — held-out RMSE	0.8559
BPR ranking model — Recall@10	0.1464
Popularity baseline — Recall@10	0.0988
BPR lift over popularity	+48.2%
Rating model (RMSE-tuned) — Recall@10	0.0184

The last row is the honest catch: the RMSE-accurate model ranks worse than popularity, which is exactly why top-N uses BPR instead.

Cross-genre

Neighbours that share no genre

"Films like X" from the learned embeddings, picking matches that wear a different genre label — the kind a genre filter could never surface. Straight from the run:

Toy Story (1995) Animation / Children
→ Steve Jobs (2015) · The Woman in Black (2012)
The Matrix (1999) Action / Sci-Fi
→ Swingers (1996), a Comedy/Drama
The Shining (1980) Horror
→ No Country for Old Men (2007) · Donnie Darko (2001) · Requiem for a Dream (2000)
Pulp Fiction (1994) Crime / Drama
→ Fight Club · Memento · Seven Samurai — neighbours by feel, not tag

Reflection

What building it taught me

Embeddings beat genres. A vector learned from behaviour captures texture a category tag throws away — and the neighbours it returns genuinely cross genre lines.
Recommendation is retrieval. Once every film is a vector, "what should I watch" becomes a nearest-neighbour query.
RMSE-good ≠ ranking-good. The biggest surprise: my accurate rating model ranked worse than recommending whatever's popular. Top-N needs an objective that optimises order, not error — hence BPR.
Popularity is a brutal baseline. On MovieLens it's hard to beat, because popular films dominate the held-out set too. Beating it by +48% meant changing the loss, not tuning the old one.
Writing the SGD by hand is the point. Deriving and coding the gradient updates for both squared-error MF and BPR is where the model stopped being a black box.