From-Scratch Build · Recommendation Systems

Short-Video Recommender

The endless feed is judged by one thing: how long you lingered before the swipe. This build turns those split-second watch signals into the next pick — an implicit-feedback recommender with a confidence-weighted matrix factorization I wrote from scratch in NumPy/SciPy. No stars, no reviews: behaviour is the only label.

PythonImplicit feedbackwatch_ratio signal Matrix factorizationConfidence-weighted ALSFrom scratch

What it is

One feed, judged in seconds

Unlike a store of products you browse, a short-video feed shows you exactly one thing and watches what you do. There are no stars and no reviews — just behaviour: did you watch to the end, or flick it away in half a second? That single number, watch_ratio = watch_time / duration, is the entire training signal.

The model turns it into implicit feedback in the style of Hu, Koren & Volinsky: a video you watched at all is a positive with preference 1, but how long you watched sets a confidence 1 + α·watch_ratio — a completed watch counts far more than a half-second glance. A confidence-weighted matrix factorization, trained by alternating least squares, learns user and video factors from that, and next_videos(user, k) ranks unseen videos by predicted preference.

2.72×
held-out Recall@10 of the learned model over a popularity baseline, on planted-structure synthetic data — a real number from demo.py.

The pipeline

From a watch_ratio to the next video

Behaviour in, an ordered list of unseen videos out.

signal

watch_ratio

watch_time ÷ duration, in [0, 1]. The only label a short-video feed gives you.

model

Implicit feedback

preference = 1 on any watch; confidence = 1 + α·watch_ratio weights it by how long.

train

Confidence-weighted ALS

Hand-rolled alternating least squares learns user + video factors from the weighted signal.

rank

next_videos(user, k)

Score every unseen video by x·y, return the top-k. The actual recommendation.

online

Single-watch update

A fresh completed watch re-solves that user's factor row and nudges the next pick.

evaluate

Recall@K / NDCG@K

Held-out ranking quality against a popularity baseline — real numbers, not claims.

How it works

How the next video is chosen

Each watch feeds the same loop:

  1. Observe

    Record the watch_ratio for the video just shown.

  2. Weight

    Turn it into preference 1 with confidence 1 + α·watch_ratio — long watches count more.

  3. Factor

    Confidence-weighted ALS learns a latent vector for the user and for every video.

  4. Rank

    Score all unseen videos by the dot product of those vectors; take the top-k.

  5. Update

    A fresh completed watch re-solves the user's factor row, so the next pick reflects it.

Real results

It beats popularity — measured

On synthetic data with planted interests (300 users, 400 videos, 6 topics, 18,000 interactions; a held-out video counts as relevant if it was watched ≥ 60%), the learned model is run against a non-personalised popularity baseline. These are the numbers demo.py prints:

implicit-MF

Recall@10 · 0.141

NDCG@10 · 0.093 — the confidence-weighted matrix factorization.

popularity baseline

Recall@10 · 0.052

NDCG@10 · 0.026 — same top-k for everyone, no personalisation.

lift

2.72× Recall@10

Over 230 held-out users. The watch signal carries real, recoverable preference.

The data is synthetic and I say so plainly — but it is generated so watch behaviour actually reflects interest, which is what makes the structure recoverable in the first place.

Reflection

What building it taught me