Press start and grant camera access. A face-detection model loads from CDN and runs locally — no frames ever leave your device. The status panel updates in real time when there's no one in frame, multiple people appear, or attention drifts off-screen.
Face detection by Google MediaPipe Tasks (BlazeFace), running on WebGL via TensorFlow Lite. All inference is local in your browser — no upload, no recording, no storage.
MediaPipe's BlazeFace runs on every video frame. Each detection comes back with a bounding box, six facial keypoints, and a confidence score.
Counting detections is the simplest cheat-flag: more than one face in the frame is one of the earliest signals a proctoring system uses.
Nose-keypoint position relative to the face bounding-box centre is a cheap proxy for head rotation. When the nose drifts past a threshold for several frames in a row, the proctor fires a "looking away" event.
The face bounding box's height relative to the frame tells you whether the user is at a sensible distance from the camera — too small means they've leaned out of frame, too large means they've moved up close.
The original project layers a YOLOv8 object detector on top (phones, books) and a pose estimator for body posture. This demo focuses on the face-detection layer so it runs everywhere without downloading 200 MB of model weights.
Nothing leaves your device. The camera stream is rendered to a <video>
element in your browser. Each frame is handed to a face detector that's been loaded into a
WebAssembly + WebGL runtime — the model itself comes from Google's CDN, but only the model file does.
The proctoring logic, the overlay drawing, the event log: all of it stays in this tab.
Closing the tab or pressing Stop camera releases the camera and frees the video stream. There is no server-side component to this site. There is no analytics, no recording, no upload.