From-Scratch Build · Beyond Coursework

compl.ai

"Is this AI system allowed?" is fast becoming a question with legal teeth. compl.ai is a compliance rule engine that audits an AI system — described by a structured, model-card-like spec — against a catalogue of governance requirements drawn from the EU AI Act and the NIST AI RMF, and reports a risk-weighted score plus exactly where it passes, fails or falls short.

PythonRule engine15 requirements Risk tiersRisk-weighted scoreEU AI Act · NIST AI RMF

Try it

Audit an example system, live

The real engine is Python (complai/ — a YAML requirements framework, a rule engine, text/JSON reports, pytest-covered). The widget below is a faithful re-implementation of the same rules and scoring in JavaScript so you can see an audit happen in the browser. Pick a risk tier, toggle the safeguards a system has in place, and watch the report update.

IDSeverityStatusRequirement / evidence

Score = 100 × (earned severity weight) ÷ (total applicable weight). Pass earns full weight, partial half, fail zero. Severity weights: low 1, medium 3, high 6, critical 10.

Honest scope. compl.ai is a structured self-assessment checklist, not legal advice and not certification. Requirement wording is a plain-language paraphrase of public frameworks for engineering self-review — verify against the current regulation.

What it is

Regulation, turned into a checklist that runs

AI regulation arrives as dense prose — obligations about transparency, data quality, documentation, human oversight and risk management. compl.ai translates that prose into a catalogue of discrete, checkable requirements, then evaluates a given system against each one. The output isn't a vibe; it's a report: this requirement met, that one failed, this one needs evidence — with a risk-weighted score on top.

The grounding is real. Each requirement cites a principle from the EU AI Act (Reg. (EU) 2024/1689 — risk management, data governance, documentation, transparency, human oversight, accuracy & robustness) or the NIST AI Risk Management Framework 1.0. Systems are classified by risk tier, because obligations scale with how much harm a system could do.

The stack

From a regulation to an audit report

Model the rules, classify the system, evaluate, score, report.

model

Requirements catalogue

15 requirements in YAML, each with id, severity, an applicable-tier list, a declarative check and a source citation.

classify

Risk tiering

minimal / limited / high. High-risk systems trigger stricter requirements lower tiers skip.

intake

System spec

A model-card-like YAML/JSON document: intended use, datasets, oversight, accuracy, logging.

evaluate

Rule engine

Runs each check (field_true / present / gte / in) and resolves pass / fail / partial with evidence.

score

Risk-weighted score

Severity-weighted 0–100 score plus a risk-exposure total so the worst gaps surface first.

report

Report (text + JSON)

An itemised result with evidence, source and remediation for every gap.

Real example audits

Two systems, run through the Python engine

From python demo.py over the two bundled specs — these are real outputs, not mock-ups:

Reflection

What building it taught me