Python · ddmin delta debugging

Bugpoint

A bug report is usually buried in noise — a thousand lines of input when four would do. Bugpoint takes a failing case and a way to test "does this still fail?", and automatically shrinks the input to a minimal, 1-minimal reproduction. The engine is the classic ddmin delta-debugging algorithm (Zeller & Hildebrandt, 2002): a smart bisection over the input that drops chunks, checks the failure survives, and increases granularity until nothing more can go.

Python (stdlib only)ddminLine + char minimizer CLI + shell oracle15 pytest testsMIT

What it is

Shrinking a bug to its core

When something crashes on a huge input, the first real task is always the same: find the minimal reproduction. Bugpoint automates it. Given a failing case and a way to test "does this still fail?", it systematically removes pieces of the input and keeps whatever still triggers the bug — converging on a tiny reproducer that points straight at the cause.

It turns hours of manual "comment-it-out-and-rerun" into an automatic search — and the minimal case it spits out is often half the debugging battle won. The result is guaranteed 1-minimal: removing any single remaining element makes the failure disappear. Identical candidates are memoized so the same subset is never re-tested.

1000 → 4
a 1000-element input with a hidden 4-element trigger, shrunk to exactly those 4 in 190 test evaluations (178 duplicate tests skipped by caching).

The stack

From a messy failure to a minimal reproducer

A reduction engine wrapped around a simple pass/fail oracle.

input

Failing case

The large input that triggers the bug — the raw material to be shrunk.

oracle

Interestingness test

A check that answers one question: does this reduced candidate still reproduce the failure?

core

Delta debugging

Bisect the input, drop chunks, keep what still fails — recurse until nothing more can go.

strategy

Granularity steps

Start with big chunks, then finer ones, balancing speed against how small the result gets.

safety

Validity guards

Keep reductions well-formed enough to actually run, so failures stay genuine.

output

Minimal reproducer

The smallest input that still breaks — handed back ready to debug.

Architecture

How a case is minimised

Reduction is a guided search, each step validated by the oracle:

  1. Confirm

    Verify the original input really does reproduce the failure.

  2. Split

    Divide the input into chunks at the current granularity.

  3. Test

    Try removing chunks and ask the oracle if it still fails.

  4. Keep

    Retain any reduction that preserves the failure; discard the rest.

  5. Refine

    Increase granularity and recurse until the case can't shrink further.

The API

Three calls

The core is ddmin(elements, interesting) over any list; minimize_lines and minimize_chars wrap it for source text. The predicate returns True when a candidate still reproduces the failure, and must be True for the full input.

from bugpoint import ddmin, minimize_lines

elements = list(range(1000))
required = {17, 256, 743, 999}            # the hidden trigger
ddmin(elements, lambda s: required.issubset(set(s)))
# -> [17, 256, 743, 999]   (1-minimal)

minimize_lines(program_text, still_crashes)   # minimal failing lines

The CLI

A file and a shell command

The interestingness test is any external command — Bugpoint writes each candidate to a temp file and treats a non-zero exit code as "still failing", the same convention as a crashing program. It reports the real number of test evaluations.

$ printf 'alpha\nbeta\nCRASH\ngamma\ndelta\n' > in.txt

$ python -m bugpoint --input in.txt \
    --test "... exit 1 if 'CRASH' in file ..." --output in.min

bugpoint: 6 -> 1 lines in 4 test evaluations. Wrote in.min
# in.min now contains exactly:  CRASH