Tactics of Persuasion — Built From Scratch

What it is

Naming the moves in a message

Persuasion techniques are a known catalogue — propaganda analysts have classified them for decades. This system learns eight of them (a subset of the SemEval-2020 propaganda taxonomy) and applies them automatically: hand it a sentence and it returns which tactics appear and which words triggered each. A single sentence often stacks several at once, so it's a multi-label problem, not a tidy one-answer classification.

The value isn't to censor or score — it's transparency. Surfacing that a sentence leans on fear plus a loaded epithet makes the machinery of the message visible, and every call traces back to explicit cue phrases and TF-IDF weights. That's media literacy, made legible.

8 tactics

loaded language, appeal to fear, name-calling, exaggeration, flag-waving, doubt, slogans, whataboutism — and they co-occur, so each sentence gets a set of labels.

The stack

From paragraph to labelled tactics

A classical, fully transparent multi-label pipeline grounded in a defined rhetoric taxonomy.

taxonomy

Tactic catalogue

Eight persuasion techniques from the SemEval propaganda taxonomy — the label space the model learns.

data

Seed dataset

114 original political-style sentences, hand-labelled — including multi-label and neutral rows.

features

TF-IDF + lexicons

Word and character n-gram TF-IDF, plus small per-tactic cue lexicons as extra features.

model

One-vs-rest logistic

One balanced logistic-regression head per tactic, so each fires independently for multi-label output.

evidence

Trigger phrases

Each prediction reports the cue phrases in the text that justify it — not just a score.

evaluate

Per-tactic F1

Stratified held-out split, macro- and micro-F1, per-label report — the rare ones are easy to miss.

Architecture

How tactics are detected

Each sentence flows through the same read-and-label pipeline:

Define
Fix the taxonomy of eight persuasion tactics to detect.
Vectorise
Word + character TF-IDF, concatenated with per-tactic lexicon counts.
Predict
One logistic head per tactic; threshold each probability into a label set.
Explain
Surface the cue phrases in the text that triggered each detected tactic.
Score
Held-out macro-F1 0.91 on the seed set; inspect per-tactic recall.

Real output

What it actually returns

Verbatim from python demo.py — trained on the 114-sentence seed set.

> These radical extremists are coming for your jobs, and if we do nothing, it will be too late.
    appeal_to_fear   100%   triggers: 'coming for', 'if we do nothing', 'too late'

> My opponent is a spineless coward, a fraud who has lied to you for years.
    name_calling     100%   triggers: 'spineless', 'coward', 'fraud'
    loaded_language   80%   triggers: (learned weights)

> We will build the greatest economy the world has ever seen, the biggest boom in all of history.
    exaggeration     100%   triggers: 'the greatest', 'ever', 'ever seen', 'the biggest'

> Can we really trust a single promise from the same people who failed us every time before?
    doubt             97%   triggers: 'Can we really trust'

> They lecture us about the deficit, but what about the trillions they wasted when they held power?
    whataboutism      99%   triggers: 'but what about', 'what about'

> The redistricting committee will publish its proposed map next Tuesday.
    no tactics detected

Held-out evaluation (70/30 stratified split): macro-F1 0.909, micro-F1 0.896. A teaching-scale model on a small, clean seed set — on noisy real text, expect lower.

Reflection

What rebuilding it taught me

Simple models go far on clear signal. TF-IDF plus a handful of cue lexicons separates these tactics surprisingly well — no transformer required for a teaching-scale task.
Multi-label is the honest framing. Real persuasion stacks techniques; forcing one label per passage throws away the truth.
The taxonomy is a design choice. What counts as a "tactic" shapes everything — defining the label space is half the project.
Rare tactics hide. Aggregate accuracy flatters; per-label recall reveals the techniques the model quietly never catches.
Detection is for transparency. The goal is to make rhetoric visible, not to judge it — a distinction worth holding onto.

Naming the moves in a message

From paragraph to labelled tactics

Tactic catalogue

Seed dataset

TF-IDF + lexicons

One-vs-rest logistic

Trigger phrases

Per-tactic F1

How tactics are detected

Define

Vectorise

Predict

Explain

Score

What it actually returns

What rebuilding it taught me