Adversarial Search
in Three Games

Quoridor · Ghosts · UNO

AI Reasoning & Problem Solving · 2025

Playable website: ../website/index.html

Agenda

How we organised the work
Quoridor · algorithm · heuristic · class concepts · live demo
Ghosts · algorithm · heuristic · class concepts · live demo
UNO · algorithm · heuristic · class concepts · live demo
Recap & what we'd do next

Total time ≤ 10 min talk + 5 min Q&A (per rubric)

5pts · Organising the work

How we organised the work

Per-game owner

One person primarily responsible for each my_agent()
Shared play_game driver - identical across all three notebooks
Pair review before submission to keep the eval functions readable

Shared scaffolding

Same GameClient + play_game across games
Each game class exposes the same kind of helper API: get_valid_moves, simulate_move, is_terminal
Same coding patterns (minimax skeleton, move scoring) re-used

Game 1 - Quoridor

Rules (from notebook)

9×9 board, pawns on opposite mid-edges
Each turn: move pawn OR place wall
10 walls per player; cannot completely block any path
Jump over adjacent opponent; diagonal step if blocked
Win: reach the opposite row

Game-theoretic profile

Two-player, zero-sum
Perfect information
Deterministic
Branching factor > 100 once walls open up

3pts · Algorithm

Quoridor - Algorithm

Depth-limited alpha-beta minimax

Frontier: all pawn moves (≤4 after jump rule) + a capped set of wall placements (top-K sorted by proximity to opponent)
Depth: 2 plies; deep enough to spot 1-move tactics, shallow enough to stay responsive
Pruning: alpha-beta on negamax form
Move legality: uses shortest_path_length() as a path-completeness check before placing a wall (matches the rule "walls cannot fully block any player")

def negamax(state, depth, alpha, beta):
    if depth == 0 or state.is_terminal():
        return eval(state)
    for m in order_moves(state):       # walls near opponent first
        v = -negamax(apply(state, m), depth-1, -beta, -alpha)
        alpha = max(alpha, v)
        if alpha >= beta: break
    return alpha

2pts · Heuristic

Quoridor - Heuristic

Path-length differential

The notebook tip is exactly this:

eval(state) = 10 * (opponent_path - my_path)
            + (my_walls_left - opp_walls_left)

opponent_path - my_path is computed via BFS (already provided as shortest_path_length)
Walls-in-hand kept as a small tie-breaker - preserves flexibility
Why it works: winning = being closer to your goal than the opponent is to theirs; the BFS distance is an admissible-style proxy for "moves to win"

5pts · Mapping to class concepts

Quoridor - Class concepts

Concept	Where it appears
Uninformed search (BFS)	Shortest-path distance computation
Adversarial search / minimax	Move selection in `my_agent`
Alpha-beta pruning	Negamax cut-offs on the frontier
Heuristic evaluation	Path-difference + wall-count combination
Move ordering	Walls sorted by distance to opponent before alpha-beta
Branching-factor management	Capping the wall-move candidate list (top-K)

Live demo · Quoridor You vs AI · depth-2 α-β minimax Open full-screen ↗

Game 2 - Ghosts

Rules (from notebook)

6×6 board; exits at the four corners
8 ghosts each: 4 good + 4 evil, types hidden
Setup: own two rows, columns 1-4 only
Move 1 square orthogonal; capture = move onto enemy
Capturing reveals both ghosts' types

Win / lose conditions

Win: capture all opponent's good ghosts OR move a good ghost to an opponent-side exit corner
Lose: all your good ghosts captured OR you capture all of opponent's evil ghosts

The second loss condition makes this a true bluffing game.

3pts · Algorithm

Ghosts - Algorithm

Expectiminimax-flavoured 2-ply search

From the current state, enumerate own moves (≤32)
For each, apply move, then take the worst opponent response from the top-K (we limit to 16 to keep latency low)
Opponent piece types are unknown - they enter the eval as expected threat (see next slide)
For setup, we don't search - we use a fixed strategy: good ghosts on the outer columns so they're 1 step from a corner exit

# Per candidate move
child = apply(state, move)
if child.winner: score = eval(child)
else:
    score = min(eval(apply(child, om)) for om in opp_moves[:16])
return argmax(moves by score)

2pts · Heuristic

Ghosts - Heuristic

Multi-term evaluation

+60 × (own goods - opp goods) - good-ghost differential
+25 × own evils remaining - keeps evils alive (we lose if all are captured)
-5 × opp evils remaining - mild incentive to capture opp evils (with a small weight on purpose)
-4 × distance(own goods → opponent exit row), +10 if on outer column
-threat for opponent pieces moving toward our exit rows; revealed-evil pieces count for ~20% the threat of unknown/revealed-good

Belief reasoning is encoded in that last bullet - hidden pieces are penalised at full weight, revealed-evil pieces at reduced weight. This is the imperfect-information "trick" the notebook hints at.

5pts · Mapping to class concepts

Ghosts - Class concepts

Concept	Where it appears
Adversarial search	2-ply min over opponent responses
Imperfect information / belief state	Threat weighting based on whether opponent piece type is revealed
Expectiminimax-style averaging	Treating hidden pieces as a weighted mixture
Heuristic design with multiple objectives	Material + advancement + threat - balanced via weights
Game decomposition (setup vs play)	Separate strategy phase: fixed setup, search-based play
"Reverse" terminal states	Captured-all-evils-loses → forces non-greedy capture policy

Live demo · Ghosts Hidden-info · setup + 2-ply search Open full-screen ↗

Game 3 - UNO

Rules (from notebook)

4 players, 108-card deck
Cards have color, value, type (normal/action/wild)
Match by color, value, or play a Wild
Skip / Reverse / Draw 2 / Wild / Wild +4 special effects
First empty hand wins

Game-theoretic profile

Multi-agent (4 players)
Imperfect information (don't see hands)
Stochastic (draws from shuffled deck)
State space huge; minimax tree intractable

3pts · Algorithm

UNO - Algorithm

Greedy / rule-based policy

Enumerate get_valid_moves() (legal plays + draw)
Score each candidate play with a hand-tuned heuristic (next slide)
Pick the highest-scoring play; if no play, draw
For a Wild, pick the color most common in our remaining hand

We deliberately did not use deep search here. Hidden hands plus a 100+-card stochastic deck make minimax a poor fit; a thoughtful greedy policy with the right features beats it in wall-clock practice for this assignment.

plays = [m for m in valid_moves if m['type'] == 'play']
if not plays: return {'type': 'draw', 'count': 1}
plays.sort(key=score, reverse=True)
return plays[0]

2pts · Heuristic

UNO - Heuristic

Card scoring (features)

Feature	Weight
Card is normal numeric	+ card value (dump high cards first)
Card is Draw 2	+30
Card is Skip	+20
Card is Reverse	+15
Card is Wild	-50 (save them)
Card is Wild +4	-80 (rarer still)
Card matches top by color (not just value)	+5 (keep flexibility)
Some opponent has ≤2 cards AND we have an aggressive card	+40

Information used: get_hand_sizes() tells us when to switch to aggression; get_discard_pile() is available for full card counting (future work).

5pts · Mapping to class concepts

UNO - Class concepts

Concept	Where it appears
Imperfect-information game	We can't see opponents' hands - have to reason from hand sizes and discard pile
Stochastic environment	Drawing from a shuffled deck = chance nodes
Heuristic / rule-based policy	Why we picked heuristics over minimax when the state space is huge
Card counting / belief tracking	`get_discard_pile()` exposes the played history
Aggression switching	Threshold rule on opponent hand size (greedy + threshold = simple form of game-state policy)
Trade-off: complexity vs. responsiveness	Same reason we capped Quoridor's wall search

Live demo · UNO (4-player) Greedy policy · action-card valuation Open full-screen ↗

One-slide summary

	Quoridor	Ghosts	UNO
Info	Perfect	Hidden types	Hidden hands
Chance	None	None	Deck draws
Players	2	2	4
Algorithm	α-β minimax (d=2)	2-ply min over opp responses	Greedy policy
Heuristic	Path-diff + walls	Material + advancement + belief threat	Action-card valuation
Key class concept	BFS + α-β	Belief state	Heuristic under uncertainty

What we'd do next

Quoridor: iterative deepening so depth scales with available time; transposition table on (pawns, wall-set) keys
Ghosts: particle-filter belief over hidden-type assignments and proper expectiminimax over the belief
UNO: Monte-Carlo determinization - sample opponent hands consistent with the discard pile and average rollouts; this is the standard hidden-info game-AI move

Thank you

Questions?

Source notebooks: Quoridor_Assignment_Standalone.ipynb, Ghosts_Assignment_Standalone.ipynb, Uno_Assignment_Standalone.ipynb

Playable website: ../website/index.html

Adversarial Searchin Three Games

Quoridor · Ghosts · UNO

Agenda

How we organised the work

Per-game owner

Shared scaffolding

Game 1 - Quoridor

Rules (from notebook)

Game-theoretic profile

Quoridor - Algorithm

Depth-limited alpha-beta minimax

Quoridor - Heuristic

Path-length differential

Quoridor - Class concepts

Game 2 - Ghosts

Rules (from notebook)

Win / lose conditions

Ghosts - Algorithm

Expectiminimax-flavoured 2-ply search

Ghosts - Heuristic

Multi-term evaluation

Ghosts - Class concepts

Game 3 - UNO

Rules (from notebook)

Game-theoretic profile

UNO - Algorithm

Greedy / rule-based policy

UNO - Heuristic

Card scoring (features)

UNO - Class concepts

One-slide summary

What we'd do next

Thank you

Adversarial Search
in Three Games