Adversarial Search
in Three Games

Quoridor · Ghosts · UNO

AI Reasoning & Problem Solving · 2025

Playable website: ../website/index.html

Agenda

  1. How we organised the work
  2. Quoridor · algorithm · heuristic · class concepts · live demo
  3. Ghosts · algorithm · heuristic · class concepts · live demo
  4. UNO · algorithm · heuristic · class concepts · live demo
  5. Recap & what we'd do next

Total time ≤ 10 min talk + 5 min Q&A (per rubric)

5pts · Organising the work

How we organised the work

Per-game owner

  • One person primarily responsible for each my_agent()
  • Shared play_game driver - identical across all three notebooks
  • Pair review before submission to keep the eval functions readable

Shared scaffolding

  • Same GameClient + play_game across games
  • Each game class exposes the same kind of helper API: get_valid_moves, simulate_move, is_terminal
  • Same coding patterns (minimax skeleton, move scoring) re-used

Game 1 - Quoridor

Rules (from notebook)

  • 9×9 board, pawns on opposite mid-edges
  • Each turn: move pawn OR place wall
  • 10 walls per player; cannot completely block any path
  • Jump over adjacent opponent; diagonal step if blocked
  • Win: reach the opposite row

Game-theoretic profile

  • Two-player, zero-sum
  • Perfect information
  • Deterministic
  • Branching factor > 100 once walls open up
3pts · Algorithm

Quoridor - Algorithm

Depth-limited alpha-beta minimax

  • Frontier: all pawn moves (≤4 after jump rule) + a capped set of wall placements (top-K sorted by proximity to opponent)
  • Depth: 2 plies; deep enough to spot 1-move tactics, shallow enough to stay responsive
  • Pruning: alpha-beta on negamax form
  • Move legality: uses shortest_path_length() as a path-completeness check before placing a wall (matches the rule "walls cannot fully block any player")
def negamax(state, depth, alpha, beta):
    if depth == 0 or state.is_terminal():
        return eval(state)
    for m in order_moves(state):       # walls near opponent first
        v = -negamax(apply(state, m), depth-1, -beta, -alpha)
        alpha = max(alpha, v)
        if alpha >= beta: break
    return alpha
2pts · Heuristic

Quoridor - Heuristic

Path-length differential

The notebook tip is exactly this:

eval(state) = 10 * (opponent_path - my_path)
            + (my_walls_left - opp_walls_left)
  • opponent_path - my_path is computed via BFS (already provided as shortest_path_length)
  • Walls-in-hand kept as a small tie-breaker - preserves flexibility
  • Why it works: winning = being closer to your goal than the opponent is to theirs; the BFS distance is an admissible-style proxy for "moves to win"
5pts · Mapping to class concepts

Quoridor - Class concepts

ConceptWhere it appears
Uninformed search (BFS)Shortest-path distance computation
Adversarial search / minimaxMove selection in my_agent
Alpha-beta pruningNegamax cut-offs on the frontier
Heuristic evaluationPath-difference + wall-count combination
Move orderingWalls sorted by distance to opponent before alpha-beta
Branching-factor managementCapping the wall-move candidate list (top-K)
Live demo · Quoridor You vs AI · depth-2 α-β minimax Open full-screen ↗

Game 2 - Ghosts

Rules (from notebook)

  • 6×6 board; exits at the four corners
  • 8 ghosts each: 4 good + 4 evil, types hidden
  • Setup: own two rows, columns 1-4 only
  • Move 1 square orthogonal; capture = move onto enemy
  • Capturing reveals both ghosts' types

Win / lose conditions

  • Win: capture all opponent's good ghosts OR move a good ghost to an opponent-side exit corner
  • Lose: all your good ghosts captured OR you capture all of opponent's evil ghosts

The second loss condition makes this a true bluffing game.

3pts · Algorithm

Ghosts - Algorithm

Expectiminimax-flavoured 2-ply search

  • From the current state, enumerate own moves (≤32)
  • For each, apply move, then take the worst opponent response from the top-K (we limit to 16 to keep latency low)
  • Opponent piece types are unknown - they enter the eval as expected threat (see next slide)
  • For setup, we don't search - we use a fixed strategy: good ghosts on the outer columns so they're 1 step from a corner exit
# Per candidate move
child = apply(state, move)
if child.winner: score = eval(child)
else:
    score = min(eval(apply(child, om)) for om in opp_moves[:16])
return argmax(moves by score)
2pts · Heuristic

Ghosts - Heuristic

Multi-term evaluation

  • +60 × (own goods - opp goods) - good-ghost differential
  • +25 × own evils remaining - keeps evils alive (we lose if all are captured)
  • -5 × opp evils remaining - mild incentive to capture opp evils (with a small weight on purpose)
  • -4 × distance(own goods → opponent exit row), +10 if on outer column
  • -threat for opponent pieces moving toward our exit rows; revealed-evil pieces count for ~20% the threat of unknown/revealed-good

Belief reasoning is encoded in that last bullet - hidden pieces are penalised at full weight, revealed-evil pieces at reduced weight. This is the imperfect-information "trick" the notebook hints at.

5pts · Mapping to class concepts

Ghosts - Class concepts

ConceptWhere it appears
Adversarial search2-ply min over opponent responses
Imperfect information / belief stateThreat weighting based on whether opponent piece type is revealed
Expectiminimax-style averagingTreating hidden pieces as a weighted mixture
Heuristic design with multiple objectivesMaterial + advancement + threat - balanced via weights
Game decomposition (setup vs play)Separate strategy phase: fixed setup, search-based play
"Reverse" terminal statesCaptured-all-evils-loses → forces non-greedy capture policy
Live demo · Ghosts Hidden-info · setup + 2-ply search Open full-screen ↗

Game 3 - UNO

Rules (from notebook)

  • 4 players, 108-card deck
  • Cards have color, value, type (normal/action/wild)
  • Match by color, value, or play a Wild
  • Skip / Reverse / Draw 2 / Wild / Wild +4 special effects
  • First empty hand wins

Game-theoretic profile

  • Multi-agent (4 players)
  • Imperfect information (don't see hands)
  • Stochastic (draws from shuffled deck)
  • State space huge; minimax tree intractable
3pts · Algorithm

UNO - Algorithm

Greedy / rule-based policy

  • Enumerate get_valid_moves() (legal plays + draw)
  • Score each candidate play with a hand-tuned heuristic (next slide)
  • Pick the highest-scoring play; if no play, draw
  • For a Wild, pick the color most common in our remaining hand

We deliberately did not use deep search here. Hidden hands plus a 100+-card stochastic deck make minimax a poor fit; a thoughtful greedy policy with the right features beats it in wall-clock practice for this assignment.

plays = [m for m in valid_moves if m['type'] == 'play']
if not plays: return {'type': 'draw', 'count': 1}
plays.sort(key=score, reverse=True)
return plays[0]
2pts · Heuristic

UNO - Heuristic

Card scoring (features)

FeatureWeight
Card is normal numeric+ card value (dump high cards first)
Card is Draw 2+30
Card is Skip+20
Card is Reverse+15
Card is Wild-50 (save them)
Card is Wild +4-80 (rarer still)
Card matches top by color (not just value)+5 (keep flexibility)
Some opponent has ≤2 cards AND we have an aggressive card+40

Information used: get_hand_sizes() tells us when to switch to aggression; get_discard_pile() is available for full card counting (future work).

5pts · Mapping to class concepts

UNO - Class concepts

ConceptWhere it appears
Imperfect-information gameWe can't see opponents' hands - have to reason from hand sizes and discard pile
Stochastic environmentDrawing from a shuffled deck = chance nodes
Heuristic / rule-based policyWhy we picked heuristics over minimax when the state space is huge
Card counting / belief trackingget_discard_pile() exposes the played history
Aggression switchingThreshold rule on opponent hand size (greedy + threshold = simple form of game-state policy)
Trade-off: complexity vs. responsivenessSame reason we capped Quoridor's wall search
Live demo · UNO (4-player) Greedy policy · action-card valuation Open full-screen ↗

One-slide summary

QuoridorGhostsUNO
InfoPerfectHidden typesHidden hands
ChanceNoneNoneDeck draws
Players224
Algorithmα-β minimax (d=2)2-ply min over opp responsesGreedy policy
HeuristicPath-diff + wallsMaterial + advancement + belief threatAction-card valuation
Key class conceptBFS + α-βBelief stateHeuristic under uncertainty

What we'd do next

  • Quoridor: iterative deepening so depth scales with available time; transposition table on (pawns, wall-set) keys
  • Ghosts: particle-filter belief over hidden-type assignments and proper expectiminimax over the belief
  • UNO: Monte-Carlo determinization - sample opponent hands consistent with the discard pile and average rollouts; this is the standard hidden-info game-AI move

Thank you

Questions?

Source notebooks: Quoridor_Assignment_Standalone.ipynb, Ghosts_Assignment_Standalone.ipynb, Uno_Assignment_Standalone.ipynb

Playable website: ../website/index.html