Adversarial Search – A Detailed Guide For Beginners

Welcome to adversarial search – where we’re not just searching through a static world, but facing an opponent who actively tries to defeat us! This is game playing AI, and it’s one of the most exciting topics in the course.


PART 1: WHY GAMES MATTER IN AI

Games as Multi-Agent Environments

Until now, we’ve been in single-agent environments:

  • The world doesn’t fight back
  • We just need to find a path to the goal
  • The only challenge is the size of the state space

But in games, we face other agents:

  • They have their own goals
  • They actively try to stop us
  • They respond to our moves
  • We need to anticipate their actions

Types of Multi-Agent Environments

Multi-Agent Environments
        ├── Cooperative (agents work together)
        │    ├ Example: Robots cleaning a house together
        │    └ Example: Teammates in soccer
        │
        └── Competitive (agents work against each other)
             ├── Zero-sum (my win is your loss)
             │    ├ Example: Chess, Checkers, Go
             │    └ This is what we'll study!
             │
             └── Non-zero-sum (both can win/lose)
                  └ Example: Prisoner's Dilemma

Why Study Games?

  1. Games are fun! – But that’s not the academic reason…
  2. Historical role in AI – Games have been AI’s “grand challenges”:
  • 1994: Chinook beats checkers champion
  • 1997: Deep Blue beats Kasparov at chess
  • 2016: AlphaGo beats Lee Sedol at Go
  • 2017: AlphaZero masters chess, Go, and shogi
  1. Dealing with adversarial agents – Real-world applications:
  • Military strategy
  • Business competition
  • Cybersecurity (defending against attackers)
  1. Huge state spaces – Games are HARD:
  • Chess: ~10⁴⁷ possible positions
  • Go: ~10¹⁷⁰ possible positions (more than atoms in universe!)
  • If we can solve games, we can solve many real problems
  1. Clean environment – Clear rules, clear win/loss criteria

PART 2: GAME CHARACTERISTICS – CLASSIFYING THE PROBLEM

The 5 Key Axes of Games

1. Deterministic vs. Stochastic
        ↓
2. Number of Players (1, 2, many)
        ↓
3. Zero-sum vs. Non-zero-sum
        ↓
4. Perfect Information vs. Imperfect Information
        ↓
5. Turn-taking vs. Simultaneous moves

Axis 1: Deterministic vs. Stochastic

Deterministic Games:

  • No randomness
  • Same move always produces same result
  • Examples: Chess, Checkers, Go, Tic-tac-toe

Stochastic Games:

  • Random elements (dice, card draws, etc.)
  • Same move can have different outcomes
  • Examples: Backgammon, Poker, Monopoly

Axis 2: Number of Players

Single-player:

  • Just search! (8-puzzle, Rubik’s cube)
  • No adversary to worry about

Two-player:

  • Most classic games
  • Me vs. opponent
  • What we’ll focus on

Multi-player:

  • More complex alliances and betrayals
  • Diplomacy, many card games

Axis 3: Zero-sum vs. Non-zero-sum

Zero-sum:

  • My gain = your loss
  • Total utility always constant
  • Chess: I win (+1), you lose (-1), sum = 0
  • This simplifies things – we can be purely adversarial

Non-zero-sum:

  • Both can win or both can lose
  • Cooperation might benefit both
  • Prisoner’s Dilemma, many economic games

Axis 4: Perfect vs. Imperfect Information

Perfect Information:

  • Both players see everything
  • No hidden cards, no secrets
  • Chess, Checkers, Go, Tic-tac-toe

Imperfect Information:

  • Hidden information
  • Poker (can’t see opponent’s cards)
  • Kriegspiel (blind chess)

Axis 5: Turn-taking vs. Simultaneous

Turn-taking:

  • I move, then you move
  • Chess, most board games

Simultaneous:

  • Both move at once
  • Rock-paper-scissors
  • Some video games

What We’ll Study

Classic case: Two-player, deterministic, zero-sum, perfect information, turn-taking

  • Chess, Checkers, Go, Tic-tac-toe
  • This is the foundation

PART 3: SINGLE-PLAYER GAMES – THE EASY CASE

Wait, Why Start with Single-Player?

Because it connects to what we already know!

Single-player deterministic game = Search problem

  • 8-puzzle, Rubik’s cube, Freecell
  • Just need to find a sequence of moves

Key Insight: Value of a State

In single-player search, each node stores the best outcome reachable from that state:

        Start (win possible?)
       /    \
   Move1    Move2
    /        \
State A     State B
  |            |
Win!         Lose

Value(Start) = max(Value(Move1), Value(Move2))
              = max(Win, Lose)
              = Win

Implementation

def single_player_value(state):
    if is_goal(state):
        return WIN

    best_value = LOSE

    for move in legal_moves(state):
        next_state = apply_move(state, move)
        value = single_player_value(next_state)
        best_value = max(best_value, value)

    return best_value

After search, pick move leading to best value.

This sets up the pattern for two-player games!


PART 4: MINIMAX – THE FOUNDATION OF TWO-PLAYER GAMES

The Core Idea

In two-player zero-sum games:

  • MAX player (us) wants to maximize our score
  • MIN player (opponent) wants to minimize our score
  • They alternate moves
  • We assume opponent plays optimally to beat us

The Minimax Value

For any state, the minimax value is:

  • If it’s MAX’s turn: the maximum value we can guarantee against best play
  • If it’s MIN’s turn: the minimum value MAX will get despite MIN’s best efforts
  • At terminal states: the actual utility (win/loss/draw)

Visualizing the Alternation

                    MAX (our turn)
                    /    |    \
                   /     |     \
           MIN turn     MIN     MIN
           /    \       /  \     / \
          /      \     /    \   /   \
      MAX turn   MAX  MAX   MAX MAX  MAX
        /  \      ...
      ...  ...

Simple Example Tree

        MAX
        /  \
       /    \
      MIN    MIN
     /  \    /  \
    8    2  5    6

Computation:

  1. Left MIN node: min(8,2) = 2
  2. Right MIN node: min(5,6) = 5
  3. MAX node: max(2,5) = 5

Result: MAX can guarantee a score of 5

Another Example

        MAX
        /  \
       /    \
      MIN    MIN
     /  \    /  \
    3   12  8    2

Left MIN: min(3,12) = 3
Right MIN: min(8,2) = 2
MAX: max(3,2) = 3

The Minimax Algorithm

def minimax(state):
    """
    Returns the best action for MAX from current state
    """
    if is_terminal(state):
        return utility(state)

    if player_to_move(state) == "MAX":
        return max_value(state)
    else:
        return min_value(state)

def max_value(state):
    if is_terminal(state):
        return utility(state)

    v = -infinity
    for action, next_state in successors(state):
        v = max(v, min_value(next_state))
    return v

def min_value(state):
    if is_terminal(state):
        return utility(state)

    v = +infinity
    for action, next_state in successors(state):
        v = min(v, max_value(next_state))
    return v

Tic-Tac-Toe Example (Conceptual)

Initial empty board (MAX's turn)
    ├── Place X in center
    │    └── MIN places O in corner
    │         └── MAX places X ...
    ├── Place X in corner
    │    └── MIN places O ...
    └── Place X in edge ...

The algorithm explores ALL possibilities, assuming MIN always blocks MAX’s wins.

Minimax Properties

PropertyValue
Optimal?YES – against perfect opponent
Against imperfect?Still optimal in worst case
Time complexityO(b^m) – exponential in depth!
Space complexityO(bm) – depth-first search
b = branching factorChess: ~35
m = max depthChess: ~100 moves

The problem: For chess, 35^100 is completely impossible to compute!


PART 5: RESOURCE LIMITS – MAKING MINIMAX PRACTICAL

The Reality Check

We cannot search to the end of most games:

  • Chess: ~35^100 positions
  • Even with pruning, impossible
  • We need to stop early

Depth-Limited Minimax

Instead of searching to terminal states:

  1. Search to a fixed depth (e.g., 4 ply = 2 moves each)
  2. At depth limit, use an evaluation function to estimate position value
  3. Propagate estimates up the tree
        MAX (depth 0)
        /          \
       /            \
    MIN (depth 1)   MIN
    /      \        /  \
   /        \      /    \
MAX(d2)    MAX   MAX    MAX
  |         |     |      |
Eval=5    Eval=3 Eval=8 Eval=2

Evaluation Functions

The heart of practical game AI! A function that scores any position without searching deeper.

Ideal evaluation function: Would return the true minimax value

  • But that would require solving the game!

Practical evaluation: Weighted linear combination of features

evaluate(s) = w₁×f₁(s) + w₂×f₂(s) + ... + wₙ×fₙ(s)

Chess evaluation features:

  • Material count: pawn=1, knight/bishop=3, rook=5, queen=9
  • Piece mobility: number of legal moves
  • King safety: pawn shelter around king
  • Center control: pieces attacking center squares
  • Pawn structure: doubled/isolated pawns

Example:

evaluate(s) = 1.0 × material_diff 
             + 0.1 × mobility_diff 
             + 0.5 × king_safety_diff

The Depth Trade-off

Shallow search (2 ply):
    Fast but dumb - misses tactics

Medium search (4-6 ply):
    Good for amateur play

Deep search (8-10 ply):
    Strong club player

Very deep (12+ ply):
    Master/Grandmaster level

Rule of thumb: Each additional ply (half-move) doubles search time but significantly improves play

Iterative Deepening for Games

Remember iterative deepening from uninformed search? It’s perfect for games!

def iterative_deepening_minimax(state, max_depth):
    for depth in range(1, max_depth+1):
        # Search to this depth
        best_move = depth_limited_minimax(state, depth)

        # If we have time, go deeper next iteration
        if out_of_time():
            break

    return best_move

Why for games?

  • We don’t know how deep we need to search
  • Time is limited (chess clocks!)
  • We can always return the best result from completed depth
  • Later depths refine the evaluation

PART 6: ALPHA-BETA PRUNING – THE GAME CHANGER

The Insight

“If you have an idea which is surely bad, don’t take the time to see how truly awful it is.” – Pat Winston

In minimax, many branches are irrelevant. Once we know a move is bad, we can stop exploring it.

The Key Variables

  • α (alpha): The best value MAX can guarantee along current path
  • β (beta): The best value MIN can guarantee along current path

Initially: α = -∞, β = +∞

Rules:

  • If at MAX node, we update α = max(α, child_value)
  • If at MIN node, we update β = min(β, child_value)
  • Prune when α ≥ β – no need to explore further!

Simple Example

        MAX (α=-∞, β=+∞)
        /    \
       /      \
    MIN       MIN
    /  \       ?
   2    3

Step-by-step:
1. Explore left MIN: get value 2
   - MAX now knows it can guarantee at least 2 (α=2)

2. Explore right MIN: first child returns 1
   - This MIN node now has value ≤1 (since it's MIN)
   - β for this MIN = 1

3. Compare: α=2, β=1 → α ≥ β → PRUNE!
   - MAX already has guaranteed 2
   - This MIN can't give more than 1
   - So MAX will never choose this branch

Detailed Walkthrough

Consider this tree (MAX at top):

            MAX
           /   \
          /     \
         MIN     MIN
        /  \     /  \
      10   9    8   100

Without pruning:

  • Left MIN: min(10,9) = 9
  • Right MIN: min(8,100) = 8
  • MAX: max(9,8) = 9

With alpha-beta:

Step 1: Start at root
    α = -∞, β = +∞

Step 2: Go to left MIN
    α = -∞, β = +∞

Step 3: Explore first child (10)
    MIN gets value 10
    β = min(∞, 10) = 10

Step 4: Explore second child (9)
    9 < current β (10)
    MIN updates β = min(10, 9) = 9

    MIN returns 9 to MAX

Step 5: Back at MAX
    α = max(-∞, 9) = 9
    β still +∞

Step 6: Go to right MIN
    α = 9, β = +∞ (passed from MAX)

Step 7: Explore first child (8)
    MIN gets 8
    β = min(∞, 8) = 8

Step 8: Check prune condition
    α = 9, β = 8
    9 ≥ 8? YES!

    PRUNE! Don't explore second child (100)
    MIN returns 8 (or any value ≤8)

Why Pruning is Safe

We prune only when:

  • MAX already has a guaranteed value α
  • Current MIN node can’t possibly exceed α
  • Therefore MAX will never choose this branch

The value we lost by pruning is irrelevant to the final decision!

The Power of Move Ordering

Alpha-beta’s effectiveness depends on examining good moves first.

Worst ordering: No pruning at all

MAX
├── Worst move first (value=1)
├── Better move (value=5)  ← no pruning because α still low
└── Best move (value=10)

Best ordering: Maximum pruning

MAX
├── Best move first (value=10) ← α=10 now!
├── Next move (value=5) ← PRUNED (5 can't beat 10)
└── Next move (value=3) ← PRUNED

Perfect Ordering Benefit

With perfect move ordering:

  • Time complexity: O(b^(m/2))
  • Effectively doubles the search depth for same time!

For chess:

  • Without pruning: depth 6
  • With perfect alpha-beta: depth 12
  • HUGE improvement!

Alpha-Beta Implementation

def alpha_beta_search(state):
    # Returns best action for MAX
    v, best_action = max_value(state, -infinity, +infinity)
    return best_action

def max_value(state, alpha, beta):
    if is_terminal(state):
        return utility(state), None

    v = -infinity
    best_action = None

    for action, next_state in successors(state):
        # Get value from MIN's perspective
        child_v, _ = min_value(next_state, alpha, beta)

        if child_v > v:
            v = child_v
            best_action = action

        # Update alpha
        alpha = max(alpha, v)

        # Prune?
        if v >= beta:
            return v, best_action  # Beta cutoff

    return v, best_action

def min_value(state, alpha, beta):
    if is_terminal(state):
        return utility(state), None

    v = +infinity
    best_action = None

    for action, next_state in successors(state):
        child_v, _ = max_value(next_state, alpha, beta)

        if child_v < v:
            v = child_v
            best_action = action

        # Update beta
        beta = min(beta, v)

        # Prune?
        if v <= alpha:
            return v, best_action  # Alpha cutoff

    return v, best_action

Visualizing Alpha-Beta Propagation

Level 1 (MAX): α=-∞, β=+∞
    |
Level 2 (MIN): α=-∞, β=+∞
    |
Level 3 (MAX): α=-∞, β=+∞ → gets value 3, returns up
    |
Level 2 (MIN): gets 3, β=3, passes α=-∞, β=3 down
    |
Level 3 (MAX): now α=-∞, β=3
              first child 5 > β? PRUNE!

PART 7: EXPECTIMAX – HANDLING RANDOMNESS

When Things Aren’t Deterministic

Many games have random elements:

  • Backgammon (dice)
  • Monopoly (dice)
  • Poker (random cards)
  • Many video games (random drops, critical hits)

We can’t use minimax because outcomes aren’t fully under opponent’s control.

The Expectimax Idea

Instead of “MIN” nodes (minimize), we have “CHANCE” nodes (average):

  • MAX nodes: maximize (our choices)
  • CHANCE nodes: take weighted average based on probabilities
  • MIN nodes still exist for adversarial chance games

Expectimax Example

        MAX
        /  \
       /    \
   CHANCE   CHANCE
    /|\      /|\
   1 2 3    4 5 6
  (1/3 each)

Computation:

  1. Left CHANCE: (1+2+3)/3 = 2
  2. Right CHANCE: (4+5+6)/3 = 5
  3. MAX: max(2,5) = 5

More Complex Example

        MAX
        /  \
       /    \
   CHANCE   MIN
    / \      / \
   8  10    2   100
  1/2 1/2

Computation:

  1. Left CHANCE: (8×0.5 + 10×0.5) = 9
  2. Right MIN: min(2,100) = 2
  3. MAX: max(9,2) = 9

Expectimax Implementation

def expectimax(state):
    if is_terminal(state):
        return utility(state)

    if player_to_move(state) == "MAX":
        return max_value(state)
    elif player_to_move(state) == "MIN":
        return min_value(state)
    else:  # CHANCE node
        return exp_value(state)

def max_value(state):
    v = -infinity
    for action, next_state in successors(state):
        v = max(v, expectimax(next_state))
    return v

def min_value(state):
    v = +infinity
    for action, next_state in successors(state):
        v = min(v, expectimax(next_state))
    return v

def exp_value(state):
    v = 0
    total_prob = 0

    for outcome, next_state, prob in stochastic_successors(state):
        v += prob * expectimax(next_state)
        total_prob += prob

    # Normalize in case probabilities don't sum to 1
    return v / total_prob

When to Use Expectimax vs Minimax

ScenarioUse
Deterministic, adversarialMinimax
Deterministic, no adversaryRegular search
Random outcomes, adversarialExpectimax with MIN
Random outcomes, no adversaryExpectimax without MIN
Unknown probabilitiesHard! Need to learn them

PART 8: GAME TREE COMPLEXITIES – REAL NUMBERS

Branching Factors

GameBranching FactorTypical Game LengthTree Size
Tic-tac-toe~49 moves4⁹ ≈ 262,144
Checkers~1070 moves10⁷⁰
Chess~35100 moves35¹⁰⁰ ≈ 10¹⁵⁴
Go~300200 moves300²⁰⁰ ≈ 10⁴⁹⁴
Backgammon~400100 moves400¹⁰⁰ ≈ 10²⁶⁰

What Alpha-Beta Achieves

With perfect ordering, effective branching factor becomes √b:

Gameb√bWithout pruningWith perfect pruning
Chess35~6depth ddepth 2d
Go300~17depth ddepth 2d

Deep Blue’s Numbers

Deep Blue (beat Kasparov in 1997):

  • Searched 200 million positions per second
  • Used alpha-beta pruning
  • Reached depth 12-14 typically
  • Could go to depth 40+ in some lines
  • Special-purpose hardware

PART 9: COMPLETE ALGORITHM COMPARISON

AlgorithmOpponentRandomnessOptimal?TimeSpace
Single-agent searchNoneNoYesO(b^d)O(bd)
MinimaxAdversarialNoYes*O(b^m)O(bm)
Alpha-betaAdversarialNoYesO(b^(m/2))O(bm)
ExpectimaxMixedYesYes*O(b^m)O(bm)

*Against optimal play


PART 10: KEY TAKEAWAYS FOR YOUR MIDTERM

The Big Ideas

  1. Games are multi-agent – Opponents actively work against us
  2. Minimax – Foundation for adversarial search
  3. Assume optimal opponent – Plan for worst case
  4. Evaluation functions – Needed when can’t search to end
  5. Alpha-beta pruning – Dramatically improves efficiency
  6. Move ordering – Critical for pruning effectiveness
  7. Expectimax – Handle randomness

Memory Aids

  • Minimax = “I maximize, you minimize”
  • Alpha = “What MAX can guarantee” (starts low, goes up)
  • Beta = “What MIN can guarantee” (starts high, goes down)
  • Prune when α ≥ β = “MAX’s guarantee beats MIN’s limit”
  • Expectimax = “Expectation + Maximization”

Common Exam Questions

Q: Why use alpha-beta pruning?
A: Reduces search space from O(b^m) to O(b^(m/2)) with perfect ordering, allowing deeper search in same time.

Q: When is minimax not optimal?
A: Against suboptimal opponents, it’s still optimal in worst case, but might miss opportunities to exploit opponent mistakes.

Q: What makes a good evaluation function?
A: Fast to compute, correlates with actual winning chances, uses meaningful features of the position.

Q: Why does move ordering matter for alpha-beta?
A: Good ordering leads to early cutoffs, bad ordering explores irrelevant branches before finding good ones.

Q: Difference between minimax and expectimax?
A: Minimax assumes worst-case (minimizing opponent), expectimax averages over chance outcomes.


PRACTICE PROBLEMS

Problem 1

Draw the minimax tree for this game where MAX moves first, values are at leaves: [3, 8, 2, 5, 9, 1, 4, 6] in a complete binary tree of depth 3.

Answer:

        MAX
       /   \
      MIN   MIN
     / \    / \
   MAX MAX MAX MAX
   /\   /\   /\   /\
  3 8  2 5  9 1  4 6

Compute bottom-up:

  • Left-left MAX: max(3,8)=8
  • Left-right MAX: max(2,5)=5
  • Left MIN: min(8,5)=5
  • Right-left MAX: max(9,1)=9
  • Right-right MAX: max(4,6)=6
  • Right MIN: min(9,6)=6
  • Root MAX: max(5,6)=6

Problem 2

In the above tree, which branches would alpha-beta prune with perfect ordering?

Answer: With perfect ordering (highest values first), the right MIN would see 9 then 6, no prune. No pruning occurs in this small tree.

Problem 3

You have 10 seconds per move, can evaluate 1 million nodes per second. With alpha-beta, how deep can you search in chess (b=35)?

Without pruning: 35^d = 10^7 → d = log(10^7)/log(35) ≈ 4.5 ply

With perfect alpha-beta: (√35)^d = 35^(d/2) = 10^7 → d/2 = 4.5 → d = 9 ply

Problem 4

Design a simple evaluation function for tic-tac-toe.

Answer:

evaluate(board):
    if MAX has 3 in a row: return +100
    if MIN has 3 in a row: return -100

    score = 0
    # Reward center control
    if board.center == MAX: score += 10
    if board.center == MIN: score -= 10

    # Reward corners
    for corner in corners:
        if board[corner] == MAX: score += 5
        if board[corner] == MIN: score -= 5

    return score

Problem 5

In expectimax, if a chance node has outcomes with probabilities [0.2, 0.3, 0.5] and values [10, 20, 5], what’s its value?

Answer: 0.2×10 + 0.3×20 + 0.5×5 = 2 + 6 + 2.5 = 10.5


Adversarial search connects back to everything we’ve learned – search trees, complexity analysis, and the trade-off between optimality and practicality. Master minimax and alpha-beta, understand when to use expectimax, and you’ll be well-prepared.

Scroll to Top