Adversarial Search - A Detailed Guide For Beginners

Welcome to adversarial search – where we’re not just searching through a static world, but facing an opponent who actively tries to defeat us! This is game playing AI, and it’s one of the most exciting topics in the course.

PART 1: WHY GAMES MATTER IN AI

Games as Multi-Agent Environments

Until now, we’ve been in single-agent environments:

The world doesn’t fight back
We just need to find a path to the goal
The only challenge is the size of the state space

But in games, we face other agents:

They have their own goals
They actively try to stop us
They respond to our moves
We need to anticipate their actions

Types of Multi-Agent Environments

Multi-Agent Environments
        ├── Cooperative (agents work together)
        │    ├ Example: Robots cleaning a house together
        │    └ Example: Teammates in soccer
        │
        └── Competitive (agents work against each other)
             ├── Zero-sum (my win is your loss)
             │    ├ Example: Chess, Checkers, Go
             │    └ This is what we'll study!
             │
             └── Non-zero-sum (both can win/lose)
                  └ Example: Prisoner's Dilemma

Why Study Games?

Games are fun! – But that’s not the academic reason…
Historical role in AI – Games have been AI’s “grand challenges”:

1994: Chinook beats checkers champion
1997: Deep Blue beats Kasparov at chess
2016: AlphaGo beats Lee Sedol at Go
2017: AlphaZero masters chess, Go, and shogi

Dealing with adversarial agents – Real-world applications:

Military strategy
Business competition
Cybersecurity (defending against attackers)

Huge state spaces – Games are HARD:

Chess: ~10⁴⁷ possible positions
Go: ~10¹⁷⁰ possible positions (more than atoms in universe!)
If we can solve games, we can solve many real problems

Clean environment – Clear rules, clear win/loss criteria

PART 2: GAME CHARACTERISTICS – CLASSIFYING THE PROBLEM

The 5 Key Axes of Games

1. Deterministic vs. Stochastic
        ↓
2. Number of Players (1, 2, many)
        ↓
3. Zero-sum vs. Non-zero-sum
        ↓
4. Perfect Information vs. Imperfect Information
        ↓
5. Turn-taking vs. Simultaneous moves

Axis 1: Deterministic vs. Stochastic

Deterministic Games:

No randomness
Same move always produces same result
Examples: Chess, Checkers, Go, Tic-tac-toe

Stochastic Games:

Random elements (dice, card draws, etc.)
Same move can have different outcomes
Examples: Backgammon, Poker, Monopoly

Axis 2: Number of Players

Single-player:

Just search! (8-puzzle, Rubik’s cube)
No adversary to worry about

Two-player:

Most classic games
Me vs. opponent
What we’ll focus on

Multi-player:

More complex alliances and betrayals
Diplomacy, many card games

Axis 3: Zero-sum vs. Non-zero-sum

Zero-sum:

My gain = your loss
Total utility always constant
Chess: I win (+1), you lose (-1), sum = 0
This simplifies things – we can be purely adversarial

Non-zero-sum:

Both can win or both can lose
Cooperation might benefit both
Prisoner’s Dilemma, many economic games

Axis 4: Perfect vs. Imperfect Information

Perfect Information:

Both players see everything
No hidden cards, no secrets
Chess, Checkers, Go, Tic-tac-toe

Imperfect Information:

Hidden information
Poker (can’t see opponent’s cards)
Kriegspiel (blind chess)

Axis 5: Turn-taking vs. Simultaneous

Turn-taking:

I move, then you move
Chess, most board games

Simultaneous:

Both move at once
Rock-paper-scissors
Some video games

What We’ll Study

Classic case: Two-player, deterministic, zero-sum, perfect information, turn-taking

Chess, Checkers, Go, Tic-tac-toe
This is the foundation

PART 3: SINGLE-PLAYER GAMES – THE EASY CASE

Wait, Why Start with Single-Player?

Because it connects to what we already know!

Single-player deterministic game = Search problem

8-puzzle, Rubik’s cube, Freecell
Just need to find a sequence of moves

Key Insight: Value of a State

In single-player search, each node stores the best outcome reachable from that state:

        Start (win possible?)
       /    \
   Move1    Move2
    /        \
State A     State B
  |            |
Win!         Lose

Value(Start) = max(Value(Move1), Value(Move2))
              = max(Win, Lose)
              = Win

Implementation

def single_player_value(state):
    if is_goal(state):
        return WIN

    best_value = LOSE

    for move in legal_moves(state):
        next_state = apply_move(state, move)
        value = single_player_value(next_state)
        best_value = max(best_value, value)

    return best_value

After search, pick move leading to best value.

This sets up the pattern for two-player games!

PART 4: MINIMAX – THE FOUNDATION OF TWO-PLAYER GAMES

The Core Idea

In two-player zero-sum games:

MAX player (us) wants to maximize our score
MIN player (opponent) wants to minimize our score
They alternate moves
We assume opponent plays optimally to beat us

The Minimax Value

For any state, the minimax value is:

If it’s MAX’s turn: the maximum value we can guarantee against best play
If it’s MIN’s turn: the minimum value MAX will get despite MIN’s best efforts
At terminal states: the actual utility (win/loss/draw)

Visualizing the Alternation

                    MAX (our turn)
                    /    |    \
                   /     |     \
           MIN turn     MIN     MIN
           /    \       /  \     / \
          /      \     /    \   /   \
      MAX turn   MAX  MAX   MAX MAX  MAX
        /  \      ...
      ...  ...

Simple Example Tree

        MAX
        /  \
       /    \
      MIN    MIN
     /  \    /  \
    8    2  5    6

Computation:

Left MIN node: min(8,2) = 2
Right MIN node: min(5,6) = 5
MAX node: max(2,5) = 5

Result: MAX can guarantee a score of 5

Another Example

        MAX
        /  \
       /    \
      MIN    MIN
     /  \    /  \
    3   12  8    2

Left MIN: min(3,12) = 3
Right MIN: min(8,2) = 2
MAX: max(3,2) = 3

The Minimax Algorithm

def minimax(state):
    """
    Returns the best action for MAX from current state
    """
    if is_terminal(state):
        return utility(state)

    if player_to_move(state) == "MAX":
        return max_value(state)
    else:
        return min_value(state)

def max_value(state):
    if is_terminal(state):
        return utility(state)

    v = -infinity
    for action, next_state in successors(state):
        v = max(v, min_value(next_state))
    return v

def min_value(state):
    if is_terminal(state):
        return utility(state)

    v = +infinity
    for action, next_state in successors(state):
        v = min(v, max_value(next_state))
    return v

Tic-Tac-Toe Example (Conceptual)

Initial empty board (MAX's turn)
    ├── Place X in center
    │    └── MIN places O in corner
    │         └── MAX places X ...
    ├── Place X in corner
    │    └── MIN places O ...
    └── Place X in edge ...

The algorithm explores ALL possibilities, assuming MIN always blocks MAX’s wins.

Minimax Properties

Property	Value
Optimal?	YES – against perfect opponent
Against imperfect?	Still optimal in worst case
Time complexity	O(b^m) – exponential in depth!
Space complexity	O(bm) – depth-first search
b = branching factor	Chess: ~35
m = max depth	Chess: ~100 moves

The problem: For chess, 35^100 is completely impossible to compute!

PART 5: RESOURCE LIMITS – MAKING MINIMAX PRACTICAL

The Reality Check

We cannot search to the end of most games:

Chess: ~35^100 positions
Even with pruning, impossible
We need to stop early

Depth-Limited Minimax

Instead of searching to terminal states:

Search to a fixed depth (e.g., 4 ply = 2 moves each)
At depth limit, use an evaluation function to estimate position value
Propagate estimates up the tree

        MAX (depth 0)
        /          \
       /            \
    MIN (depth 1)   MIN
    /      \        /  \
   /        \      /    \
MAX(d2)    MAX   MAX    MAX
  |         |     |      |
Eval=5    Eval=3 Eval=8 Eval=2

Evaluation Functions

The heart of practical game AI! A function that scores any position without searching deeper.

Ideal evaluation function: Would return the true minimax value

But that would require solving the game!

Practical evaluation: Weighted linear combination of features

evaluate(s) = w₁×f₁(s) + w₂×f₂(s) + ... + wₙ×fₙ(s)

Chess evaluation features:

Material count: pawn=1, knight/bishop=3, rook=5, queen=9
Piece mobility: number of legal moves
King safety: pawn shelter around king
Center control: pieces attacking center squares
Pawn structure: doubled/isolated pawns

Example:

evaluate(s) = 1.0 × material_diff 
             + 0.1 × mobility_diff 
             + 0.5 × king_safety_diff

The Depth Trade-off

Shallow search (2 ply):
    Fast but dumb - misses tactics

Medium search (4-6 ply):
    Good for amateur play

Deep search (8-10 ply):
    Strong club player

Very deep (12+ ply):
    Master/Grandmaster level

Rule of thumb: Each additional ply (half-move) doubles search time but significantly improves play

Iterative Deepening for Games

Remember iterative deepening from uninformed search? It’s perfect for games!

def iterative_deepening_minimax(state, max_depth):
    for depth in range(1, max_depth+1):
        # Search to this depth
        best_move = depth_limited_minimax(state, depth)

        # If we have time, go deeper next iteration
        if out_of_time():
            break

    return best_move

Why for games?

We don’t know how deep we need to search
Time is limited (chess clocks!)
We can always return the best result from completed depth
Later depths refine the evaluation

PART 6: ALPHA-BETA PRUNING – THE GAME CHANGER

The Insight

“If you have an idea which is surely bad, don’t take the time to see how truly awful it is.” – Pat Winston

In minimax, many branches are irrelevant. Once we know a move is bad, we can stop exploring it.

The Key Variables

α (alpha): The best value MAX can guarantee along current path
β (beta): The best value MIN can guarantee along current path

Initially: α = -∞, β = +∞

Rules:

If at MAX node, we update α = max(α, child_value)
If at MIN node, we update β = min(β, child_value)
Prune when α ≥ β – no need to explore further!

Simple Example

        MAX (α=-∞, β=+∞)
        /    \
       /      \
    MIN       MIN
    /  \       ?
   2    3

Step-by-step:
1. Explore left MIN: get value 2
   - MAX now knows it can guarantee at least 2 (α=2)

2. Explore right MIN: first child returns 1
   - This MIN node now has value ≤1 (since it's MIN)
   - β for this MIN = 1

3. Compare: α=2, β=1 → α ≥ β → PRUNE!
   - MAX already has guaranteed 2
   - This MIN can't give more than 1
   - So MAX will never choose this branch

Detailed Walkthrough

Consider this tree (MAX at top):

            MAX
           /   \
          /     \
         MIN     MIN
        /  \     /  \
      10   9    8   100

Without pruning:

Left MIN: min(10,9) = 9
Right MIN: min(8,100) = 8
MAX: max(9,8) = 9

With alpha-beta:

Step 1: Start at root
    α = -∞, β = +∞

Step 2: Go to left MIN
    α = -∞, β = +∞

Step 3: Explore first child (10)
    MIN gets value 10
    β = min(∞, 10) = 10

Step 4: Explore second child (9)
    9 < current β (10)
    MIN updates β = min(10, 9) = 9

    MIN returns 9 to MAX

Step 5: Back at MAX
    α = max(-∞, 9) = 9
    β still +∞

Step 6: Go to right MIN
    α = 9, β = +∞ (passed from MAX)

Step 7: Explore first child (8)
    MIN gets 8
    β = min(∞, 8) = 8

Step 8: Check prune condition
    α = 9, β = 8
    9 ≥ 8? YES!

    PRUNE! Don't explore second child (100)
    MIN returns 8 (or any value ≤8)

Why Pruning is Safe

We prune only when:

MAX already has a guaranteed value α
Current MIN node can’t possibly exceed α
Therefore MAX will never choose this branch

The value we lost by pruning is irrelevant to the final decision!

The Power of Move Ordering

Alpha-beta’s effectiveness depends on examining good moves first.

Worst ordering: No pruning at all

MAX
├── Worst move first (value=1)
├── Better move (value=5)  ← no pruning because α still low
└── Best move (value=10)

Best ordering: Maximum pruning

MAX
├── Best move first (value=10) ← α=10 now!
├── Next move (value=5) ← PRUNED (5 can't beat 10)
└── Next move (value=3) ← PRUNED

Perfect Ordering Benefit

With perfect move ordering:

Time complexity: O(b^(m/2))
Effectively doubles the search depth for same time!

For chess:

Without pruning: depth 6
With perfect alpha-beta: depth 12
HUGE improvement!

Alpha-Beta Implementation

def alpha_beta_search(state):
    # Returns best action for MAX
    v, best_action = max_value(state, -infinity, +infinity)
    return best_action

def max_value(state, alpha, beta):
    if is_terminal(state):
        return utility(state), None

    v = -infinity
    best_action = None

    for action, next_state in successors(state):
        # Get value from MIN's perspective
        child_v, _ = min_value(next_state, alpha, beta)

        if child_v > v:
            v = child_v
            best_action = action

        # Update alpha
        alpha = max(alpha, v)

        # Prune?
        if v >= beta:
            return v, best_action  # Beta cutoff

    return v, best_action

def min_value(state, alpha, beta):
    if is_terminal(state):
        return utility(state), None

    v = +infinity
    best_action = None

    for action, next_state in successors(state):
        child_v, _ = max_value(next_state, alpha, beta)

        if child_v < v:
            v = child_v
            best_action = action

        # Update beta
        beta = min(beta, v)

        # Prune?
        if v <= alpha:
            return v, best_action  # Alpha cutoff

    return v, best_action

Visualizing Alpha-Beta Propagation

Level 1 (MAX): α=-∞, β=+∞
    |
Level 2 (MIN): α=-∞, β=+∞
    |
Level 3 (MAX): α=-∞, β=+∞ → gets value 3, returns up
    |
Level 2 (MIN): gets 3, β=3, passes α=-∞, β=3 down
    |
Level 3 (MAX): now α=-∞, β=3
              first child 5 > β? PRUNE!

PART 7: EXPECTIMAX – HANDLING RANDOMNESS

When Things Aren’t Deterministic

Many games have random elements:

Backgammon (dice)
Monopoly (dice)
Poker (random cards)
Many video games (random drops, critical hits)

We can’t use minimax because outcomes aren’t fully under opponent’s control.

The Expectimax Idea

Instead of “MIN” nodes (minimize), we have “CHANCE” nodes (average):

MAX nodes: maximize (our choices)
CHANCE nodes: take weighted average based on probabilities
MIN nodes still exist for adversarial chance games

Expectimax Example

        MAX
        /  \
       /    \
   CHANCE   CHANCE
    /|\      /|\
   1 2 3    4 5 6
  (1/3 each)

Computation:

Left CHANCE: (1+2+3)/3 = 2
Right CHANCE: (4+5+6)/3 = 5
MAX: max(2,5) = 5

More Complex Example

        MAX
        /  \
       /    \
   CHANCE   MIN
    / \      / \
   8  10    2   100
  1/2 1/2

Computation:

Left CHANCE: (8×0.5 + 10×0.5) = 9
Right MIN: min(2,100) = 2
MAX: max(9,2) = 9

Expectimax Implementation

def expectimax(state):
    if is_terminal(state):
        return utility(state)

    if player_to_move(state) == "MAX":
        return max_value(state)
    elif player_to_move(state) == "MIN":
        return min_value(state)
    else:  # CHANCE node
        return exp_value(state)

def max_value(state):
    v = -infinity
    for action, next_state in successors(state):
        v = max(v, expectimax(next_state))
    return v

def min_value(state):
    v = +infinity
    for action, next_state in successors(state):
        v = min(v, expectimax(next_state))
    return v

def exp_value(state):
    v = 0
    total_prob = 0

    for outcome, next_state, prob in stochastic_successors(state):
        v += prob * expectimax(next_state)
        total_prob += prob

    # Normalize in case probabilities don't sum to 1
    return v / total_prob

When to Use Expectimax vs Minimax

Scenario	Use
Deterministic, adversarial	Minimax
Deterministic, no adversary	Regular search
Random outcomes, adversarial	Expectimax with MIN
Random outcomes, no adversary	Expectimax without MIN
Unknown probabilities	Hard! Need to learn them

PART 8: GAME TREE COMPLEXITIES – REAL NUMBERS

Branching Factors

Game	Branching Factor	Typical Game Length	Tree Size
Tic-tac-toe	~4	9 moves	4⁹ ≈ 262,144
Checkers	~10	70 moves	10⁷⁰
Chess	~35	100 moves	35¹⁰⁰ ≈ 10¹⁵⁴
Go	~300	200 moves	300²⁰⁰ ≈ 10⁴⁹⁴
Backgammon	~400	100 moves	400¹⁰⁰ ≈ 10²⁶⁰

What Alpha-Beta Achieves

With perfect ordering, effective branching factor becomes √b:

Game	b	√b	Without pruning	With perfect pruning
Chess	35	~6	depth d	depth 2d
Go	300	~17	depth d	depth 2d

Deep Blue’s Numbers

Deep Blue (beat Kasparov in 1997):

Searched 200 million positions per second
Used alpha-beta pruning
Reached depth 12-14 typically
Could go to depth 40+ in some lines
Special-purpose hardware

PART 9: COMPLETE ALGORITHM COMPARISON

Algorithm	Opponent	Randomness	Optimal?	Time	Space
Single-agent search	None	No	Yes	O(b^d)	O(bd)
Minimax	Adversarial	No	Yes*	O(b^m)	O(bm)
Alpha-beta	Adversarial	No	Yes	O(b^(m/2))	O(bm)
Expectimax	Mixed	Yes	Yes*	O(b^m)	O(bm)

*Against optimal play

PART 10: KEY TAKEAWAYS FOR YOUR MIDTERM

The Big Ideas

Games are multi-agent – Opponents actively work against us
Minimax – Foundation for adversarial search
Assume optimal opponent – Plan for worst case
Evaluation functions – Needed when can’t search to end
Alpha-beta pruning – Dramatically improves efficiency
Move ordering – Critical for pruning effectiveness
Expectimax – Handle randomness

Memory Aids

Minimax = “I maximize, you minimize”
Alpha = “What MAX can guarantee” (starts low, goes up)
Beta = “What MIN can guarantee” (starts high, goes down)
Prune when α ≥ β = “MAX’s guarantee beats MIN’s limit”
Expectimax = “Expectation + Maximization”

Common Exam Questions

Q: Why use alpha-beta pruning?
A: Reduces search space from O(b^m) to O(b^(m/2)) with perfect ordering, allowing deeper search in same time.

Q: When is minimax not optimal?
A: Against suboptimal opponents, it’s still optimal in worst case, but might miss opportunities to exploit opponent mistakes.

Q: What makes a good evaluation function?
A: Fast to compute, correlates with actual winning chances, uses meaningful features of the position.

Q: Why does move ordering matter for alpha-beta?
A: Good ordering leads to early cutoffs, bad ordering explores irrelevant branches before finding good ones.

Q: Difference between minimax and expectimax?
A: Minimax assumes worst-case (minimizing opponent), expectimax averages over chance outcomes.

PRACTICE PROBLEMS

Problem 1

Draw the minimax tree for this game where MAX moves first, values are at leaves: [3, 8, 2, 5, 9, 1, 4, 6] in a complete binary tree of depth 3.

Answer:

        MAX
       /   \
      MIN   MIN
     / \    / \
   MAX MAX MAX MAX
   /\   /\   /\   /\
  3 8  2 5  9 1  4 6

Compute bottom-up:

Left-left MAX: max(3,8)=8
Left-right MAX: max(2,5)=5
Left MIN: min(8,5)=5
Right-left MAX: max(9,1)=9
Right-right MAX: max(4,6)=6
Right MIN: min(9,6)=6
Root MAX: max(5,6)=6

Problem 2

In the above tree, which branches would alpha-beta prune with perfect ordering?

Answer: With perfect ordering (highest values first), the right MIN would see 9 then 6, no prune. No pruning occurs in this small tree.

Problem 3

You have 10 seconds per move, can evaluate 1 million nodes per second. With alpha-beta, how deep can you search in chess (b=35)?

Without pruning: 35^d = 10^7 → d = log(10^7)/log(35) ≈ 4.5 ply

With perfect alpha-beta: (√35)^d = 35^(d/2) = 10^7 → d/2 = 4.5 → d = 9 ply

Problem 4

Design a simple evaluation function for tic-tac-toe.

Answer:

evaluate(board):
    if MAX has 3 in a row: return +100
    if MIN has 3 in a row: return -100

    score = 0
    # Reward center control
    if board.center == MAX: score += 10
    if board.center == MIN: score -= 10

    # Reward corners
    for corner in corners:
        if board[corner] == MAX: score += 5
        if board[corner] == MIN: score -= 5

    return score

Problem 5

In expectimax, if a chance node has outcomes with probabilities [0.2, 0.3, 0.5] and values [10, 20, 5], what’s its value?

Answer: 0.2×10 + 0.3×20 + 0.5×5 = 2 + 6 + 2.5 = 10.5

Adversarial search connects back to everything we’ve learned – search trees, complexity analysis, and the trade-off between optimality and practicality. Master minimax and alpha-beta, understand when to use expectimax, and you’ll be well-prepared.