Welcome to adversarial search – where we’re not just searching through a static world, but facing an opponent who actively tries to defeat us! This is game playing AI, and it’s one of the most exciting topics in the course.
PART 1: WHY GAMES MATTER IN AI
Games as Multi-Agent Environments
Until now, we’ve been in single-agent environments:
- The world doesn’t fight back
- We just need to find a path to the goal
- The only challenge is the size of the state space
But in games, we face other agents:
- They have their own goals
- They actively try to stop us
- They respond to our moves
- We need to anticipate their actions
Types of Multi-Agent Environments
Multi-Agent Environments
├── Cooperative (agents work together)
│ ├ Example: Robots cleaning a house together
│ └ Example: Teammates in soccer
│
└── Competitive (agents work against each other)
├── Zero-sum (my win is your loss)
│ ├ Example: Chess, Checkers, Go
│ └ This is what we'll study!
│
└── Non-zero-sum (both can win/lose)
└ Example: Prisoner's Dilemma
Why Study Games?
- Games are fun! – But that’s not the academic reason…
- Historical role in AI – Games have been AI’s “grand challenges”:
- 1994: Chinook beats checkers champion
- 1997: Deep Blue beats Kasparov at chess
- 2016: AlphaGo beats Lee Sedol at Go
- 2017: AlphaZero masters chess, Go, and shogi
- Dealing with adversarial agents – Real-world applications:
- Military strategy
- Business competition
- Cybersecurity (defending against attackers)
- Huge state spaces – Games are HARD:
- Chess: ~10⁴⁷ possible positions
- Go: ~10¹⁷⁰ possible positions (more than atoms in universe!)
- If we can solve games, we can solve many real problems
- Clean environment – Clear rules, clear win/loss criteria
PART 2: GAME CHARACTERISTICS – CLASSIFYING THE PROBLEM
The 5 Key Axes of Games
1. Deterministic vs. Stochastic
↓
2. Number of Players (1, 2, many)
↓
3. Zero-sum vs. Non-zero-sum
↓
4. Perfect Information vs. Imperfect Information
↓
5. Turn-taking vs. Simultaneous moves
Axis 1: Deterministic vs. Stochastic
Deterministic Games:
- No randomness
- Same move always produces same result
- Examples: Chess, Checkers, Go, Tic-tac-toe
Stochastic Games:
- Random elements (dice, card draws, etc.)
- Same move can have different outcomes
- Examples: Backgammon, Poker, Monopoly
Axis 2: Number of Players
Single-player:
- Just search! (8-puzzle, Rubik’s cube)
- No adversary to worry about
Two-player:
- Most classic games
- Me vs. opponent
- What we’ll focus on
Multi-player:
- More complex alliances and betrayals
- Diplomacy, many card games
Axis 3: Zero-sum vs. Non-zero-sum
Zero-sum:
- My gain = your loss
- Total utility always constant
- Chess: I win (+1), you lose (-1), sum = 0
- This simplifies things – we can be purely adversarial
Non-zero-sum:
- Both can win or both can lose
- Cooperation might benefit both
- Prisoner’s Dilemma, many economic games
Axis 4: Perfect vs. Imperfect Information
Perfect Information:
- Both players see everything
- No hidden cards, no secrets
- Chess, Checkers, Go, Tic-tac-toe
Imperfect Information:
- Hidden information
- Poker (can’t see opponent’s cards)
- Kriegspiel (blind chess)
Axis 5: Turn-taking vs. Simultaneous
Turn-taking:
- I move, then you move
- Chess, most board games
Simultaneous:
- Both move at once
- Rock-paper-scissors
- Some video games
What We’ll Study
Classic case: Two-player, deterministic, zero-sum, perfect information, turn-taking
- Chess, Checkers, Go, Tic-tac-toe
- This is the foundation
PART 3: SINGLE-PLAYER GAMES – THE EASY CASE
Wait, Why Start with Single-Player?
Because it connects to what we already know!
Single-player deterministic game = Search problem
- 8-puzzle, Rubik’s cube, Freecell
- Just need to find a sequence of moves
Key Insight: Value of a State
In single-player search, each node stores the best outcome reachable from that state:
Start (win possible?)
/ \
Move1 Move2
/ \
State A State B
| |
Win! Lose
Value(Start) = max(Value(Move1), Value(Move2))
= max(Win, Lose)
= Win
Implementation
def single_player_value(state):
if is_goal(state):
return WIN
best_value = LOSE
for move in legal_moves(state):
next_state = apply_move(state, move)
value = single_player_value(next_state)
best_value = max(best_value, value)
return best_value
After search, pick move leading to best value.
This sets up the pattern for two-player games!
PART 4: MINIMAX – THE FOUNDATION OF TWO-PLAYER GAMES
The Core Idea
In two-player zero-sum games:
- MAX player (us) wants to maximize our score
- MIN player (opponent) wants to minimize our score
- They alternate moves
- We assume opponent plays optimally to beat us
The Minimax Value
For any state, the minimax value is:
- If it’s MAX’s turn: the maximum value we can guarantee against best play
- If it’s MIN’s turn: the minimum value MAX will get despite MIN’s best efforts
- At terminal states: the actual utility (win/loss/draw)
Visualizing the Alternation
MAX (our turn)
/ | \
/ | \
MIN turn MIN MIN
/ \ / \ / \
/ \ / \ / \
MAX turn MAX MAX MAX MAX MAX
/ \ ...
... ...
Simple Example Tree
MAX
/ \
/ \
MIN MIN
/ \ / \
8 2 5 6
Computation:
- Left MIN node: min(8,2) = 2
- Right MIN node: min(5,6) = 5
- MAX node: max(2,5) = 5
Result: MAX can guarantee a score of 5
Another Example
MAX
/ \
/ \
MIN MIN
/ \ / \
3 12 8 2
Left MIN: min(3,12) = 3
Right MIN: min(8,2) = 2
MAX: max(3,2) = 3
The Minimax Algorithm
def minimax(state):
"""
Returns the best action for MAX from current state
"""
if is_terminal(state):
return utility(state)
if player_to_move(state) == "MAX":
return max_value(state)
else:
return min_value(state)
def max_value(state):
if is_terminal(state):
return utility(state)
v = -infinity
for action, next_state in successors(state):
v = max(v, min_value(next_state))
return v
def min_value(state):
if is_terminal(state):
return utility(state)
v = +infinity
for action, next_state in successors(state):
v = min(v, max_value(next_state))
return v
Tic-Tac-Toe Example (Conceptual)
Initial empty board (MAX's turn)
├── Place X in center
│ └── MIN places O in corner
│ └── MAX places X ...
├── Place X in corner
│ └── MIN places O ...
└── Place X in edge ...
The algorithm explores ALL possibilities, assuming MIN always blocks MAX’s wins.
Minimax Properties
| Property | Value |
|---|---|
| Optimal? | YES – against perfect opponent |
| Against imperfect? | Still optimal in worst case |
| Time complexity | O(b^m) – exponential in depth! |
| Space complexity | O(bm) – depth-first search |
| b = branching factor | Chess: ~35 |
| m = max depth | Chess: ~100 moves |
The problem: For chess, 35^100 is completely impossible to compute!
PART 5: RESOURCE LIMITS – MAKING MINIMAX PRACTICAL
The Reality Check
We cannot search to the end of most games:
- Chess: ~35^100 positions
- Even with pruning, impossible
- We need to stop early
Depth-Limited Minimax
Instead of searching to terminal states:
- Search to a fixed depth (e.g., 4 ply = 2 moves each)
- At depth limit, use an evaluation function to estimate position value
- Propagate estimates up the tree
MAX (depth 0)
/ \
/ \
MIN (depth 1) MIN
/ \ / \
/ \ / \
MAX(d2) MAX MAX MAX
| | | |
Eval=5 Eval=3 Eval=8 Eval=2
Evaluation Functions
The heart of practical game AI! A function that scores any position without searching deeper.
Ideal evaluation function: Would return the true minimax value
- But that would require solving the game!
Practical evaluation: Weighted linear combination of features
evaluate(s) = w₁×f₁(s) + w₂×f₂(s) + ... + wₙ×fₙ(s)
Chess evaluation features:
- Material count: pawn=1, knight/bishop=3, rook=5, queen=9
- Piece mobility: number of legal moves
- King safety: pawn shelter around king
- Center control: pieces attacking center squares
- Pawn structure: doubled/isolated pawns
Example:
evaluate(s) = 1.0 × material_diff
+ 0.1 × mobility_diff
+ 0.5 × king_safety_diff
The Depth Trade-off
Shallow search (2 ply):
Fast but dumb - misses tactics
Medium search (4-6 ply):
Good for amateur play
Deep search (8-10 ply):
Strong club player
Very deep (12+ ply):
Master/Grandmaster level
Rule of thumb: Each additional ply (half-move) doubles search time but significantly improves play
Iterative Deepening for Games
Remember iterative deepening from uninformed search? It’s perfect for games!
def iterative_deepening_minimax(state, max_depth):
for depth in range(1, max_depth+1):
# Search to this depth
best_move = depth_limited_minimax(state, depth)
# If we have time, go deeper next iteration
if out_of_time():
break
return best_move
Why for games?
- We don’t know how deep we need to search
- Time is limited (chess clocks!)
- We can always return the best result from completed depth
- Later depths refine the evaluation
PART 6: ALPHA-BETA PRUNING – THE GAME CHANGER
The Insight
“If you have an idea which is surely bad, don’t take the time to see how truly awful it is.” – Pat Winston
In minimax, many branches are irrelevant. Once we know a move is bad, we can stop exploring it.
The Key Variables
- α (alpha): The best value MAX can guarantee along current path
- β (beta): The best value MIN can guarantee along current path
Initially: α = -∞, β = +∞
Rules:
- If at MAX node, we update α = max(α, child_value)
- If at MIN node, we update β = min(β, child_value)
- Prune when α ≥ β – no need to explore further!
Simple Example
MAX (α=-∞, β=+∞)
/ \
/ \
MIN MIN
/ \ ?
2 3
Step-by-step:
1. Explore left MIN: get value 2
- MAX now knows it can guarantee at least 2 (α=2)
2. Explore right MIN: first child returns 1
- This MIN node now has value ≤1 (since it's MIN)
- β for this MIN = 1
3. Compare: α=2, β=1 → α ≥ β → PRUNE!
- MAX already has guaranteed 2
- This MIN can't give more than 1
- So MAX will never choose this branch
Detailed Walkthrough
Consider this tree (MAX at top):
MAX
/ \
/ \
MIN MIN
/ \ / \
10 9 8 100
Without pruning:
- Left MIN: min(10,9) = 9
- Right MIN: min(8,100) = 8
- MAX: max(9,8) = 9
With alpha-beta:
Step 1: Start at root
α = -∞, β = +∞
Step 2: Go to left MIN
α = -∞, β = +∞
Step 3: Explore first child (10)
MIN gets value 10
β = min(∞, 10) = 10
Step 4: Explore second child (9)
9 < current β (10)
MIN updates β = min(10, 9) = 9
MIN returns 9 to MAX
Step 5: Back at MAX
α = max(-∞, 9) = 9
β still +∞
Step 6: Go to right MIN
α = 9, β = +∞ (passed from MAX)
Step 7: Explore first child (8)
MIN gets 8
β = min(∞, 8) = 8
Step 8: Check prune condition
α = 9, β = 8
9 ≥ 8? YES!
PRUNE! Don't explore second child (100)
MIN returns 8 (or any value ≤8)
Why Pruning is Safe
We prune only when:
- MAX already has a guaranteed value α
- Current MIN node can’t possibly exceed α
- Therefore MAX will never choose this branch
The value we lost by pruning is irrelevant to the final decision!
The Power of Move Ordering
Alpha-beta’s effectiveness depends on examining good moves first.
Worst ordering: No pruning at all
MAX
├── Worst move first (value=1)
├── Better move (value=5) ← no pruning because α still low
└── Best move (value=10)
Best ordering: Maximum pruning
MAX
├── Best move first (value=10) ← α=10 now!
├── Next move (value=5) ← PRUNED (5 can't beat 10)
└── Next move (value=3) ← PRUNED
Perfect Ordering Benefit
With perfect move ordering:
- Time complexity: O(b^(m/2))
- Effectively doubles the search depth for same time!
For chess:
- Without pruning: depth 6
- With perfect alpha-beta: depth 12
- HUGE improvement!
Alpha-Beta Implementation
def alpha_beta_search(state):
# Returns best action for MAX
v, best_action = max_value(state, -infinity, +infinity)
return best_action
def max_value(state, alpha, beta):
if is_terminal(state):
return utility(state), None
v = -infinity
best_action = None
for action, next_state in successors(state):
# Get value from MIN's perspective
child_v, _ = min_value(next_state, alpha, beta)
if child_v > v:
v = child_v
best_action = action
# Update alpha
alpha = max(alpha, v)
# Prune?
if v >= beta:
return v, best_action # Beta cutoff
return v, best_action
def min_value(state, alpha, beta):
if is_terminal(state):
return utility(state), None
v = +infinity
best_action = None
for action, next_state in successors(state):
child_v, _ = max_value(next_state, alpha, beta)
if child_v < v:
v = child_v
best_action = action
# Update beta
beta = min(beta, v)
# Prune?
if v <= alpha:
return v, best_action # Alpha cutoff
return v, best_action
Visualizing Alpha-Beta Propagation
Level 1 (MAX): α=-∞, β=+∞
|
Level 2 (MIN): α=-∞, β=+∞
|
Level 3 (MAX): α=-∞, β=+∞ → gets value 3, returns up
|
Level 2 (MIN): gets 3, β=3, passes α=-∞, β=3 down
|
Level 3 (MAX): now α=-∞, β=3
first child 5 > β? PRUNE!
PART 7: EXPECTIMAX – HANDLING RANDOMNESS
When Things Aren’t Deterministic
Many games have random elements:
- Backgammon (dice)
- Monopoly (dice)
- Poker (random cards)
- Many video games (random drops, critical hits)
We can’t use minimax because outcomes aren’t fully under opponent’s control.
The Expectimax Idea
Instead of “MIN” nodes (minimize), we have “CHANCE” nodes (average):
- MAX nodes: maximize (our choices)
- CHANCE nodes: take weighted average based on probabilities
- MIN nodes still exist for adversarial chance games
Expectimax Example
MAX
/ \
/ \
CHANCE CHANCE
/|\ /|\
1 2 3 4 5 6
(1/3 each)
Computation:
- Left CHANCE: (1+2+3)/3 = 2
- Right CHANCE: (4+5+6)/3 = 5
- MAX: max(2,5) = 5
More Complex Example
MAX
/ \
/ \
CHANCE MIN
/ \ / \
8 10 2 100
1/2 1/2
Computation:
- Left CHANCE: (8×0.5 + 10×0.5) = 9
- Right MIN: min(2,100) = 2
- MAX: max(9,2) = 9
Expectimax Implementation
def expectimax(state):
if is_terminal(state):
return utility(state)
if player_to_move(state) == "MAX":
return max_value(state)
elif player_to_move(state) == "MIN":
return min_value(state)
else: # CHANCE node
return exp_value(state)
def max_value(state):
v = -infinity
for action, next_state in successors(state):
v = max(v, expectimax(next_state))
return v
def min_value(state):
v = +infinity
for action, next_state in successors(state):
v = min(v, expectimax(next_state))
return v
def exp_value(state):
v = 0
total_prob = 0
for outcome, next_state, prob in stochastic_successors(state):
v += prob * expectimax(next_state)
total_prob += prob
# Normalize in case probabilities don't sum to 1
return v / total_prob
When to Use Expectimax vs Minimax
| Scenario | Use |
|---|---|
| Deterministic, adversarial | Minimax |
| Deterministic, no adversary | Regular search |
| Random outcomes, adversarial | Expectimax with MIN |
| Random outcomes, no adversary | Expectimax without MIN |
| Unknown probabilities | Hard! Need to learn them |
PART 8: GAME TREE COMPLEXITIES – REAL NUMBERS
Branching Factors
| Game | Branching Factor | Typical Game Length | Tree Size |
|---|---|---|---|
| Tic-tac-toe | ~4 | 9 moves | 4⁹ ≈ 262,144 |
| Checkers | ~10 | 70 moves | 10⁷⁰ |
| Chess | ~35 | 100 moves | 35¹⁰⁰ ≈ 10¹⁵⁴ |
| Go | ~300 | 200 moves | 300²⁰⁰ ≈ 10⁴⁹⁴ |
| Backgammon | ~400 | 100 moves | 400¹⁰⁰ ≈ 10²⁶⁰ |
What Alpha-Beta Achieves
With perfect ordering, effective branching factor becomes √b:
| Game | b | √b | Without pruning | With perfect pruning |
|---|---|---|---|---|
| Chess | 35 | ~6 | depth d | depth 2d |
| Go | 300 | ~17 | depth d | depth 2d |
Deep Blue’s Numbers
Deep Blue (beat Kasparov in 1997):
- Searched 200 million positions per second
- Used alpha-beta pruning
- Reached depth 12-14 typically
- Could go to depth 40+ in some lines
- Special-purpose hardware
PART 9: COMPLETE ALGORITHM COMPARISON
| Algorithm | Opponent | Randomness | Optimal? | Time | Space |
|---|---|---|---|---|---|
| Single-agent search | None | No | Yes | O(b^d) | O(bd) |
| Minimax | Adversarial | No | Yes* | O(b^m) | O(bm) |
| Alpha-beta | Adversarial | No | Yes | O(b^(m/2)) | O(bm) |
| Expectimax | Mixed | Yes | Yes* | O(b^m) | O(bm) |
*Against optimal play
PART 10: KEY TAKEAWAYS FOR YOUR MIDTERM
The Big Ideas
- Games are multi-agent – Opponents actively work against us
- Minimax – Foundation for adversarial search
- Assume optimal opponent – Plan for worst case
- Evaluation functions – Needed when can’t search to end
- Alpha-beta pruning – Dramatically improves efficiency
- Move ordering – Critical for pruning effectiveness
- Expectimax – Handle randomness
Memory Aids
- Minimax = “I maximize, you minimize”
- Alpha = “What MAX can guarantee” (starts low, goes up)
- Beta = “What MIN can guarantee” (starts high, goes down)
- Prune when α ≥ β = “MAX’s guarantee beats MIN’s limit”
- Expectimax = “Expectation + Maximization”
Common Exam Questions
Q: Why use alpha-beta pruning?
A: Reduces search space from O(b^m) to O(b^(m/2)) with perfect ordering, allowing deeper search in same time.
Q: When is minimax not optimal?
A: Against suboptimal opponents, it’s still optimal in worst case, but might miss opportunities to exploit opponent mistakes.
Q: What makes a good evaluation function?
A: Fast to compute, correlates with actual winning chances, uses meaningful features of the position.
Q: Why does move ordering matter for alpha-beta?
A: Good ordering leads to early cutoffs, bad ordering explores irrelevant branches before finding good ones.
Q: Difference between minimax and expectimax?
A: Minimax assumes worst-case (minimizing opponent), expectimax averages over chance outcomes.
PRACTICE PROBLEMS
Problem 1
Draw the minimax tree for this game where MAX moves first, values are at leaves: [3, 8, 2, 5, 9, 1, 4, 6] in a complete binary tree of depth 3.
Answer:
MAX
/ \
MIN MIN
/ \ / \
MAX MAX MAX MAX
/\ /\ /\ /\
3 8 2 5 9 1 4 6
Compute bottom-up:
- Left-left MAX: max(3,8)=8
- Left-right MAX: max(2,5)=5
- Left MIN: min(8,5)=5
- Right-left MAX: max(9,1)=9
- Right-right MAX: max(4,6)=6
- Right MIN: min(9,6)=6
- Root MAX: max(5,6)=6
Problem 2
In the above tree, which branches would alpha-beta prune with perfect ordering?
Answer: With perfect ordering (highest values first), the right MIN would see 9 then 6, no prune. No pruning occurs in this small tree.
Problem 3
You have 10 seconds per move, can evaluate 1 million nodes per second. With alpha-beta, how deep can you search in chess (b=35)?
Without pruning: 35^d = 10^7 → d = log(10^7)/log(35) ≈ 4.5 ply
With perfect alpha-beta: (√35)^d = 35^(d/2) = 10^7 → d/2 = 4.5 → d = 9 ply
Problem 4
Design a simple evaluation function for tic-tac-toe.
Answer:
evaluate(board):
if MAX has 3 in a row: return +100
if MIN has 3 in a row: return -100
score = 0
# Reward center control
if board.center == MAX: score += 10
if board.center == MIN: score -= 10
# Reward corners
for corner in corners:
if board[corner] == MAX: score += 5
if board[corner] == MIN: score -= 5
return score
Problem 5
In expectimax, if a chance node has outcomes with probabilities [0.2, 0.3, 0.5] and values [10, 20, 5], what’s its value?
Answer: 0.2×10 + 0.3×20 + 0.5×5 = 2 + 6 + 2.5 = 10.5
Adversarial search connects back to everything we’ve learned – search trees, complexity analysis, and the trade-off between optimality and practicality. Master minimax and alpha-beta, understand when to use expectimax, and you’ll be well-prepared.
