Minmax with Alpha Beta Pruning
Game Tree Search
The Goal of a Chess Engine
A chess engine is solving:
“Given a position, what move leads to the best possible future?”
But the engine cannot know the future, so it simulates it.
This simulation is called search.
The Game Tree
Every legal move creates a new position. From that position, the opponent also has moves.
This forms a tree:
Position
├── Move A
│ ├── Opp Move A1
│ │ ├── Move A1a
│ │ └── Move A1b
│ └── Opp Move A2
│ └── ...
└── Move B
└── ...
This is called the game tree.
Evaluation Function
We cannot search until checkmate (too big). So at some depth we stop and estimate who is better
evaluate(position) → score
+ positive = good for white
+ negative = good for black
We can also think of it as static evaluation function, because it generates a score just by looking at the current board with no look ahead.
Example:
+1.20 → White is winning
-0.50 → Black slightly better
0.00 → Equal
The Minimax Principle
- Max player (White): tries to maximize the score
- Min player (Black): tries to minimize the score
- Each player assumes the opponent plays optimally
- At each ply, we alternate between max and min
function minimax(position, depth, isMaximizingPlayer):
if depth == 0 or position.isGameOver():
return evaluate(position)
if isMaximizingPlayer:
maxEval = -INFINITY
for each move in position.getLegalMoves():
newPosition = position.makeMove(move)
eval = minimax(newPosition, depth - 1, false)
maxEval = max(maxEval, eval)
return maxEval
else:
minEval = +INFINITY
for each move in position.getLegalMoves():
newPosition = position.makeMove(move)
eval = minimax(newPosition, depth - 1, true)
minEval = min(minEval, eval)
return minEval
Example
Root Position (White to move, MAX, depth=2)
Score: ?
│
├── Move A: Nf3
│ │
│ ├── Black Move A1: d5 (MIN, depth=1)
│ │ Score: ?
│ │ │
│ │ ├── White Move A1a: d4 (MAX, depth=0)
│ │ │ └── [EVAL: +0.3] ← Static evaluation here!
│ │ │
│ │ └── White Move A1b: e3 (MAX, depth=0)
│ │ └── [EVAL: +0.1] ← Static evaluation here!
│ │
│ │ → MIN picks lowest: min(+0.3, +0.1) = +0.1
│ │
│ └── Black Move A2: e5 (MIN, depth=1)
│ Score: ?
│ │
│ ├── White Move A2a: d4 (MAX, depth=0)
│ │ └── [EVAL: +0.5] ← Static evaluation here!
│ │
│ └── White Move A2b: Nc3 (MAX, depth=0)
│ └── [EVAL: +0.2] ← Static evaluation here!
│
│ → MIN picks lowest: min(+0.5, +0.2) = +0.2
│
│ → MAX at Move A picks highest: max(+0.1, +0.2) = +0.2
│
├── Move B: e4
│ │
│ ├── Black Move B1: c5 (MIN, depth=1)
│ │ Score: ?
│ │ │
│ │ ├── White Move B1a: Nf3 (MAX, depth=0)
│ │ │ └── [EVAL: +0.4] ← Static evaluation here!
│ │ │
│ │ └── White Move B1b: d4 (MAX, depth=0)
│ │ └── [EVAL: +0.6] ← Static evaluation here!
│ │
│ │ → MIN picks lowest: min(+0.4, +0.6) = +0.4
│ │
│ └── Black Move B2: e5 (MIN, depth=1)
│ Score: ?
│ │
│ ├── White Move B2a: Nf3 (MAX, depth=0)
│ │ └── [EVAL: +0.3] ← Static evaluation here!
│ │
│ └── White Move B2b: Bc4 (MAX, depth=0)
│ └── [EVAL: +0.5] ← Static evaluation here!
│
│ → MIN picks lowest: min(+0.3, +0.5) = +0.3
│
│ → MAX at Move B picks highest: max(+0.4, +0.3) = +0.4
│
└── Move C: d4
│
├── Black Move C1: d5 (MIN, depth=1)
│ Score: ?
│ │
│ ├── White Move C1a: c4 (MAX, depth=0)
│ │ └── [EVAL: +0.2] ← Static evaluation here!
│ │
│ └── White Move C1b: Nf3 (MAX, depth=0)
│ └── [EVAL: +0.1] ← Static evaluation here!
│
│ → MIN picks lowest: min(+0.2, +0.1) = +0.1
│
└── Black Move C2: Nf6 (MIN, depth=1)
Score: ?
│
├── White Move C2a: c4 (MAX, depth=0)
│ └── [EVAL: +0.3] ← Static evaluation here!
│
└── White Move C2b: Nc3 (MAX, depth=0)
└── [EVAL: +0.4] ← Static evaluation here!
→ MIN picks lowest: min(+0.3, +0.4) = +0.3
→ MAX at Move C picks highest: max(+0.1, +0.3) = +0.3
FINAL DECISION at Root:
White chooses: max(+0.2, +0.4, +0.3) = +0.4
→ Best move: e4 (Move B)
Alpha-Beta Pruning
The Problem with Plain Minimax
Plain minimax explores every node in the tree, even when we already know some branches won’t affect the final decision.
Example: You’re looking at move options. You found one that guarantees you a score of +5. Then you start looking at another move, and the opponent’s first response gives you -3. Do you need to check the opponent’s other responses? NO! You already have +5, so you won’t opponent to play this move in the first place.
This optimization is called Alpha-Beta Pruning.
The Two Parameters
Alpha (α): The best score MAX can guarantee so far
- Best score the maximizing side (us) is GUARANTEED so far
- Starts at -∞
- Only MAX updates it (increases it)
- Represents MAX’s “floor” - the worst MAX will accept
Beta (β): The best score MIN can guarantee so far
- Best score the minimizing side (opponent) is GUARANTEED so far
- Starts at +∞
- Only MIN updates it (decreases it)
- Represents MIN’s “ceiling” - the worst MIN will accept
So at any node:
The true value of the position must lie inside [α, β]
If we ever prove it lies outside this window → stop searching because parent will never choose it.
The Two Types of Pruning
There are two different situations where we prune, depending on which player discovers the cutoff:
1. Beta Cutoff
Occurs at a node where opponent is choosing (minimizing node).
Meaning
We found a move so good for us that opponent will NEVER allow reaching this position.
Example
We are evaluating Move A at root.
We already have:
alpha = +2 (we can get +2 elsewhere)
Now inside this branch opponent analyzes a reply:
position score = +5 for us
Opponent says:
“Nope. I will never play a move that gives you +5 when I can choose another line giving you +2.”
So this branch is irrelevant.
Therefore:
score >= beta → BETA CUTOFF
2. Alpha Cutoff
Occurs at a node where we are choosing (maximizing node).
Meaning
We found a move so bad that we will never play this line.
Example
Opponent already has a line giving us:
beta = -4
Now we analyze another continuation and find:
score = -7
Pseudocode
function alphaBeta(position, depth, alpha, beta, isMaximizingPlayer):
if depth == 0 or position.isGameOver():
return evaluate(position)
if isMaximizingPlayer:
maxEval = -INFINITY
for each move in position.getLegalMoves():
newPosition = position.makeMove(move)
eval = alphaBeta(newPosition, depth - 1, alpha, beta, false)
maxEval = max(maxEval, eval)
alpha = max(alpha, eval) // ← MAX updates alpha
if beta <= alpha: // ← Pruning condition
break // Beta cutoff: MIN won't allow this path
return maxEval
else: // Minimizing player
minEval = +INFINITY
for each move in position.getLegalMoves():
newPosition = position.makeMove(move)
eval = alphaBeta(newPosition, depth - 1, alpha, beta, true)
minEval = min(minEval, eval)
beta = min(beta, eval) // ← MIN updates beta
if beta <= alpha: // ← Pruning condition
break // Alpha cutoff: MAX won't allow this path
return minEval
// Initial call from root
bestMove = alphaBeta(position, MAX_DEPTH, -INFINITY, +INFINITY, true)
Negamax - A Simpler Way to Write Minimax
Negamax is not a different algorithm - it’s just a cleaner way to write minimax.
Instead of having separate logic for MAX and MIN players, negamax exploits a mathematical symmetry:
“Your best move is my worst move (negated)”
The Key Insight
In chess (and most zero-sum games):
- What’s good for White (+5) is equally bad for Black (-5)
- White maximizes the score
- Black minimizes the score
But we can flip the perspective:
- From White’s view: position scores +5
- From Black’s view: same position scores -5
So instead of:
White (MAX): pick maximum
Black (MIN): pick minimum
We can do:
Current player: pick maximum FROM THEIR PERSPECTIVE
Opponent: negate the score to flip perspective
Minimax vs Negamax - Side by Side
Regular Minimax
function minimax(position, depth, isMaximizingPlayer):
if depth == 0:
return evaluate(position)
if isMaximizingPlayer: // White's turn
maxEval = -INFINITY
for each move in position.getLegalMoves():
eval = minimax(makeMove(move), depth - 1, false)
maxEval = max(maxEval, eval)
return maxEval
else: // Black's turn
minEval = +INFINITY
for each move in position.getLegalMoves():
eval = minimax(makeMove(move), depth - 1, true)
minEval = min(minEval, eval)
return minEval
Negamax - Simplified!
function negamax(position, depth, color):
if depth == 0:
return color * evaluate(position) # ← Flip perspective!
maxEval = -INFINITY
for each move in position.getLegalMoves():
eval = -negamax(makeMove(move), depth - 1, -color) // ← Negate!
maxEval = max(maxEval, eval)
return maxEval
// Initial call
// color = +1 for White (maximizing)
// color = -1 for Black (minimizing)
bestScore = negamax(position, depth, +1)
Why Negate the Recursive Call?
eval = -negamax(...)
Think of it as switching perspective:
- Child node returns score from their perspective
- We want score from our perspective
- Our best = opponent’s worst
- So we negate!
Example:
Black finds a position worth -5 from White’s view → From Black’s view, that’s +5 (good for Black!) → Negamax returns +5 to Black → White receives -5 (bad for White, correctly)
Let’s trace a simple tree:
Position (White to move, color=+1, depth=2)
│
├── Move A
│ │
│ └── Position after A (Black to move, color=-1, depth=1)
│ │
│ ├── Move A1
│ │ └── [EVAL from White's view: +3]
│ │ color=-1, so return: -1 * (+3) = -3
│ │ To White (parent): -(-3) = +3
│ │
│ └── Move A2
│ └── [EVAL from White's view: +1]
│ color=-1, so return: -1 * (+1) = -1
│ To White (parent): -(-1) = +1
│
│ Black picks max(-3, -1) = -1 (best for Black)
│ Return to White: -(-1) = +1
│
└── Move B
│
└── Position after B (Black to move, color=-1, depth=1)
│
├── Move B1
│ └── [EVAL: +5]
│ Returns to Black: -1 * (+5) = -5
│ To White: -(-5) = +5
│
└── Move B2
└── [EVAL: +2]
Returns to Black: -1 * (+2) = -2
To White: -(-2) = +2
Black picks max(-5, -2) = -2
Return to White: -(-2) = +2
White picks max(+1, +2) = +2
Best move: B
Negamax with Alpha-Beta Pruning
function negamax(position, depth, alpha, beta, color):
if depth == 0:
return color * evaluate(position)
maxEval = -INFINITY
for each move in position.getLegalMoves():
eval = -negamax(makeMove(move), depth - 1, -beta, -alpha, -color)
maxEval = max(maxEval, eval)
alpha = max(alpha, eval)
if alpha >= beta:
break # Prune
return maxEval
# Initial call
bestScore = negamax(position, MAX_DEPTH, -INFINITY, +INFINITY, +1)
Why swap alpha and beta?
- Alpha/beta represent a window from current player’s perspective
- When we flip perspective (negate), we must also flip the window
- Your alpha becomes opponent’s negative beta, and vice versa
Alpha and Beta in Negamax
In negamax, alpha and beta always represent the window from the CURRENT player’s perspective.
Alpha: "I'm guaranteed at least this score"
Beta: "My opponent won't let me get more than this"
Both players use the same logic, but when we recurse, we negate and swap the window.
Why We Swap Alpha and Beta
Your alpha = “best I can guarantee”
- For opponent, this becomes their beta = “worst they’ll allow you”
- But negated:
-alpha
Your beta = “best opponent will allow”
- For opponent, this becomes their alpha = “best they can guarantee”
- But negated:
-beta
Step-by-Step Example
Root: White's turn (color = +1)
alpha = -∞, beta = +∞
"I want at least -∞, opponent won't give me more than +∞"
├── Exploring Move A
│
│ Call: negamax(positionA, depth-1, -∞, +∞, -1)
│ ↓
│ After negation: negamax(..., -beta, -alpha, -color)
│ -(+∞) -(-∞)
│ -∞ +∞
│
│ Black's turn (color = -1)
│ alpha = -∞, beta = +∞ (from Black's perspective)
│ "I (Black) want at least -∞, White won't give me more than +∞"
│
│ ├── Move A1: returns +3 (from Black's view)
│ │ alpha = max(-∞, +3) = +3
│ │ "I now have at least +3 (from my Black perspective)"
│ │
│ ├── Move A2: returns +1
│ │ alpha = max(+3, +1) = +3 (no change)
│
│ Black returns: +3
│
│ Back to White: -(+3) = -3
│ White's alpha = max(-∞, -3) = -3
│ "I (White) now have at least -3"
│
├── Exploring Move B
│
│ White now has alpha = -3, beta = +∞
│
│ Call: negamax(positionB, depth-1, -beta, -alpha, -color)
│ -(+∞) -(-3)
│ -∞ +3 ← Swapped!
│
│ Black's turn
│ alpha = -∞, beta = +3 (from Black's perspective)
│ "I want at least -∞, but White won't let me get more than +3"
│ ↑
│ This came from White's alpha = -3
│
│ ├── Move B1: returns +5 (from Black's view)
│ │ alpha = max(-∞, +5) = +5
│ │ Check: alpha >= beta? → +5 >= +3? YES! ✂️
│ │ PRUNE! (Beta cutoff)
│ │
│ │ Why prune?
│ │ Black found +5 for themselves
│ │ But White (parent) already has -3 guaranteed
│ │ From White's view, +5 for Black = -5 for White
│ │ White won't choose this branch (-5 < -3)
│
│ └── Move B2: ✂️ Not explored
Iterative Deepening
Imagine you’re writing a chess engine and give it 10 seconds to find a move.
Naive approach:
"Let me search to depth 6... oh wait, this is taking 15 seconds!"
→ Time's up, no move found yet!
Better approach (Iterative Deepening):
Depth 1: 0.001s → Found move: e4 (score: +0.2)
Depth 2: 0.01s → Found move: e4 (score: +0.3)
Depth 3: 0.1s → Found move: Nf3 (score: +0.4)
Depth 4: 1s → Found move: Nf3 (score: +0.5)
Depth 5: 8s → Found move: d4 (score: +0.6)
[Time's up!]
→ Return d4 (best move found at depth 5)
Iterative Deepening: Search depth 1, then depth 2, then depth 3… until time runs out. Always have a move ready!
Pseudocode
Basic Iterative Deepening
function iterativeDeepeningSearch(position, maxTime):
startTime = currentTime()
bestMove = null
bestScore = -INFINITY
for depth = 1 to INFINITY:
# Check if we have time for this depth
if currentTime() - startTime >= maxTime:
break
# Search at current depth
score, move = negamax(position, depth, -INF, +INF, +1)
# Update best move found so far
bestMove = move
bestScore = score
# Optional: print info
print("Depth:", depth, "Move:", move, "Score:", score)
return bestMove
With Time Management
function iterativeDeepeningSearch(position, maxTime):
startTime = currentTime()
bestMove = null
timeForDepth = [] # Track time per depth
for depth = 1 to MAX_DEPTH:
depthStartTime = currentTime()
# Estimate if we have time for this depth
if depth > 2:
estimatedTime = predictTimeForDepth(depth, timeForDepth)
if currentTime() - startTime + estimatedTime > maxTime:
break # Not enough time, use previous result
# Search at current depth
score, move = alphaBeta(position, depth, -INF, +INF, +1)
# Record how long this depth took
timeForDepth.append(currentTime() - depthStartTime)
# Update best move
bestMove = move
bestScore = score
# Check if time is up
if currentTime() - startTime >= maxTime:
break
return bestMove
Why Does This Work?
Isn’t searching depth 1, 2, 3… wasteful?"
Intuition says: Yes! You’re re-searching the same positions multiple times.
Reality says: No! The time is dominated by the deepest search.
The Math Behind It
Chess has a branching factor of ~35 (average legal moves per position).
Time for each depth:
Depth 1: 35^1 = 35 nodes
Depth 2: 35^2 = 1,225 nodes
Depth 3: 35^3 = 42,875 nodes
Depth 4: 35^4 = 1,500,625 nodes
Depth 5: 35^5 = 52,521,875 nodes
Depth 6: 35^6 = 1,838,265,625 nodes
Total work with Iterative Deepening to depth 6:
Total = 35 + 1,225 + 42,875 + ... + 1,838,265,625
≈ 1,838,310,235 nodes
Just depth 6 alone: 1,838,265,625 nodes
Overhead from earlier depths: 44,610 nodes (0.002%!)
The overhead is negligible! The deepest search dominates everything.
General Formula
For branching factor b and depth d:
Time for depth d: b^d
Total time with ID:
b^1 + b^2 + b^3 + ... + b^d = (b^(d+1) - b) / (b - 1)
≈ b^d * (b / (b-1))
For b=35:
Overhead factor = 35 / 34 = 1.029
Benefits of Iterative Deepening
1. Time Management
# Always have a move ready
for depth in 1..∞:
if timeUp():
return bestMoveFoundSoFar # ← Always valid!
search(depth)
2. Better Move Ordering
Results from shallower searches help order moves in deeper searches:
# Depth 3 found: d4 is best, Nf3 second, e4 third
# At depth 4, search in this order first:
moves = [d4, Nf3, e4, ...] # ← Previous best first!
# → More alpha-beta pruning!
This is called Principal Variation (PV) move ordering.
Principal Variation = the line of moves the engine currently believes is best.
Example after depth 4:
PV: 1. Nf3 d5 2. d4 Nf6
This dramatically speeds up alpha-beta pruning because
Alpha-beta is only fast if you search the BEST move first.
If the best move is searched late → pruning almost disappears.
3. Progressive Information
Depth 1: e4 (+0.2) [0.001s]
Depth 2: e4 (+0.3) [0.01s]
Depth 3: Nf3 (+0.4) [0.1s]
Depth 4: Nf3 (+0.5) [1s]
You see the engine “thinking deeper” in real-time!
4. Handles Unknown Time Limits
Don’t know how long to search? Just keep going deeper until interrupted!
Psuedocode (Time Limit and PV move Ordering)
function findBestMove(position, maxTime):
startTime = now()
bestMove = null
pvMoves = [] # Principal variation from previous depth
for depth = 1 to 100: # Practically infinite
# Check time before starting new depth
elapsed = now() - startTime
if elapsed > maxTime * 0.9: # Leave 10% buffer
break
# Search at current depth with move ordering
result = search(position, depth, pvMoves)
# Update best move and PV
bestMove = result.move
bestScore = result.score
pvMoves = result.principalVariation
# Log progress
print(f"Depth {depth}: {bestMove} (score: {bestScore}, time: {elapsed}s)")
# Check for forced mate
if abs(bestScore) > MATE_THRESHOLD:
print(f"Mate found in {depth} plies!")
break
return bestMove
function search(position, depth, pvMoves):
# Order moves: PV move first, then others
moves = orderMoves(position.getLegalMoves(), pvMoves)
bestMove = null
bestScore = -INFINITY
alpha = -INFINITY
beta = +INFINITY
pv = []
for move in moves:
newPos = position.makeMove(move)
score = -negamax(newPos, depth - 1, -beta, -alpha, -1)
if score > bestScore:
bestScore = score
bestMove = move
pv = [move] + childPV
alpha = max(alpha, score)
return {move: bestMove, score: bestScore, principalVariation: pv}
5. Iterative Deepening with Aspiration Windows
Advanced engines narrow the alpha-beta window using previous depth’s score:
Normally search starts with:
alpha = -∞
beta = +∞
Meaning:
“I have no idea what the position score is — search everything.”
This is slow because pruning is weak.
Key Observation
From iterative deepening, you already know the score from previous depth.
Example:
Depth 6 → score = +0.32
Depth 7 → score ≈ probably near +0.32
Chess positions are stable. Score rarely jumps from +0.3 → -5.0 in one extra ply.
So instead of searching full range…
we search only near the expected value.
Aspiration Window Idea
Instead of:
alpha = -∞
beta = +∞
we search:
alpha = previousScore - margin
beta = previousScore + margin
Why this is FAST
Alpha-beta pruning strength depends on window size.
Smaller window ⇒ way more cutoffs ⇒ exponential speedup
function iterativeDeepening(position, maxTime):
previousScore = 0
for depth = 1 to ∞:
# Use narrow window around previous score
windowSize = 0.5
alpha = previousScore - windowSize
beta = previousScore + windowSize
score = negamax(position, depth, alpha, beta, +1)
# Re-search if score falls outside window
if score <= alpha or score >= beta:
# Widen window and re-search
score = negamax(position, depth, -INF, +INF, +1)
previousScore = score
if timeUp():
break
return bestMove