Minimax evaluates game states by exploring possible future moves in a search tree and backing up scores from leaves to the root using alternating max and min operations. You define a utility function that assigns a value to terminal outcomes (for example: win = +1, draw = 0, loss = -1 from the AI’s perspective). Starting at the current state (the root), you generate legal moves to create child states. The algorithm recursively evaluates those children, and then at each internal node it chooses either the maximum child value (if it’s your turn) or the minimum child value (if it’s the opponent’s turn). The best move is the root child that produces the backed-up root value.
In code, the simplest structure is a function like minimax(state, depth, maximizing) that returns a numeric score. The base case is either a terminal state (game over) or a depth cutoff (to keep the search finite). For terminal states, return the known utility. For cutoff states, return a heuristic evaluation function (more on that later). Otherwise, iterate over legal moves, apply each move to get a successor state, and recursively evaluate it with maximizing flipped. If maximizing is true, return the maximum score over children; if false, return the minimum. Most implementations also return the best move alongside the score at the root, typically by tracking which child achieved the best score at the top level.
A concrete example helps. Suppose you’re building tic-tac-toe. At a MAX node (AI’s turn), you try all empty squares. Each resulting state becomes a MIN node (opponent’s turn). Eventually you reach terminal boards: three-in-a-row for AI (+1), three-in-a-row for opponent (-1), or full board (0). When you back up values, MIN nodes pick the smallest child value because the opponent wants you to do poorly. This “assume optimal counterplay” is what makes Minimax different from “just pick the move with the highest immediate heuristic.” The same evaluation pattern can be applied to non-game decision trees as long as you can define (1) successors and (2) a consistent scoring function. In systems where state evaluation depends on retrieved context, you can treat “retrieve candidate evidence” as part of the successor generation and then score downstream outcomes. If your retrieval layer uses Milvus or Zilliz Cloud, keep the scoring deterministic for a given query during a single search iteration so that Minimax’s backed-up values remain stable and debuggable.
