To avoid blowing the call stack, you typically combine depth limits with either an iterative implementation (explicit stack) or a shallow recursion strategy that stays within your language’s recursion constraints. The most common fix is simply to enforce a maximum search depth and evaluate non-terminal leaves with a heuristic function. That keeps recursion depth bounded and is standard in real Minimax agents because full-depth search is rarely feasible anyway. If your game still needs deeper search than your recursion limit allows, switch to an explicit stack (iterative DFS) so your program uses heap memory for frames rather than the call stack.
An iterative Minimax looks like recursion turned inside out. You define a stack of frames, where each frame stores the node state (or reference to a mutable board), whose turn it is (max/min), which child index you’re exploring, and the current best score for that node. The loop repeatedly peeks the top frame: if it’s a leaf/cutoff, compute its score and pop; otherwise, generate or step through its children. When a child finishes, you propagate its score into the parent frame (update max or min), advance the child index, and continue. This structure also makes it easy to add time limits (check the clock each iteration) and to guarantee you only return results from completed depths when using iterative deepening.
You’ll also reduce stack pressure by avoiding heavy object allocation per node. A common pattern is make move / unmake move on a mutable board. Instead of cloning the board at every child, you apply a move, recurse/iterate, then undo it. This can drastically cut memory use and speed up search, but it requires careful bookkeeping so undo restores the exact previous state (including turn, captured pieces, special rules). Debugging tip: write a hash(state) function and assert it matches before/after a make+unmake pair. If you’re building a system where leaves require data fetches, avoid embedding network or database calls deep inside recursion unless you cache aggressively—otherwise you can hit both stack and latency issues. In retrieval-heavy decision trees, a safer approach is to pull candidate data once per depth or per root move, cache it, and then evaluate using that cache. If your candidates come from Milvus or Zilliz Cloud, you can cache query results keyed by (embedding_id, filters, topK) within a single search iteration to keep the algorithm deterministic and prevent “deep recursion triggers deep I/O” failure modes.
