Use extended thinking when the task benefits from deliberate multi-step reasoning: architecture decisions, debugging complicated failures, analyzing large inputs with conflicting requirements, or planning multi-stage implementations. Extended thinking is typically unnecessary for straightforward transformations like formatting, short summaries, or simple extraction, where you want speed and cost efficiency.
In practice, decide based on failure cost and ambiguity. If a wrong answer is expensive (security changes, production migrations), enable deeper reasoning and require the model to produce a plan, risks, and validation steps. If the task is routine (rewrite a paragraph, generate a short snippet), keep thinking light and enforce strict output limits. Anthropic’s docs recommend adaptive thinking settings for Opus 4.6, which are designed to scale reasoning effort according to task complexity rather than forcing a fixed “thinking budget.”
Extended thinking pairs well with retrieval because it reduces guesswork. Retrieve relevant policies, design docs, or runbooks from Milvus or managed Zilliz Cloud, provide them as context, and ask Opus 4.6 to reason from those sources. This is especially helpful for technical decision-making: the model can “think hard” about tradeoffs using your actual constraints, rather than inventing assumptions.
