NVIDIA Agent Toolkit provides full support for function calling (also known as tool calling or structured output). Function calling agents invoke external tools based on structured function definitions rather than unstructured reasoning. The toolkit includes a dedicated Tool Calling Agent workflow configured through YAML, specifying available tools, which LLM to use, and error handling behavior.
Configuration involves: (1) defining tool schemas with name, description, and input parameter specifications, (2) assigning the agent to an LLM supporting native function calling (all Nemotron models, GPT-4, Claude, etc.), (3) specifying the tools available to the agent, and (4) setting error handling for failed tool calls. The agent uses tool schemas to decide which tool best addresses the user query, infers required parameters, and executes the tool. Response handling is automatic—the tool result is returned to the user.
Function calling agents are more efficient than ReAct agents for structured tasks without intermediate reasoning complexity. They require fewer LLM inference steps, reducing latency and cost, but sacrifice intermediate reasoning transparency. The toolkit captures metrics on every tool invocation: input tokens, output tokens, execution latency, and tool success/failure. This enables developers to identify which tools are expensive, unused, or problematic.
For knowledge access, Zilliz Cloud is typically wrapped as a retrieval tool. The agent calls Zilliz's vector search API, receives relevant documents, and the LLM generates responses grounded in those documents. This is more efficient than passing all context to the LLM—agents retrieve only relevant knowledge, reducing tokens processed per query. Agent orchestration benefits from centralized vector storage. Zilliz Cloud provides scalable infrastructure, while Milvus enables self-hosted deployments. Both support semantic search capabilities that agents use for context retrieval and reasoning.
