Structured Reasoning and Output Formats

Welcome back, future AI architect! In our previous chapters, you’ve mastered the fundamentals of any-llm, from seamless provider switching to handling various prompt types. You’re already generating amazing text, but what if you need more than just free-form prose? What if your application demands data in a specific, machine-readable format – like JSON – or needs the LLM to decide when to call a specific function in your code?

This is where structured reasoning and output formats come into play. It’s a game-changer for building robust, reliable, and truly intelligent AI applications. In this chapter, we’ll dive deep into how any-llm empowers you to guide LLMs towards predictable, structured outputs, turning your AI from a conversationalist into a data-generating powerhouse.

We’ll start by understanding why structured output is so crucial, then explore the any-llm mechanisms for achieving it, including JSON mode and function calling. We’ll even integrate Python’s powerful Pydantic library to ensure your outputs are not just structured, but also rigorously validated. Get ready to elevate your LLM interactions from simple text generation to sophisticated data orchestration!

Why Structured Output is Essential

Imagine building an application that needs to extract specific information from a user’s request, like a booking system that needs to know the destination, date, and number of guests. If the LLM just returns a conversational sentence like “Okay, I’ve noted you want to travel to Paris next Tuesday with two people,” your application still has to parse that natural language string to extract the critical data points. This is prone to errors, requires complex regex, and is difficult to scale.

This is where structured output shines! By instructing the LLM to return data in a predictable format, such as JSON, your application can easily parse and process the information.

Here’s why it’s a must-have for serious AI development:

Predictability: Ensures your application receives data in a consistent, parseable format, reducing the need for complex natural language processing on the output.
Reliability: Minimizes parsing errors and makes your application more robust.
Automation: Facilitates seamless integration with downstream systems that expect structured data (databases, APIs, other services).
Enhanced Reasoning: Allows LLMs to perform more complex reasoning by deciding which tools (functions) to use based on the input, rather than just generating text.

Let’s look at a simple visual representation of the difference:

graph LR subgraph Unstructured_Output["Unstructured Output"] User_U[User Request] --> LLM_U[LLM Text Output] LLM_U --> App_U[Application Logic] App_U --> Parser[Needs Complex Parser] Parser --> Data_U[Extracted Data Error Prone] end subgraph Structured_Output["Structured Output"] User_S[User Request] --> LLM_S[LLM Structured Output] LLM_S --> App_S[Application Logic] App_S --> Validator[Easy Validation Parsing] Validator --> Data_S[Structured Data Reliable] end style Unstructured_Output fill:#f8d7da,stroke:#dc3545,stroke-width:2px,color:#dc3545 style Structured_Output fill:#d4edda,stroke:#28a745,stroke-width:2px,color:#28a745

As you can see, structured output streamlines the process, making the “Validator/Parser” step much simpler and more reliable.

Achieving Structured Output with `any-llm`

any-llm, designed for unifying LLM interactions, provides straightforward ways to request structured outputs, abstracting away the provider-specific nuances. The two primary methods are JSON Mode and Function Calling.

1. JSON Mode: Getting Data in a JSON Format

Many modern LLMs support a “JSON mode” where they are explicitly instructed to return a valid JSON object. any-llm provides a unified way to enable this. When JSON mode is active, the LLM will try its best to generate output that adheres to the JSON specification, even if it means slightly altering its natural language response.

Why use JSON Mode?

You need a simple, schema-less (or loosely-schema’d) JSON object.
You want to extract key-value pairs from a user’s input.
Your downstream systems consume JSON directly.

Let’s see how to request JSON output using any-llm.

Step-by-Step: Requesting JSON Output

First, ensure you have any-llm-sdk installed. For this chapter, we’ll assume any-llm-sdk version 1.0.0 or newer is installed, along with your chosen provider’s dependencies (e.g., [openai] or [ollama]). If you haven’t already, install it:

pip install "any-llm-sdk[openai]>=1.0.0" pydantic

(Note: We’re also installing pydantic now, as we’ll use it shortly for validation!)

Now, let’s write some Python code.

# structured_output_json.py
import os
from any_llm import completion

# Ensure your API key is set as an environment variable
# For OpenAI, it would be OPENAI_API_KEY
# For Mistral, MISTRAL_API_KEY, etc.
# Example: export OPENAI_API_KEY="sk-..."
if not os.environ.get("OPENAI_API_KEY"): # Or your chosen provider
    print("Please set your LLM provider's API key as an environment variable.")
    exit()

print("--- Requesting JSON Output ---")

# We'll define a simple prompt asking for structured information.
# It's good practice to explicitly state in the prompt that you want JSON.
prompt_text = """
Extract the product name, quantity, and unit price from the following order:
"I'd like to order 5 bags of premium coffee beans at $12.50 each."
Return the data as a JSON object with keys: "product", "quantity", "unit_price".
"""

try:
    # Use any_llm.completion and set response_format to {"type": "json_object"}
    response = completion(
        model="gpt-3.5-turbo", # Or your preferred model, e.g., "mistral-tiny"
        messages=[{"role": "user", "content": prompt_text}],
        response_format={"type": "json_object"} # THIS is the magic!
    )

    # The response will be a CompletionResponse object.
    # The content of the first choice will contain the JSON string.
    json_string_output = response.choices[0].message.content
    print(f"\nRaw JSON Output:\n{json_string_output}")

    # To use it in Python, you'll typically parse this string
    import json
    parsed_data = json.loads(json_string_output)
    print(f"\nParsed Python Dictionary:\n{parsed_data}")
    print(f"Product: {parsed_data.get('product')}")
    print(f"Quantity: {parsed_data.get('quantity')}")

except Exception as e:
    print(f"An error occurred: {e}")

Explanation:

We import completion from any_llm.
We construct a prompt_text that clearly instructs the LLM on what to extract and how to format it (as JSON with specific keys). While response_format helps, a clear prompt is always beneficial.
The crucial part is response_format={"type": "json_object"}. This tells any-llm (and the underlying LLM provider) to enforce JSON output.
After receiving the response, we access response.choices[0].message.content to get the raw JSON string.
Finally, we use Python’s built-in json.loads() to parse this string into a usable Python dictionary.

Go ahead, run this script! You should see perfectly formatted JSON output that you can easily work with in your Python application.

2. Pydantic for Robust JSON Validation

While JSON mode is great, LLMs can sometimes “hallucinate” or deviate slightly from a strict schema, especially with complex prompts or less capable models. This is where Pydantic comes in! Pydantic is a Python library that allows you to define data schemas using standard Python type hints. It then provides powerful validation and parsing capabilities.

How Pydantic works with LLMs:

You define a Pydantic model representing your desired output structure.
You pass this Pydantic model (or its schema) to the LLM.
The LLM is prompted to generate JSON that matches this schema.
You parse the LLM’s output using Pydantic, which will validate it and raise errors if it doesn’t conform.

any-llm provides direct support for Pydantic models, making this integration seamless. It will often convert your Pydantic model into a JSON Schema and pass it to the LLM, along with instructions to adhere to that schema.

Step-by-Step: Using Pydantic with `any-llm`

Let’s refine our previous example to use Pydantic for robust validation.

# structured_output_pydantic.py
import os
import json
from any_llm import completion
from pydantic import BaseModel, Field # Import BaseModel and Field from Pydantic

# Ensure your API key is set
if not os.environ.get("OPENAI_API_KEY"): # Or your chosen provider
    print("Please set your LLM provider's API key as an environment variable.")
    exit()

print("--- Requesting Pydantic-Validated JSON Output ---")

# 1. Define your desired output structure using a Pydantic BaseModel
class OrderItem(BaseModel):
    product: str = Field(description="The name of the product ordered")
    quantity: int = Field(description="The number of items ordered")
    unit_price: float = Field(description="The price per unit of the product")

# It's also good practice to define a main response model if you expect multiple items
class OrderConfirmation(BaseModel):
    items: list[OrderItem] = Field(description="A list of items in the order")
    total_cost: float = Field(description="The calculated total cost of the order")
    currency: str = Field(default="USD", description="The currency of the order")

# We'll update the prompt to be a bit more complex,
# but still guide the LLM towards the structure implicitly.
prompt_text = """
Extract the order details from the following customer request.
Calculate the total cost. Assume USD if not specified.

Customer request: "I need 2 large pizzas at $25.00 each, and also 3 cans of soda for $2.00 per can."
"""

try:
    # 2. Pass the Pydantic model directly to any_llm's completion function
    # any-llm will handle converting this to a JSON Schema for the LLM.
    response = completion(
        model="gpt-4o", # GPT-4o or similar advanced model is recommended for complex Pydantic schemas
        messages=[{"role": "user", "content": prompt_text}],
        response_model=OrderConfirmation # THIS is where you specify your Pydantic model
    )

    # 3. any-llm automatically parses and validates the response into your Pydantic object!
    # If the LLM output is invalid, any-llm will raise a PydanticValidationError.
    parsed_order: OrderConfirmation = response.parsed_content # Access the parsed Pydantic object

    print(f"\nParsed Pydantic Object:\n{parsed_order.model_dump_json(indent=2)}")
    print(f"\nOrder Items:")
    for item in parsed_order.items:
        print(f"  - Product: {item.product}, Quantity: {item.quantity}, Unit Price: {item.unit_price}")
    print(f"Total Cost: {parsed_order.total_cost} {parsed_order.currency}")
    print(f"Type of parsed_order: {type(parsed_order)}")

except Exception as e:
    print(f"An error occurred during Pydantic parsing or LLM call: {e}")
    # If the LLM response didn't match the schema, a PydanticValidationError will be raised.
    # You can catch and inspect this error to see the discrepancies.

Explanation:

We define OrderItem and OrderConfirmation as Pydantic.BaseModel classes. The Field function allows us to add descriptions, which any-llm can use to help the LLM understand the purpose of each field.
Instead of response_format={"type": "json_object"}, we now use response_model=OrderConfirmation. any-llm takes care of generating the appropriate prompt and schema for the LLM.
The most exciting part: response.parsed_content directly gives you an instance of your OrderConfirmation Pydantic model, already validated and type-correct! If the LLM fails to produce valid JSON according to your schema, any-llm (or Pydantic internally) will raise an error, preventing malformed data from entering your application.

This approach dramatically increases the reliability of your LLM-powered applications.

3. Function Calling: Giving the LLM Tools

Function calling takes structured output to the next level. Instead of just asking the LLM to return data, you can ask it to decide which function to call from a predefined set of tools, and what arguments to call that function with. The LLM acts as a smart router or an agent, interpreting user intent and translating it into executable actions.

Why use Function Calling?

You want the LLM to interact with external systems (databases, APIs).
You need to perform specific computations or actions based on user input.
You’re building agents that can choose appropriate tools.

any-llm provides a unified interface for defining and using tools (functions), abstracting the provider-specific ways of doing so (e.g., OpenAI’s tools parameter, Google’s tool_config).

Step-by-Step: Implementing Function Calling

Let’s create a scenario where the LLM needs to decide whether to get weather information or current stock prices.

# structured_output_function_calling.py
import os
import json
from any_llm import completion
from pydantic import BaseModel, Field

# Ensure your API key is set
if not os.environ.get("OPENAI_API_KEY"):
    print("Please set your LLM provider's API key as an environment variable.")
    exit()

print("--- Function Calling Example ---")

# 1. Define Pydantic models for your function arguments
class GetCurrentWeatherArgs(BaseModel):
    location: str = Field(description="The city and state, e.g., San Francisco, CA")
    unit: str = Field(default="fahrenheit", description="The unit of temperature to use, e.g., 'celsius' or 'fahrenheit'")

class GetStockPriceArgs(BaseModel):
    ticker: str = Field(description="The stock ticker symbol, e.g., AAPL for Apple")

# 2. Define your functions (tools)
def get_current_weather(location: str, unit: str = "fahrenheit") -> dict:
    """Get the current weather in a given location."""
    print(f"--- Calling get_current_weather for {location} in {unit} ---")
    # In a real application, this would call a weather API.
    # For now, we'll return mock data.
    if "San Francisco" in location:
        return {"location": location, "temperature": "60", "unit": unit, "forecast": "Partly cloudy"}
    elif "New York" in location:
        return {"location": location, "temperature": "45", "unit": unit, "forecast": "Cloudy"}
    else:
        return {"location": location, "temperature": "N/A", "unit": unit, "forecast": "Unknown"}

def get_stock_price(ticker: str) -> dict:
    """Get the current stock price for a given ticker symbol."""
    print(f"--- Calling get_stock_price for {ticker} ---")
    # In a real application, this would call a stock price API.
    # For now, we'll return mock data.
    if ticker.upper() == "AAPL":
        return {"ticker": ticker.upper(), "price": 175.25, "currency": "USD"}
    elif ticker.upper() == "GOOG":
        return {"ticker": ticker.upper(), "price": 139.80, "currency": "USD"}
    else:
        return {"ticker": ticker.upper(), "price": "N/A", "currency": "N/A"}

# 3. Create a list of tools for any-llm
# Use a dictionary format that any-llm expects, including the Pydantic model for args.
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": GetCurrentWeatherArgs.model_json_schema() # Pydantic model's JSON Schema
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_stock_price",
            "description": "Get the current stock price for a given ticker symbol",
            "parameters": GetStockPriceArgs.model_json_schema()
        }
    }
]

# Function to process the LLM's tool call
def handle_tool_call(tool_call):
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments) # Arguments are a JSON string

    print(f"\nLLM decided to call function: {function_name} with arguments: {arguments}")

    # Dynamically call the function based on its name
    if function_name == "get_current_weather":
        # Validate arguments using Pydantic for safety
        validated_args = GetCurrentWeatherArgs(**arguments)
        return get_current_weather(location=validated_args.location, unit=validated_args.unit)
    elif function_name == "get_stock_price":
        validated_args = GetStockPriceArgs(**arguments)
        return get_stock_price(ticker=validated_args.ticker)
    else:
        return {"error": f"Unknown function: {function_name}"}

# Example user queries
user_queries = [
    "What's the weather like in San Francisco?",
    "Tell me the current stock price of AAPL.",
    "What's the temperature in New York in Celsius?",
    "How much is GOOG?",
    "Tell me a joke." # This should not trigger a tool call
]

for i, query in enumerate(user_queries):
    print(f"\n--- Processing Query {i+1}: '{query}' ---")
    try:
        # 4. Call completion with the tools list
        response = completion(
            model="gpt-4o", # Function calling works best with advanced models
            messages=[{"role": "user", "content": query}],
            tools=tools,
            tool_choice="auto" # Let the LLM decide if it needs a tool
        )

        # 5. Check if the LLM decided to call a tool
        if response.choices[0].message.tool_calls:
            for tool_call in response.choices[0].message.tool_calls:
                tool_output = handle_tool_call(tool_call)
                print(f"Tool Output: {tool_output}")

                # Optional: Send tool output back to LLM for further reasoning
                # This creates a multi-turn conversation where the LLM can use the tool's result.
                # For simplicity, we'll just print the output here.
                # In a full agent, you'd append messages like:
                # messages.append(response.choices[0].message)
                # messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": json.dumps(tool_output)})
                # then call completion again.
        else:
            # If no tool was called, it's a regular text response
            print(f"LLM Response (no tool call): {response.choices[0].message.content}")

    except Exception as e:
        print(f"An error occurred: {e}")

Explanation:

We define Pydantic models (GetCurrentWeatherArgs, GetStockPriceArgs) for the arguments each function expects. This provides schema validation for the arguments generated by the LLM.
We define Python functions (get_current_weather, get_stock_price) that represent our “tools.” These are placeholder functions that would typically interact with external APIs or databases.
We construct a tools list. Each tool object includes its name, a description (crucial for the LLM to understand its purpose), and parameters. any-llm expects the parameters to be a JSON Schema, which we easily get from our Pydantic models using model_json_schema().
When calling completion, we pass our tools list and set tool_choice="auto". This tells the LLM it can choose to call a tool if it deems it necessary. You can also specify a particular tool to force its use.
After receiving the response, we check response.choices[0].message.tool_calls. If this list is not empty, the LLM has decided to call one or more tools.
The handle_tool_call function demonstrates how you would parse the LLM’s suggested tool call (function name and arguments) and then execute your actual Python function. We use Pydantic again to validate the arguments provided by the LLM before calling our internal functions, adding another layer of safety.
For queries like “Tell me a joke,” the LLM correctly identifies that no tool is needed and provides a standard text response.

This function calling pattern is fundamental for building sophisticated LLM agents and integrating LLMs into complex workflows.

Mini-Challenge: Structured Recipe Extraction

You’ve learned how to get structured JSON and use function calling. Now, let’s combine some of these ideas!

Challenge: Create a Python script that uses any-llm and Pydantic to extract a recipe’s name, ingredients (with quantity and unit), and steps from a natural language description.

Here’s what your Pydantic models should look like:

from pydantic import BaseModel, Field

class Ingredient(BaseModel):
    name: str = Field(description="Name of the ingredient")
    quantity: float = Field(description="Numerical quantity of the ingredient")
    unit: str = Field(description="Unit of measurement for the ingredient, e.g., 'cups', 'grams', 'pieces'")

class Recipe(BaseModel):
    name: str = Field(description="The name of the recipe")
    ingredients: list[Ingredient] = Field(description="A list of ingredients with quantities and units")
    instructions: list[str] = Field(description="A list of step-by-step instructions for the recipe")
    prep_time_minutes: int | None = Field(default=None, description="Optional preparation time in minutes")
    cook_time_minutes: int | None = Field(default=None, description="Optional cooking time in minutes")

Your task:

Write a Python script.
Use the Recipe Pydantic model as the response_model for any_llm.completion.
Craft a clear user prompt (e.g., “Give me a recipe for chocolate chip cookies, including ingredients, quantities, units, and steps. Also, tell me prep and cook times.”).
Handle potential errors during the completion call.
Print the extracted Recipe object in a user-friendly format.

Hint: Pay close attention to your prompt engineering. Explicitly ask the LLM to provide the information you need for each field in your Recipe model. For quantity and unit, it’s helpful to explicitly request numerical quantities and standard units.

What to observe/learn:

How well the LLM adheres to the nested Pydantic structure.
How any-llm seamlessly converts the LLM’s JSON output into your Python objects.
The importance of clear, guiding prompts when using structured output.

Common Pitfalls & Troubleshooting

Mismatched Schema:
- Problem: The LLM generates JSON that doesn’t exactly match your Pydantic model’s schema, leading to PydanticValidationError or unexpected parsing issues.
- Reason: The LLM might misunderstand a field’s purpose, hallucinate extra fields, or fail to conform to a specific type (e.g., generating a string when an integer is expected).
- Solution:
  - Improve Prompt Engineering: Be extremely explicit in your prompt about the desired output structure, field names, types, and constraints. Provide examples if necessary.
  - Add Descriptions: Use Field(description=...) in your Pydantic models. any-llm passes these descriptions to the LLM, which helps it understand the schema better.
  - Use Stronger Models: More capable LLMs (e.g., GPT-4o, Claude 3 Opus) are significantly better at adhering to complex schemas and function calls.
  - Iterate: Test with different prompts and models to find the sweet spot.
Incorrect Tool Selection or Arguments (Function Calling):
- Problem: The LLM calls the wrong function, or provides incorrect/missing arguments for the chosen function.
- Reason: The LLM didn’t fully understand the user’s intent or the purpose/parameters of your tools.
- Solution:
  - Clear Tool Descriptions: Ensure your function.description in the tools list is concise and accurately describes what the function does.
  - Clear Parameter Descriptions: Use Field(description=...) in your function argument Pydantic models.
  - Specific Prompts: Guide the user to phrase requests in a way that clearly indicates tool usage.
  - Error Handling in handle_tool_call: Implement robust validation and error handling after the LLM suggests a tool call but before you execute your actual function. This is where Pydantic validation of arguments is critical.
Provider Limitations:
- Problem: Some older or less capable LLM providers/models might not fully support response_format={"type": "json_object"} or function calling directly, or they might implement it less reliably.
- Reason: These features are relatively new advancements.
- Solution: Check the any-llm documentation or the specific provider’s documentation for feature support. If a feature isn’t supported, you might need to fall back to a “parse raw text with regex/heuristics” approach, or choose a different LLM provider/model. any-llm aims to unify this, but underlying model capabilities still vary.

Summary

Congratulations! You’ve successfully navigated the world of structured reasoning and output formats with any-llm. Here are the key takeaways:

Structured output is vital for building reliable, automated, and scalable AI applications, moving beyond mere conversational text.
any-llm provides unified ways to request structured data, abstracting provider-specific implementations.
JSON Mode (response_format={"type": "json_object"}) is perfect for extracting data into a parseable JSON string.
Pydantic integration (response_model=YourPydanticModel) offers robust schema definition, validation, and automatic parsing, significantly increasing output reliability.
Function Calling (tools=...) transforms the LLM into an intelligent agent, allowing it to decide which external tools (your Python functions) to invoke and with what arguments.
Clear prompt engineering and detailed descriptions in your Pydantic models and tool definitions are crucial for guiding the LLM to produce accurate structured outputs.
Always be prepared for potential schema mismatches or incorrect tool choices, and implement robust error handling, often leveraging Pydantic’s validation capabilities.

You’re now equipped to build applications where LLMs don’t just talk, but act and provide data in a way that your software can directly understand and utilize. This is a monumental step towards building truly intelligent systems!

In the next chapter, we’ll explore Asynchronous Usage and Performance Tuning, learning how to make your any-llm applications even faster and more responsive.

References

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.

Structured Reasoning and Output Formats

Table of Contents

Why Structured Output is Essential

Achieving Structured Output with any-llm

1. JSON Mode: Getting Data in a JSON Format

Step-by-Step: Requesting JSON Output

2. Pydantic for Robust JSON Validation

Step-by-Step: Using Pydantic with any-llm

3. Function Calling: Giving the LLM Tools

Step-by-Step: Implementing Function Calling

Mini-Challenge: Structured Recipe Extraction

Common Pitfalls & Troubleshooting

Summary

References

Achieving Structured Output with `any-llm`

Step-by-Step: Using Pydantic with `any-llm`