Introduction to Robust Error Handling
Welcome back, future AI architect! In the previous chapters, we’ve explored the fascinating world of any-llm – Mozilla’s unified interface for Large Language Models. You’ve learned how to set up your environment, make basic completion calls, and configure different LLM providers. But what happens when things don’t go as planned? What if an API key is wrong, the network flickers, or a model is overloaded?
That’s where robust error handling comes into play! Just like a sturdy bridge needs to withstand unexpected winds and tremors, your AI applications need to gracefully handle errors and exceptions. Ignoring errors can lead to brittle applications that crash unexpectedly, provide poor user experiences, or even incur unnecessary costs.
In this chapter, we’ll dive deep into any-llm’s approach to error handling. We’ll learn how to anticipate, catch, and intelligently respond to various issues that can arise when interacting with LLMs. By the end, you’ll be equipped to build more resilient and production-ready AI systems.
Before we begin, make sure you have a basic understanding of Python’s try...except blocks and how to make a simple any-llm completion call, as covered in Chapter 3 and 4.
Understanding any-llm’s Exception Hierarchy
When you interact with various LLM providers, each might return errors in its own unique format. This is precisely where any-llm shines, by abstracting these differences and presenting a unified, consistent set of exceptions. This consistency makes your code cleaner and easier to maintain, as you don’t need to write provider-specific error logic.
At its core, any-llm provides a hierarchy of exception classes, all inheriting from a base AnyLLMError. This structure allows you to catch specific errors for granular control or broader errors for general handling.
Let’s visualize a conceptual any-llm exception hierarchy using a class diagram:
Conceptual any-llm Exception Hierarchy
As you can see, AnyLLMError is the parent, with more specific errors branching off. AnyLLMAPIError often carries details like status_code and provider_error_code, which are crucial for debugging.
Common Error Types in any-llm
While the exact exception names might evolve, here are common categories you’ll encounter and how any-llm typically represents them:
AnyLLMConnectionError:- What it is: Occurs when there’s a problem establishing or maintaining a network connection with the LLM provider. This could be due to your internet connection, the provider’s servers being down, or DNS issues.
- Why it’s important: These are often transient errors, meaning they might resolve themselves if you retry the request after a short delay.
AnyLLMAuthenticationError:- What it is: Indicates that your API key is missing, invalid, or expired. The LLM provider refused your request because it couldn’t verify your identity.
- Why it’s important: This usually requires user intervention (e.g., setting the correct API key) and is not typically resolved by retries.
AnyLLMRateLimitError:- What it is: The LLM provider has rejected your request because you’ve exceeded the allowed number of requests within a given timeframe. Providers implement rate limits to prevent abuse and ensure fair usage.
- Why it’s important: Like connection errors, these are often transient. Implementing a retry strategy with exponential backoff is crucial here.
AnyLLMInvalidRequestError:- What it is: Your request payload (e.g., the prompt, model name, temperature setting) contains an error that the LLM provider cannot process. This means your input is malformed or violates the provider’s rules.
- Why it’s important: This usually points to a bug in your application’s logic or prompt engineering. Retrying without fixing the request won’t help.
AnyLLMAPIError(General API Error):- What it is: A broad category for various issues on the LLM provider’s side that don’t fit into the more specific categories above. This could include internal server errors, model failures, or other unexpected issues.
- Why it’s important: Sometimes transient, sometimes indicative of a larger issue with the provider or model. It often contains more detailed error messages from the provider.
AnyLLMTimeoutError:- What it is: A specific type of
AnyLLMAPIError(or sometimes a direct subclass ofAnyLLMError) that occurs when the LLM provider takes too long to respond, exceeding a predefined timeout period. - Why it’s important: Can be transient due to network latency or model overload. Retries can be effective.
- What it is: A specific type of
By understanding these distinctions, you can write more intelligent and robust error handling logic.
Strategies for Handling Errors
Now that we know the types of errors, how do we handle them effectively?
- Graceful Degradation: When an LLM call fails, can your application still function, perhaps with reduced capabilities? For example, if a creative generation fails, can you fall back to a simpler, cached response or inform the user politely?
- Retries with Exponential Backoff: For transient errors like
AnyLLMConnectionErrororAnyLLMRateLimitError, simply retrying the request after a short delay can often resolve the issue. Exponential backoff means you increase the delay between retries exponentially (e.g., 1s, then 2s, then 4s, 8s…). This prevents you from overwhelming the service and gives it time to recover. - Logging: Always log errors! Detailed logs are your best friend for debugging in development and monitoring in production. Include timestamps, error types, messages, and any relevant request details (but be mindful of sensitive information).
- Alerting: For critical errors in production, integrate with an alerting system (e.g., PagerDuty, Slack, email) to notify your team immediately.
Step-by-Step Implementation: Handling any-llm Exceptions
Let’s put these concepts into practice. We’ll start with a basic any-llm completion call and then progressively add error handling.
First, ensure you have any-llm-sdk installed. We’ll install with a common provider, for example, mistral. As of December 2025, the installation command is:
pip install 'any-llm-sdk[mistral]'
And set your API key as an environment variable (replace YOUR_MISTRAL_API_KEY with your actual key):
export MISTRAL_API_KEY="YOUR_MISTRAL_API_KEY"
Or, if you’re using a different provider, ensure its respective API key environment variable (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY) is set.
Now, let’s create a Python file named error_handling_example.py.
Step 1: Basic try...except for AnyLLMError
We’ll start by catching the most generic any-llm exception. This is a good first step to prevent your application from crashing due to any issue from any-llm.
# error_handling_example.py
import os
import logging
from any_llm import completion, AnyLLMError
# Configure logging for better visibility
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
def get_llm_response(prompt: str, provider: str = "mistral") -> str | None:
"""
Attempts to get a completion from an LLM provider,
handling generic any-llm errors.
"""
logging.info(f"Attempting to get LLM response for provider: {provider}")
try:
response = completion(
model="mistral-large", # Or "gpt-4o", "claude-3-opus-20240229", etc.
messages=[{"role": "user", "content": prompt}],
provider=provider
)
return response.choices[0].message.content
except AnyLLMError as e:
logging.error(f"An any-llm error occurred: {e}")
return None
except Exception as e:
# Catch any other unexpected Python errors
logging.critical(f"An unexpected non-any-llm error occurred: {e}")
return None
if __name__ == "__main__":
test_prompt = "What is the capital of France?"
response_content = get_llm_response(test_prompt)
if response_content:
print(f"\nLLM Response: {response_content}")
else:
print("\nFailed to get a response from the LLM.")
Explanation:
- We import
logging,completion, andAnyLLMError. logging.basicConfigsets up basic logging to show INFO messages and above.- The
get_llm_responsefunction wraps thecompletioncall in atry...exceptblock. except AnyLLMError as e:catches any error specifically raised byany-llm. We log it as an error.except Exception as e:is a fallback for any other Python error that might occur, which is logged as critical. While useful as a catch-all, it’s generally better to catch more specific exceptions when possible.
To Test:
- Run the script:
python error_handling_example.py - It should print the capital of France.
- To simulate an error: Temporarily unset or mess up your
MISTRAL_API_KEYenvironment variable (e.g.,export MISTRAL_API_KEY="INVALID_KEY"in your terminal, then run the script). You should see anAnyLLMErrorlogged. Remember to set it back to a valid key afterward!
Step 2: Handling Specific any-llm Exceptions
Now, let’s refine our error handling to differentiate between common problems. This allows us to provide more targeted feedback or implement specific recovery strategies.
Modify your error_handling_example.py to include more specific except blocks:
# error_handling_example.py (continued)
import os
import logging
import time
from any_llm import completion
from any_llm import (
AnyLLMError,
AnyLLMConnectionError,
AnyLLMAuthenticationError,
AnyLLMRateLimitError,
AnyLLMInvalidRequestError,
AnyLLMAPIError # Catches general API errors, including timeouts if not explicitly caught
)
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
def get_llm_response_specific_errors(prompt: str, provider: str = "mistral") -> str | None:
"""
Attempts to get a completion from an LLM provider,
handling specific any-llm errors.
"""
logging.info(f"Attempting to get LLM response for provider: {provider}")
try:
response = completion(
model="mistral-large",
messages=[{"role": "user", "content": prompt}],
provider=provider
)
return response.choices[0].message.content
except AnyLLMAuthenticationError:
logging.error("Authentication failed. Please check your API key for the provider.")
return None
except AnyLLMRateLimitError:
logging.warning("Rate limit exceeded. Please wait and try again later.")
return None
except AnyLLMConnectionError:
logging.error("Network connection issue. Please check your internet connection or provider status.")
return None
except AnyLLMInvalidRequestError as e:
logging.error(f"Invalid request parameters: {e}. Check your prompt or model configuration.")
return None
except AnyLLMAPIError as e:
logging.error(f"An API error occurred with the LLM provider: {e.status_code} - {e.message}. "
f"Provider error code: {getattr(e, 'provider_error_code', 'N/A')}")
return None
except AnyLLMError as e: # Catch any other general any-llm error not specifically handled
logging.error(f"An unexpected any-llm error occurred: {e}")
return None
except Exception as e:
logging.critical(f"An entirely unexpected Python error occurred: {e}")
return None
if __name__ == "__main__":
# ... (keep the previous if __name__ block for testing)
print("\n--- Testing with specific error handling ---")
test_prompt = "Explain the concept of quantum entanglement in simple terms."
response_content = get_llm_response_specific_errors(test_prompt)
if response_content:
print(f"\nLLM Response (specific errors): {response_content}")
else:
print("\nFailed to get a response from the LLM with specific error handling.")
Explanation:
- We imported specific
any-llmexception classes. - The
exceptblocks are now ordered from most specific to most general. This is important because Python will catch the first matchingexceptblock. For instance,AnyLLMAuthenticationErroris caught beforeAnyLLMError. - Each
exceptblock provides a more informative log message tailored to the specific error. ForAnyLLMAPIError, we try to extractstatus_codeandmessagefor better debugging.
To Test:
- Authentication Error: Again, temporarily invalidate your
MISTRAL_API_KEY. You should now see the “Authentication failed…” message. - Invalid Request Error: You could try passing an unsupported
modelto simulate this, thoughany-llmmight abstract some of these. A more direct simulation would involve providing a prompt that’s too long or malformed if the provider’s API has strict limits. For this exercise, assume if you pass a model name that doesn’t exist, it might trigger anAnyLLMInvalidRequestErrororAnyLLMAPIError.
Step 3: Implementing Retries with Exponential Backoff
For transient errors like AnyLLMConnectionError and AnyLLMRateLimitError, retries are essential. Let’s build a simple retry mechanism into our function.
# error_handling_example.py (continued)
import os
import logging
import time
import random # For adding jitter to backoff
from any_llm import completion
from any_llm import (
AnyLLMError,
AnyLLMConnectionError,
AnyLLMAuthenticationError,
AnyLLMRateLimitError,
AnyLLMInvalidRequestError,
AnyLLMAPIError
)
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
def get_llm_response_with_retries(prompt: str, provider: str = "mistral", max_retries: int = 3) -> str | None:
"""
Attempts to get a completion from an LLM provider,
handling specific any-llm errors with retries for transient issues.
"""
logging.info(f"Attempting to get LLM response for provider: {provider}")
for attempt in range(1, max_retries + 1):
try:
logging.info(f"Attempt {attempt}/{max_retries} for LLM completion.")
response = completion(
model="mistral-large",
messages=[{"role": "user", "content": prompt}],
provider=provider
)
return response.choices[0].message.content
except (AnyLLMConnectionError, AnyLLMRateLimitError) as e:
# These are transient errors, so we'll retry
wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1) # Exponential backoff with jitter
logging.warning(f"Transient error encountered: {e}. Waiting {wait_time:.2f} seconds before retrying...")
time.sleep(wait_time)
except AnyLLMAuthenticationError:
logging.error("Authentication failed. Please check your API key for the provider. Not retrying.")
return None
except AnyLLMInvalidRequestError as e:
logging.error(f"Invalid request parameters: {e}. Check your prompt or model configuration. Not retrying.")
return None
except AnyLLMAPIError as e:
# For general API errors, check if it's potentially transient or a server error
if getattr(e, 'status_code', 500) >= 500: # Server-side errors might be transient
wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)
logging.warning(f"Server-side API error ({e.status_code}): {e.message}. Waiting {wait_time:.2f} seconds before retrying...")
time.sleep(wait_time)
else:
logging.error(f"Non-retryable API error occurred: {e.status_code} - {e.message}. Not retrying.")
return None
except AnyLLMError as e:
logging.error(f"An unexpected any-llm error occurred: {e}. Not retrying.")
return None
except Exception as e:
logging.critical(f"An entirely unexpected Python error occurred: {e}. Not retrying.")
return None
logging.error(f"Failed to get LLM response after {max_retries} attempts.")
return None
if __name__ == "__main__":
# ... (keep the previous if __name__ blocks for testing)
print("\n--- Testing with retries and exponential backoff ---")
test_prompt = "What are the benefits of learning Python for AI development?"
response_content = get_llm_response_with_retries(test_prompt, max_retries=3)
if response_content:
print(f"\nLLM Response (with retries): {response_content}")
else:
print("\nFailed to get a response from the LLM after multiple retries.")
Explanation:
- We added
import randomfor “jitter” (a small random component) to the backoff time. This helps prevent many clients from retrying at the exact same moment, which can cause a “thundering herd” problem. - The
for attempt in range(1, max_retries + 1):loop manages our retry attempts. AnyLLMConnectionErrorandAnyLLMRateLimitErrorare grouped together, as they both benefit from retries.wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)calculates the exponential backoff. For attempt 1, it’s2^0 + jitter(1s + jitter); for attempt 2, it’s2^1 + jitter(2s + jitter), and so on.time.sleep(wait_time)pauses execution.- For
AnyLLMAPIError, we add a check: if thestatus_codeis 5xx (server error), we consider it potentially transient and retry. Otherwise, it’s treated as a non-retryable error (e.g., a 4xx client error). - Non-retryable errors (authentication, invalid request, or truly unexpected errors) cause the function to return
Noneimmediately. - If all retries fail, a final error message is logged, and
Noneis returned.
This robust function is now much more resilient to temporary network glitches or provider-side load.
Mini-Challenge: Simulate and Handle a Rate Limit
Let’s put your new knowledge to the test!
Challenge:
Modify the get_llm_response_with_retries function (or create a new one based on it) to simulate a AnyLLMRateLimitError on the first attempt, but then succeed on a subsequent retry.
Hint:
Inside your get_llm_response_with_retries function, at the very beginning of the try block, you can temporarily add a condition like if attempt == 1: raise AnyLLMRateLimitError("Simulated rate limit"). Observe how your exponential backoff and retry logic handles this. Make sure to remove this simulation code after you’ve completed the challenge!
What to observe/learn:
- How the
warninglog for the rate limit appears. - How the
time.sleepcorrectly pauses execution. - That the function eventually succeeds after the simulated failure.
- The importance of
random.uniform(0, 1)(jitter) in making the wait times slightly unpredictable.
# Mini-Challenge: Add this to your error_handling_example.py
if __name__ == "__main__":
# ... previous test blocks ...
print("\n--- Mini-Challenge: Simulating Rate Limit ---")
# Define a new function or modify the existing one temporarily
def get_llm_response_simulate_rate_limit(prompt: str, provider: str = "mistral", max_retries: int = 3) -> str | None:
"""
Simulates a rate limit error on the first attempt and retries.
"""
logging.info(f"Attempting LLM response for provider: {provider} with rate limit simulation.")
for attempt in range(1, max_retries + 1):
try:
logging.info(f"Simulation Attempt {attempt}/{max_retries} for LLM completion.")
# --- SIMULATION CODE START ---
if attempt == 1:
logging.warning("Simulating AnyLLMRateLimitError on first attempt!")
raise AnyLLMRateLimitError("Simulated rate limit exceeded for testing purposes.")
# --- SIMULATION CODE END ---
response = completion(
model="mistral-large",
messages=[{"role": "user", "content": prompt}],
provider=provider
)
return response.choices[0].message.content
except (AnyLLMConnectionError, AnyLLMRateLimitError) as e:
wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)
logging.warning(f"Transient error encountered: {e}. Waiting {wait_time:.2f} seconds before retrying...")
time.sleep(wait_time)
except AnyLLMAuthenticationError:
logging.error("Authentication failed. Not retrying.")
return None
except AnyLLMInvalidRequestError as e:
logging.error(f"Invalid request parameters: {e}. Not retrying.")
return None
except AnyLLMAPIError as e:
if getattr(e, 'status_code', 500) >= 500:
wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)
logging.warning(f"Server-side API error ({e.status_code}): {e.message}. Waiting {wait_time:.2f} seconds before retrying...")
time.sleep(wait_time)
else:
logging.error(f"Non-retryable API error occurred: {e.status_code} - {e.message}. Not retrying.")
return None
except AnyLLMError as e:
logging.error(f"An unexpected any-llm error occurred: {e}. Not retrying.")
return None
except Exception as e:
logging.critical(f"An entirely unexpected Python error occurred: {e}. Not retrying.")
return None
logging.error(f"Failed to get LLM response after {max_retries} attempts.")
return None
challenge_prompt = "Tell me a short story about a brave squirrel."
challenge_response = get_llm_response_simulate_rate_limit(challenge_prompt, max_retries=3)
if challenge_response:
print(f"\nLLM Response (Challenge Success): {challenge_response}")
else:
print("\nLLM Response (Challenge Failed) after retries.")
Run your script and observe the logs! You should see the simulated rate limit, a pause, and then a successful completion.
Common Pitfalls & Troubleshooting
Even with robust error handling, developers can encounter issues. Here are a few common pitfalls:
- Catching
Exceptiontoo broadly: Whileexcept Exception as e:acts as a safety net, relying on it too heavily can hide specific problems. Always try to catch specificany-llmexceptions first. If you catchExceptiontoo high up, you might be retrying for anAnyLLMAuthenticationError(which won’t help) or missing crucial debugging info. - Not implementing retries for transient errors: Forgetting exponential backoff for
AnyLLMRateLimitErrororAnyLLMConnectionErrorwill make your application brittle and prone to failure under moderate load or network instability. - Ignoring error messages:
AnyLLMAPIErrorandAnyLLMInvalidRequestErroroften contain valuablemessageandstatus_codeattributes. Log these details! They are crucial for understanding why an error occurred. - Infinite retries or too many retries: While retries are good, an infinite loop or too many attempts can overwhelm the API provider, consume your credits, or make your application unresponsive. Always set a
max_retrieslimit. - Hardcoding API keys: Never embed your API keys directly in your code. Always use environment variables or a secure configuration management system. This is a security best practice that prevents accidental exposure in version control.
Summary
Phew! You’ve just mastered a crucial aspect of building reliable AI applications. Let’s recap what we’ve covered:
- Why error handling matters: It makes your applications resilient, user-friendly, and easier to debug.
any-llm’s unified exception hierarchy: It simplifies handling errors across different LLM providers, with a baseAnyLLMErrorand specific subclasses.- Common
any-llmerror types: We exploredAnyLLMConnectionError,AnyLLMAuthenticationError,AnyLLMRateLimitError,AnyLLMInvalidRequestError, andAnyLLMAPIError, understanding their causes and implications. - Effective error handling strategies: Including graceful degradation, retries with exponential backoff (and jitter!), and comprehensive logging.
- Hands-on implementation: You built Python code demonstrating basic, specific, and retry-enabled error handling for
any-llmcalls. - Mini-Challenge: You practiced simulating and handling a rate limit error.
- Common pitfalls: You learned to avoid broad
Exceptioncatches, neglecting retries, ignoring error details, excessive retries, and hardcoding API keys.
By applying these principles, you’re well on your way to developing robust and professional AI solutions with any-llm.
What’s next? In the upcoming chapter, we’ll explore how to handle multiple LLM requests efficiently using asynchronous programming with any-llm, allowing your applications to perform many tasks concurrently without getting bogged down!
References
- Mozilla.ai any-llm GitHub Repository: The primary source for the
any-llmSDK, including installation instructions and code examples. - Introducing any-llm: A unified API to access any LLM provider (Mozilla.ai Blog): Provides context on the motivation and features of
any-llm v1.0. - Python
loggingmodule documentation: Essential for understanding how to effectively log messages in your Python applications. - Python
timemodule documentation: Used for pausing execution in retry mechanisms. - Retries with Exponential Backoff: A common pattern for handling transient errors in networked applications.
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.