Welcome to a truly exciting chapter where we turn theory into practice! In our previous discussions, we’ve explored the foundational concepts of AI workflow languages, agent operating systems, and orchestration engines. Now, it’s time to get our hands dirty and build a simplified, yet insightful, collaborative AI assistant that brings these ideas to life.
In this chapter, you’ll embark on a hands-on journey to create a system where multiple AI agents work together to achieve a complex goal: researching a specific topic and generating a concise summary. This project will solidify your understanding of multi-agent collaboration, tool integration, and basic orchestration, preparing you for more advanced frameworks like OpenFang and ChatDev. Get ready to write some code and see your agents in action!
Before we dive in, please ensure you have a basic grasp of Python programming, understand how to interact with Large Language Model (LLM) APIs, and are familiar with the core concepts of agents, tools, and orchestration as covered in earlier chapters. Let’s make some AI magic happen!
Core Concepts for Our Collaborative Assistant
Our objective is to construct an AI assistant capable of taking a user query (e.g., “What is quantum computing?”) and producing a well-structured summary. To accomplish this, we’ll design a system featuring two specialized agents and a straightforward orchestrator:
- Researcher Agent: This agent’s primary responsibility will be to find relevant information by utilizing an external “tool,” which will simulate a search engine.
- Summarizer Agent: This agent will receive the raw information gathered by the Researcher Agent and condense it into a coherent, easy-to-understand summary.
- Orchestrator: This will be our central Python script, acting as the conductor of our agent ensemble. It will guide the flow of information between the agents, ensuring they complete their tasks in the correct sequence to achieve the overall goal.
Why adopt this approach? It beautifully illustrates the power of multi-agent collaboration and specialization. Instead of relying on one monolithic AI to handle every aspect of the task, we break down the problem into smaller, more manageable sub-tasks. Each sub-task is then assigned to an “expert” agent designed specifically for that role. This modularity not only makes our system more robust and easier to debug but also significantly enhances its scalability.
Let’s visualize this workflow to get a clearer picture of how our agents will interact:
This project focuses on building these fundamental components from scratch. This hands-on experience will provide a solid foundation before you explore more advanced frameworks like OpenFang (an Agent Operating System that manages agent lifecycle, memory, and communication) or ChatDev (a multi-agent collaboration framework specifically designed for software development).
What Exactly is a “Tool” for an AI Agent?
In the context of AI agent systems, a “tool” is essentially a function, an API call, or a piece of software that an agent can invoke to interact with the external world or perform a specific task. Think of it as an agent’s specialized gadget! LLMs are incredibly powerful for generating and understanding text, but they don’t inherently possess real-time access to the internet, or the ability to run code, or interact with databases. That’s where tools come in.
For our project, we’ll simulate an external search API using a simple Python function that returns some pre-defined or dynamically generated text. In a real-world, production-grade scenario, this search_tool would make an actual API call to services like Google Search, Bing Search, or a specialized internal knowledge base. The key idea is to give our agents capabilities beyond their core LLM reasoning.
Understanding Our Simplified Agent Structure
Each of our agents will adhere to a basic, yet effective, structure that allows them to perform their designated roles:
- Perceive: Agents must be able to receive input, such as a research query from the orchestrator or raw text to be summarized.
- Plan (Implicit): For this project, our agents will have an implicit, pre-defined “plan” based on their assigned role (e.g., “research this query” or “summarize this text”). In more sophisticated agent operating systems, this planning phase could involve complex reasoning, decision-making, and even self-correction.
- Act: Agents will perform an action based on their perception and plan. This might involve calling an external tool, making a request to an LLM, or processing data internally.
- Respond: Finally, agents will provide their output or result back to the orchestrator or another agent, completing their part of the workflow.
Step-by-Step Implementation: Building Our Assistant
Let’s get our hands on the keyboard and start coding our collaborative AI assistant! We’ll build this system incrementally, explaining each piece as we go.
Step 0: Environment Setup
First things first, let’s prepare our Python environment. As of 2026-03-20, Python 3.10 or newer is widely recommended for most AI development due to its performance improvements and feature set.
Create a Project Directory: Open your terminal or command prompt and create a new directory for our project.
mkdir collaborative_ai_assistant cd collaborative_ai_assistantSet up a Virtual Environment: It’s always best practice to use a virtual environment to manage project dependencies and avoid conflicts with other Python projects.
python3.10 -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activateYou should see
(venv)at the beginning of your terminal prompt, indicating the virtual environment is active.Install Dependencies: We’ll need a library to interact with Large Language Models. We’ll use the
openailibrary as a widely adopted example, but the core principles could be adapted for Anthropic’s Claude, Google’s Gemini, or others. We’ll also usepython-dotenvto manage API keys securely.pip install openai==1.14.0 python-dotenv==1.0.1 # Using example versions as of 2026-03-20These specific versions are provided for consistency with the prompt’s date.
Set up your LLM API Key: Create a file named
.envin the root of yourcollaborative_ai_assistantproject directory. This file will store your OpenAI API key. It is critically important to never hardcode API keys directly into your source code, especially if it might be shared or committed to version control!# .env OPENAI_API_KEY="sk-YOUR_OPENAI_API_KEY_HERE"Action: Replace
"sk-YOUR_OPENAI_API_KEY_HERE"with your actual OpenAI API key. If you don’t have one, you’ll need to create an account and generate a key on the OpenAI Platform.
Step 1: Define Our “Tool” - A Simple Search Function
Let’s start by creating the external “tool” our agents can use. Create a new Python file named tools.py in your project directory.
# tools.py
def search_tool(query: str) -> str:
"""
Simulates an external search engine.
In a real application, this would make an API call to a search service
like Google Search, Bing Search, or a specialized knowledge base.
"""
print(f"DEBUG: Search tool called with query: '{query}'")
# Simulate different search results based on the query content.
# This avoids needing a live API key for a search engine for this project.
if "quantum computing" in query.lower():
return (
"Quantum computing is a new type of computing that uses the principles of quantum mechanics "
"to solve complex problems that are beyond the capability of classical computers. "
"It leverages phenomena like superposition and entanglement. "
"Companies like IBM and Google are at the forefront of its development, "
"focusing on improving qubit stability and error correction."
)
elif "artificial intelligence" in query.lower():
return (
"Artificial intelligence (AI) is a broad field of computer science that gives computers "
"the ability to perform tasks that typically require human intelligence. "
"This includes learning, problem-solving, perception, and understanding language. "
"Machine learning and deep learning are key subfields of AI, driving innovations "
"in areas like natural language processing and computer vision."
)
else:
return f"No specific information found for '{query}'. Here's a generic response about technology: Technology constantly evolves, driving innovation across various sectors, from healthcare to entertainment. Modern tech often involves data science and automation."
Explanation of tools.py:
- We define a simple Python function called
search_toolthat accepts aquerystring as input. - The
printstatement is a handy debugging aid, letting us know when the tool is invoked and with what query. - Inside the function, we use basic
if/elif/elsestatements to check for keywords in the query. Based on these keywords, the function returns different pre-defined strings. This effectively simulates how a real search engine would return relevant results without requiring us to set up and pay for an actual search API for this introductory project. It’s a placeholder to demonstrate the concept of tool use.
Step 2: Create a BaseAgent Class
Next, let’s create a foundational class for all our agents. This BaseAgent class will encapsulate common functionalities, particularly how agents interact with the Large Language Model. Create a new Python file named agents.py.
# agents.py
import os
from abc import ABC, abstractmethod # For creating abstract base classes
from dotenv import load_dotenv
from openai import OpenAI # Using OpenAI's client library
# Load environment variables from the .env file
load_dotenv()
class BaseAgent(ABC):
"""
An abstract base class for all AI agents.
Provides common functionalities like LLM interaction.
"""
def __init__(self, name: str, llm_model: str = "gpt-4-turbo-preview", temperature: float = 0.7):
"""
Initializes the base agent with a name and LLM configuration.
Args:
name (str): The name of the agent (e.g., "Researcher", "Summarizer").
llm_model (str): The name of the LLM model to use. Defaults to "gpt-4-turbo-preview"
as a modern and capable model as of 2026-03-20.
temperature (float): Controls the randomness of the LLM's output. Lower values (e.g., 0.1-0.5)
make output more deterministic, higher values (e.g., 0.7-1.0) more creative.
"""
self.name = name
# Initialize the OpenAI client using the API key from environment variables
self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
self.llm_model = llm_model
self.temperature = temperature
print(f"DEBUG: Initializing {self.name} with model {self.llm_model}")
def _call_llm(self, messages: list[dict]) -> str:
"""
Internal helper method to make a call to the LLM API.
Args:
messages (list[dict]): A list of message dictionaries in the OpenAI chat format.
Returns:
str: The content of the LLM's response, or an error message if the call fails.
"""
try:
response = self.client.chat.completions.create(
model=self.llm_model,
messages=messages,
temperature=self.temperature,
)
# Extract and return the content of the first message choice
return response.choices[0].message.content
except Exception as e:
# Basic error handling for LLM API calls
print(f"ERROR: LLM call failed for {self.name}: {e}")
return "An error occurred during LLM processing."
@abstractmethod
def run(self, *args, **kwargs) -> str:
"""
Abstract method that every specific agent must implement.
This method defines the agent's core operational logic.
"""
pass
Explanation of agents.py (BaseAgent):
import os,abc,load_dotenv,OpenAI: We import all necessary modules.ABC(Abstract Base Class) andabstractmethodare crucial for creating a blueprint that ensures all classes inheriting fromBaseAgentprovide their own implementation of therunmethod.load_dotenv(): This line is vital! It loads theOPENAI_API_KEYfrom our.envfile into the system’s environment variables, making it accessible to our Python script.class BaseAgent(ABC): This declares our abstract base class.__init__(self, name, llm_model, temperature):- This is the constructor for our base agent. It takes a
name(e.g., “Researcher”), thellm_modelto use (defaulting togpt-4-turbo-preview, a powerful model as of 2026-03-20), andtemperatureto control LLM creativity. - It initializes the
OpenAIclient using the API key securely loaded from the environment variables.
- This is the constructor for our base agent. It takes a
_call_llm(self, messages):- This is a protected helper method (indicated by the leading underscore) designed to interact with the LLM API. It takes a list of
messagesformatted according to OpenAI’s chat completion API. - It sends these messages to the specified
llm_modeland returns the generated response content. - Basic
try-exceptblock is included for rudimentary error handling during API calls.
- This is a protected helper method (indicated by the leading underscore) designed to interact with the LLM API. It takes a list of
@abstractmethod def run(self, *args, **kwargs): This is the most important part of our abstract class. It signals that any concrete class that inherits fromBaseAgentmust provide its own implementation for therunmethod. Thisrunmethod will contain the specific, unique logic for each specialized agent.
Step 3: Implement the ResearcherAgent
Now, let’s create our first concrete agent class, ResearcherAgent. This agent will leverage our search_tool to gather information. Add this code to your existing agents.py file, after the BaseAgent class.
# agents.py (add this to the existing agents.py file)
# ... (existing imports and BaseAgent class) ...
from .tools import search_tool # Import our search tool from the same package
class ResearcherAgent(BaseAgent):
"""
A specialized agent responsible for researching topics using an external search tool.
"""
def __init__(self, name: str = "Researcher"):
"""
Initializes the Researcher Agent.
"""
super().__init__(name) # Call the BaseAgent's constructor
self.tools = {"search": search_tool} # The Researcher Agent "knows" about the search tool
print(f"{self.name} is ready to research.")
def run(self, query: str) -> str:
"""
The Researcher Agent's core logic: uses the search tool to find information
based on the provided query.
Args:
query (str): The topic or question to research.
Returns:
str: The raw information found by the search tool.
"""
print(f"{self.name} received query: '{query}'")
# Invoke the search tool using its dictionary key
search_results = self.tools["search"](query)
print(f"{self.name} found information (snippet): '{search_results[:100]}...'")
return search_results
Explanation of ResearcherAgent:
from .tools import search_tool: We explicitly import thesearch_toolfunction we defined earlier from ourtools.pyfile. The.indicates thattools.pyis in the same directory asagents.py.class ResearcherAgent(BaseAgent): This declares ourResearcherAgentclass, explicitly stating that it inherits fromBaseAgent.__init__(self, name):super().__init__(name): This is crucial! It calls the constructor of theBaseAgentclass, ensuring that common functionalities like LLM client initialization are set up.self.tools = {"search": search_tool}: This is how the agent “learns” about the tools it has access to. We store oursearch_toolfunction in a dictionary, making it callable by its string key"search".
run(self, query):- This is the concrete implementation of the
abstractmethodrunfor theResearcherAgent. - It takes a
querystring as input. self.tools["search"](query): This line executes our simulated search tool, passing thequeryto it. The tool returns a string of information.- The agent then returns this
search_resultsstring.
- This is the concrete implementation of the
Step 4: Implement the SummarizerAgent
Next, let’s create the SummarizerAgent in the same agents.py file. This agent will directly use the LLM’s capabilities to condense text. Add this code after the ResearcherAgent class.
# agents.py (add this to the existing agents.py file)
# ... (existing imports, BaseAgent, ResearcherAgent classes) ...
class SummarizerAgent(BaseAgent):
"""
A specialized agent responsible for summarizing provided text using an LLM.
"""
def __init__(self, name: str = "Summarizer"):
"""
Initializes the Summarizer Agent.
"""
super().__init__(name) # Call the BaseAgent's constructor
print(f"{self.name} is ready to summarize.")
def run(self, text_to_summarize: str) -> str:
"""
The Summarizer Agent's core logic: uses the LLM to summarize provided text.
Args:
text_to_summarize (str): The raw text content to be summarized.
Returns:
str: A concise summary generated by the LLM.
"""
print(f"{self.name} received text for summarization (snippet): '{text_to_summarize[:100]}...'")
# Construct the messages list for the LLM API call.
# The 'system' message sets the persona and instructions for the LLM.
# The 'user' message provides the actual content to be summarized.
messages = [
{"role": "system", "content": "You are an expert summarizer. Condense the given text into a concise, informative, and neutral summary. Focus on the main points."},
{"role": "user", "content": f"Please summarize the following text:\n\n{text_to_summarize}"}
]
# Call the inherited _call_llm method to get the summary from the LLM.
summary = self._call_llm(messages)
print(f"{self.name} generated summary (snippet): '{summary[:100]}...'")
return summary
Explanation of SummarizerAgent:
class SummarizerAgent(BaseAgent): This declares ourSummarizerAgentclass, also inheriting fromBaseAgent.__init__(self, name):super().__init__(name): Again, we call theBaseAgentconstructor for common setup. This agent doesn’t directly use external tools like theResearcherAgent; its core capability comes from its direct interaction with the LLM.
run(self, text_to_summarize):- This is the concrete implementation of the
runmethod for theSummarizerAgent. - It takes
text_to_summarizeas input, which will typically be the raw information returned by theResearcherAgent. - Prompt Engineering: We construct a list of
messagesfor the LLM. Thesystemmessage is crucial here; it gives the LLM a specific persona and instructions (“You are an expert summarizer…”). Theusermessage then provides the actual content the LLM needs to process. This is a fundamental pattern for guiding LLMs to perform specific tasks. self._call_llm(messages): The agent calls the inherited helper method to send these messages to the LLM and retrieve the summary.- It then returns the generated
summary.
- This is the concrete implementation of the
Step 5: Build the Orchestrator (Main Script)
Finally, let’s create the brain of our operation: the orchestrator. This script will instantiate our agents and coordinate their workflow, passing information between them in the correct sequence. Create a new Python file named orchestrator.py in your project directory.
# orchestrator.py
from agents import ResearcherAgent, SummarizerAgent
def main():
"""
The main function to orchestrate the collaborative AI assistant workflow.
"""
print("--- Starting Collaborative AI Assistant ---")
# 1. Instantiate our specialized agents
# The orchestrator is responsible for bringing the agents to life.
researcher = ResearcherAgent()
summarizer = SummarizerAgent()
# 2. Define the user's initial query or task
user_query = "What are the latest advancements in quantum computing?"
# Try another query to see different tool results:
# user_query = "Explain artificial intelligence and its subfields."
print(f"\nOrchestrator: User query received: '{user_query}'")
# 3. Orchestrate the workflow: Guiding information flow between agents
# Step A: Delegate the research task to the Researcher Agent.
# The orchestrator tells the researcher what to do.
print("\nOrchestrator: Delegating research task to Researcher Agent...")
raw_information = researcher.run(user_query)
# The output of the Researcher Agent (raw_information) becomes the input for the next step.
# Step B: Delegate the summarization task to the Summarizer Agent.
# The orchestrator takes the raw information and passes it to the summarizer.
print("\nOrchestrator: Delegating summarization task to Summarizer Agent...")
final_summary = summarizer.run(raw_information)
# The output of the Summarizer Agent (final_summary) is our desired end result.
# 4. Present the final result to the user
print("\n--- Final Output ---")
print(f"Original Query: {user_query}")
print("\nGenerated Summary:")
print(final_summary)
print("\n--- Collaborative AI Assistant Finished ---")
if __name__ == "__main__":
main()
Explanation of orchestrator.py:
from agents import ResearcherAgent, SummarizerAgent: We import the two agent classes we just created, making them available in our orchestrator script.main()function: This function encapsulates the entire workflow.- Instantiation:
researcher = ResearcherAgent()andsummarizer = SummarizerAgent()create instances of our specialized agents. This is the orchestrator “assembling” its team. - User Query:
user_queryholds the initial task that our collaborative assistant needs to perform. - Orchestration Logic: This is the core of the workflow:
- The orchestrator first calls
researcher.run(user_query). It delegates the research task to theResearcherAgentand captures its output (raw_information). - Crucially, it then takes this
raw_information(which is the output from the researcher) and passes it as input to thesummarizer.run(raw_information). This demonstrates the sequential flow of information and task delegation between specialized agents.
- The orchestrator first calls
- Output: Finally, it prints the original query and the
final_summarygenerated through the collaborative effort of our agents.
- Instantiation:
if __name__ == "__main__": main(): This standard Python idiom ensures that themain()function is called only when the script is executed directly (not when imported as a module).
Step 6: Run Your Collaborative AI Assistant!
You’ve built all the pieces! Now, it’s time to see your collaborative AI assistant in action.
Open your terminal or command prompt. Make sure your virtual environment is activated (you should see (venv) at the beginning of your prompt). Then, run your orchestrator script:
python orchestrator.py
What to Observe: You should see a sequence of debug messages and final output that looks similar to this (actual LLM-generated summary content will vary slightly each time):
--- Starting Collaborative AI Assistant ---
DEBUG: Initializing Researcher with model gpt-4-turbo-preview
Researcher is ready to research.
DEBUG: Initializing Summarizer with model gpt-4-turbo-preview
Summarizer is ready to summarize.
Orchestrator: User query received: 'What are the latest advancements in quantum computing?'
Orchestrator: Delegating research task to Researcher Agent...
Researcher received query: 'What are the latest advancements in quantum computing?'
DEBUG: Search tool called with query: 'What are the latest advancements in quantum computing?'
Researcher found information (snippet): 'Quantum computing is a new type of computing that uses the principles of quantum mechanics to solve complex...'
Orchestrator: Delegating summarization task to Summarizer Agent...
Summarizer received text for summarization (snippet): 'Quantum computing is a new type of computing that uses the principles of quantum mechanics to solve complex...'
Summarizer generated summary (snippet): 'Quantum computing utilizes quantum mechanics principles like superposition and entanglement to tackle complex...'
--- Final Output ---
Original Query: What are the latest advancements in quantum computing?
Generated Summary:
Quantum computing, a novel field leveraging quantum mechanics (superposition, entanglement) to solve problems intractable for classical computers, is rapidly advancing. Key progress includes enhanced qubit stability and coherence, the development of fault-tolerant quantum architectures, and refined algorithms like Shor's and Grover's for practical use. Major players such as IBM, Google, and Rigetti are leading hardware innovation, while researchers explore diverse applications in materials science, drug discovery, and financial modeling. The sector is actively pursuing quantum advantage, marking a transformative era in computation.
--- Collaborative AI Assistant Finished ---
Congratulations! You’ve successfully built a basic multi-agent system where a ResearcherAgent uses a tool to gather information, and a SummarizerAgent processes that information using an LLM, all coordinated by an Orchestrator. This is the very essence of advanced AI engineering and agentic workflows!
Mini-Challenge: Add a Fact-Checker Agent
You’ve done an excellent job creating specialized agents and orchestrating their tasks. Now, let’s enhance our system further by adding another layer of intelligence!
Challenge: Create a new agent called FactCheckerAgent. This agent should take the generated summary and a specific claim within that summary, then use the search_tool to attempt to verify that claim.
Steps to Implement:
Create
FactCheckerAgentinagents.py:- It should inherit from
BaseAgent. - Its
__init__method should callsuper().__init__(name)and potentially define its own tools (it will need thesearch_tool!). - Its
runmethod should takesummary_text(the full summary) andclaim_to_verify(a specific sentence or phrase from the summary) as input. - Inside
run, it should first use thesearch_tool(just like theResearcherAgent) with theclaim_to_verifyas the query to gather external evidence. - Then, it should use its internal LLM (
self._call_llm) to compare thesearch_resultswith theclaim_to_verifyand determine if the claim is supported, contradicted, or unclear based only on the search results. Return a clear verdict (e.g., “supported”, “contradicted”, “unclear”) along with a brief explanation.
- It should inherit from
Integrate into
orchestrator.py:- Instantiate the
FactCheckerAgentin themain()function, alongside your other agents. - After the
SummarizerAgentgenerates itsfinal_summary, add a new orchestration step. - Choose a specific, verifiable claim from the
final_summary(or a known fact related to the topic) to pass to theFactCheckerAgent. - Call the
FactCheckerAgent’srunmethod with thefinal_summaryand your chosenclaim_to_verify. - Print the verdict returned by the
FactCheckerAgent.
- Instantiate the
Hint for FactCheckerAgent.run prompt:
- Your
FactCheckerAgent.runmethod’smessagesto the LLM might look something like this to guide its reasoning:# Inside FactCheckerAgent.run(...) # ... # First, get search_results using self.tools["search"](claim_to_verify) messages = [ {"role": "system", "content": "You are a diligent and unbiased fact-checker. Your task is to compare a given claim against provided search results. State your verdict as 'SUPPORTED', 'CONTRADICTED', or 'UNCLEAR'. Provide a concise explanation for your verdict, citing information from the search results if possible."}, {"role": "user", "content": f"Claim to verify: '{claim_to_verify}'\n\nSearch Results:\n{search_results}\n\nBased ONLY on the search results, what is your verdict and explanation?"} ] verdict_and_explanation = self._call_llm(messages) return verdict_and_explanation - Remember to import your new
FactCheckerAgentintoorchestrator.py!
What to observe/learn from this challenge:
- How easily you can extend your multi-agent system by adding new specialized agents.
- The critical importance of precise prompt engineering for LLMs to perform specific, nuanced tasks like fact-checking accurately.
- The iterative nature of designing and refining agent interactions and their prompts.
Give it a shot! Don’t worry if it’s not perfect on the first try; debugging and refining agent prompts and workflows is a core part of modern AI engineering.
Common Pitfalls & Troubleshooting
Working with multi-agent systems, especially those leveraging LLMs, can introduce unique challenges compared to traditional software development. Their emergent behaviors and reliance on external APIs mean you’ll encounter different types of issues. Here are a few common pitfalls and how to address them:
LLM API Errors (Rate Limits, Invalid Keys, Network Issues):
- Pitfall: Your program crashes with an API-related error, or
_call_llmreturns “An error occurred during LLM processing.” - Troubleshooting:
- Check
.envFile: Double-check that yourOPENAI_API_KEYis correctly set in your.envfile and that the key itself is valid and active. Ensure there are no extra spaces or characters. - API Dashboard: Log into your OpenAI (or other LLM provider) account. Verify that your API key is active, you have sufficient credit, and your usage hasn’t hit any hard limits.
- Rate Limits: LLM providers impose limits on how many requests you can make per minute or second. For this simple project, it’s unlikely to be an issue, but in larger, concurrent systems, you would need to implement robust retry logic with exponential backoff.
- Network Connectivity: Ensure your machine has an active and stable internet connection to reach the LLM API endpoints.
- Check
- Pitfall: Your program crashes with an API-related error, or
Agent “Hallucinations” or Irrelevant/Poor Output:
- Pitfall: The
SummarizerAgentproduces a summary that doesn’t quite align with the raw information, or theFactCheckerAgent(from the mini-challenge) gives a strange or incorrect verdict. - Troubleshooting:
- Prompt Engineering is Key! This is often the root cause. Review your
systemandusermessages for each agent. Are they clear, concise, and unambiguous? Try to be more specific about the desired output format, tone, or constraints. For example, explicitly tell theSummarizerAgentto “focus on the main points and avoid speculation.” - Temperature Adjustment: The
temperatureparameter inBaseAgent’s__init__(or_call_llm) controls the randomness of the LLM’s output. A highertemperature(e.g., 0.8-1.0) makes the LLM more creative and potentially less factual, while a lowertemperature(e.g., 0.1-0.5) makes it more deterministic and factual. Experiment with this value. - Context Window Limits: While modern LLMs like
gpt-4-turbo-previewhave very large context windows, ensure that thetext_to_summarizeorsearch_resultsaren’t too long. Extremely verbose inputs can sometimes lead to the LLM “losing track” of earlier information. - Tool Output Quality: Remember the “garbage in, garbage out” principle. If the
search_tool(even our simulated one) returns irrelevant or poor-quality information to theSummarizerAgent, the summary will likely reflect that poor quality.
- Prompt Engineering is Key! This is often the root cause. Review your
- Pitfall: The
Tool Integration and Invocation Issues:
- Pitfall: The
search_toolisn’t called, or it’s called with the wrong arguments, leading to unexpected behavior or errors. - Troubleshooting:
printDebugging: Utilizeprintstatements extensively (like theDEBUGones we added) to trace the flow of execution, confirm when tools are called, and inspect the values of variables passed to and returned from tools.- Dictionary Keys: Double-check that the tool’s name (the string key) in
self.tools(e.g.,"search") exactly matches how it’s called (e.g.,self.tools["search"]). Typographical errors here are common. - Import Paths: Ensure that
from .tools import search_toolis correct and thattools.pyis in the expected location relative toagents.py.
- Pitfall: The
Managing Agent State and Memory (Future Consideration):
- Pitfall: In more complex, conversational, or long-running scenarios, agents might “forget” previous interactions or contextual information crucial for their current task.
- Troubleshooting: For our current simple project, this isn’t a major issue because context is explicitly passed from one agent to the next. However, in real-world applications, you would implement explicit memory mechanisms. This could involve:
- Conversational History: Passing previous user and assistant messages as part of the
messageslist to the LLM. - Knowledge Graphs or Vector Databases: Storing and retrieving relevant information from an AI-native database to augment the LLM’s context.
- This is precisely where full-fledged Agent Operating Systems like OpenFang (v0.3.30) provide robust solutions for managing agent state, memory, and persistent context.
- Conversational History: Passing previous user and assistant messages as part of the
Remember, debugging agentic systems is a dynamic process that often involves a blend of traditional software debugging techniques and the art of prompt engineering. Be patient, experiment with your prompts, and use your print statements generously to understand the internal workings of your agents!
Summary
In this hands-on chapter, you’ve taken a significant step from theoretical understanding to practical application in AI engineering. You’ve experienced firsthand the power of breaking down complex problems into manageable, collaborative tasks for AI agents. Here’s a quick recap of our key accomplishments:
- Built a Multi-Agent System: You successfully designed and implemented a foundational collaborative AI assistant featuring a
ResearcherAgentand aSummarizerAgent. - Practiced Tool Integration: You gained practical experience in how AI agents can extend their capabilities by leveraging external functions, demonstrated through our
search_tool. - Implemented Basic Orchestration: You created an
orchestrator.pyscript that effectively defines the workflow and manages the sequential interaction and information flow between specialized agents. - Understood Agent Specialization: You experienced the benefits of breaking down a complex task into smaller, specialized roles for different agents, leading to a more modular, efficient, and understandable system.
- Gained Practical Experience with LLM APIs: You directly used the OpenAI API client to integrate a powerful large language model into your agent system, learning about prompt engineering in the process.
This project provided a solid foundational understanding of how concepts like multi-agent collaboration, tool use, and orchestration come together to form intelligent systems. While our current assistant is simplified, it lays the groundwork for understanding and eventually working with more sophisticated frameworks like OpenFang (an Agent Operating System for comprehensive agent management) and ChatDev (a multi-agent collaboration framework specifically designed for automated software development).
In the next chapters, we’ll continue our exploration into more advanced aspects of AI engineering, diving deeper into topics like dedicated AI workflow languages, AI-native IDEs, and the future trends shaping this incredibly exciting field. Keep building, keep experimenting, and keep learning!
References
- OpenAI API Documentation
- RightNow-AI/openfang - Agent Operating System - GitHub
- ChatDev 2.0: Dev All through LLM-powered Multi-Agent Collaboration - GitHub
- Python Official Documentation
- python-dotenv GitHub Repository
- Mermaid.js Live Editor (for diagram validation)
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.