ChatGPT answers questions.
An AI Agent takes actions.
The difference is the gap between “telling me how to book a flight” and “actually booking the flight for you.” This is the most important architectural shift in software since microservices.
This is the Mastery Guide to AI Agents — from single-tool calls to multi-agent orchestration with LangGraph.
Part 1: Foundations (The Mental Model)
LLM = The Smart Advisor
A plain LLM is your brilliant friend who can advise you on anything. But they can only talk. They cannot access your email, call your bank, or update your calendar. They have no tools.
AI Agent = The Smart Employee
An Agent is an LLM that has been given Tools and the ability to decide when and how to use them.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| User: "Book me a meeting with Sarah next Tuesday at 3pm."
│
▼
[LLM Thinks]: "I need to check Sarah's calendar first."
│
▼ (Tool Call)
[Tool: get_calendar("[email protected]", "next Tuesday")]
│ Returns: "Sarah is free 2-4pm."
▼
[LLM Thinks]: "Now I'll book a 1-hour slot at 3pm."
│
▼ (Tool Call)
[Tool: create_event("Meeting with Sarah", "Tuesday 3pm", attendees=[...])]
│ Returns: "Event created. Invitation sent."
▼
Agent: "Done! Meeting booked for Tuesday at 3pm."
|
The LLM is the brain. The tools are the hands.
Multi-Agent = The Company
A single agent handles one specialty. A multi-agent system is a company of specialized workers:
- Orchestrator Agent: The Manager. Breaks down complex tasks and assigns to specialists.
- Researcher Agent: Searches the web, reads documents.
- Coder Agent: Writes and executes code.
- Critic Agent: Reviews outputs and flags errors.
Part 2: The Investigation (The ReAct Loop)
Every agent runs on a core loop called ReAct (Reason + Act):
1
2
3
4
5
6
7
8
9
10
| THOUGHT: "What do I need to do?"
↓
ACTION: Call a tool
↓
OBSERVATION: See the tool's result
↓
THOUGHT: "What did I learn? What's next?"
↓
(repeat until task is done or I give up)
FINAL ANSWER
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
| from openai import OpenAI
import json
client = OpenAI()
# Define what tools the agent can use
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
}
]
def get_weather(city: str) -> str:
# Your actual weather API call here
return f"Sunny, 28°C in {city}"
messages = [{"role": "user", "content": "What's the weather in Hanoi?"}]
# First call: LLM decides to use the tool
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
# Execute the tool
tool_call = response.choices[0].message.tool_calls[0]
result = get_weather(**json.loads(tool_call.function.arguments))
# Feed result back to LLM for final answer
messages.append(response.choices[0].message)
messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": result})
final = client.chat.completions.create(model="gpt-4o", messages=messages)
print(final.choices[0].message.content)
# "The weather in Hanoi is currently sunny with a temperature of 28°C."
|
Part 3: The Diagnosis (Agent Failure Modes)
| Problem | Symptom | Fix |
|---|
| Infinite Loop | Agent keeps calling tools without finishing | Set max_iterations=10. Detect repeated tool calls. |
| Tool Hallucination | Agent calls a tool with made-up arguments | Use strict JSON schema validation on tool inputs. |
| Context Overflow | Long agentic tasks fill up context window | Implement a “memory” layer. Summarize old steps. |
| Cascading Failure | One agent fails; the whole pipeline dies | Error handling per agent. Retry with backoff. |
| Unreliable Output | Agent gives different format each time | Use structured output (response_format={"type": "json_object"}). |
Part 4: The Resolution (LangGraph for Multi-Agent)
LangGraph is the most production-ready framework for building stateful, multi-agent workflows.
The key concept: define agents as nodes in a graph. Control flow with edges.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
| from langgraph.graph import START, END, StateGraph
from langgraph.prebuilt import ToolNode
from typing import TypedDict, Annotated
import operator
# Define the shared state between agents
class AgentState(TypedDict):
messages: Annotated[list, operator.add] # Accumulate messages
task: str
result: str
def research_agent(state: AgentState) -> AgentState:
"""Searches the web and returns findings."""
# ... call LLM with search tools ...
return {"messages": [research_result]}
def writer_agent(state: AgentState) -> AgentState:
"""Takes research and writes a report."""
# ... call LLM with the research in context ...
return {"messages": [written_report], "result": written_report}
def should_continue(state: AgentState) -> str:
"""Router: decide which agent goes next."""
if not state.get("result"):
return "writer"
return END
# Build the graph
workflow = StateGraph(AgentState)
workflow.add_node("researcher", research_agent)
workflow.add_node("writer", writer_agent)
workflow.add_edge(START, "researcher")
workflow.add_conditional_edges("researcher", should_continue)
workflow.add_edge("writer", END)
app = workflow.compile()
result = app.invoke({"task": "Write a report on AI trends in 2026"})
|
Final Mental Model
1
2
3
4
5
6
7
| LLM -> The Smart Brain. Can reason, but cannot act.
Tool -> The Hands. APIs, DB queries, code execution, web search.
Agent -> Brain + Hands. Can think and act in a loop until done.
Multi-Agent -> A Company. Specialized brains working in parallel.
ReAct Loop -> Think → Act → Observe → Repeat.
LangGraph -> The org chart software for your AI company.
|
The shift of 2026: Moving from “chatbots that answer” to “agents that do.” If your AI product only generates text, you’re already behind. The future is AI that executes.