Series · LangGraph from Scratch · Part 6 of 8

· 27 min read

LangGraph from Scratch, Part 6: Giving the Bot Tools

Your bot stops guessing. It calls a real calculator and a real web search, decides for itself when to reach for them, and streams the work back to the UI over the exact belt you built in Part 5.

langgraph · fastapi · tools · tutorial

Ask your bot from Part 5 what 23 times 17 is. It will answer with total confidence, and there's a real chance it's wrong. Ask it for today's headlines and it will invent some. The model isn't lying on purpose. It's a brilliant improviser that has never once done arithmetic or seen the internet. It pattern-matches words, and sometimes the pattern for "23 times 17" lands on the wrong number.

By the end of this page, your bot stops guessing. It gets a calculator that's always right and a web search that actually looks things up, and, the part that makes it an agent instead of a chatbot, it decides on its own when to use them.

Today's destination. One question, two tools the bot chose to call, and a final answer built from their results. The dashed bubbles are the bot showing its work.

This part has more moving pieces than the last, so here's the map: one new Python file for the tools, a few lines that let the model reach for them, a graph that can now loop, and two new envelope types riding the SSE belt from Part 5. You'll install one new package along the way. Let's give the bot some hands.

Why your bot is a confident liar

The thing to understand before any code: a language model has no calculator inside it and no live connection to anything. It generates the most plausible next word. For "the capital of France" the most plausible continuation is "Paris," and it's right. For "23 times 17" the most plausible next token is some number, and the odds it's exactly 391 are not great.

Tool calling fixes this by giving the model an escape hatch. When it hits something it can't do reliably, it stops guessing and asks your program to run a real function. The calculator computes 391 because it actually multiplies. Search returns real results because it actually queries the web. The model's job shrinks to the thing it's genuinely good at: deciding which tool to reach for and turning the result into a sentence.

What tool calling actually is

Here's the part that trips everyone up, so let's be precise. The model never runs your code. It can't. It has no Python interpreter and no access to your server. What it produces is a request: a little structured note that says "please run the tool named calculator with the argument 23 * 17." Your application reads that note, runs the real function, and hands the answer back to the model.

The handoff in one picture. The model writes down what it wants run; your app runs it; the answer rides back. The model and your code never trade places.

This is the whole idea, and it's why tool calling is safe to reason about: the model is a planner, your app is the only thing that ever executes. That division of labor is also why your first job is to write tools that are safe to hand a planner you don't fully control. Which brings us to the calculator.

A calculator that can't be turned against you

Make a new file, backend/app/tools.py. The obvious way to build a calculator in Python is eval("23 * 17"), and you must never do that. eval runs any Python, and the string is coming from a model that a stranger on the internet is steering. eval("__import__('os').system('rm -rf ~')") is also a valid expression. Hand the model eval and you've handed your server to whoever talks to it.

So we parse the expression into a syntax tree and walk it ourselves, allowing only arithmetic and refusing everything else. Start with the imports and a whitelist of the operators we permit:

PYTHON
import ast
import operator
from langchain_core.tools import tool
_OPERATORS = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
ast.Pow: operator.pow,
ast.USub: operator.neg,
}

ast.parse turns "23 * 17" into a tree of nodes: a multiplication node with two number nodes under it. Our _OPERATORS dict maps the handful of node types we trust to the Python function that performs each one. Anything not in this dict, a function call, a name, an attribute lookup, has no entry at all, and that's the security boundary.

Now the walker. It recurses through the tree, and the only nodes it knows how to handle are plain numbers and our whitelisted operators:

PYTHON
def _eval(node):
if isinstance(node, ast.Constant) and isinstance(node.value, (int, float)):
return node.value
if isinstance(node, ast.BinOp) and type(node.op) in _OPERATORS:
return _OPERATORS[type(node.op)](_eval(node.left), _eval(node.right))
if isinstance(node, ast.UnaryOp) and type(node.op) in _OPERATORS:
return _OPERATORS[type(node.op)](_eval(node.operand))
raise ValueError("unsupported expression")

Read the last line as the hero. If the tree contains anything that isn't a number, a binary operator, or a unary minus, none of the if branches match and _eval raises ValueError. __import__('os').system(...) is a function call, which is an ast.Call node, which has no branch, so it dies on raise instead of running. The whitelist is a closed door, not a list of locks to pick.

Last, wrap it as a tool the model can actually see:

PYTHON
@tool
def calculator(expression: str) -> str:
"""Evaluate a basic math expression, e.g. '23 * 17 + 5'."""
tree = ast.parse(expression, mode="eval")
return str(_eval(tree.body))

That docstring matters more than it looks. "Evaluate a basic math expression" is how the model knows this is the thing to reach for when it sees arithmetic. A vague docstring gets a tool the model never calls.

A tool that can see the internet

The calculator you wrote by hand because it's a teaching moment. For web search you'd be reinventing a search API, so we use a prebuilt one: Tavily, a search service shaped for LLMs. It has a free tier that's plenty for this. Sign up, grab an API key from the dashboard, and install the LangChain Tavily integration:

BASH
pip install langchain-tavily

Then add the search tool to the bottom of tools.py. It's a single line, because Tavily's package already ships a ready-made tool:

PYTHON
from langchain_tavily import TavilySearch
web_search = TavilySearch(max_results=3)

TavilySearch is already a tool, the same shape your @tool-decorated calculator produces, so the model can call it the same way. max_results=3 keeps replies tight. Its built-in name, the one you'll see in the stream, is tavily_search.

Now save the file and restart your server. If you skipped a step, you're about to meet a very specific error, and meeting it here on purpose is cheaper than meeting it confused at midnight.

The deliberate break. TavilySearch checks for its key the instant it's constructed, which is the instant the module is imported, which is server startup. No key, no boot.

You saw this shape back in Part 3, when ChatOpenAI couldn't find its key. Same fix, new key. TavilySearch looks for TAVILY_API_KEY the moment it's built, and the server can't even start without it. Add the key to your backend/.env, next to the LLM key that's already there:

BASH
OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...

One detail, because import order matters here. tools.py constructs TavilySearch at the top level, the moment it's imported, so it needs the .env loaded before that line runs. Add load_dotenv() to the top of tools.py so the file can find its own key no matter who imports it first:

PYTHON
from dotenv import load_dotenv
load_dotenv() # put this at the very top, above the Tavily import

Restart, and the server boots clean. You now have two tools sitting in a file. The model still has no idea they exist.

Teaching the model the tools exist

Open graph.py. The model you built in Part 3 was created with ChatOpenAI(...) or ChatAnthropic(...), whichever tab you picked. Right after that line, bind the tools to it:

PYTHON
from app.tools import calculator, web_search # with your other imports
tools = [calculator, web_search]
llm = llm.bind_tools(tools) # right after you create llm

bind_tools returns a new model object that carries the tool descriptions in every request it makes. We reassign llm to that bound version, so call_model doesn't change by a single character; it still calls llm.invoke(...), but now every call quietly includes "by the way, here are two tools you can ask for."

A graph that can change its mind

Here's the conceptual heart of the part. Until now your graph was a straight line: START to llm to END, every time. But a tool-using bot can't be a straight line, because the model might want to call a tool, see the result, and then decide it needs another one before it answers. The graph has to be able to loop.

The ReAct loop. The model reasons about whether it needs a tool, acts by calling one, observes the result, and reasons again. It leaves the loop only when it has nothing left to look up.

That pattern, reason then act then observe then repeat, is called the ReAct loop, and LangGraph gives you every piece of it prebuilt. You need three changes to the graph: a node that runs tools, an edge that decides whether to use it, and an edge that loops back. Replace your old wiring block in graph.py with this:

PYTHON
from langgraph.prebuilt import ToolNode, tools_condition
builder = StateGraph(State)
builder.add_node("llm", call_model)
builder.add_node("tools", ToolNode(tools))
builder.add_edge(START, "llm")
builder.add_conditional_edges("llm", tools_condition)
builder.add_edge("tools", "llm")
graph = builder.compile()

Three lines are new, and each one maps to a piece of the loop diagram. add_node("tools", ToolNode(tools)) adds the ACT station: ToolNode is LangGraph's prebuilt runner that reads the model's tool-call request, executes the matching function, and puts the result back on the tray. You hand it the same tools list you bound to the model.

add_conditional_edges("llm", tools_condition) is the fork. A normal edge always goes to the same place; a conditional edge runs a function to decide. tools_condition is a prebuilt checker: it looks at what the model just produced and returns "tools" if the model asked for a tool, or "__end__" if it just answered. You don't pass a routing table; tools_condition already returns the right node names. This line replaces your old add_edge("llm", END), because ending is now just one of the two things that can happen.

add_edge("tools", "llm") is the loop. After the tools node runs, the result goes back to the model, which gets to reason again. That single edge is what lets the bot chain "calculate this, then search for that, then answer."

Don't take my word for the shape. Ask LangGraph to draw it, the same way you did in Part 3:

TEXT
graph TD;
__start__([__start__]):::first
llm(llm)
tools(tools)
__end__([__end__]):::last
__start__ --> llm;
llm -.-> __end__;
llm -.-> tools;
tools --> llm;

The dotted lines out of llm are the conditional edge: it goes to one or the other, decided at runtime. The solid line from tools back to llm is your loop. Here's that same graph rendered:

Mermaid

New events on the same belt

The graph runs tools now, but the frontend can't see any of it yet. Time to collect on the design decision from Part 5. Remember the envelope: every SSE message is data: {"type": ..., ...}, and you swore that when tools arrived, the frontend parser wouldn't change, it would just learn new type values. Today's the day that pays off.

graph.astream_events already reports tool activity. Alongside the on_chat_model_stream events you handled in Part 5, the same firehose now emits on_tool_start when a tool begins and on_tool_end when it returns. Forward them as two new envelope types. Open main.py and grow your token_stream:

PYTHON
async def token_stream(message: str):
inputs = {"messages": [HumanMessage(content=message)]}
async for event in graph.astream_events(inputs, version="v2"):
kind = event["event"]
if kind == "on_chat_model_stream":
token = event["data"]["chunk"].content
if token:
yield sse({"type": "token", "content": token})
elif kind == "on_tool_start":
yield sse({
"type": "tool_start",
"name": event["name"],
"args": event["data"]["input"],
})
elif kind == "on_tool_end":
result = event["data"]["output"].content
yield sse({
"type": "tool_end",
"name": event["name"],
"result": result,
})
yield sse({"type": "done"})

The on_chat_model_stream branch is untouched from Part 5; the final answer still streams token by token (the empty chunks the model emits while writing a tool call are dropped by the if token guard you already have). The two new branches each read three fields. event["name"] is the tool's own name, like calculator. event["data"]["input"] is the arguments the model chose, like {"expression": "23 * 17"}. And the result has one gotcha worth a callout.

Save and curl it, with the -N flag from Part 5 so nothing buffers:

BASH
curl -N -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{"message": "what is 23 * 17?"}'
The same wire from Part 5, carrying two new labels. tool_start and tool_end interleave with the token envelopes; the blank-line framing and the done envelope are exactly as you left them.

There they are: tool_start and tool_end on the wire, in between the tokens, framed by the same blank line, closed by the same done. The backend is finished. The frontend just needs to learn what to do when those two new types arrive.

Tool calling, taken literally. The bot can't do 2 + 2 in its head either, but now it knows to reach for the calculator instead of guessing. Knowing which tool to grab is the whole skill.

Showing the work in the UI

Back in frontend/app/page.tsx. Your messages have been { role, content } since Part 4. A tool call doesn't fit that shape; it has a name, arguments, and a result, not a body of text. So widen the Message type into a union:

TSX
type Message =
| { role: "user" | "assistant"; content: string }
| { role: "tool"; name: string; args: Record<string, unknown>; result: string | null };

A tool message starts with result: null, the "running" state, and gets filled in when its tool_end arrives. Add two small helpers next to appendToken, one to start a tool and one to finish it:

TSX
function startTool(name: string, args: Record<string, unknown>) {
setMessages((prev) => [...prev, { role: "tool", name, args, result: null }]);
}
function endTool(result: string) {
setMessages((prev) => {
const next = [...prev];
for (let i = next.length - 1; i >= 0; i--) {
const m = next[i];
if (m.role === "tool" && m.result === null) {
next[i] = { ...m, result };
break;
}
}
return next;
});
}

startTool pushes a new tool bubble in its running state. endTool walks backward to the most recent unfinished tool and fills in its result, the same immutable-update pattern you've used since Part 4: new array, new object, fresh references so React repaints.

There's one more change, and it's a consequence of the loop. In Part 5, when the user sent a message, you added an empty assistant bubble immediately, ready to fill with tokens. But now the bot might call a tool before it says anything, and you don't want an empty grey bubble sitting above the tool call. So stop pre-creating it. On send, add only the user's message:

TSX
setMessages((prev) => [...prev, { role: "user", content: text }]);

And let appendToken create the assistant bubble on demand, the first time a token actually arrives:

TSX
function appendToken(token: string) {
setMessages((prev) => {
const last = prev[prev.length - 1];
if (last && last.role === "assistant") {
const next = [...prev];
next[next.length - 1] = { ...last, content: last.content + token };
return next;
}
return [...prev, { role: "assistant", content: token }];
});
}

If the last message is already an assistant bubble, grow it. If it isn't, because the last thing on screen is a tool result, or the conversation just started, open a fresh one. This is what lets the bot talk, run a tool, and talk again, each turn landing in its own bubble in the right order.

Now the moment Part 5 was built for. Find your reader loop, the one that splits on the blank line and parses each envelope. The parser doesn't change. You add two branches to the part that reacts to type, and nothing else moves:

TSX
for (const part of parts) {
if (!part.startsWith("data: ")) continue;
const envelope = JSON.parse(part.slice(6));
if (envelope.type === "token") appendToken(envelope.content);
else if (envelope.type === "tool_start") startTool(envelope.name, envelope.args);
else if (envelope.type === "tool_end") endTool(envelope.result);
}

That's it. That's the entire reward for designing an envelope in Part 5 instead of streaming bare tokens. Two new lines, no rewrite, no new parser. The buffer, the \n\n split, the kept tail: all of it works untouched, because you taught it to read a shape, not a token.

Last, a component to render a tool bubble so it reads as the bot working, not as a chat message. Add a small ToolCallBubble above your Chat component:

TSX
function ToolCallBubble({ name, args, result }: {
name: string;
args: Record<string, unknown>;
result: string | null;
}) {
const argText = Object.values(args).join(", ");
return (
<div className="text-left">
<span className="inline-flex flex-col gap-0.5 rounded-xl border border-dashed px-3 py-2 font-mono text-xs text-muted-foreground">
<span className="text-foreground">
<span className="font-semibold">{name}</span>({argText})
</span>
<span>{result === null ? "running…" : `${result}`}</span>
</span>
</div>
);
}

The dashed border and monospace font are deliberate: they make a tool call look like a machine doing work, clearly not the same thing as a person or the bot talking. Then, in your messages.map, send tool messages to it and everything else to the bubble you already have:

TSX
{messages.map((m, i) =>
m.role === "tool" ? (
<ToolCallBubble key={i} name={m.name} args={m.args} result={m.result} />
) : (
<div key={i} className={m.role === "user" ? "text-right" : "text-left"}>
{/* the same <span> bubble (with the streaming caret) from Part 5, unchanged */}
</div>
)
)}

Save and send "What is 23 times 17?". A dashed calculator(23 * 17) bubble pops in reading "running…", flips to "→ 391" a beat later, and then the answer streams in below it.

The tool mid-run. The bubble appears the instant tool_start lands and shows 'running…' until tool_end fills in the result. The bot is visibly thinking with its hands.

Watch it reach for both

Now the question this whole part was building toward. Ask it something that needs the calculator and the web in one breath:

What is 23 times 17, and what's the population of Tokyo?

Watch the loop run. The model reasons, decides it needs the calculator, and a calculator(23 * 17) bubble appears and resolves to 391. The result goes back to the model, which reasons again, decides it also needs the web, and a tavily_search bubble appears and resolves with Tokyo's population. That goes back too, and only now, with both facts in hand, does the model write its final answer, streaming in word by word under the two tool bubbles. You can see the dessert shot at the top of this page; that's your bot, running the loop twice and showing every step.

Right now you have: a bot that holds two real tools, decides on its own when to use them, runs them in a loop that can chain one into the next, and narrates every tool call and result to the UI over the same stream you built last time. It stopped being a chatbot. It's an agent.

What you built

Part 6
  • A safe calculator tool built on an ast walker with an operator whitelist, so the model can do exact math without ever running eval on untrusted input.
  • A real web_search tool from Tavily, wired in with one line and its own API key in .env.
  • Tools bound to the model with llm.bind_tools(...), so it can ask for them, and a graph that loops through a prebuilt ToolNode via tools_condition.
  • The ReAct loop running end to end: the bot reasons, acts, observes the result, and reasons again until it can answer.
  • Two new envelope types, tool_start and tool_end, streamed over the Part 5 belt and rendered as live tool bubbles, with the frontend parser unchanged.

Test yourself

Score ··
01

When the model 'calls a tool,' what does it actually do?

02

Why does the calculator walk an ast tree instead of calling Python's eval() on the expression?

03

What does builder.add_conditional_edges('llm', tools_condition) do?

04

Adding tool events to the UI needed only two new lines in the frontend's reader. Why so little?

05

On an on_tool_end event, event['data']['output'] is a ToolMessage. How do you get the result string for the envelope?

Commit it, from the project root, in a terminal that isn't hosting a server:

BASH
git add .
git commit -m "part 6: give the bot a calculator and web search via tools"

Your bot is sharp now, but it has the memory of a goldfish. Tell it your name, then ask what your name is, and it will have no idea, because every message you send starts a brand-new conversation with no past. In Part 7 you'll give it a memory, so it can hold a thread across turns and you can finally have a real conversation.