Series · LangGraph from Scratch · Part 7 of 8

· 27 min read

LangGraph from Scratch, Part 7: Conversation Memory

Your bot finally remembers what you told it. A checkpointer, one thread per conversation, a New chat button, and a sidebar of past chats, all without a database.

langgraph · fastapi · memory · tutorial

Tell your bot from Part 6 your name. It will say something friendly back. Now ask it what your name is. It has no idea. Not because it's broken, but because it never had a chance: every message you send starts a brand-new conversation with no past in it. Your bot is sharp, it has tools, and it forgets you the instant you hit Send.

By the end of this page, that's fixed. Your bot holds a conversation across turns, a "New chat" button starts a clean one, and a sidebar lets you flip between past chats, all with no database and about thirty lines of code.

Today's destination. You tell it your name, change the subject, ask again, and it remembers. Same graph, same stream; it stopped starting from zero every time.

This part is almost all about one idea with a friendly name: a checkpointer. It's the piece that lets the graph save where it was and pick back up. You'll add it in one line, hit a very loud error on purpose, fix it by teaching your app to label each conversation, and then build the UI on top. Let's cure the amnesia.

Your bot has the memory of a goldfish

Watch the failure first, because naming it is half the fix. You say your name, the bot greets you, you ask it to recall the name, and it draws a complete blank.

The bug, met on purpose. The bot greeted Alex one message ago and now swears they've never met. Nothing is broken; nothing is being remembered.

Here's why. Every time the frontend calls /chat, the backend runs graph.astream_events with exactly one message in it: the one you just typed. The graph runs, the model answers, the request ends, and everything the graph held is thrown away. The next message starts a fresh run with no trace of the last one. The model isn't forgetting; there's nothing to forget from. Each turn is the bot's first day on the job.

In-memory state, taken literally. A goldfish has a famous nine-second memory; a bot whose only memory is one request is worse. The PROD label is the foreshadowing for Part 8.

Threads: one notebook per conversation

To remember a conversation, two things have to be true. The history has to be saved somewhere between requests, and your app has to know which saved history belongs to this chat. The second one is the part people skip, and it's where the whole design hinges.

Picture a drawer of notebooks, one per conversation. Each notebook has a label on its spine. When a message comes in carrying the label a1b2, the graph pulls that exact notebook, reads everything written in it so far, adds the new turn, and puts it back. A message labeled 9f3c opens a different notebook entirely. The label is the only thing that decides which memory you get.

The whole mental model. The thread_id rides in with the request, the checkpointer opens the matching notebook, and the rest of the graph never has to think about it.

Giving the graph a memory

The thing that reads and writes those notebooks is a checkpointer. LangGraph ships one that keeps everything in your computer's memory, InMemorySaver, and wiring it in is genuinely one line of import and one line of use. Open graph.py and find the spot at the bottom where you compile the graph. Right now it reads graph = builder.compile(). Give it a checkpointer:

PYTHON
from langgraph.checkpoint.memory import InMemorySaver
checkpointer = InMemorySaver()
graph = builder.compile(checkpointer=checkpointer)

That's the whole backend change to enable memory. Everything above it, the state, the llm node, the ToolNode, the conditional edge, stays exactly as you left it in Part 6. compile(checkpointer=...) hands the graph a place to save its state after every step and to reload it before the next run.

The graph now demands to know who it's talking to

Save graph.py, restart the server, and send a message from the UI. Instead of a reply, the backend falls over. Read the error; it's about to tell you exactly what's missing.

The deliberate break. The moment the graph has a memory, it refuses to run without knowing which conversation to use. This is LangGraph protecting you from silently mixing everyone's chats into one notebook.

ValueError: Checkpointer requires one or more of the following 'configurable' keys: thread_id, checkpoint_ns, checkpoint_id. You saw this family of error in Part 3 (missing API key) and Part 6 (missing Tavily key): the library checks a precondition up front and fails loudly instead of doing something quietly wrong. A graph with a checkpointer must be told which thread it's working on, every single call. You added the memory; now you have to hand it the notebook label. Right now you're passing none, so it stops you.

Telling it which conversation

Two small changes wire the thread through. First, the request needs to carry a thread_id, so add it to your ChatRequest model in main.py:

PYTHON
class ChatRequest(BaseModel):
message: str
thread_id: str

Second, token_stream has to pass that id into the graph as config. LangGraph reads the thread from a nested dict under the configurable key, and that dict goes in as the second argument to astream_events. Update the generator and the endpoint that calls it:

PYTHON
async def token_stream(message: str, thread_id: str):
inputs = {"messages": [HumanMessage(content=message)]}
config = {"configurable": {"thread_id": thread_id}}
async for event in graph.astream_events(inputs, config, version="v2"):
... # the token / tool_start / tool_end branches from Part 6, unchanged
yield sse({"type": "done"})
@app.post("/chat")
async def chat(request: ChatRequest):
return StreamingResponse(
token_stream(request.message, request.thread_id),
media_type="text/event-stream",
)

The body of the event loop doesn't change at all; the three branches you wrote in Part 6 still handle tokens and tool calls exactly as before. The only new thing is config, slotted in as the second argument, carrying the one fact the checkpointer was asking for.

Now look at what inputs still is: a single message, the one the user just typed. Not the whole history. That feels wrong the first time, so here's the picture that makes it click.

Why you only send the newest message. The checkpointer loads the saved history onto the tray before the node runs, and saves the grown tray after. You supply one line; it supplies the rest.

Remember the add_messages reducer from Part 3, the rule that appends to the message list instead of replacing it? This is its payoff. The checkpointer loads the saved messages onto the tray, your one new message gets appended by that reducer, the model sees the full conversation, and its reply gets appended and saved. You send one message; the graph remembers the rest. That's the trade the checkpointer makes for you on every turn.

The frontend picks a name for the chat

The backend is ready to remember, but the frontend isn't sending a thread_id yet. Open frontend/app/page.tsx. The plan: on first load, mint a random id and stash it in localStorage so it survives refreshes, then send it with every message.

Add a piece of state for the current thread and an effect that sets it up once:

TSX
const [threadId, setThreadId] = useState("");
useEffect(() => {
let id = localStorage.getItem("thread_id");
if (!id) {
id = crypto.randomUUID();
localStorage.setItem("thread_id", id);
}
setThreadId(id);
}, []);

On the first visit there's nothing stored, so crypto.randomUUID() mints a fresh id like a1b2c3d4-... and saves it. On every visit after, the stored id comes straight back, so the same browser keeps talking to the same notebook. Now send it: add thread_id to the body of your fetch:

TSX
body: JSON.stringify({ message: text, thread_id: threadId }),

Save both files, send your name, then ask for it back. The bot remembers. Same streaming, same tool bubbles, but now the conversation has a spine.

The payoff. The same shot from the top of the page, now real. The thread_id rode along on every request, the checkpointer kept the notebook, and the bot read its own history before answering.

A button that starts fresh

One thread is a great start, but you'll want to begin a clean conversation without the old one bleeding in. That's a "New chat" button, and it's three lines of logic: mint a new id, point localStorage at it, and clear the messages on screen.

TSX
function newChat() {
const id = crypto.randomUUID();
localStorage.setItem("thread_id", id);
setThreadId(id);
setMessages([]);
}

The old conversation isn't deleted; its notebook still sits in the checkpointer under its old id. You've just opened a brand-new blank one and pointed the UI at it. Drop a button in your card header next to the title:

TSX
<div className="flex items-center justify-between border-b px-5 py-4">
<span className="font-semibold">Chatbot</span>
<Button variant="outline" size="sm" onClick={newChat}>
+ New chat
</Button>
</div>

Click it and the screen wipes clean. Tell the new chat a different name, ask it back: it knows the new name and has no memory of the old thread. Two separate notebooks, exactly as the diagram promised.

Right now you have a bot that holds a real conversation, survives a page refresh, and can start over on demand. For most readers that's a perfect place to stop, and the part is complete. The rest is a stretch goal that's genuinely fun: a sidebar of every chat you've had, click to jump back in.

Stretch: a shelf of past conversations

Here's the honest scope before you start, because this is the one section in the series that adds real surface area. To list past chats you need to remember their ids on the frontend; to reopen one you need to rebuild its messages, which means a new backend endpoint that reads a thread's history out of the checkpointer. None of it is hard, but it's more moving parts than a one-line button. If you'd rather ship what you have, skip to the recap; nothing below changes what you already built.

Reading a thread back out of memory

The checkpointer already holds every conversation. You need a way to ask it for one. LangGraph exposes the saved state through get_state, and there's an async version, aget_state, that fits a async def endpoint cleanly. Add this to main.py:

PYTHON
class StoredMessage(BaseModel):
role: str
content: str
@app.get("/threads/{thread_id}/messages")
async def thread_messages(thread_id: str) -> list[StoredMessage]:
config = {"configurable": {"thread_id": thread_id}}
snapshot = await graph.aget_state(config)
messages = snapshot.values.get("messages", [])
return [
StoredMessage(role="user" if m.type == "human" else "assistant", content=m.content)
for m in messages
if m.type in ("human", "ai") and m.content
]

aget_state hands back a snapshot of the thread; snapshot.values is the saved state dict, and .get("messages", []) pulls the list out, falling back to empty for a thread that was never used. Each stored message carries a .type, one of human, ai, or tool, and we keep the human and assistant text, mapping it to the {role, content} shape the frontend already speaks. The and m.content filter drops the empty-bodied turns the model emits while it's calling a tool, so the rebuilt history reads like a clean conversation.

Prove it works before touching the UI. Open a Python shell in your backend and read a thread straight back:

The conversation, read back out of the checkpointer. This is exactly what the new endpoint serves: the saved history of one thread, addressed by its id.

Listing and switching threads in the UI

Back in page.tsx. The frontend needs to remember which threads exist and let you click between them. Keep a small list of { id, title } objects in state and in localStorage, alongside the active thread id:

TSX
type Thread = { id: string; title: string };
const [threads, setThreads] = useState<Thread[]>([]);
useEffect(() => {
setThreads(JSON.parse(localStorage.getItem("threads") ?? "[]"));
}, []);

When a brand-new thread sends its first message, record it with that message as its title. Add this at the top of sendMessage, right where you append the user's message:

TSX
if (messages.length === 0) {
const entry = { id: threadId, title: text };
const updated = [entry, ...threads];
setThreads(updated);
localStorage.setItem("threads", JSON.stringify(updated));
}

To reopen a thread, point at its id and pull its messages from the new endpoint. Two small functions handle it:

TSX
async function loadThread(id: string) {
const res = await fetch(`${API_BASE}/threads/${id}/messages`);
if (res.ok) setMessages(await res.json());
}
function switchThread(id: string) {
localStorage.setItem("thread_id", id);
setThreadId(id);
loadThread(id);
}

loadThread asks the backend for a thread's history and drops it into messages; the {role, content} objects it returns already match your assistant and user bubbles, so they render with no extra work. switchThread makes that thread the active one so your next message continues it. Finally, render the list as a sidebar and let the whole layout sit side by side:

TSX
<aside className="w-56 shrink-0 border-r p-3">
<Button variant="outline" size="sm" className="w-full" onClick={newChat}>
+ New chat
</Button>
<nav className="mt-3 space-y-1">
{threads.map((t) => (
<button
key={t.id}
onClick={() => switchThread(t.id)}
className={`block w-full truncate rounded-md px-3 py-2 text-left text-sm ${
t.id === threadId ? "bg-muted font-medium text-primary" : "text-muted-foreground"
}`}
>
{t.title}
</button>
))}
</nav>
</aside>

Each button is one saved thread; clicking it swaps the conversation under you. The active one gets the accent treatment so you always know which notebook you're writing in.

The finished thing. Every conversation is a notebook on the shelf; click one to open it. Each lives under its own thread_id in the same checkpointer the bot reads on every turn.

The catch hiding in InMemorySaver

You built real memory without a database, and that's the magic trick, but it's worth knowing exactly how the trick works. InMemorySaver keeps every thread in a plain dict inside the running Python process. It's fast, it's zero-setup, and it vanishes the instant that process stops. Restart the server and every conversation is gone, every notebook blank. This is the goldfish bowl labeled PROD from the comic: it works beautifully right up until something restarts.

Here's the full frontend/app/page.tsx after this part, sidebar and all, in case a piece drifted while you wired it up:

TSX
"use client";
import { useState, useRef, useEffect, type FormEvent } from "react";
import { Button } from "@/components/ui/button";
import { Input } from "@/components/ui/input";
import { Card } from "@/components/ui/card";
type Message =
| { role: "user" | "assistant"; content: string }
| { role: "tool"; name: string; args: Record<string, unknown>; result: string | null };
type Thread = { id: string; title: string };
const API_BASE = process.env.NEXT_PUBLIC_API_BASE_URL;
function ToolCallBubble({ name, args, result }: {
name: string;
args: Record<string, unknown>;
result: string | null;
}) {
const argText = Object.values(args).join(", ");
return (
<div className="text-left">
<span className="inline-flex flex-col gap-0.5 rounded-xl border border-dashed px-3 py-2 font-mono text-xs text-muted-foreground">
<span className="text-foreground">
<span className="font-semibold">{name}</span>({argText})
</span>
<span>{result === null ? "running…" : `${result}`}</span>
</span>
</div>
);
}
export default function Chat() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState("");
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const [controller, setController] = useState<AbortController | null>(null);
const [threadId, setThreadId] = useState("");
const [threads, setThreads] = useState<Thread[]>([]);
const bottomRef = useRef<HTMLDivElement>(null);
useEffect(() => {
let id = localStorage.getItem("thread_id");
if (!id) {
id = crypto.randomUUID();
localStorage.setItem("thread_id", id);
}
setThreadId(id);
setThreads(JSON.parse(localStorage.getItem("threads") ?? "[]"));
}, []);
useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
function appendToken(token: string) {
setMessages((prev) => {
const last = prev[prev.length - 1];
if (last && last.role === "assistant") {
const next = [...prev];
next[next.length - 1] = { ...last, content: last.content + token };
return next;
}
return [...prev, { role: "assistant", content: token }];
});
}
function startTool(name: string, args: Record<string, unknown>) {
setMessages((prev) => [...prev, { role: "tool", name, args, result: null }]);
}
function endTool(result: string) {
setMessages((prev) => {
const next = [...prev];
for (let i = next.length - 1; i >= 0; i--) {
const m = next[i];
if (m.role === "tool" && m.result === null) {
next[i] = { ...m, result };
break;
}
}
return next;
});
}
function newChat() {
const id = crypto.randomUUID();
localStorage.setItem("thread_id", id);
setThreadId(id);
setMessages([]);
setError(null);
}
async function loadThread(id: string) {
const res = await fetch(`${API_BASE}/threads/${id}/messages`);
if (res.ok) setMessages(await res.json());
}
function switchThread(id: string) {
localStorage.setItem("thread_id", id);
setThreadId(id);
loadThread(id);
}
function stop() {
controller?.abort();
}
async function sendMessage(e: FormEvent) {
e.preventDefault();
const text = input.trim();
if (!text || loading) return;
if (messages.length === 0) {
const updated = [{ id: threadId, title: text }, ...threads];
setThreads(updated);
localStorage.setItem("threads", JSON.stringify(updated));
}
setMessages((prev) => [...prev, { role: "user", content: text }]);
setInput("");
setLoading(true);
setError(null);
const controller = new AbortController();
setController(controller);
try {
const res = await fetch(`${API_BASE}/chat`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ message: text, thread_id: threadId }),
signal: controller.signal,
});
if (!res.ok || !res.body) throw new Error();
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const parts = buffer.split("\n\n");
buffer = parts.pop() ?? "";
for (const part of parts) {
if (!part.startsWith("data: ")) continue;
const envelope = JSON.parse(part.slice(6));
if (envelope.type === "token") appendToken(envelope.content);
else if (envelope.type === "tool_start") startTool(envelope.name, envelope.args);
else if (envelope.type === "tool_end") endTool(envelope.result);
}
}
} catch (err) {
if ((err as Error).name !== "AbortError") {
setError("Could not reach the backend. Is it running on :8000?");
}
} finally {
setLoading(false);
setController(null);
}
}
return (
<main className="mx-auto flex h-dvh max-w-4xl flex-col p-4">
<Card className="flex flex-1 overflow-hidden">
<aside className="w-56 shrink-0 border-r p-3">
<Button variant="outline" size="sm" className="w-full" onClick={newChat}>
+ New chat
</Button>
<nav className="mt-3 space-y-1">
{threads.map((t) => (
<button
key={t.id}
onClick={() => switchThread(t.id)}
className={`block w-full truncate rounded-md px-3 py-2 text-left text-sm ${
t.id === threadId ? "bg-muted font-medium text-primary" : "text-muted-foreground"
}`}
>
{t.title}
</button>
))}
</nav>
</aside>
<div className="flex flex-1 flex-col overflow-hidden">
<div className="border-b px-5 py-4 font-semibold">Chatbot</div>
<div className="flex-1 space-y-4 overflow-y-auto p-5">
{messages.map((m, i) =>
m.role === "tool" ? (
<ToolCallBubble key={i} name={m.name} args={m.args} result={m.result} />
) : (
<div key={i} className={m.role === "user" ? "text-right" : "text-left"}>
<span className={`inline-block max-w-[75%] rounded-2xl px-4 py-2 ${
m.role === "user" ? "bg-primary text-primary-foreground" : "bg-muted"
}`}>
{m.content}
{loading && m.role === "assistant" && i === messages.length - 1 && (
<span className="ml-0.5 animate-pulse"></span>
)}
</span>
</div>
)
)}
<div ref={bottomRef} />
</div>
{error && (
<p className="mx-5 mb-2 rounded-md bg-red-50 px-4 py-2 text-sm text-red-700">
{error}
</p>
)}
<form onSubmit={sendMessage} className="flex gap-2 border-t p-4">
<Input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask me anything..."
disabled={loading}
/>
{loading ? (
<Button type="button" variant="outline" onClick={stop}>
Stop
</Button>
) : (
<Button type="submit">Send</Button>
)}
</form>
</div>
</Card>
</main>
);
}

What you built

Part 7
  • A checkpointer wired into the graph with one line, builder.compile(checkpointer=InMemorySaver()), so the graph saves and reloads its own state.
  • Threads: every conversation is keyed by a thread_id, generated with crypto.randomUUID() and kept in localStorage so it survives refreshes.
  • The thread wired end to end: the frontend sends thread_id, the backend passes it as the graph's config, and the bot reads its own history before each reply.
  • A New chat button that mints a fresh thread, and a sidebar that lists past chats and reopens them via a GET /threads/{id}/messages endpoint reading from the checkpointer.
  • A clear-eyed view of the limit: InMemorySaver lives in RAM, so every conversation is wiped on restart, the problem Part 8 leaves you ready to solve.

Test yourself

Score ··
01

Why did the Part 6 bot forget your name between two messages?

02

What is a thread_id?

03

After adding the checkpointer, the server crashed with ValueError: Checkpointer requires one or more of the following 'configurable' keys. Why?

04

With memory on, why does the backend still send only the newest message to the graph, not the whole history?

05

The series uses InMemorySaver. What happens to every conversation when the server restarts?

Commit it, from the project root, in a terminal that isn't hosting a server:

BASH
git add .
git commit -m "part 7: give the bot conversation memory with a checkpointer and threads"

Your bot remembers you now, runs tools, and streams its answers, the whole thing humming on your laptop. There's one wall left: it only exists on your machine. In Part 8 you'll put it on the public internet, share the link, and watch it forget everyone the first time you redeploy, which turns out to be the perfect reason to care about everything this part taught you.