Series · LangGraph from Scratch · Part 7 of 8
· 27 min read
LangGraph from Scratch, Part 7: Conversation Memory
Your bot finally remembers what you told it. A checkpointer, one thread per conversation, a New chat button, and a sidebar of past chats, all without a database.
langgraph · fastapi · memory · tutorial
Tell your bot from Part 6 your name. It will say something friendly back. Now ask it what your name is. It has no idea. Not because it's broken, but because it never had a chance: every message you send starts a brand-new conversation with no past in it. Your bot is sharp, it has tools, and it forgets you the instant you hit Send.
By the end of this page, that's fixed. Your bot holds a conversation across turns, a "New chat" button starts a clean one, and a sidebar lets you flip between past chats, all with no database and about thirty lines of code.
This part is almost all about one idea with a friendly name: a checkpointer. It's the piece that lets the graph save where it was and pick back up. You'll add it in one line, hit a very loud error on purpose, fix it by teaching your app to label each conversation, and then build the UI on top. Let's cure the amnesia.
Your bot has the memory of a goldfish
Watch the failure first, because naming it is half the fix. You say your name, the bot greets you, you ask it to recall the name, and it draws a complete blank.
Here's why. Every time the frontend calls /chat, the backend runs graph.astream_events with exactly one message in it: the one you just typed. The graph runs, the model answers, the request ends, and everything the graph held is thrown away. The next message starts a fresh run with no trace of the last one. The model isn't forgetting; there's nothing to forget from. Each turn is the bot's first day on the job.
Threads: one notebook per conversation
To remember a conversation, two things have to be true. The history has to be saved somewhere between requests, and your app has to know which saved history belongs to this chat. The second one is the part people skip, and it's where the whole design hinges.
Picture a drawer of notebooks, one per conversation. Each notebook has a label on its spine. When a message comes in carrying the label a1b2, the graph pulls that exact notebook, reads everything written in it so far, adds the new turn, and puts it back. A message labeled 9f3c opens a different notebook entirely. The label is the only thing that decides which memory you get.
Giving the graph a memory
The thing that reads and writes those notebooks is a checkpointer. LangGraph ships one that keeps everything in your computer's memory, InMemorySaver, and wiring it in is genuinely one line of import and one line of use. Open graph.py and find the spot at the bottom where you compile the graph. Right now it reads graph = builder.compile(). Give it a checkpointer:
from langgraph.checkpoint.memory import InMemorySaver
checkpointer = InMemorySaver()graph = builder.compile(checkpointer=checkpointer)That's the whole backend change to enable memory. Everything above it, the state, the llm node, the ToolNode, the conditional edge, stays exactly as you left it in Part 6. compile(checkpointer=...) hands the graph a place to save its state after every step and to reload it before the next run.
The graph now demands to know who it's talking to
Save graph.py, restart the server, and send a message from the UI. Instead of a reply, the backend falls over. Read the error; it's about to tell you exactly what's missing.
ValueError: Checkpointer requires one or more of the following 'configurable' keys: thread_id, checkpoint_ns, checkpoint_id. You saw this family of error in Part 3 (missing API key) and Part 6 (missing Tavily key): the library checks a precondition up front and fails loudly instead of doing something quietly wrong. A graph with a checkpointer must be told which thread it's working on, every single call. You added the memory; now you have to hand it the notebook label. Right now you're passing none, so it stops you.
Telling it which conversation
Two small changes wire the thread through. First, the request needs to carry a thread_id, so add it to your ChatRequest model in main.py:
class ChatRequest(BaseModel): message: str thread_id: strSecond, token_stream has to pass that id into the graph as config. LangGraph reads the thread from a nested dict under the configurable key, and that dict goes in as the second argument to astream_events. Update the generator and the endpoint that calls it:
async def token_stream(message: str, thread_id: str): inputs = {"messages": [HumanMessage(content=message)]} config = {"configurable": {"thread_id": thread_id}} async for event in graph.astream_events(inputs, config, version="v2"): ... # the token / tool_start / tool_end branches from Part 6, unchanged yield sse({"type": "done"})
@app.post("/chat")async def chat(request: ChatRequest): return StreamingResponse( token_stream(request.message, request.thread_id), media_type="text/event-stream", )The body of the event loop doesn't change at all; the three branches you wrote in Part 6 still handle tokens and tool calls exactly as before. The only new thing is config, slotted in as the second argument, carrying the one fact the checkpointer was asking for.
Now look at what inputs still is: a single message, the one the user just typed. Not the whole history. That feels wrong the first time, so here's the picture that makes it click.
Remember the add_messages reducer from Part 3, the rule that appends to the message list instead of replacing it? This is its payoff. The checkpointer loads the saved messages onto the tray, your one new message gets appended by that reducer, the model sees the full conversation, and its reply gets appended and saved. You send one message; the graph remembers the rest. That's the trade the checkpointer makes for you on every turn.
The frontend picks a name for the chat
The backend is ready to remember, but the frontend isn't sending a thread_id yet. Open frontend/app/page.tsx. The plan: on first load, mint a random id and stash it in localStorage so it survives refreshes, then send it with every message.
Add a piece of state for the current thread and an effect that sets it up once:
const [threadId, setThreadId] = useState("");
useEffect(() => { let id = localStorage.getItem("thread_id"); if (!id) { id = crypto.randomUUID(); localStorage.setItem("thread_id", id); } setThreadId(id);}, []);On the first visit there's nothing stored, so crypto.randomUUID() mints a fresh id like a1b2c3d4-... and saves it. On every visit after, the stored id comes straight back, so the same browser keeps talking to the same notebook. Now send it: add thread_id to the body of your fetch:
body: JSON.stringify({ message: text, thread_id: threadId }),Save both files, send your name, then ask for it back. The bot remembers. Same streaming, same tool bubbles, but now the conversation has a spine.
A button that starts fresh
One thread is a great start, but you'll want to begin a clean conversation without the old one bleeding in. That's a "New chat" button, and it's three lines of logic: mint a new id, point localStorage at it, and clear the messages on screen.
function newChat() { const id = crypto.randomUUID(); localStorage.setItem("thread_id", id); setThreadId(id); setMessages([]);}The old conversation isn't deleted; its notebook still sits in the checkpointer under its old id. You've just opened a brand-new blank one and pointed the UI at it. Drop a button in your card header next to the title:
<div className="flex items-center justify-between border-b px-5 py-4"> <span className="font-semibold">Chatbot</span> <Button variant="outline" size="sm" onClick={newChat}> + New chat </Button></div>Click it and the screen wipes clean. Tell the new chat a different name, ask it back: it knows the new name and has no memory of the old thread. Two separate notebooks, exactly as the diagram promised.
Right now you have a bot that holds a real conversation, survives a page refresh, and can start over on demand. For most readers that's a perfect place to stop, and the part is complete. The rest is a stretch goal that's genuinely fun: a sidebar of every chat you've had, click to jump back in.
Stretch: a shelf of past conversations
Here's the honest scope before you start, because this is the one section in the series that adds real surface area. To list past chats you need to remember their ids on the frontend; to reopen one you need to rebuild its messages, which means a new backend endpoint that reads a thread's history out of the checkpointer. None of it is hard, but it's more moving parts than a one-line button. If you'd rather ship what you have, skip to the recap; nothing below changes what you already built.
Reading a thread back out of memory
The checkpointer already holds every conversation. You need a way to ask it for one. LangGraph exposes the saved state through get_state, and there's an async version, aget_state, that fits a async def endpoint cleanly. Add this to main.py:
class StoredMessage(BaseModel): role: str content: str
@app.get("/threads/{thread_id}/messages")async def thread_messages(thread_id: str) -> list[StoredMessage]: config = {"configurable": {"thread_id": thread_id}} snapshot = await graph.aget_state(config) messages = snapshot.values.get("messages", []) return [ StoredMessage(role="user" if m.type == "human" else "assistant", content=m.content) for m in messages if m.type in ("human", "ai") and m.content ]aget_state hands back a snapshot of the thread; snapshot.values is the saved state dict, and .get("messages", []) pulls the list out, falling back to empty for a thread that was never used. Each stored message carries a .type, one of human, ai, or tool, and we keep the human and assistant text, mapping it to the {role, content} shape the frontend already speaks. The and m.content filter drops the empty-bodied turns the model emits while it's calling a tool, so the rebuilt history reads like a clean conversation.
Prove it works before touching the UI. Open a Python shell in your backend and read a thread straight back:
Listing and switching threads in the UI
Back in page.tsx. The frontend needs to remember which threads exist and let you click between them. Keep a small list of { id, title } objects in state and in localStorage, alongside the active thread id:
type Thread = { id: string; title: string };const [threads, setThreads] = useState<Thread[]>([]);
useEffect(() => { setThreads(JSON.parse(localStorage.getItem("threads") ?? "[]"));}, []);When a brand-new thread sends its first message, record it with that message as its title. Add this at the top of sendMessage, right where you append the user's message:
if (messages.length === 0) { const entry = { id: threadId, title: text }; const updated = [entry, ...threads]; setThreads(updated); localStorage.setItem("threads", JSON.stringify(updated));}To reopen a thread, point at its id and pull its messages from the new endpoint. Two small functions handle it:
async function loadThread(id: string) { const res = await fetch(`${API_BASE}/threads/${id}/messages`); if (res.ok) setMessages(await res.json());}
function switchThread(id: string) { localStorage.setItem("thread_id", id); setThreadId(id); loadThread(id);}loadThread asks the backend for a thread's history and drops it into messages; the {role, content} objects it returns already match your assistant and user bubbles, so they render with no extra work. switchThread makes that thread the active one so your next message continues it. Finally, render the list as a sidebar and let the whole layout sit side by side:
<aside className="w-56 shrink-0 border-r p-3"> <Button variant="outline" size="sm" className="w-full" onClick={newChat}> + New chat </Button> <nav className="mt-3 space-y-1"> {threads.map((t) => ( <button key={t.id} onClick={() => switchThread(t.id)} className={`block w-full truncate rounded-md px-3 py-2 text-left text-sm ${ t.id === threadId ? "bg-muted font-medium text-primary" : "text-muted-foreground" }`} > {t.title} </button> ))} </nav></aside>Each button is one saved thread; clicking it swaps the conversation under you. The active one gets the accent treatment so you always know which notebook you're writing in.
The catch hiding in InMemorySaver
You built real memory without a database, and that's the magic trick, but it's worth knowing exactly how the trick works. InMemorySaver keeps every thread in a plain dict inside the running Python process. It's fast, it's zero-setup, and it vanishes the instant that process stops. Restart the server and every conversation is gone, every notebook blank. This is the goldfish bowl labeled PROD from the comic: it works beautifully right up until something restarts.
Here's the full frontend/app/page.tsx after this part, sidebar and all, in case a piece drifted while you wired it up:
"use client";
import { useState, useRef, useEffect, type FormEvent } from "react";import { Button } from "@/components/ui/button";import { Input } from "@/components/ui/input";import { Card } from "@/components/ui/card";
type Message = | { role: "user" | "assistant"; content: string } | { role: "tool"; name: string; args: Record<string, unknown>; result: string | null };
type Thread = { id: string; title: string };
const API_BASE = process.env.NEXT_PUBLIC_API_BASE_URL;
function ToolCallBubble({ name, args, result }: { name: string; args: Record<string, unknown>; result: string | null;}) { const argText = Object.values(args).join(", "); return ( <div className="text-left"> <span className="inline-flex flex-col gap-0.5 rounded-xl border border-dashed px-3 py-2 font-mono text-xs text-muted-foreground"> <span className="text-foreground"> <span className="font-semibold">{name}</span>({argText}) </span> <span>{result === null ? "running…" : `→ ${result}`}</span> </span> </div> );}
export default function Chat() { const [messages, setMessages] = useState<Message[]>([]); const [input, setInput] = useState(""); const [loading, setLoading] = useState(false); const [error, setError] = useState<string | null>(null); const [controller, setController] = useState<AbortController | null>(null); const [threadId, setThreadId] = useState(""); const [threads, setThreads] = useState<Thread[]>([]); const bottomRef = useRef<HTMLDivElement>(null);
useEffect(() => { let id = localStorage.getItem("thread_id"); if (!id) { id = crypto.randomUUID(); localStorage.setItem("thread_id", id); } setThreadId(id); setThreads(JSON.parse(localStorage.getItem("threads") ?? "[]")); }, []);
useEffect(() => { bottomRef.current?.scrollIntoView({ behavior: "smooth" }); }, [messages]);
function appendToken(token: string) { setMessages((prev) => { const last = prev[prev.length - 1]; if (last && last.role === "assistant") { const next = [...prev]; next[next.length - 1] = { ...last, content: last.content + token }; return next; } return [...prev, { role: "assistant", content: token }]; }); }
function startTool(name: string, args: Record<string, unknown>) { setMessages((prev) => [...prev, { role: "tool", name, args, result: null }]); }
function endTool(result: string) { setMessages((prev) => { const next = [...prev]; for (let i = next.length - 1; i >= 0; i--) { const m = next[i]; if (m.role === "tool" && m.result === null) { next[i] = { ...m, result }; break; } } return next; }); }
function newChat() { const id = crypto.randomUUID(); localStorage.setItem("thread_id", id); setThreadId(id); setMessages([]); setError(null); }
async function loadThread(id: string) { const res = await fetch(`${API_BASE}/threads/${id}/messages`); if (res.ok) setMessages(await res.json()); }
function switchThread(id: string) { localStorage.setItem("thread_id", id); setThreadId(id); loadThread(id); }
function stop() { controller?.abort(); }
async function sendMessage(e: FormEvent) { e.preventDefault(); const text = input.trim(); if (!text || loading) return;
if (messages.length === 0) { const updated = [{ id: threadId, title: text }, ...threads]; setThreads(updated); localStorage.setItem("threads", JSON.stringify(updated)); }
setMessages((prev) => [...prev, { role: "user", content: text }]); setInput(""); setLoading(true); setError(null);
const controller = new AbortController(); setController(controller);
try { const res = await fetch(`${API_BASE}/chat`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ message: text, thread_id: threadId }), signal: controller.signal, }); if (!res.ok || !res.body) throw new Error();
const reader = res.body.getReader(); const decoder = new TextDecoder(); let buffer = "";
while (true) { const { value, done } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const parts = buffer.split("\n\n"); buffer = parts.pop() ?? ""; for (const part of parts) { if (!part.startsWith("data: ")) continue; const envelope = JSON.parse(part.slice(6)); if (envelope.type === "token") appendToken(envelope.content); else if (envelope.type === "tool_start") startTool(envelope.name, envelope.args); else if (envelope.type === "tool_end") endTool(envelope.result); } } } catch (err) { if ((err as Error).name !== "AbortError") { setError("Could not reach the backend. Is it running on :8000?"); } } finally { setLoading(false); setController(null); } }
return ( <main className="mx-auto flex h-dvh max-w-4xl flex-col p-4"> <Card className="flex flex-1 overflow-hidden"> <aside className="w-56 shrink-0 border-r p-3"> <Button variant="outline" size="sm" className="w-full" onClick={newChat}> + New chat </Button> <nav className="mt-3 space-y-1"> {threads.map((t) => ( <button key={t.id} onClick={() => switchThread(t.id)} className={`block w-full truncate rounded-md px-3 py-2 text-left text-sm ${ t.id === threadId ? "bg-muted font-medium text-primary" : "text-muted-foreground" }`} > {t.title} </button> ))} </nav> </aside>
<div className="flex flex-1 flex-col overflow-hidden"> <div className="border-b px-5 py-4 font-semibold">Chatbot</div> <div className="flex-1 space-y-4 overflow-y-auto p-5"> {messages.map((m, i) => m.role === "tool" ? ( <ToolCallBubble key={i} name={m.name} args={m.args} result={m.result} /> ) : ( <div key={i} className={m.role === "user" ? "text-right" : "text-left"}> <span className={`inline-block max-w-[75%] rounded-2xl px-4 py-2 ${ m.role === "user" ? "bg-primary text-primary-foreground" : "bg-muted" }`}> {m.content} {loading && m.role === "assistant" && i === messages.length - 1 && ( <span className="ml-0.5 animate-pulse">▍</span> )} </span> </div> ) )} <div ref={bottomRef} /> </div> {error && ( <p className="mx-5 mb-2 rounded-md bg-red-50 px-4 py-2 text-sm text-red-700"> {error} </p> )} <form onSubmit={sendMessage} className="flex gap-2 border-t p-4"> <Input value={input} onChange={(e) => setInput(e.target.value)} placeholder="Ask me anything..." disabled={loading} /> {loading ? ( <Button type="button" variant="outline" onClick={stop}> Stop </Button> ) : ( <Button type="submit">Send</Button> )} </form> </div> </Card> </main> );}What you built
Part 7- A checkpointer wired into the graph with one line,
builder.compile(checkpointer=InMemorySaver()), so the graph saves and reloads its own state. - Threads: every conversation is keyed by a
thread_id, generated withcrypto.randomUUID()and kept inlocalStorageso it survives refreshes. - The thread wired end to end: the frontend sends
thread_id, the backend passes it as the graph'sconfig, and the bot reads its own history before each reply. - A
New chatbutton that mints a fresh thread, and a sidebar that lists past chats and reopens them via aGET /threads/{id}/messagesendpoint reading from the checkpointer. - A clear-eyed view of the limit:
InMemorySaverlives in RAM, so every conversation is wiped on restart, the problem Part 8 leaves you ready to solve.
Test yourself
Why did the Part 6 bot forget your name between two messages?
What is a thread_id?
After adding the checkpointer, the server crashed with ValueError: Checkpointer requires one or more of the following 'configurable' keys. Why?
With memory on, why does the backend still send only the newest message to the graph, not the whole history?
The series uses InMemorySaver. What happens to every conversation when the server restarts?
Commit it, from the project root, in a terminal that isn't hosting a server:
git add .git commit -m "part 7: give the bot conversation memory with a checkpointer and threads"Your bot remembers you now, runs tools, and streams its answers, the whole thing humming on your laptop. There's one wall left: it only exists on your machine. In Part 8 you'll put it on the public internet, share the link, and watch it forget everyone the first time you redeploy, which turns out to be the perfect reason to care about everything this part taught you.