dev_to 2026年3月15日

記憶力を備えた対話型 AI アージェントの構築：LangGraph、Postgres チェックポイント、そして金融 UX の未来

Building Conversational AI Agents That Remember: LangGraph, Postgres Checkpointing, and the Future of Financial UX

Translated: 2026/3/15 19:00:27

langgraphfinancial-aiconversational-aipostgres-checkpointingllm-state-management

Japanese Translation

ステートレスな LLM を状態付金融アドバイザーへと変える「インタラプト/レジューム」グラフトポロジーとは何か、そしてそれが CFO 向けの AI プロダクトにおいてなぜ何事ものに変化をもたらすのか。金融 AI アージェントのデモはすべて同じように見える：ユーザーが質問を投げかけ、アージェントが回答し、その物語が幕を閉じる。ワンショット。ワンターン。アージェントは回答が送信された瞬間に存在を忘れてしまう。しかし、本物の金融会話はそのようには機能しない。 CFO は一度質問をして去るわけではない。彼女は「今年四半期の OPEX の変動要因は何だったか？」と問いかけ、回答を受けた後、「部門別に詳しく分解して」と深掘りを続け、さらに「欧州での拡大を四半期遅らせるシナリオでキャッシュ・ランウェイがどうなるか、計算してみて」と転換する。各質問は前のものに基づいて展開される。コンテキストが蓄積していく。アージェントは対話が行われた場所、すでに実行された解析、そしてユーザーが関心を持つことを覚えておく必要がある。これが AI デモと AI プロダクトの間のギャップだ。これを埋めるには、根本的に異なるアーキテクチャが必要になる。最近、僕は多ターン記憶、インタラプト/レジューム機能、および Postgres に持久化されたステートを持つ対話型 AI アージェントを構築する機会を得た。私が発見したパターンは金融 AI にも直接適用できると考え、CFO や財務チームが AI システムとやり取りする方法における UX パラダイムシフトを表すものだと思われている。本記事では、アーキテクチャ、核となるアイデア、そして金融プロダクトへの示唆を解説する。ほとんどのエージェントフレームワークは、呼び出しごとに独立性を重視する。ユーザーがメッセージを送信し、アージェントが処理して回答を返す間、計算グラフ全体—including 間のステート—は蒸発してしまい、すべてリセットされてしまう。単純な Q&A にはこれで十分かもしれない。しかし、金融ワークフローにおいてはこれが災いとなる。本物の金融会話を考えてみなさい：ターン 1:「今年四半期の売上高成長率は何だったか？」ターン 2:「それは、当社の 3 社近い競合他社とどう比較されるのか？」ターン 3:「同じ期間の粗利益トレンドを抽出して。」ターン 4:「これらの全てに基づいて、上理事会へのコメント用段落を執筆して。」ターン 4 までに、アージェントはターン 1 の売上高数値、ターン 2 の競合データ、ターン 3 の利益解析を覚えておく必要がある。持久化されたステートがない場合、各ターンは最初からやり直すことになる。ユーザーはコンテキストの繰り返し、ドキュメントの再アップロード、そして達成しようとしていることを再説明を強いられてしまう。これは単なる不自由さではない。これは財務専門家が実際に使用するのは反復・対話型ワークフローから AI を置き換えることを防ぐ根本的な UX 失敗だ。ソリューションは、LangGraph の 3 つの素組みが協力して機能することに基づいている：アージェントが回答し、人間の入力を受け付ける、そしてループに戻る「ループする」グラフトポロジー「interrupt()」を使ってグラフの途中を実行中断し、ステートを保存実行中断点を毎回データベースに保存する、グラフ全体ステートを保存する「Postgres チェックポイライター」ここで、会話のライフサイクルを平易な用語で示す：ユーザーがメッセージを送信 ↓ アージェントがメッセージ + 全履歴を処理 ↓ アージェントが回答し、追加の入力が必要だと判断 ↓ 「interrupt()」が呼び出される全ステート→Postgres にシリアライズ ↓ ...数分、数時間、数日が経過... ↓ ユーザーがフォローアップメッセージを送信 ↓ グラフは Postgres チェックポイントから再開新しいメッセージが会話履歴に注入 ↓ アージェントはすべて（古い + 新しいコンテキスト）を処理 ↓ （対話が解決するまでサイクルが繰り返される）重要な洞察：グラフはターン間に終了しない。それは停止する。メッセージ履歴、ターンカウンター、中間結果、ルーティング決定など、全ステートが Postgres へシリアライズされる。ユーザーが戻ってくる時、グラフはその最後に残された位置から正確に再開する。一歩ずつこれらを構築しよう。最初の変更点は、ターンを超えて何が持久化されるかだ。LangGraph はステートスキーマとして TypedDict を使用する： from typing import Annotated, TypedDict, Literal from langchain_core.messages import BaseMessage from langgraph.graph import add_messages class ChatState(TypedDict): # 会話履歴 — n

Original Content

How interrupt/resume graph topology turns stateless LLMs into stateful financial advisors — and why this changes everything for CFO-facing AI products. Every demo of a financial AI agent looks the same: the user asks a question, the agent answers, end of story. One shot. One turn. The agent forgets you exist the moment the response is sent. But real financial conversations don't work that way. A CFO doesn't ask a single question and walk away. She starts with "What drove the variance in OPEX this quarter?", gets an answer, then drills down: "Break that out by department." Then pivots: "OK, run a scenario where we delay the European expansion by one quarter - what happens to our cash runway?" Each question builds on the last. Context accumulates. The agent needs to remember where the conversation has been, what analyses it has already run, and what the user cares about. This is the gap between AI demos and AI products. And closing it requires a fundamentally different architecture. I recently had the opportunity to build a conversational AI agent with multi-turn memory, interrupt/resume capabilities, and persistent state stored in Postgres. The patterns I discovered apply directly to financial AI, and I believe they represent a UX paradigm shift for how CFOs and finance teams will interact with AI systems. This article walks through the architecture, the core ideas, and the implications for financial products. Most agent frameworks treat each invocation as independent. The user sends a message, the agent processes it, returns a response, and the entire computational graph - along with all intermediate state - evaporates. For simple Q&A, this works. For financial workflows, it's a disaster. Consider what a real financial conversation looks like: Turn 1: "What was our revenue growth rate last quarter?" Turn 2: "How does that compare to our three closest competitors?" Turn 3: "Pull the gross margin trends for the same period." Turn 4: "Based on all of this, draft a board commentary paragraph." By turn 4, the agent needs to remember the revenue figures from turn 1, the competitive data from turn 2, and the margin analysis from turn 3. Without persistent state, each turn starts from scratch. The user is forced to repeat context, re-upload documents, and re-explain what they're trying to accomplish. This isn't just an inconvenience — it's a fundamental UX failure that prevents AI from replacing the iterative, conversational workflow that finance professionals actually use. The solution relies on three primitives from LangGraph working together: A looping graph topology where the agent responds, waits for human input, and loops back interrupt() to suspend execution mid-graph and persist state A Postgres checkpointer that saves the full graph state to a database at every suspension point Here's the conversation lifecycle in plain terms: User sends message ↓ Agent processes message + full history ↓ Agent responds, decides it needs more input ↓ interrupt() is called Full state → serialized to Postgres ↓ ... minutes, hours, days pass ... ↓ User sends a follow-up message ↓ Graph resumes from the Postgres checkpoint New message is injected into conversation history ↓ Agent processes everything (old + new context) ↓ (cycle repeats until conversation is resolved) The critical insight: the graph doesn't terminate between turns. It suspends. The entire state — message history, turn counter, intermediate results, routing decisions — is serialized to Postgres. When the user comes back, the graph resumes exactly where it left off. Let's build this step by step. The first decision is what to persist across turns. LangGraph uses a TypedDict as the state schema: from typing import Annotated, TypedDict, Literal from langchain_core.messages import BaseMessage from langgraph.graph import add_messages class ChatState(TypedDict): # Conversation history — new messages are appended automatically messages: Annotated[list[BaseMessage], add_messages] # Whether the agent needs more input from the user awaiting_input: bool # How many turns the conversation has gone through turn: int The add_messages annotation is a LangGraph reducer — it tells the framework to append new messages to the existing list rather than overwriting it. This is how conversation history accumulates across turns without any manual bookkeeping. awaiting_input is the flag the LLM sets when it decides it needs more information from the user. It drives the routing logic that determines whether to suspend the graph or end the conversation. This is a minimal example. In a real financial agent, you'd add fields for accumulated analysis results, which specialized tools have been called, and any structured data the agent has gathered. The principle is the same: everything the agent needs to remember goes into the state, and the checkpointer handles persistence automatically. The graph creates a cycle between two nodes — the agent and a "human gate" that suspends execution: from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver from langgraph.constants import END from langgraph.graph import StateGraph from langgraph.types import interrupt from langchain_core.messages import AIMessage, SystemMessage async def agent_node(state: ChatState) -> dict: """ The agent node. It receives the full conversation history, reasons over it, and decides whether to continue or wait for more input. """ # In production, you'd use with_structured_output() here # to get a typed response with an explicit awaiting_input flag. # For simplicity, this example uses a plain LLM call. response = await llm.ainvoke( [SystemMessage(content=SYSTEM_PROMPT)] + state["messages"] ) # Determine if we need more input (simplified logic) needs_input = "?" in response.content # naive heuristic for demo return { "messages": [AIMessage(content=response.content)], "awaiting_input": needs_input, "turn": state["turn"] + 1, } async def human_gate(state: ChatState) -> dict: """ Suspends the graph and waits for the user's next message. interrupt() does three things: 1. Triggers the checkpointer to save full state to Postgres 2. Halts execution of the graph 3. Returns the user's new message when the graph resumes """ user_message = interrupt("Waiting for user") return { "messages": [user_message], "awaiting_input": False, } def route(state: ChatState) -> str: """Send to human gate if the agent wants more input, otherwise end.""" return "human_gate" if state["awaiting_input"] else "end" # Assemble the graph builder = StateGraph(ChatState) builder.add_node("agent", agent_node) builder.add_node("human_gate", human_gate) builder.set_entry_point("agent") builder.add_conditional_edges("agent", route, {"human_gate": "human_gate", "end": END}) builder.add_edge("human_gate", "agent") # Compile with checkpointer — this is what makes interrupt() work checkpointer = AsyncPostgresSaver.from_conn_string("postgresql://...") await checkpointer.setup() # creates checkpoint tables (idempotent) graph = builder.compile(checkpointer=checkpointer) This creates the following topology: entry → agent → [awaiting_input=True] → human_gate → (back to agent) → [awaiting_input=False] → END Without the checkpointer, interrupt() would raise an error — there's nowhere to persist the state. The checkpointer is not optional infrastructure; it's a structural requirement of the interrupt/resume pattern. On the application side, you invoke the graph with a thread_id that identifies the conversation: from langchain_core.messages import HumanMessage thread_config = { "configurable": {"thread_id": "conversation-001"} } # First turn — start the conversation result = await graph.ainvoke( { "messages": [HumanMessage(content="What was our OPEX last quarter?")], "awaiting_input": False, "turn": 0, }, config=thread_config, ) # ... time passes, user comes back ... # Second turn — resume with the same thread_id result = await graph.ainvoke( {"messages": [HumanMessage(content="Break that out by department")]}, config=thread_config, ) # Third turn — still the same thread, full history available result = await graph.ainvoke( {"messages": [HumanMessage(content="Draft a board paragraph from this")]}, config=thread_config, ) Same thread_id = same conversation = resume from the last checkpoint. The graph loads the full state from Postgres before processing each new message. By turn 3, the agent has the OPEX figures from turn 1, the departmental breakdown from turn 2, and the full reasoning chain — all without the user repeating anything. The pattern becomes truly powerful when the conversational agent can delegate to specialized agents. Instead of one monolithic LLM doing everything, you have an orchestrator that routes to domain experts: async def revenue_agent(state: ChatState) -> dict: """Specialized agent for revenue analysis.""" analysis = await run_revenue_analysis(state["messages"]) return {"messages": [AIMessage(content=analysis)]} async def forecast_agent(state: ChatState) -> dict: """Specialized agent for scenario modeling.""" forecast = await run_forecast_model(state["messages"]) return {"messages": [AIMessage(content=forecast)]} # Extended routing def route(state: ChatState) -> str: if state.get("next_agent") == "revenue": return "revenue_agent" if state.get("next_agent") == "forecast": return "forecast_agent" if state["awaiting_input"]: return "human_gate" return "end" # Sub-agents return to the orchestrator builder.add_edge("revenue_agent", "agent") builder.add_edge("forecast_agent", "agent") Now the conversation flow becomes: User: "Compare our margins to competitors" → agent decides: need margin data first → routes to revenue_agent → revenue_agent returns results into state → agent synthesizes, responds to user → interrupt() → state saved to Postgres User: "Now model what happens if we cut R&D by 10%" → graph resumes from checkpoint → agent decides: need forecast model → routes to forecast_agent → forecast_agent runs scenario, returns results → agent combines revenue analysis + forecast → responds with comprehensive answer The user experiences a natural conversation. Behind the scenes, multiple specialized agents are being orchestrated, their results accumulated in state, and the entire history persisted across turns. Each sub-agent can use different tools, different prompts, even different LLM models — the conversational agent just cares about results. This architecture isn't just a technical pattern — it's a UX paradigm shift for financial AI products. Here's why it matters. Today's financial AI tools are essentially search engines with natural language wrappers. You ask a question, you get an answer. The interaction model is transactional. The interrupt/resume pattern enables a fundamentally different model: conversations. A CFO can start an analysis, drill down into anomalies, pivot to scenario modeling, and build up to a complex deliverable — a board presentation, a variance analysis, a budget recommendation — over multiple turns. The AI maintains full context throughout. This mirrors how CFOs actually work with their FP&A teams. You don't hand your analyst a single question and wait for a report. You have a conversation. You iterate. You refine. The conversation is the interface. Not every financial question has an instant answer. Some analyses require running complex models, querying multiple data sources, or waiting for market data feeds. With the interrupt/resume pattern, the agent can say "I'm running the Monte Carlo simulation on your revenue scenarios — I'll notify you when results are ready" and checkpoint its state. When the computation finishes, the conversation resumes where it left off. This opens the door to financial AI that handles genuinely complex workflows: multi-day budget review processes, iterative forecast refinement, or collaborative analysis sessions where the CFO and the AI work through a problem over the course of a week. Every checkpoint is a serialized snapshot of the full conversation state at a specific point in time. This means you get a complete, immutable audit trail of every decision, every analysis, and every piece of data the agent considered — as a natural byproduct of the architecture. In financial services, where regulatory compliance demands traceability, this isn't a feature. It's table stakes. You can query the checkpoint history for any conversation thread and reconstruct exactly what the agent knew, what it recommended, and why — at any point in the conversation. No additional logging infrastructure required. The sub-agent pattern maps naturally to how finance teams are organized. You build specialized agents for different domains — revenue analysis, cost allocation, cash flow forecasting, competitive intelligence, regulatory compliance — and let the conversational agent route between them based on what the user is asking about. Each agent maintains its own domain expertise while the orchestrator maintains conversational context. The result is an AI system that mirrors the organizational structure of a finance team: specialized expertise coordinated by a generalist who understands the big picture and remembers the full conversation. Building this pattern for production taught me several things I wouldn't have learned from documentation alone. The checkpointer is not optional. It's tempting to think of persistence as a nice-to-have that you'll add later. It's not. Without interrupt() + checkpointer, you simply cannot build multi-turn conversational agents. The entire architecture depends on the graph's ability to suspend and resume with full state intact. Start with the checkpointer from day one. Use structured output for routing. Don't try to parse routing decisions out of free-text LLM output. Use with_structured_output() to get a typed response object with explicit fields like awaiting_input: bool and next_agent: str | None. Free-text parsing is fragile and leads to subtle bugs that only surface in production conversations. Track conversation status explicitly. You need a way to distinguish "the agent is actively processing" from "the agent is waiting for the user to respond." A distinct PAUSED status in your task or conversation model gives you this, and enables operational features like timeout cleanup, stale conversation alerts, and accurate status indicators in the UI. State accumulation is the killer feature. The ability to accumulate analysis results across turns means the agent's context grows richer with every interaction. By the end of a 10-turn conversation, the agent has a comprehensive picture of the analysis the user is building — the revenue data from turn 1, the competitive benchmarks from turn 4, the scenario models from turn 7. No stateless agent can achieve this. Keep the graph topology simple. It's tempting to build elaborate conditional routing with dozens of edges. Resist this. A clean loop — agent → human gate → agent, with sub-agents branching off and returning to the orchestrator — handles the vast majority of conversational workflows. Complexity in the graph is complexity in debugging. The industry is converging on a model where AI financial assistants are not tools you query but collaborators you converse with. The technical infrastructure to support this — persistent state, interrupt/resume, multi-agent orchestration — is now mature enough for production. I believe the next generation of CFO-facing AI products will be built on these patterns. Not single-shot Q&A systems, but stateful conversational agents that remember your context, orchestrate specialized analyses, and evolve their understanding of your business over time. The companies that figure this out first will have a decisive advantage. Not because the underlying LLMs are better, but because the architecture around them — the state management, the orchestration, the persistence — creates an experience that feels like working with an exceptionally capable colleague rather than querying a database with natural language. The technology is ready. The question is who builds the product. I'm a CFO and AI Solutions Architect with 20+ years in fintech and banking. I build production agentic systems at the intersection of finance and AI. If you're working on similar problems — particularly conversational AI for enterprise finance — let's connect on LinkedIn.