拆解 Warp AI Agent（三）：对话即状态机——3700 行代码管理 Agent 完整生命周期-洪萨配资

系列第三篇。前两篇讲了 Action 的类型安全和执行调度，本篇进入 Layer 4——AIConversation，Warp 用 3733 行代码实现的对话状态机。它管理着从"用户提问"到"任务完成/取消/阻塞"的完整生命周期。

一、问题：AI 对话的状态有多复杂？

一个完整的 Agent 对话不是简单的"请求→响应"：

用户提问 → Agent 思考 → 执行 Action 1（并行读取 3 个文件） → 读取完成，分析结果 → 执行 Action 2（编辑文件，等用户确认） → 用户确认 → 执行 Action 3（运行命令，命令还在跑...） → 命令输出快照 1 → Agent 决定继续等 / 输入更多内容 / 交还用户控制 → 执行 Action 4（启动子 Agent） → 子 Agent 运行中... → 任务完成 / 错误 / 取消 / 阻塞

这里有 5 种终止状态、3 种中间状态、嵌套的子任务、长时运行命令的控制权转移……用 if-else 写会变成面条代码。Warp 的解法：状态机。

二、AIConversation：核心结构

// app/src/ai/agent/conversation.rs (3733行)/// An Agent Mode conversation.#[derive(Debug, Clone)]pubstructAIConversation{id:AIConversationId,status:ConversationStatus,// ... 更多字段}

2.1 五种终止状态

#[derive(Debug, Clone, PartialEq, Eq, Hash)]pubenumConversationStatus{/// 正在进行InProgress,/// 成功完成Success,/// 错误终止Error,/// 用户取消Cancelled,/// 阻塞等待（如需要用户确认危险操作）Blocked{blocked_action:String},}

为什么Blocked是终止状态而不是中间状态？因为在 Warp 的设计中，阻塞的对话不会自动恢复——它停在 UI 上等用户拍板。用户确认后，会创建一个新的 Exchange（对话轮次）继续执行，而不是"解除阻塞"。

2.2 15 种输入类型

// app/src/ai/agent/mod.rspubenumAIAgentInput{UserQuery(String),ActionResult(AIAgentActionResult),InvokeSkill(SkillReference),StartFromAmbientRunPrompt(AmbientRunPrompt),// ... 共 15 种}

2.3 输出消息类型

pubenumAIAgentOutputMessageType{Text,Reasoning,// 思考过程Summarization,// 摘要Action,// Action 执行TodoOperation,// Todo 列表操作Subagent,// 子 Agent 通信WebSearch,// 网络搜索WebFetch,// 网页抓取}

三、对话上下文：Agent 能"看到"什么

pubenumAIAgentContext{Directory(PathBuf),// 当前工作目录SelectedText(String),// 用户选中的文本ExecutionEnvironment(String),// 执行环境信息CurrentTime(String),// 当前时间Image(Vec<u8>),// 图片Codebase(Vec<FileContext>),// 代码库上下文ProjectRules(Vec<ProjectRule>),// 项目规则File(FileContext),// 文件内容Git(GitContext),// Git 状态Skills(Vec<SkillReference>),// 可用 SkillBlock(BlockContext),// 终端 Block 上下文}

注意：ProjectRules就是我们第五篇要讲的 WARP.md/AGENTS.md 双文件规则系统。

四、长时运行命令（LRC）的双模控制

这是 Warp AI Agent 最独特的设计之一：Agent 和用户可以交替控制同一个运行中的命令。

4.1 三种写入模式

pubenumAIAgentPtyWriteMode{/// 原始写入——直接把字节发给 PTYRaw,/// 行写入——追加换行符后发送Line,/// Block 写入——多行文本逐行发送Block,}

4.2 控制权转移

// Action 类型TransferShellCommandControlToUser{reason:String}

场景：Agent 启动了一个python交互式命令，输入了几行代码后，决定让用户接手：

Agent: 启动 python 交互式命令 → WriteToLongRunningShellCommand(mode=Line, input="import pandas as pd\n") → WriteToLongRunningShellCommand(mode=Line, input="df = pd.read_csv('data.csv')\n") → ReadShellCommandOutput → 看到输出 → TransferShellCommandControlToUser(reason="数据已加载，请你继续探索") 用户: 接手终端，手动输入更多命令

4.3 快照模式 vs 实时模式

对于长时运行命令，Warp 使用快照模式返回结果，而不是流式输出：

pubenumReadShellCommandOutputResult{Success{output:LongRunningCommandSnapshot},// ...}pubstructLongRunningCommandSnapshot{// 命令当前的完整输出快照}

为什么不流式？因为 Agent 需要的是"命令当前的状态"而不是实时日志。一个cargo build跑 30 秒，Agent 不需要看 30 秒的编译输出——它只需要在需要时获取当前快照。

五、AIAgentHarness：多引擎对话

/// The harness that produced an agent conversation.#[derive(Debug, Clone, Copy, PartialEq, Eq)]pubenumAIAgentHarness{Oz,// Warp 内置 AgentClaudeCode,// 委托给 Claude CLIGemini,// 委托给 Gemini CLIUnknown,// 未知 harness（向后兼容）}

每个对话都记录了它是由哪个引擎产生的。这意味着：

Warp 的对话历史可以混合来自不同引擎的记录
用户可以切换引擎继续对话
后台 Agent 任务可以指定使用特定引擎

5.1 Harness 枚举（CLI 层）

// crates/warp_cli/src/agent.rspubenumHarness{#[default]#[value(name ="oz")]Oz,#[value(name ="claude", alias ="claude-code")]Claude,#[value(name ="opencode", alias ="open-code")]OpenCode,#[value(name ="gemini")]Gemini,#[value(skip)]Unknown,}

5.2 HarnessKind（驱动层）

// app/src/ai/agent_sdk/driver/harness/mod.rspub(crate)enumHarnessKind{Oz,ThirdParty(Box<dynThirdPartyHarness>),Unsupported(Harness),}

三层抽象：

层级	枚举	职责
CLI	`Harness`	命令行参数解析
对话	`AIAgentHarness`	记录对话来源
驱动	`HarnessKind`	执行时路由到不同 Runner

六、自动执行模式

pubenumAIConversationAutoexecuteMode{/// 尊重用户设置（默认）RespectUserSettings,/// 一直执行到完成（后台 Agent 用）RunToCompletion,}

RunToCompletion模式专为 Ambient Agent（后台 Agent）设计——后台任务没有用户在线确认，所以所有 Action 都自动执行。这与第二篇的风险分级机制形成互补：

模式	前台 Agent	后台 Agent
调度	RespectUserSettings	RunToCompletion
只读操作	自动执行	自动执行
危险操作	等用户确认	也自动执行

七、Ambient Agent 状态映射

// app/src/ai/ambient_agents/mod.rspubenumAmbientConversationStatus{Success,Error{error:RenderableAIError},Cancelled{reason:CancellationReason},Blocked{blocked_action:String},}

后台 Agent 的状态直接从ConversationStatus映射：

pubfnconversation_output_status_from_conversation(conversation:&AIConversation,)->Option<AmbientConversationStatus>{// Blocked 优先检查ifletConversationStatus::Blocked{blocked_action}=conversation.status(){returnSome(AmbientConversationStatus::Blocked{blocked_action:blocked_action.clone()});}// 然后检查最后的输出状态letlast_exchange=conversation.root_task_exchanges().last()?;// ...}

八、与业界方案对比

维度	Warp	Claude Code	Cursor	Hermes Agent
对话模型	状态机 (5状态)	简单请求-响应	简单请求-响应	上下文压缩
长时命令	双模控制+快照	无	无	无
控制权转移	Agent↔用户	不支持	不支持	不支持
多引擎	Oz/Claude/Gemini	仅自研	仅自研	仅自研
后台 Agent	AmbientAgent	不支持	不支持	支持(subagent)
阻塞状态	一等公民	无	无	无

Warp 的独特价值：TransferShellCommandControlToUser+LongRunningCommandSnapshot——Agent 和用户可以交替控制同一个 Shell 命令。这在其他 Agent 框架中几乎看不到。

九、可复用模式：Conversation State Machine

┌─────────────────────────────────────────┐ │ Conversation State Machine │ ├─────────────────────────────────────────┤ │ 1. 五态终态 │ │ InProgress → Success / Error / │ │ Cancelled / Blocked │ │ Blocked 不自动恢复，而是开新 Exchange │ │ │ │ 2. LRC 双模 │ │ 快照模式: 按需读取命令当前状态 │ │ 控制权转移: Agent ↔ User 交替 │ │ 写入模式: Raw / Line / Block │ │ │ │ 3. 多引擎对话 │ │ 对话记录 harness 来源 │ │ 同一对话可切换引擎 │ │ │ │ 4. 自动执行策略 │ │ 前台: RespectUserSettings │ │ 后台: RunToCompletion │ └─────────────────────────────────────────┘