拆解 Warp AI Agent（一）：类型即协议——23 种 Action 的编译期安全设计

本文探讨了Warp AI Agent底层架构的核心设计决策——使用Rust类型系统而非JSON Schema定义工具协议。文章指出工具调用是AI Agent中最易出错的环节，常见问题包括工具名错误、参数类型不匹配等。Warp通过Rust枚举类型AIAgentActionType在编译期实现23种Action的穷举检查，相比JSON Schema具有显著优势：新增工具必须添加处理器（编译报错而非运行

qcx23

430人浏览 · 2026-05-02 09:03:08

qcx23 · 2026-05-02 09:03:08 发布

在这里插入图片描述

这是「拆解 Warp AI Agent」系列的第一篇。全系列 5 篇，从 Warp 源码提炼 7 个可复用的 AI Agent 架构模式。本文聚焦最底层的设计决策：为什么用 Rust 类型系统来定义 Agent 的工具协议，而不是 JSON Schema？

一、问题：AI Agent 的工具调用，出错率有多高？

构建 AI Agent 时，工具调用（Tool Call）是最容易出错的一环：

LLM 返回了不存在的工具名 — 调用端没有对应处理器
参数类型错误 — 传了字符串但期望整数
结果类型不匹配 — 工具返回了 A 类型但 Agent 期望 B 类型
取消状态遗漏 — 用户取消后某个工具忘了处理 cancelled 状态

这些问题在 Python/TypeScript 等动态类型语言中，只能在运行时通过 try-catch 兜底。Warp 的选择不同——让编译器替你检查。

二、六层架构全景

先看全貌，再聚焦第一层：

┌──────────────────────────────────────────────────────────────┐
│ Layer 6: Agent SDK / Harness    (Oz, Claude, Gemini CLI)     │
├──────────────────────────────────────────────────────────────┤
│ Layer 5: Ambient Agents         (后台任务, 自动执行)          │
├──────────────────────────────────────────────────────────────┤
│ Layer 4: Conversation / Task    (对话状态机, Todo, CodeReview)│
├──────────────────────────────────────────────────────────────┤
│ Layer 3: Action Model           (风险分级, 队列调度)          │
├──────────────────────────────────────────────────────────────┤
│ Layer 2: MCP Protocol           (Transport, Provider, 重连)   │
├──────────────────────────────────────────────────────────────┤
│ Layer 1: Core Primitives  ◄─── 本篇焦点                      │
│   AIAgentActionType (23种) + AIAgentActionResultType          │
│   DiffValidation + Citation + ContentHash                     │
└──────────────────────────────────────────────────────────────┘

第一层是整个 Agent 系统的地基。如果 Action 定义有漏洞，上面五层都建在沙滩上。

三、AIAgentActionType：23 种 Action，编译期穷举

3.1 完整枚举

// crates/ai/src/agent/action/mod.rs (827行)
#[derive(Debug, Clone, Eq, PartialEq, EnumDiscriminants)]
pub enum AIAgentActionType {
    // Shell 操作
    RequestCommandOutput {
        command: String,
        is_read_only: Option<bool>,    // LLM 判断是否只读
        is_risky: Option<bool>,        // LLM 判断是否有风险
        wait_until_completion: bool,
        uses_pager: Option<bool>,
        rationale: Option<String>,     // AI 的执行理由
        citations: Vec<AIAgentCitation>,
    },
    WriteToLongRunningShellCommand {
        block_id: BlockId,
        input: bytes::Bytes,
        mode: AIAgentPtyWriteMode,     // Raw / Line / Block
    },

    // 文件操作
    ReadFiles(ReadFilesRequest),
    UploadArtifact(UploadArtifactRequest),
    RequestFileEdits { file_edits: Vec<FileEdit>, title: Option<String> },

    // 搜索操作
    SearchCodebase(SearchCodebaseRequest),
    Grep { queries: Vec<String>, path: String },
    FileGlob { patterns: Vec<String>, path: Option<String> },
    FileGlobV2 { patterns: Vec<String>, search_dir: Option<String> },

    // MCP 操作
    ReadMCPResource { server_id: Option<Uuid>, name: String, uri: Option<String> },
    CallMCPTool { server_id: Option<Uuid>, name: String, input: serde_json::Value },

    // 文档操作
    ReadDocuments(ReadDocumentsRequest),
    EditDocuments(EditDocumentsRequest),
    CreateDocuments(CreateDocumentsRequest),

    // 交互操作
    AskUserQuestion { questions: Vec<AskUserQuestionItem> },

    // 协作操作
    StartAgent {
        version: StartAgentVersion,
        name: String,
        prompt: String,
        execution_mode: StartAgentExecutionMode,
        lifecycle_subscription: Option<Vec<LifecycleEventType>>,
    },
    SendMessageToAgent { addresses: Vec<String>, subject: String, message: String },

    // 其他
    SuggestNewConversation { message_id: String },
    SuggestPrompt(SuggestPromptRequest),
    InitProject,
    OpenCodeReview,
    ReadShellCommandOutput { block_id: BlockId, delay: Option<ShellCommandDelay> },
    UseComputer(UseComputerRequest),
    InsertCodeReviewComments { repo_path: PathBuf, comments: Vec<InsertReviewComment>, base_branch: Option<String> },
    RequestComputerUse(RequestComputerUseRequest),
    ReadSkill(ReadSkillRequest),
    FetchConversation { conversation_id: String },
    TransferShellCommandControlToUser { reason: String },
}

关键设计：

is_read_only 和 is_risky 由 LLM 自己标注——让 AI 自己评估风险等级，而不是由宿主事后判断
rationale 字段要求 LLM 给出执行理由，用于 UI 展示和审计
citations 追踪每条 Action 的来源，确保可追溯

3.2 为什么不用 JSON Schema？

Claude Code 和大部分 Agent 框架用 JSON Schema 定义工具：

{
  "name": "read_file",
  "parameters": {
    "type": "object",
    "properties": {
      "path": { "type": "string" }
    }
  }
}

这种方式的问题：

问题	JSON Schema	Rust Enum
新增工具忘记加处理器	运行时报错	编译报错
参数类型不匹配	运行时解析	编译期检查
取消状态遗漏	容易漏处理	match 穷举强制覆盖
重构时遗漏调用点	需要全局搜索	编译器自动定位

四、cancelled_result()：穷举的力量

Warp 为每种 Action 类型定义了 cancelled_result() 方法：

impl AIAgentActionType {
    pub fn cancelled_result(&self) -> AIAgentActionResultType {
        match self {
            Self::RequestCommandOutput { .. } => 
                AIAgentActionResultType::RequestCommandOutput(
                    RequestCommandOutputResult::CancelledBeforeExecution,
                ),
            Self::CallMCPTool { .. } => 
                AIAgentActionResultType::CallMCPTool(CallMCPToolResult::Cancelled),
            Self::ReadFiles(..) => 
                AIAgentActionResultType::ReadFiles(ReadFilesResult::Cancelled),
            Self::UseComputer(..) => 
                AIAgentActionResultType::UseComputer(UseComputerResult::Cancelled),
            // ... 23 种变体，每种都有对应的 Cancelled 结果
            // 新增 Action 时，如果忘记加这一行，编译器会报错！
        }
    }
}

这是整个设计最精妙的部分：Rust 的 match 要求穷举所有变体。新增一种 Action 类型时，如果你忘记在 cancelled_result() 中添加对应分支，编译直接失败。

对比 Python 的实现：

# Python: 忘记处理新 Action 的 cancelled 状态？
# 没有任何提示，直到运行时才发现
def cancelled_result(self):
    if self.type == "RequestCommandOutput":
        return RequestCommandOutputResult(cancelled=True)
    elif self.type == "ReadFiles":
        return ReadFilesResult(cancelled=True)
    # 忘了 CallMCPTool？没关系，走 else 分支返回 None...
    # 然后在某个不相关的角落崩了

五、三态结果模型：Success / Error / Cancelled

每个 Action 的结果都有三个状态：

// crates/ai/src/agent/action_result/mod.rs
pub enum AIAgentActionResultType {
    RequestCommandOutput(RequestCommandOutputResult),
    ReadFiles(ReadFilesResult),
    CallMCPTool(CallMCPToolResult),
    // ...
}

// 每种 Result 内部都有三态：
pub enum RequestCommandOutputResult {
    Success { output: String, exit_code: i32, ... },
    Error { message: String, ... },
    CancelledBeforeExecution,
}

impl AIAgentActionResultType {
    pub fn is_successful(&self) -> bool { ... }
    pub fn is_failed(&self) -> bool { ... }
    pub fn is_cancelled(&self) -> bool { ... }
}

为什么 Cancelled 是一等公民？ 因为在 Agent 场景中，取消不是异常——是常态。用户随时可能取消一个正在执行的命令、拒绝一个文件编辑、中止一个 MCP 调用。如果把 Cancelled 当作 Error 的子集，会导致：

取消统计和错误统计混淆
取消后的恢复逻辑和错误恢复逻辑耦合
无法区分"我主动停了"和"出了问题"

六、StartAgentExecutionMode：本地与远程的统一抽象

pub enum StartAgentExecutionMode {
    Local {
        /// None = Warp 内置 Agent
        /// Some("claude") = 委托给 Claude CLI
        /// Some("gemini") = 委托给 Gemini CLI
        harness_type: Option<String>,
    },
    Remote {
        environment_id: String,
        skill_references: Vec<SkillReference>,
        model_id: String,
        computer_use_enabled: bool,
        worker_host: String,
        harness_type: String,
        title: String,
    },
}

注意 Local { harness_type: Option<String> } 这个设计——同一个 enum 统一了"用 Warp 自己的 Agent"和"委托给 Claude/Gemini CLI"两种执行路径。这让上层代码不需要关心 Agent 跑在本地还是远程、用的是自己的引擎还是第三方 CLI。

七、与业界方案对比

维度	Warp (Rust Enum)	Claude Code (JSON Schema)	Hermes Agent (TypeScript)	OpenAI Function Call
工具定义	编译期枚举	JSON Schema	运行时注册	JSON Schema
类型安全	编译期保证	运行时校验	运行时校验	运行时校验
取消处理	穷举覆盖	按需处理	按需处理	无内置支持
新增工具遗忘	编译报错	运行时 404	运行时 404	运行时 404
风险自评	is_read_only/is_risky	无内置	无内置	无内置
执行理由	rationale 字段	无内置	无内置	无内置

Warp 的核心优势：把 Agent 工具协议的安全检查从运行时左移到了编译期。在 Rust 项目中，这意味着：

如果代码能编译通过，那么工具调用的类型安全就已经保证了。

八、可复用模式：Type-Driven Tool Protocol

提炼为通用模式：

┌─────────────────────────────────────────┐
│         Type-Driven Tool Protocol        │
├─────────────────────────────────────────┤
│ 1. 用代数数据类型(ADT)定义所有工具       │
│    - 每个 variant = 一种工具             │
│    - 每个 variant 的字段 = 工具参数       │
│                                          │
│ 2. 用 ADT 定义所有结果                   │
│    - 三态: Success / Error / Cancelled   │
│    - cancelled_result() 穷举覆盖         │
│                                          │
│ 3. 用 match 穷举保证覆盖率              │
│    - 新增工具 → 编译器强制补全所有分支   │
│    - 消除"忘记处理"类 bug                │
│                                          │
│ 4. LLM 自评风险等级                      │
│    - is_read_only + is_risky 字段        │
│    - 让 AI 自己标注，不依赖人工判断       │
└─────────────────────────────────────────┘

在非 Rust 语言中的变体：

TypeScript: 用 discriminated union + exhaustive switch（never 类型兜底）
Python: 用 @dataclass + match (Python 3.10+) + exhaustive check 函数
Go: 用 interface + type switch（无编译期穷举保证，需要代码生成辅助）

九、总结

模式	实现	核心价值
Type-Driven Tool Protocol	`AIAgentActionType` 23 变体枚举	编译期保证工具类型安全
Exhaustive Cancellation	`cancelled_result()` match 穷举	新增 Action 不会遗漏取消处理
Three-State Result	Success/Error/Cancelled 三态	取消是一等公民，不与错误混淆
LLM Self-Rating	`is_read_only` + `is_risky` + `rationale`	AI 自己评估风险，无需人工标注