第九篇：测试与调试 —— 保障代码质量的防线

nanobot 通过分层测试策略构建高可靠 Agent 系统。其测试遵循金字塔模型：单元测试（70%）覆盖核心模块独立逻辑，集成测试（20%）验证模块协作，端到端测试（10%）检验完整用户场景。测试中广泛运用 pytest 与 mock 技术，隔离 LLM API、文件系统等外部依赖，确保测试稳定可重复。同时提供结构化调试工具，使开发者能实时洞察内部状态。

王解

681人浏览 · 2026-02-22 12:00:00

王解 · 2026-02-22 12:00:00 发布

从单元测试到集成验证，看 nanobot 如何构建“永不崩溃”的 Agent 系统

在前八篇文章中，我们完整地剖析了 nanobot 的核心模块：AgentLoop、插件系统、LLM 交互、记忆机制、工具调用、配置与日志。一个功能完备的框架已经呈现在我们面前。但还有一个关键问题悬而未决：如何保证这些代码的质量？如何确保修改一个地方不会破坏另一个地方？如何快速定位问题？

测试与调试，就是保障代码质量的最后一道防线。nanobot 虽然轻量，却在测试方面做了精心的设计：

单元测试：覆盖核心模块的独立逻辑
模拟测试：隔离外部依赖（LLM API、文件系统）
集成测试：验证模块间的协作
调试工具：让开发者能“看见”代码内部的运行状态

如果把 Agent 比作一个人：

单元测试是体检——检查每个器官是否正常工作
集成测试是运动测试——验证全身协调能力
调试工具是内窥镜——让我们看到内部发生了什么

今天，我们就来深入解析 nanobot 的测试策略和调试技巧。

1. 测试金字塔：从单元到集成的分层策略

nanobot 的测试遵循经典的测试金字塔模型：

测试金字塔参考架构：

1.1 各层测试的职责

测试层级	测试对象	目标	工具
单元测试	单个函数/类	验证独立逻辑的正确性	pytest, unittest.mock
集成测试	模块间交互	验证组件协作无误	pytest, 测试夹具
端到端测试	完整系统	验证用户场景	CLI 测试, 模拟渠道

2. 单元测试：夯实每一块砖石

单元测试是测试金字塔的基石。nanobot 的核心模块都有对应的单元测试，确保每个函数、每个方法在独立环境下都能正确工作。

2.1 测试目录结构

tests/
├── unit/
│   ├── test_agent_loop.py      # AgentLoop 单元测试
│   ├── test_context.py          # ContextBuilder 测试
│   ├── test_memory.py           # MemoryStore 测试
│   ├── test_tools/               # 工具测试
│   │   ├── test_filesystem.py
│   │   ├── test_shell.py
│   │   └── test_web.py
│   └── test_config.py           # 配置加载测试
├── integration/
│   ├── test_full_conversation.py # 完整对话流程
│   └── test_tool_chains.py       # 工具链调用测试
└── conftest.py                   # 共享测试夹具

2.2 使用 pytest 和 mock

nanobot 的单元测试主要依赖 pytest 和 unittest.mock。pytest 提供了简洁的断言和夹具机制，mock 则用于隔离外部依赖。

2.2.1 测试 AgentLoop 的核心逻辑

# tests/unit/test_agent_loop.py
import pytest
from unittest.mock import AsyncMock, MagicMock, patch
from agent.loop import AgentLoop
from agent.context import Context

@pytest.fixture
def mock_llm():
    """模拟 LLM 提供商"""
    llm = AsyncMock()
    llm.complete.return_value = MagicMock(
        tool_calls=None,
        content="这是一个测试响应"
    )
    return llm

@pytest.fixture
def mock_tool_registry():
    """模拟工具注册表"""
    registry = MagicMock()
    registry.get_tool_schemas.return_value = []
    return registry

@pytest.fixture
def mock_context_builder():
    """模拟上下文构建器"""
    builder = AsyncMock()
    builder.build.return_value = Context(messages=[
        {"role": "user", "content": "测试消息"}
    ])
    return builder

@pytest.fixture
def agent_loop(mock_llm, mock_tool_registry, mock_context_builder):
    """创建测试用的 AgentLoop 实例"""
    loop = AgentLoop(
        llm_provider=mock_llm,
        tool_registry=mock_tool_registry,
        config={"max_iterations": 5},
        message_bus=MagicMock()
    )
    loop.context_builder = mock_context_builder
    return loop

@pytest.mark.asyncio
async def test_process_message_simple_response(agent_loop, mock_llm):
    """测试：收到简单消息，直接返回响应（无工具调用）"""
    # 准备测试数据
    message = MagicMock()
    message.content = "你好"
    message.session_id = "test-session"
    
    # 执行测试
    response = await agent_loop.process_message(message)
    
    # 验证结果
    assert response.content == "这是一个测试响应"
    mock_llm.complete.assert_called_once()

@pytest.mark.asyncio
async def test_process_message_with_tool_call(agent_loop, mock_llm):
    """测试：LLM 返回工具调用，执行后再次调用"""
    # 模拟第一次调用：返回工具调用
    mock_llm.complete.side_effect = [
        MagicMock(  # 第一次调用：要求调用工具
            tool_calls=[
                MagicMock(
                    function=MagicMock(
                        name="test_tool",
                        arguments='{"param": "value"}'
                    )
                )
            ],
            content=None
        ),
        MagicMock(  # 第二次调用：返回最终回答
            tool_calls=None,
            content="工具执行完成"
        )
    ]
    
    # 模拟工具执行
    agent_loop.tool_registry.get.return_value = AsyncMock(
        execute=AsyncMock(return_value="工具执行结果")
    )
    
    # 执行测试
    message = MagicMock()
    message.content = "帮我做件事"
    message.session_id = "test-session"
    
    response = await agent_loop.process_message(message)
    
    # 验证：LLM 被调用了两次
    assert mock_llm.complete.call_count == 2
    assert response.content == "工具执行完成"

测试代码参考：

2.3 模拟 LLM API：避免真实调用

测试 Agent 时最大的挑战是依赖真实的 LLM API——这会带来成本、延迟和不确定性。nanobot 通过模拟 LLM 提供商来解决这个问题。

# tests/unit/test_llm_provider.py
from unittest.mock import AsyncMock, patch
import pytest
from providers.litellm_provider import LiteLLMProvider

@pytest.mark.asyncio
async def test_litellm_provider_complete():
    """测试 LiteLLMProvider 的 complete 方法（模拟底层调用）"""
    # 创建配置
    config = MagicMock()
    config.model = "gpt-3.5-turbo"
    config.temperature = 0.7
    config.max_tokens = 1000
    
    provider = LiteLLMProvider(config)
    
    # 模拟 litellm.acompletion
    with patch.object(provider, '_get_litellm') as mock_get_litellm:
        mock_litellm = MagicMock()
        mock_litellm.acompletion = AsyncMock(return_value={
            "choices": [{
                "message": {
                    "content": "这是模拟的响应"
                }
            }]
        })
        mock_get_litellm.return_value = mock_litellm
        
        # 执行测试
        messages = [{"role": "user", "content": "你好"}]
        response = await provider.complete(messages)
        
        # 验证
        assert response.content == "这是模拟的响应"
        mock_litellm.acompletion.assert_called_once()

2.4 测试工具：隔离文件系统

工具测试需要避免影响真实文件系统。nanobot 使用 tempfile 和 pyfakefs 来创建隔离的测试环境。

# tests/unit/tools/test_filesystem.py
import pytest
import tempfile
from pathlib import Path
from agent.tools.filesystem import ReadFileTool, WriteFileTool

@pytest.fixture
def temp_workspace():
    """创建临时工作目录"""
    with tempfile.TemporaryDirectory() as tmpdir:
        workspace = Path(tmpdir)
        yield workspace

@pytest.mark.asyncio
async def test_write_and_read_file(temp_workspace):
    """测试：写入文件然后读取"""
    # 准备工具
    write_tool = WriteFileTool()
    read_tool = ReadFileTool()
    
    # 设置工作目录（模拟配置）
    write_tool.workspace = temp_workspace
    read_tool.workspace = temp_workspace
    
    # 测试文件路径
    test_file = temp_workspace / "test.txt"
    test_content = "Hello, nanobot!"
    
    # 执行写入
    write_result = await write_tool._execute_impl(
        path=str(test_file),
        content=test_content
    )
    assert "成功写入" in write_result
    
    # 验证文件已创建
    assert test_file.exists()
    
    # 执行读取
    read_result = await read_tool._execute_impl(path=str(test_file))
    assert read_result == test_content

@pytest.mark.asyncio
async def test_read_nonexistent_file(temp_workspace):
    """测试：读取不存在的文件应返回错误"""
    read_tool = ReadFileTool()
    read_tool.workspace = temp_workspace
    
    result = await read_tool._execute_impl(
        path=str(temp_workspace / "nonexistent.txt")
    )
    assert "错误" in result
    assert "不存在" in result

测试示例参考：

2.5 测试记忆系统

# tests/unit/test_memory.py
import pytest
import tempfile
from pathlib import Path
from agent.memory import MemoryStore
from datetime import datetime, timedelta

@pytest.fixture
def memory_store():
    """创建测试用的记忆存储"""
    with tempfile.TemporaryDirectory() as tmpdir:
        workspace = Path(tmpdir)
        store = MemoryStore(workspace)
        yield store

@pytest.mark.asyncio
async def test_long_term_memory(memory_store):
    """测试长期记忆的读写"""
    # 初始应为空
    memory = await memory_store.get_long_term()
    assert memory == ""
    
    # 写入记忆
    await memory_store.update_long_term("用户喜欢喝咖啡")
    
    # 读取验证
    memory = await memory_store.get_long_term()
    assert "用户喜欢喝咖啡" in memory

@pytest.mark.asyncio
async def test_daily_notes(memory_store):
    """测试每日笔记"""
    # 写入今日笔记
    await memory_store.append_today("今天聊了测试话题")
    
    # 读取今日笔记
    today = datetime.now().strftime("%Y-%m-%d")
    notes = await memory_store.get_notes_by_date(today)
    assert "今天聊了测试话题" in notes
    
    # 测试搜索
    results = await memory_store.search_notes("测试话题", days=1)
    assert "测试话题" in results

3. 集成测试：验证模块协作

单元测试保证了每个模块独立工作的正确性，但真正的挑战在于模块间的协作。集成测试就是为了验证这一点。

3.1 测试完整对话流程

# tests/integration/test_full_conversation.py
import pytest
from unittest.mock import AsyncMock, patch
from agent.loop import AgentLoop
from agent.context import ContextBuilder
from agent.tools.registry import ToolRegistry
from providers.litellm_provider import LiteLLMProvider
from bus.queue import MessageBus

@pytest.fixture
async def integrated_agent(temp_workspace):
    """创建集成测试用的 Agent 实例（使用模拟的 LLM）"""
    # 创建真实组件（非模拟）
    bus = MessageBus()
    
    # 模拟 LLM 提供商（避免真实调用）
    llm = AsyncMock()
    
    # 真实工具注册表
    tool_registry = ToolRegistry()
    tool_registry.register_builtin_tools()
    
    # 限制工具的工作目录
    for tool in tool_registry.get_all():
        if hasattr(tool, 'workspace'):
            tool.workspace = temp_workspace
    
    # 真实上下文构建器
    context_builder = ContextBuilder(
        config={"workspace": temp_workspace}
    )
    
    # 创建 Agent
    agent = AgentLoop(
        llm_provider=llm,
        tool_registry=tool_registry,
        config={"max_iterations": 5},
        message_bus=bus
    )
    agent.context_builder = context_builder
    
    return agent, llm, bus

@pytest.mark.asyncio
async def test_file_operation_conversation(integrated_agent, temp_workspace):
    """测试：涉及文件操作的完整对话"""
    agent, mock_llm, bus = integrated_agent
    
    # 模拟 LLM 的多轮响应
    mock_llm.complete.side_effect = [
        # 第一轮：要求写入文件
        MagicMock(
            tool_calls=[
                MagicMock(
                    function=MagicMock(
                        name="write_file",
                        arguments='{"path": "test.txt", "content": "Hello"}'
                    )
                )
            ],
            content=None
        ),
        # 第二轮：要求读取文件
        MagicMock(
            tool_calls=[
                MagicMock(
                    function=MagicMock(
                        name="read_file",
                        arguments='{"path": "test.txt"}'
                    )
                )
            ],
            content=None
        ),
        # 第三轮：返回最终回答
        MagicMock(
            tool_calls=None,
            content="文件操作完成，内容是 Hello"
        )
    ]
    
    # 模拟消息
    message = MagicMock()
    message.content = "帮我创建一个文件并读取它"
    message.session_id = "test-session"
    
    # 执行
    response = await agent.process_message(message)
    
    # 验证最终响应
    assert response.content == "文件操作完成，内容是 Hello"
    
    # 验证文件确实被创建
    test_file = temp_workspace / "test.txt"
    assert test_file.exists()
    assert test_file.read_text() == "Hello"

集成测试设计参考：

3.2 测试工具链调用

有些场景需要 Agent 连续调用多个工具。测试这类场景需要验证工具调用的顺序和结果的传递。

# tests/integration/test_tool_chains.py
@pytest.mark.asyncio
async def test_search_and_read_chain(integrated_agent, temp_workspace):
    """测试：搜索网络然后读取结果的工具链"""
    agent, mock_llm, bus = integrated_agent
    
    # 模拟 LLM 响应：先搜索，再读取
    mock_llm.complete.side_effect = [
        # 第一轮：搜索
        MagicMock(
            tool_calls=[
                MagicMock(
                    function=MagicMock(
                        name="web_search",
                        arguments='{"query": "nanobot testing"}'
                    )
                )
            ],
            content=None
        ),
        # 第二轮：读取搜索结果文件
        MagicMock(
            tool_calls=[
                MagicMock(
                    function=MagicMock(
                        name="read_file",
                        arguments='{"path": "search_results.txt"}'
                    )
                )
            ],
            content=None
        ),
        # 第三轮：总结
        MagicMock(
            tool_calls=None,
            content="搜索结果包含测试相关内容"
        )
    ]
    
    # 模拟 web_search 工具写入文件
    with patch('agent.tools.web.WebSearchTool._execute_impl') as mock_search:
        mock_search.return_value = "搜索结果已保存到 search_results.txt"
        
        message = MagicMock()
        message.content = "搜索 nanobot 测试相关内容"
        message.session_id = "test-session"
        
        response = await agent.process_message(message)
        
        assert response.content == "搜索结果包含测试相关内容"
        assert mock_llm.complete.call_count == 3

4. 调试技巧：让代码“开口说话”

即使有完善的测试，Bug 仍然难以避免。nanobot 提供了一系列调试工具，帮助开发者快速定位问题。

4.1 调试模式

使用 --debug 标志启动 nanobot，可以查看详细的内部日志：

nanobot agent -m "你好" --debug

输出示例：

09:15:23 [DEBUG] agent: 收到消息 - session_id=abc123, content_preview="你好"
09:15:23 [DEBUG] context: 构建上下文 - 系统提示长度=245, 历史消息数=0
09:15:23 [DEBUG] memory: 读取长期记忆 - 文件大小=1.2KB
09:15:23 [DEBUG] memory: 读取每日笔记 - 最近7天找到3条记录
09:15:23 [INFO] llm: 调用模型 - model=anthropic/claude-opus-4-5
09:15:25 [DEBUG] llm: API 响应 - tokens=156, finish_reason=stop
09:15:25 [DEBUG] agent: LLM 响应解析 - 有工具调用=False
09:15:25 [INFO] agent: 生成最终响应 - 长度=42
09:15:25 [DEBUG] bus: 发布消息 - channel=telegram, session_id=abc123

4.2 使用 PDB 调试

在代码中插入 breakpoint() 可以进入交互式调试器：

# agent/loop.py
async def process_message(self, message):
    # ... 前面的代码 ...
    
    response = await self.llm.complete(context.messages)
    breakpoint()  # 在这里暂停，检查 response
    
    if response.tool_calls:
        # ...

当程序运行到 breakpoint() 时，会进入 PDB 交互界面，你可以检查变量、执行代码、单步跟踪：

> /home/user/nanobot/agent/loop.py(123)process_message()
-> if response.tool_calls:
(Pdb) print(response.tool_calls)
[FunctionCall(name='write_file', arguments={'path': 'test.txt'})]
(Pdb) print(context.messages[-1])
{'role': 'user', 'content': '帮我写个文件'}
(Pdb) continue

4.3 检查记忆文件

nanobot 的记忆存储在可读的 Markdown 文件中，可以直接查看：

# 查看长期记忆
cat ~/.nanobot/workspace/MEMORY.md

# 查看今日笔记
cat ~/.nanobot/workspace/memory/$(date +%Y-%m-%d).md

# 查看技能定义
cat ~/.nanobot/workspace/skills/*.md

4.4 验证工具调用

使用 --show-tools 选项可以查看每次工具调用的详细参数和结果：

nanobot agent -m "帮我写个 Python 脚本" --show-tools

输出：

🤖 用户: 帮我写个 Python 脚本

🧠 LLM 决策: 调用工具 write_file

🔧 工具调用 [1/1]:
  工具: write_file
  参数: {
    "path": "hello.py",
    "content": "print('Hello, nanobot!')"
  }

📤 工具返回:
  成功写入文件：hello.py

🧠 LLM 再次推理...

💬 最终响应:
  已为您创建 hello.py 文件，内容是一个简单的打印语句。

4.5 常见问题排查指南

现象	可能原因	排查方法
Agent 不响应	消息总线未启动	检查 `nanobot gateway` 是否运行
LLM 调用超时	API Key 无效或网络问题	查看 `llm.log`，尝试 curl 测试 API
工具执行失败	权限不足或路径错误	启用 `--debug` 查看详细错误
记忆不生效	`MEMORY.md` 格式错误	检查文件是否为有效 Markdown
插件未加载	插件目录路径错误	查看启动日志中的插件扫描信息

调试技巧参考：

5. 持续集成：自动化测试流水线

为了确保代码质量，nanobot 项目配置了持续集成（CI）流水线，每次提交代码都会自动运行测试。

5.1 GitHub Actions 配置示例

# .github/workflows/test.yml
name: Test

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.9", "3.10", "3.11"]
    
    steps:
    - uses: actions/checkout@v3
    
    - name: 设置 Python ${{ matrix.python-version }}
      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}
    
    - name: 安装依赖
      run: |
        python -m pip install --upgrade pip
        pip install pytest pytest-asyncio pytest-cov
        pip install -e .
    
    - name: 运行单元测试
      run: |
        pytest tests/unit/ --cov=agent --cov-report=xml
    
    - name: 运行集成测试
      run: |
        pytest tests/integration/ --cov=agent --cov-append --cov-report=xml
    
    - name: 上传覆盖率报告
      uses: codecov/codecov-action@v3
      with:
        file: ./coverage.xml

5.2 测试覆盖率要求

nanobot 项目要求核心模块的测试覆盖率不低于 80%：

模块	目标覆盖率	关键测试点
`agent/loop.py`	90%	主循环、迭代控制、错误处理
`agent/context.py`	85%	上下文构建、记忆注入
`agent/memory.py`	90%	读写操作、文件处理
`agent/tools/`	80%	各工具的核心功能
`providers/`	80%	LLM 调用、参数处理

6. 小结：测试与调试的设计智慧

回顾整个测试与调试体系的实现，我们可以总结出几个关键的设计智慧：

设计要点	解决的问题	实现方式
测试金字塔	分层验证，效率与覆盖兼得	单元测试（70%）+ 集成测试（20%）+ E2E（10%）
模拟外部依赖	隔离 LLM API，避免成本	`unittest.mock` 模拟 API 响应
临时文件系统	工具测试不污染真实环境	`tempfile` + `pyfakefs`
调试模式	运行时查看内部状态	结构化日志 + 详细级别
工具调用可视化	直观理解 Agent 行为	`--show-tools` 选项
CI 自动化	每次提交自动验证	GitHub Actions + 多 Python 版本测试