OpenClaw Memory 系统：构建智能 AI 长期记忆的工程实践

优化策略技术实现效果向量加速sqlite-vec 扩展10-100x 向量搜索加速缓存机制embedding_cache 表减少 90%+ 嵌入 API 调用批处理并发批量请求降低 50-80% 网络开销智能分块500 字符 + 50 重叠平衡精度和性能查询扩展多语言分词 + bigram提升 FTS 召回率时间衰减指数衰减 + 常青保护优化结果新鲜度MMR 重排序Jaccard 相似度避免重复结

渴望飞翔的胖子

402人浏览 · 2026-03-11 10:47:26

渴望飞翔的胖子 · 2026-03-11 10:47:26 发布

OpenClaw Memory 系统：构建智能 AI 长期记忆的工程实践

引言

在现代 AI 助手的架构设计中，长期记忆系统是实现上下文感知和个性化交互的关键组件。OpenClaw 作为一个强大的个人 AI 助手平台，其 Memory 系统设计体现了对性能、可靠性和用户体验的深度思考。本文将从架构设计、核心技术实现、技术亮点等方面深入解析 OpenClaw Memory 系统。

一、系统架构概览

1.1 分层架构

OpenClaw Memory 系统采用清晰的分层架构设计：

┌─────────────────────────────────────────────────────────────────┐
│                     应用层 (Application Layer)                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │ CLI Commands │  │   Web UI     │  │   Agents     │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└─────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────┐
│                     接口层 (Interface Layer)                     │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  MemorySearchManager Interface                           │  │
│  │  - search()    - readFile()  - status()                  │  │
│  │  - sync()      - close()     - probeEmbeddingAvailability│  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────┐
│                   管理层 (Management Layer)                      │
│  ┌──────────────────┐  ┌──────────────────┐                     │
│  │ MemoryIndexManager│  │ FallbackManager  │                     │
│  │ (Builtin SQLite)  │  │ (QMD + Builtin)  │                     │
│  └──────────────────┘  └──────────────────┘                     │
└─────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────┐
│                  搜索层 (Search Layer)                           │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │ Vector Search│  │Keyword Search│  │Hybrid Merge  │           │
│  │ (Cosine Sim) │  │   (BM25/FTS) │  │ (MMR+Decay)  │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└─────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────┐
│                  存储层 (Storage Layer)                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │ SQLite DB    │  │ Vector Ext   │  │ FTS5 Table   │           │
│  │ (chunks,files)│  │(sqlite-vec)  │  │ (full-text)  │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└─────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────┐
│                 嵌入层 (Embedding Layer)                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │
│  │  OpenAI      │  │   Gemini     │  │    Local     │           │
│  │  Voyage      │  │   Mistral    │  │   Ollama     │           │
│  └──────────────┘  └──────────────┘  └──────────────┘           │
└─────────────────────────────────────────────────────────────────┘

1.2 核心组件交互

二、核心功能实现

2.1 向量搜索机制

向量搜索是 Memory 系统的核心功能，基于余弦相似度实现语义匹配：

// src/memory/manager-search.ts
export async function searchVector(params: {
  db: DatabaseSync;
  vectorTable: string;
  providerModel: string;
  queryVec: number[];
  limit: number;
  snippetMaxChars: number;
  ensureVectorReady: (dimensions: number) => Promise<boolean>;
  sourceFilterVec: { sql: string; params: SearchSource[] };
  sourceFilterChunks: { sql: string; params: SearchSource[] };
}): Promise<SearchRowResult[]> {
  // 优先使用 sqlite-vec 扩展进行加速
  if (await params.ensureVectorReady(params.queryVec.length)) {
    const rows = params.db.prepare(`
      SELECT c.id, c.path, c.start_line, c.end_line, c.text,
             c.source,
             vec_distance_cosine(v.embedding, ?) AS dist
      FROM ${params.vectorTable} v
      JOIN chunks c ON c.id = v.id
      WHERE c.model = ?${params.sourceFilterVec.sql}
      ORDER BY dist ASC
      LIMIT ?
    `).all(vectorToBlob(params.queryVec), params.providerModel,
            ...params.sourceFilterVec.params, params.limit);

    return rows.map(row => ({
      id: row.id,
      path: row.path,
      startLine: row.start_line,
      endLine: row.end_line,
      score: 1 - row.dist,  // 距离转换为相似度
      snippet: truncateUtf16Safe(row.text, params.snippetMaxChars),
      source: row.source,
    }));
  }

  // 降级到内存计算
  const candidates = listChunks(params);
  const scored = candidates.map(chunk => ({
    chunk,
    score: cosineSimilarity(params.queryVec, chunk.embedding)
  })).filter(entry => Number.isFinite(entry.score));

  return scored.toSorted((a, b) => b.score - a.score)
               .slice(0, params.limit)
               .map(entry => ({ /* ... */ }));
}

技术要点：

双重实现路径：优先使用 sqlite-vec 扩展加速，自动降级到内存计算
余弦相似度：score = 1 - cosine_distance，范围 [0, 1]
源过滤：支持按 memory / sessions 源过滤

2.2 全文搜索（FTS）

使用 SQLite FTS5 扩展实现高性能关键词搜索：

// src/memory/manager-search.ts
export async function searchKeyword(params: {
  db: DatabaseSync;
  ftsTable: string;
  providerModel: string | undefined;
  query: string;
  limit: number;
  snippetMaxChars: number;
  sourceFilter: { sql: string; params: SearchSource[] };
  buildFtsQuery: (raw: string) => string | null;
  bm25RankToScore: (rank: number) => number;
}): Promise<Array<SearchRowResult & { textScore: number }>> {
  const ftsQuery = params.buildFtsQuery(params.query);
  if (!ftsQuery) return [];

  // BM25 排序 + AND 查询
  const modelClause = params.providerModel ? " AND model = ?" : "";
  const modelParams = params.providerModel ? [params.providerModel] : [];

  const rows = params.db.prepare(`
    SELECT id, path, source, start_line, end_line, text,
           bm25(${params.ftsTable}) AS rank
    FROM ${params.ftsTable}
    WHERE ${params.ftsTable} MATCH ?${modelClause}${params.sourceFilter.sql}
    ORDER BY rank ASC
    LIMIT ?
  `).all(ftsQuery, ...modelParams, ...params.sourceFilter.params, params.limit);

  return rows.map(row => {
    const textScore = params.bm25RankToScore(row.rank);
    return {
      id: row.id,
      path: row.path,
      startLine: row.start_line,
      endLine: row.end_line,
      score: textScore,
      textScore,
      snippet: truncateUtf16Safe(row.text, params.snippetMaxChars),
      source: row.source,
    };
  });
}

// BM25 排序转换为 0-1 分数
export function bm25RankToScore(rank: number): number {
  if (!Number.isFinite(rank)) return 1 / 1000;
  if (rank < 0) {
    const relevance = -rank;
    return relevance / (1 + relevance);
  }
  return 1 / (1 + rank);
}

关键技术：

FTS5 虚拟表：全文索引，支持 MATCH 操作符
BM25 排序：经典的文档相关性排序算法
查询构建：支持多语言分词和 AND/OR 查询

2.3 混合搜索策略

混合搜索融合向量和关键词搜索的优势：

// src/memory/hybrid.ts
export async function mergeHybridResults(params: {
  vector: HybridVectorResult[];
  keyword: HybridKeywordResult[];
  vectorWeight: number;
  textWeight: number;
  workspaceDir?: string;
  mmr?: Partial<MMRConfig>;
  temporalDecay?: Partial<TemporalDecayConfig>;
  nowMs?: number;
}): Promise<Array<{ path: string; startLine: number; endLine: number;
                     score: number; snippet: string; source: string }>> {
  // 1. 合并向量和关键词结果
  const byId = new Map<string, {
    id: string; path: string; startLine: number; endLine: number;
    source: string; snippet: string; vectorScore: number; textScore: number;
  }>();

  for (const r of params.vector) {
    byId.set(r.id, { ...r, textScore: 0 });
  }

  for (const r of params.keyword) {
    const existing = byId.get(r.id);
    if (existing) {
      existing.textScore = r.textScore;
    } else {
      byId.set(r.id, { ...r, vectorScore: 0 });
    }
  }

  // 2. 加权合并
  const merged = Array.from(byId.values()).map(entry => {
    const score = params.vectorWeight * entry.vectorScore +
                  params.textWeight * entry.textScore;
    return { ...entry, score };
  });

  // 3. 应用时间衰减
  const decayed = await applyTemporalDecayToHybridResults({
    results: merged,
    temporalDecay: { ...DEFAULT_TEMPORAL_DECAY_CONFIG, ...params.temporalDecay },
    workspaceDir: params.workspaceDir,
    nowMs: params.nowMs,
  });

  // 4. 按分数排序
  const sorted = decayed.toSorted((a, b) => b.score - a.score);

  // 5. 应用 MMR 多样性重排序
  const mmrConfig = { ...DEFAULT_MMR_CONFIG, ...params.mmr };
  if (mmrConfig.enabled) {
    return applyMMRToHybridResults(sorted, mmrConfig);
  }

  return sorted;
}

混合搜索流程：

结果合并：按 ID 合并向量和关键词结果
加权计算：score = vectorWeight * vectorScore + textWeight * textScore
时间衰减：根据文件年龄调整分数
多样性重排序：MMR 算法避免结果过于相似

2.4 嵌入管理

多 Provider 支持

// src/memory/embeddings.ts
export async function createEmbeddingProvider(
  options: EmbeddingProviderOptions
): Promise<EmbeddingProviderResult> {
  const requestedProvider = options.provider;
  const fallback = options.fallback;

  const createProvider = async (id: EmbeddingProviderId) => {
    if (id === "local") {
      const provider = await createLocalEmbeddingProvider(options);
      return { provider };
    }
    if (id === "ollama") {
      const { provider, client } = await createOllamaEmbeddingProvider(options);
      return { provider, ollama: client };
    }
    // ... 其他 provider
  };

  // Auto 模式：自动选择可用 provider
  if (requestedProvider === "auto") {
    const missingKeyErrors: string[] = [];

    // 优先尝试本地模型
    if (canAutoSelectLocal(options)) {
      try {
        const local = await createProvider("local");
        return { ...local, requestedProvider };
      } catch (err) {
        localError = formatLocalSetupError(err);
      }
    }

    // 尝试远程 provider（按顺序）
    for (const provider of REMOTE_EMBEDDING_PROVIDER_IDS) {
      try {
        const result = await createProvider(provider);
        return { ...result, requestedProvider };
      } catch (err) {
        if (isMissingApiKeyError(err)) {
          missingKeyErrors.push(formatErrorMessage(err));
          continue;
        }
        throw err;
      }
    }

    // 所有 provider 不可用 → FTS-only 模式
    return {
      provider: null,
      requestedProvider,
      providerUnavailableReason: "No embeddings provider available."
    };
  }

  // 显式指定 provider + fallback
  try {
    const primary = await createProvider(requestedProvider);
    return { ...primary, requestedProvider };
  } catch (primaryErr) {
    if (fallback && fallback !== "none" && fallback !== requestedProvider) {
      try {
        const fallbackResult = await createProvider(fallback);
        return {
          ...fallbackResult,
          requestedProvider,
          fallbackFrom: requestedProvider,
          fallbackReason: formatErrorMessage(primaryErr),
        };
      } catch (fallbackErr) {
        // 两者都失败 → FTS-only
        if (isMissingApiKeyError(primaryErr) && isMissingApiKeyError(fallbackErr)) {
          return {
            provider: null,
            requestedProvider,
            providerUnavailableReason: "Both primary and fallback failed."
          };
        }
        throw fallbackErr;
      }
    }
    // 无 fallback → FTS-only
    if (isMissingApiKeyError(primaryErr)) {
      return {
        provider: null,
        requestedProvider,
        providerUnavailableReason: formatErrorMessage(primaryErr),
      };
    }
    throw primaryErr;
  }
}

批处理优化

// src/memory/manager-embedding-ops.ts
private async embedChunksWithProviderBatch<TRequest extends { custom_id: string }>(params: {
  chunks: MemoryChunk[];
  entry: MemoryFileEntry | SessionFileEntry;
  source: MemorySource;
  provider: "voyage" | "openai" | "gemini";
  enabled: boolean;
  buildRequest: (chunk: MemoryChunk) => Omit<TRequest, "custom_id">;
  runBatch: (runnerOptions: {
    agentId: string;
    requests: TRequest[];
    wait: boolean;
    concurrency: number;
    pollIntervalMs: number;
    timeoutMs: number;
    debug: (message: string, data?: Record<string, unknown>) => void;
  }) => Promise<Map<string, number[]> | number[][]>;
}): Promise<number[][]> {
  if (!params.enabled) {
    return this.embedChunksInBatches(params.chunks);
  }

  // 1. 从缓存加载已有嵌入
  const { embeddings, missing } = this.collectCachedEmbeddings(params.chunks);
  if (missing.length === 0) {
    return embeddings;
  }

  // 2. 构建批量请求
  const { requests, mapping } = this.buildBatchRequests<TRequest>({
    missing,
    entry: params.entry,
    source: params.source,
    build: params.buildRequest,
  });

  // 3. 执行批量嵌入（带降级）
  const runnerOptions = this.buildEmbeddingBatchRunnerOptions({
    requests,
    chunks: params.chunks,
    source: params.source,
  });

  const batchResult = await this.runBatchWithFallback({
    provider: params.provider,
    run: async () => await params.runBatch(runnerOptions),
    fallback: async () => await this.embedChunksInBatches(params.chunks),
  });

  // 4. 应用批量结果
  if (Array.isArray(batchResult)) {
    return batchResult;
  }
  this.applyBatchEmbeddings({ byCustomId: batchResult, mapping, embeddings });
  return embeddings;
}

// 批处理失败降级机制
private async runBatchWithFallback<T>(params: {
  provider: string;
  run: () => Promise<T>;
  fallback: () => Promise<number[][]>;
}): Promise<T | number[][]> {
  if (!this.batch.enabled) {
    return await params.fallback();
  }

  try {
    const result = await this.runBatchWithTimeoutRetry({
      provider: params.provider,
      run: params.run,
    });
    await this.resetBatchFailureCount();
    return result;
  } catch (err) {
    const failure = await this.recordBatchFailure({
      provider: params.provider,
      message: err instanceof Error ? err.message : String(err),
    });

    if (failure.disabled) {
      log.warn(`Batch disabled after ${failure.count} failures`);
    }

    // 降级到非批处理
    return await params.fallback();
  }
}

批处理优化要点：

缓存优先：先从 embedding_cache 表加载已有嵌入
智能分组：按 token 限制（8000）分组批量请求
失败降级：连续失败 2 次后自动禁用批处理
超时重试：批处理超时自动重试一次

2.5 同步机制

// src/memory/manager.ts
async sync(params?: {
  reason?: string;
  force?: boolean;
  progress?: (update: MemorySyncProgressUpdate) => void;
}): Promise<void> {
  if (this.closed) return;
  if (this.syncing) return this.syncing;

  this.syncing = this.runSyncWithReadonlyRecovery(params).finally(() => {
    this.syncing = null;
  });
  return this.syncing ?? Promise.resolve();
}

private async runSyncWithReadonlyRecovery(params?: {
  reason?: string;
  force?: boolean;
  progress?: (update: MemorySyncProgressUpdate) => void;
}): Promise<void> {
  try {
    await this.runSync(params);
    return;
  } catch (err) {
    // 检测只读数据库错误
    if (!this.isReadonlyDbError(err) || this.closed) {
      throw err;
    }

    this.readonlyRecoveryAttempts += 1;
    log.warn(`Readonly DB error detected; reopening connection`);

    // 1. 关闭旧连接
    try { this.db.close(); } catch {}

    // 2. 重新打开数据库
    this.db = this.openDatabase();

    // 3. 重置向量状态
    this.vectorReady = null;
    this.vector.available = null;
    this.vector.loadError = undefined;

    // 4. 重新确保 schema
    this.ensureSchema();

    // 5. 重试同步
    try {
      await this.runSync(params);
      this.readonlyRecoverySuccesses += 1;
    } catch (retryErr) {
      this.readonlyRecoveryFailures += 1;
      throw retryErr;
    }
  }
}

同步机制特点：

只读恢复：自动检测并恢复只读数据库错误
防重复：使用 syncing Promise 防止并发同步
文件监听：chokidar 监听文件变化自动触发同步
会话监听：监听会话消息变化实时更新

三、技术亮点

3.1 时间衰减算法

// src/memory/temporal-decay.ts
export function calculateTemporalDecayMultiplier(params: {
  ageInDays: number;
  halfLifeDays: number;
}): number {
  const lambda = Math.LN2 / params.halfLifeDays;
  const clampedAge = Math.max(0, params.ageInDays);
  if (lambda <= 0 || !Number.isFinite(clampedAge)) {
    return 1;
  }
  return Math.exp(-lambda * clampedAge);
}

export async function applyTemporalDecayToHybridResults<T extends {
  path: string; score: number; source: string
}>(params: {
  results: T[];
  temporalDecay?: Partial<TemporalDecayConfig>;
  workspaceDir?: string;
  nowMs?: number;
}): Promise<T[]> {
  const config = { ...DEFAULT_TEMPORAL_DECAY_CONFIG, ...params.temporalDecay };
  if (!config.enabled) return [...params.results];

  const nowMs = params.nowMs ?? Date.now();
  const timestampPromiseCache = new Map<string, Promise<Date | null>>();

  return Promise.all(params.results.map(async (entry) => {
    // 1. 提取时间戳（支持日期路径、文件 mtime）
    const cacheKey = `${entry.source}:${entry.path}`;
    let timestampPromise = timestampPromiseCache.get(cacheKey);
    if (!timestampPromise) {
      timestampPromise = extractTimestamp({
        filePath: entry.path,
        source: entry.source,
        workspaceDir: params.workspaceDir,
      });
      timestampPromiseCache.set(cacheKey, timestampPromise);
    }

    const timestamp = await timestampPromise;
    if (!timestamp) return entry;  // 常青内容不衰减

    // 2. 计算衰减分数
    const ageInDays = (nowMs - timestamp.getTime()) / (24 * 60 * 60 * 1000);
    const decayedScore = entry.score * calculateTemporalDecayMultiplier({
      score: entry.score,
      ageInDays,
      halfLifeDays: config.halfLifeDays,
    });

    return { ...entry, score: decayedScore };
  }));
}

时间衰减特点：

指数衰减：score * exp(-λ * age)，λ = ln(2) / halfLife
常青保护：MEMORY.md 和主题文件不衰减
日期路径：memory/2024-01-15.md 自动解析日期
并发优化：使用 Promise 缓存避免重复 stat 调用

3.2 MMR 多样性重排序

// src/memory/mmr.ts
export function mmrRerank<T extends MMRItem>(
  items: T[],
  config: Partial<MMRConfig> = {}
): T[] {
  const {
    enabled = DEFAULT_MMR_CONFIG.enabled,
    lambda = DEFAULT_MMR_CONFIG.lambda
  } = config;

  if (!enabled || items.length <= 1) return [...items];

  const clampedLambda = Math.max(0, Math.min(1, lambda));
  if (clampedLambda === 1) {
    return [...items].toSorted((a, b) => b.score - a.score);
  }

  // 1. 预分词所有项目
  const tokenCache = new Map<string, Set<string>>();
  for (const item of items) {
    tokenCache.set(item.id, tokenize(item.content));
  }

  // 2. 归一化分数到 [0, 1]
  const maxScore = Math.max(...items.map(i => i.score));
  const minScore = Math.min(...items.map(i => i.score));
  const scoreRange = maxScore - minScore;

  const normalizeScore = (score: number): number => {
    if (scoreRange === 0) return 1;
    return (score - minScore) / scoreRange;
  };

  // 3. 迭代选择 MMR 最优项
  const selected: T[] = [];
  const remaining = new Set(items);

  while (remaining.size > 0) {
    let bestItem: T | null = null;
    let bestMMRScore = -Infinity;

    for (const candidate of remaining) {
      const normalizedRelevance = normalizeScore(candidate.score);
      const maxSim = maxSimilarityToSelected(candidate, selected, tokenCache);
      const mmrScore = lambda * normalizedRelevance - (1 - lambda) * maxSim;

      if (mmrScore > bestMMRScore ||
          (mmrScore === bestMMRScore && candidate.score > (bestItem?.score ?? -Infinity))) {
        bestMMRScore = mmrScore;
        bestItem = candidate;
      }
    }

    if (bestItem) {
      selected.push(bestItem);
      remaining.delete(bestItem);
    } else {
      break;
    }
  }

  return selected;
}

// Jaccard 相似度
export function jaccardSimilarity(setA: Set<string>, setB: Set<string>): number {
  if (setA.size === 0 && setB.size === 0) return 1;
  if (setA.size === 0 || setB.size === 0) return 0;

  let intersectionSize = 0;
  const smaller = setA.size <= setB.size ? setA : setB;
  const larger = setA.size <= setB.size ? setB : setA;

  for (const token of smaller) {
    if (larger.has(token)) intersectionSize++;
  }

  const unionSize = setA.size + setB.size - intersectionSize;
  return unionSize === 0 ? 0 : intersectionSize / unionSize;
}

MMR 算法特点：

平衡相关性：MMR = λ * relevance - (1-λ) * max_similarity
Jaccard 相似度：基于词袋模型计算文本相似度
Lambda 参数：0 = 最大多样性，1 = 最大相关性
增量选择：每次选择 MMR 分数最高的未选项

3.3 批处理优化

智能分组

// src/memory/manager-embedding-ops.ts
private buildEmbeddingBatches(chunks: MemoryChunk[]): MemoryChunk[][] {
  const batches: MemoryChunk[][] = [];
  let current: MemoryChunk[] = [];
  let currentTokens = 0;

  for (const chunk of chunks) {
    const estimate = estimateUtf8Bytes(chunk.text);
    const wouldExceed = current.length > 0 &&
                        currentTokens + estimate > EMBEDDING_BATCH_MAX_TOKENS;

    if (wouldExceed) {
      batches.push(current);
      current = [];
      currentTokens = 0;
    }

    // 单个 chunk 超限：独立成批
    if (current.length === 0 && estimate > EMBEDDING_BATCH_MAX_TOKENS) {
      batches.push([chunk]);
      continue;
    }

    current.push(chunk);
    currentTokens += estimate;
  }

  if (current.length > 0) {
    batches.push(current);
  }

  return batches;
}

并发控制

// src/memory/batch-runner.ts
export async function runEmbeddingBatchGroups<TRequest>(params: {
  requests: TRequest[];
  maxRequests: number;
  wait: boolean;
  pollIntervalMs: number;
  timeoutMs: number;
  concurrency: number;
  debugLabel: string;
  debug?: (message: string, data?: Record<string, unknown>) => void;
  runGroup: (args: {
    group: TRequest[];
    groupIndex: number;
    groups: number;
    byCustomId: Map<string, number[]>;
  }) => Promise<void>;
}): Promise<Map<string, number[]>> {
  if (params.requests.length === 0) return new Map();

  // 1. 分组
  const groups = splitBatchRequests(params.requests, params.maxRequests);
  const byCustomId = new Map<string, number[]>();

  // 2. 并发执行
  const tasks = groups.map((group, groupIndex) => async () => {
    await params.runGroup({ group, groupIndex, groups: groups.length, byCustomId });
  });

  params.debug?.(params.debugLabel, {
    requests: params.requests.length,
    groups: groups.length,
    wait: params.wait,
    concurrency: params.concurrency,
    pollIntervalMs: params.pollIntervalMs,
    timeoutMs: params.timeoutMs,
  });

  await runWithConcurrency(tasks, params.concurrency);
  return byCustomId;
}

批处理优化特点：

智能分组：按 8000 token 限制分组，单 chunk 超限独立处理
并发控制：可配置并发度（默认 4）
超时保护：查询 60s，批处理 120s（本地 5-10min）
失败降级：连续失败自动禁用批处理

3.4 缓存策略

// src/memory/manager-embedding-ops.ts
private loadEmbeddingCache(hashes: string[]): Map<string, number[]> {
  if (!this.cache.enabled || !this.provider) return new Map();
  if (hashes.length === 0) return new Map();

  // 去重
  const unique: string[] = [];
  const seen = new Set<string>();
  for (const hash of hashes) {
    if (hash && !seen.has(hash)) {
      seen.add(hash);
      unique.push(hash);
    }
  }

  // 批量查询（每批 400 个）
  const out = new Map<string, number[]>();
  const baseParams = [this.provider.id, this.provider.model, this.providerKey];
  const batchSize = 400;

  for (let start = 0; start < unique.length; start += batchSize) {
    const batch = unique.slice(start, start + batchSize);
    const placeholders = batch.map(() => "?").join(", ");

    const rows = this.db.prepare(`
      SELECT hash, embedding
      FROM ${EMBEDDING_CACHE_TABLE}
      WHERE provider = ? AND model = ? AND provider_key = ? AND hash IN (${placeholders})
    `).all(...baseParams, ...batch) as Array<{ hash: string; embedding: string }>;

    for (const row of rows) {
      out.set(row.hash, parseEmbedding(row.embedding));
    }
  }

  return out;
}

// 缓存修剪
protected pruneEmbeddingCacheIfNeeded(): void {
  if (!this.cache.enabled) return;
  const max = this.cache.maxEntries;
  if (!max || max <= 0) return;

  const row = this.db.prepare(`SELECT COUNT(*) as c FROM ${EMBEDDING_CACHE_TABLE}`).get() as
    | { c: number } | undefined;
  const count = row?.c ?? 0;

  if (count <= max) return;

  // 删除最旧的条目
  const excess = count - max;
  this.db.prepare(`
    DELETE FROM ${EMBEDDING_CACHE_TABLE}
    WHERE rowid IN (
      SELECT rowid FROM ${EMBEDDING_CACHE_TABLE}
      ORDER BY updated_at ASC
      LIMIT ?
    )
  `).run(excess);
}

缓存策略特点：

内容哈希索引：按 provider + model + hash 唯一索引
批量查询：每批 400 个哈希，减少数据库往返
LRU 淘汰：按 updated_at 删除最旧条目
自动修剪：超限时自动清理

四、时序图：完整搜索流程

五、调用链分析

5.1 搜索调用链

Agent
  └─> memory_search tool
      └─> getMemorySearchManager()
          └─> MemoryIndexManager.get()
              └─> manager.search(query)
                  ├─> warmSession()          # 会话预热
                  ├─> embedQueryWithTimeout() # 查询嵌入
                  ├─> searchVector()          # 向量搜索
                  │   ├─> ensureVectorReady()
                  │   ├─> sqlite-vec 搜索 / 内存计算
                  │   └─> cosineSimilarity()
                  ├─> searchKeyword()         # 关键词搜索
                  │   ├─> buildFtsQuery()
                  │   ├─> bm25RankToScore()
                  │   └─> FTS5 MATCH 查询
                  └─> mergeHybridResults()
                      ├─> 合并向量和关键词结果
                      ├─> 加权计算分数
                      ├─> applyTemporalDecay()
                      │   └─> extractTimestamp()
                      └─> applyMMR()
                          └─> jaccardSimilarity()

5.2 同步调用链

File Watcher / Interval Timer
  └─> manager.sync()
      └─> runSyncWithReadonlyRecovery()
          └─> runSync()
              ├─> listFiles()              # 扫描文件
              ├─> indexFile()              # 索引文件
              │   ├─> chunkMarkdown()      # Markdown 分块
              │   ├─> collectCachedEmbeddings()  # 加载缓存
              │   ├─> embedChunksWithBatch()    # 批量嵌入
              │   │   ├─> buildEmbeddingBatches()   # 分组
              │   │   ├─> runEmbeddingBatchGroups() # 并发执行
              │   │   └─> upsertEmbeddingCache()    # 更新缓存
              │   └─> upsertToDB()          # 写入数据库
              └─> pruneEmbeddingCache()    # 修剪缓存

六、关键代码示例

6.1 Markdown 分块

// src/memory/internal.ts
export function chunkMarkdown(content: string, options?: {
  maxChars?: number;
  overlapChars?: number;
  minChars?: number;
}): MemoryChunk[] {
  const maxChars = options?.maxChars ?? 500;
  const overlapChars = options?.overlapChars ?? 50;
  const minChars = options?.minChars ?? 100;

  const lines = content.split("\n");
  const chunks: MemoryChunk[] = [];
  let currentChunk: string[] = [];
  let currentLength = 0;
  let currentStartLine = 1;

  for (let i = 0; i < lines.length; i++) {
    const line = lines[i];
    const lineLength = line.length;

    // 空行作为分隔符
    if (line.trim() === "" && currentLength >= minChars) {
      if (currentChunk.length > 0) {
        chunks.push({
          text: currentChunk.join("\n"),
          startLine: currentStartLine,
          endLine: i,
          hash: hashText(currentChunk.join("\n")),
        });
        currentChunk = [];
        currentLength = 0;
        currentStartLine = i + 1;
      }
      continue;
    }

    // 检查是否超过最大长度
    if (currentLength + lineLength > maxChars && currentLength >= minChars) {
      chunks.push({
        text: currentChunk.join("\n"),
        startLine: currentStartLine,
        endLine: i,
        hash: hashText(currentChunk.join("\n")),
      });

      // 重叠处理
      const overlapLines: string[] = [];
      let overlapLength = 0;
      for (let j = currentChunk.length - 1; j >= 0; j--) {
        const overlapLine = currentChunk[j];
        if (overlapLength + overlapLine.length > overlapChars) break;
        overlapLines.unshift(overlapLine);
        overlapLength += overlapLine.length;
      }

      currentChunk = overlapLines;
      currentLength = overlapLength;
      currentStartLine = i - overlapLines.length + 1;
    }

    currentChunk.push(line);
    currentLength += lineLength;
  }

  // 处理最后一个 chunk
  if (currentChunk.length > 0 && currentLength >= minChars) {
    chunks.push({
      text: currentChunk.join("\n"),
      startLine: currentStartLine,
      endLine: lines.length,
      hash: hashText(currentChunk.join("\n")),
    });
  }

  return chunks;
}

6.2 查询扩展（FTS-only 模式）

// src/memory/query-expansion.ts
export function extractKeywords(query: string): string[] {
  const tokens = tokenize(query);
  const keywords: string[] = [];
  const seen = new Set<string>();

  for (const token of tokens) {
    // 跳过停用词
    if (isQueryStopWordToken(token)) continue;
    // 跳过无效关键词
    if (!isValidKeyword(token)) continue;
    // 跳过重复
    if (seen.has(token)) continue;

    seen.add(token);
    keywords.push(token);
  }

  return keywords;
}

// 多语言分词
function tokenize(text: string): string[] {
  const tokens: string[] = [];
  const normalized = text.toLowerCase().trim();
  const segments = normalized.split(/[\s\p{P}]+/u).filter(Boolean);

  for (const segment of segments) {
    // 日文：提取假名、汉字、ASCII
    if (/[\u3040-\u30ff]/.test(segment)) {
      const jpParts = segment.match(
        /[a-z0-9_]+|[\u30a0-\u30ffー]+|[\u4e00-\u9fff]+|[\u3040-\u309f]{2,}/g
      ) ?? [];
      for (const part of jpParts) {
        if (/^[\u4e00-\u9fff]+$/.test(part)) {
          tokens.push(part);
          // 添加 bigram
          for (let i = 0; i < part.length - 1; i++) {
            tokens.push(part[i] + part[i + 1]);
          }
        } else {
          tokens.push(part);
        }
      }
    }
    // 中文：字符 n-gram
    else if (/[\u4e00-\u9fff]/.test(segment)) {
      const chars = Array.from(segment).filter(c => /[\u4e00-\u9fff]/.test(c));
      tokens.push(...chars);
      for (let i = 0; i < chars.length - 1; i++) {
        tokens.push(chars[i] + chars[i + 1]);
      }
    }
    // 韩文：去除助词
    else if (/[\uac00-\ud7af]/.test(segment)) {
      const stem = stripKoreanTrailingParticle(segment);
      if (stem && !STOP_WORDS_KO.has(stem) && isUsefulKoreanStem(stem)) {
        tokens.push(stem);
      }
      if (!STOP_WORDS_KO.has(segment)) {
        tokens.push(segment);
      }
    }
    // 其他语言：直接使用
    else {
      tokens.push(segment);
    }
  }

  return tokens;
}

七、性能优化总结

优化策略	技术实现	效果
向量加速	sqlite-vec 扩展	10-100x 向量搜索加速
缓存机制	embedding_cache 表	减少 90%+ 嵌入 API 调用
批处理	并发批量请求	降低 50-80% 网络开销
智能分块	500 字符 + 50 重叠	平衡精度和性能
查询扩展	多语言分词 + bigram	提升 FTS 召回率
时间衰减	指数衰减 + 常青保护	优化结果新鲜度
MMR 重排序	Jaccard 相似度	避免重复结果
只读恢复	自动检测并重连	提升容错性

八、技术价值与展望

8.1 技术价值

高可用性
- FTS-only 降级模式确保无 API 密钥时仍可搜索
- 批处理自动降级保证服务连续性
- 只读数据库自动恢复机制
高性能
- 多层缓存（嵌入缓存、向量索引、FTS 索引）
- 批处理并发优化
- 智能分块和重叠策略
高质量结果
- 混合搜索融合语义和关键词匹配
- 时间衰减确保结果新鲜度
- MMR 多样性重排序避免重复
可扩展性
- 插件化架构支持多种存储后端（SQLite、LanceDB）
- 多 Provider 支持（OpenAI、Gemini、Voyage、Mistral、Ollama、Local）
- 清晰的接口设计便于扩展