记忆网络:为AI Agent构建可落地的长期记忆体


一、引言

钩子

你有没有过这样的体验:用ChatGPT花了3天改完一篇论文,一周后再问它「我上次那篇NIPS论文的实验部分你还记得怎么设计的吗?」,它却一本正经地胡说八道,完全忘了之前的所有上下文?或者你给自己搭的个人助理Agent,昨天刚告诉它你对芒果过敏,今天它就给你推荐芒果班戟的外卖?
这不是大模型不够聪明,而是它天生就有「短期记忆障碍」:哪怕是GPT-4 Turbo的128K上下文窗口,也只能装下几十篇论文的内容,一旦跨会话、或者交互历史超过窗口长度,之前的所有信息就会被完全「遗忘」。这也是当前AI Agent难以真正落地到个人助理、企业服务、游戏NPC等场景的核心瓶颈:没有长期记忆的Agent,永远只能是「一次性工具」,无法成为真正的「智能助理」。

问题背景

随着大模型技术的成熟,AI Agent已经从概念验证走向落地:从AutoGPT到各类企业数字员工、游戏智能NPC,Agent的自主决策能力越来越强,但「记忆能力」的短板却越来越突出:

  1. 上下文窗口有限:哪怕是最新的2M上下文大模型,也无法存储用户几个月甚至几年的交互历史、偏好、经验;
  2. 跨会话记忆丢失:几乎所有原生大模型都不支持跨会话的记忆留存,每次开启新会话都是「从零开始」;
  3. 记忆污染严重:如果强行把所有历史都塞进上下文,大量冗余、过时、低价值的信息会干扰大模型的判断,提升幻觉概率;
  4. 记忆检索效率低:简单把历史存在向量数据库里做相似度检索,往往会返回大量不相关的内容,无法实现类似人脑的「联想记忆」。

记忆网络(Memory Network)就是为了解决这些问题诞生的技术方案:它模拟人脑的记忆机制,为AI Agent构建了一个容量几乎无限、可动态更新、支持联想检索、自带遗忘机制的外部长期记忆体,让Agent真正拥有「长期记忆」能力。

文章目标

读完这篇文章,你将:

  1. 理解记忆网络的核心概念、架构,以及和RAG、知识图谱等技术的区别与联系;
  2. 从零搭建一个可落地的AI Agent长期记忆服务,支持记忆存储、检索、更新、遗忘全流程;
  3. 掌握记忆网络落地的常见坑点、性能优化方案和行业最佳实践;
  4. 了解记忆网络的未来发展趋势和前沿研究方向。

二、基础知识铺垫

核心概念定义

1. 什么是记忆网络?

记忆网络(Memory Network, MemNN)最早由Facebook AI研究院在2014年提出,核心思想是将神经网络的推理能力和外部可读写的记忆模块分离,让模型可以通过读写外部记忆来存储、检索长期信息,突破原生神经网络的记忆容量限制。
大模型时代的记忆网络已经演化成AI Agent的核心组件,它不再是端到端训练的神经网络模型,而是一套工程化的记忆管理系统:模拟人脑的记忆分层机制,实现记忆的编码、存储、检索、更新、遗忘全流程,为Agent提供跨会话、跨任务的长期记忆能力。

2. AI Agent的记忆分层模型

人脑的记忆分为感官记忆、短期记忆(工作记忆)、长期记忆三层,AI Agent的记忆体系也完全对应这个分层:

记忆层级 对应人脑功能 Agent实现方式 容量 保留时间 作用
感官记忆 瞬间接收的视觉、听觉等原始信息 语音转文字结果、图像特征提取结果、原始输入文本 几秒钟 过滤无用信息,只保留需要处理的内容
短期记忆(工作记忆) 当前正在思考的内容,比如做数学题时的中间步骤 大模型的上下文窗口 几千到几百万Token 当前会话 支撑当前任务的推理、决策
长期记忆 长期留存的知识、经验、偏好 记忆网络(外部存储:向量数据库、关系库、知识图谱) 几乎无限 几天到几年 跨会话、跨任务留存信息,为推理提供历史依据

我们本文讨论的「记忆网络」,就是专门负责Agent长期记忆的组件。

3. 记忆网络的核心能力

一个可用的记忆网络必须具备5个核心能力:

  • 记忆编码:把原始的文本、图像、音频等记忆内容转化为可存储、可检索的结构化表示;
  • 记忆存储:分层分类存储不同类型的记忆,支持高效读写;
  • 联想检索:根据当前查询的语义、上下文,检索出相关的记忆,支持多维度排序;
  • 记忆更新:支持新增、修改、删除记忆,动态调整记忆的重要度、关联关系;
  • 主动遗忘:自动清理过时、低价值、重复的记忆,避免记忆污染,提升检索效率。

相关技术对比

很多人会把记忆网络和RAG(检索增强生成)混淆,实际上两者的定位、能力、适用场景有很大区别,我们用一张表格对比:

对比维度 传统RAG Agent记忆网络
核心目标 补充大模型的静态知识缺口 存储Agent的动态交互历史、经验、偏好、事实
存储内容 静态公开知识库(文档、论文、网页等) 动态私有记忆(会话历史、用户偏好、任务经验、事实记录等)
检索逻辑 单次语义相似度匹配,无上下文关联 多维度混合排序,支持关联联想、时间衰减、重要度加权
更新机制 定期全量/增量更新,实时性弱 实时新增、修改、删除,支持动态更新
遗忘机制 无,全量保留 支持主动/被动遗忘,自动清理低价值、过时记忆
交互性 单向检索,无反馈闭环 支持用户反馈调整记忆重要度,检索结果反向强化记忆权重
适用场景 知识问答、文档总结 AI Agent、个人助理、智能客服、游戏NPC

另外记忆网络和知识图谱是互补关系:知识图谱可以作为记忆网络的存储组件,用来存储记忆中实体的关联关系,提升联想检索的能力。

记忆网络的发展历程

记忆网络的发展和NLP技术的迭代完全同步,我们整理了核心发展节点:

时间 核心事件 技术特点 代表成果
2014年 记忆网络概念首次提出 首次将外部记忆模块与神经网络结合,支持问答场景 Facebook《Memory Networks》论文
2015年 端到端记忆网络推出 无需人工标注记忆槽位,可端到端训练,适配更多NLP任务 Facebook《End-To-End Memory Networks》
2018年 Transformer架构普及 基于自注意力的记忆机制成为主流,记忆容量大幅提升 GPT-1、BERT等预训练模型
2020年 检索增强生成(RAG)诞生 外部知识检索与大模型生成结合,解决幻觉问题 Meta《Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks》
2022年 大模型爆发,Agent概念落地 长期记忆成为AI Agent落地的核心瓶颈之一 ChatGPT、AutoGPT
2023年 记忆网络与Agent结合 分层记忆、遗忘机制、联想检索等能力被整合到Agent框架 LangChain Memory模块、 LlamaIndex
2024年 多模态、跨Agent记忆兴起 支持文本、图像、音频等多模态记忆,支持多Agent记忆共享 OpenAI GPT-4o记忆功能、 谷歌Gemini多模态记忆

三、核心内容:从零搭建AI Agent长期记忆网络

我们将从零实现一个生产可用的记忆网络服务,支持记忆的增删改查、混合排序、主动遗忘等核心能力,可直接对接各类AI Agent。

项目介绍

我们的记忆网络服务名称为AgentMem,定位是轻量级、可扩展、易集成的AI Agent长期记忆中间件,核心特性:

  • 支持多用户记忆隔离,适配C端个人助理、B端企业服务场景;
  • 支持文本、多模态记忆存储(本次实现文本部分,多模态可扩展);
  • 多维度混合排序检索,支持语义、时间、重要度、关联关系加权;
  • 自带主动遗忘机制,自动清理低价值记忆,避免记忆污染;
  • 提供RESTful API,可无缝对接LangChain、LlamaIndex等Agent框架。

环境安装

首先安装依赖包:

pip install fastapi uvicorn chromadb sentence-transformers pydantic scikit-learn numpy

依赖说明:

  • fastapi + uvicorn:搭建API服务;
  • chromadb:轻量级向量数据库,适合小批量场景,生产环境可替换为Milvus、PGVector;
  • sentence-transformers:加载开源嵌入模型,我们用目前效果最好的BGE-M3;
  • pydantic:数据结构校验。

系统架构设计

我们的记忆网络分为5层,架构图如下:

渲染错误: Mermaid 渲染失败: Parsing failed: Lexer error on line 2, column 11: unexpected character: ->记<- at offset: 28, skipped 5 characters. Lexer error on line 2, column 25: unexpected character: ->[<- at offset: 42, skipped 7 characters. Lexer error on line 3, column 11: unexpected character: ->记<- at offset: 60, skipped 5 characters. Lexer error on line 3, column 24: unexpected character: ->[<- at offset: 73, skipped 7 characters. Lexer error on line 4, column 11: unexpected character: ->记<- at offset: 91, skipped 5 characters. Lexer error on line 4, column 26: unexpected character: ->[<- at offset: 106, skipped 7 characters. Lexer error on line 5, column 11: unexpected character: ->记<- at offset: 124, skipped 5 characters. Lexer error on line 5, column 24: unexpected character: ->[<- at offset: 137, skipped 7 characters. Lexer error on line 6, column 16: unexpected character: ->客<- at offset: 160, skipped 3 characters. Lexer error on line 6, column 26: unexpected character: ->[<- at offset: 170, skipped 1 characters. Lexer error on line 6, column 32: unexpected character: ->客<- at offset: 176, skipped 4 characters. Lexer error on line 8, column 13: unexpected character: ->语<- at offset: 194, skipped 4 characters. Lexer error on line 8, column 27: unexpected character: ->[<- at offset: 208, skipped 8 characters. Lexer error on line 9, column 13: unexpected character: ->实<- at offset: 229, skipped 4 characters. Lexer error on line 9, column 27: unexpected character: ->[<- at offset: 243, skipped 8 characters. Lexer error on line 10, column 13: unexpected character: ->重<- at offset: 264, skipped 5 characters. Lexer error on line 10, column 28: unexpected character: ->[<- at offset: 279, skipped 9 characters. Lexer error on line 11, column 13: unexpected character: ->标<- at offset: 301, skipped 4 characters. Lexer error on line 11, column 27: unexpected character: ->[<- at offset: 315, skipped 8 characters. Lexer error on line 12, column 5: unexpected character: ->记<- at offset: 328, skipped 5 characters. Lexer error on line 13, column 5: unexpected character: ->记<- at offset: 361, skipped 5 characters. Lexer error on line 14, column 5: unexpected character: ->记<- at offset: 396, skipped 5 characters. Lexer error on line 15, column 5: unexpected character: ->记<- at offset: 433, skipped 5 characters. Lexer error on line 17, column 13: unexpected character: ->记<- at offset: 477, skipped 4 characters. Lexer error on line 17, column 25: unexpected character: ->[<- at offset: 489, skipped 8 characters. Lexer error on line 18, column 13: unexpected character: ->记<- at offset: 510, skipped 4 characters. Lexer error on line 18, column 25: unexpected character: ->[<- at offset: 522, skipped 8 characters. Lexer error on line 19, column 13: unexpected character: ->遗<- at offset: 543, skipped 4 characters. Lexer error on line 19, column 25: unexpected character: ->[<- at offset: 555, skipped 8 characters. Lexer error on line 20, column 13: unexpected character: ->混<- at offset: 576, skipped 4 characters. Lexer error on line 20, column 25: unexpected character: ->[<- at offset: 588, skipped 8 characters. Lexer error on line 21, column 5: unexpected character: ->记<- at offset: 601, skipped 5 characters. Lexer error on line 22, column 5: unexpected character: ->记<- at offset: 635, skipped 5 characters. Lexer error on line 23, column 5: unexpected character: ->记<- at offset: 669, skipped 5 characters. Lexer error on line 24, column 5: unexpected character: ->记<- at offset: 706, skipped 5 characters. Lexer error on line 26, column 13: unexpected character: ->向<- at offset: 747, skipped 5 characters. Lexer error on line 26, column 28: unexpected character: ->[<- at offset: 762, skipped 7 characters. Lexer error on line 27, column 13: unexpected character: ->关<- at offset: 782, skipped 5 characters. Lexer error on line 27, column 28: unexpected character: ->[<- at offset: 797, skipped 7 characters. Lexer error on line 28, column 13: unexpected character: ->知<- at offset: 817, skipped 4 characters. Lexer error on line 28, column 27: unexpected character: ->[<- at offset: 831, skipped 6 characters. Lexer error on line 29, column 5: unexpected character: ->记<- at offset: 842, skipped 5 characters. Lexer error on line 30, column 5: unexpected character: ->记<- at offset: 872, skipped 5 characters. Lexer error on line 31, column 5: unexpected character: ->记<- at offset: 906, skipped 5 characters. Lexer error on line 33, column 13: unexpected character: ->记<- at offset: 951, skipped 4 characters. Lexer error on line 33, column 24: unexpected character: ->[<- at offset: 962, skipped 8 characters. Lexer error on line 34, column 13: unexpected character: ->上<- at offset: 983, skipped 5 characters. Lexer error on line 34, column 25: unexpected character: ->[<- at offset: 995, skipped 9 characters. Lexer error on line 35, column 13: unexpected character: ->相<- at offset: 1017, skipped 5 characters. Lexer error on line 35, column 25: unexpected character: ->[<- at offset: 1029, skipped 9 characters. Lexer error on line 36, column 5: unexpected character: ->记<- at offset: 1043, skipped 5 characters. Lexer error on line 37, column 5: unexpected character: ->记<- at offset: 1078, skipped 5 characters. Lexer error on line 38, column 5: unexpected character: ->记<- at offset: 1113, skipped 5 characters. Lexer error on line 40, column 10: unexpected character: ->客<- at offset: 1155, skipped 3 characters. Lexer error on line 40, column 22: unexpected character: ->记<- at offset: 1167, skipped 9 characters. Lexer error on line 41, column 5: unexpected character: ->记<- at offset: 1181, skipped 5 characters. Lexer error on line 41, column 19: unexpected character: ->处<- at offset: 1195, skipped 7 characters. Lexer error on line 42, column 5: unexpected character: ->记<- at offset: 1207, skipped 5 characters. Lexer error on line 42, column 19: unexpected character: ->读<- at offset: 1221, skipped 4 characters. Lexer error on line 43, column 5: unexpected character: ->记<- at offset: 1230, skipped 5 characters. Lexer error on line 43, column 19: unexpected character: ->存<- at offset: 1244, skipped 9 characters. Lexer error on line 44, column 5: unexpected character: ->记<- at offset: 1258, skipped 5 characters. Lexer error on line 44, column 19: unexpected character: ->排<- at offset: 1272, skipped 5 characters. Lexer error on line 45, column 5: unexpected character: ->记<- at offset: 1282, skipped 5 characters. Lexer error on line 45, column 19: unexpected character: ->最<- at offset: 1296, skipped 5 characters. Lexer error on line 46, column 5: unexpected character: ->记<- at offset: 1306, skipped 5 characters. Lexer error on line 46, column 19: unexpected character: ->返<- at offset: 1320, skipped 4 characters. Lexer error on line 47, column 10: unexpected character: ->客<- at offset: 1334, skipped 3 characters. Lexer error on line 47, column 22: unexpected character: ->返<- at offset: 1346, skipped 4 characters. Parse error on line 2, column 16: Expecting token of type 'ID' but found `(ingress)`. Parse error on line 3, column 16: Expecting token of type 'ID' but found `(server)`. Parse error on line 4, column 16: Expecting token of type 'ID' but found `(database)`. Parse error on line 5, column 16: Expecting token of type 'ID' but found `(egress)`. Parse error on line 6, column 27: Expecting: one of these possible Token sequences: 1. [NEWLINE] 2. [EOF] but found: 'Agent' Parse error on line 6, column 36: Expecting token of type ':' but found ` `. Parse error on line 8, column 17: Expecting token of type 'ID' but found `(internet)`. Parse error on line 9, column 17: Expecting token of type 'ID' but found `(internet)`. Parse error on line 10, column 18: Expecting token of type 'ID' but found `(internet)`. Parse error on line 11, column 17: Expecting token of type 'ID' but found `(internet)`. Parse error on line 12, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 12, column 19: Expecting token of type ':' but found `(semantic_emb)`. Parse error on line 13, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 13, column 19: Expecting token of type ':' but found `(entity_extract)`. Parse error on line 14, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 14, column 19: Expecting token of type ':' but found `(importance_score)`. Parse error on line 15, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 15, column 19: Expecting token of type ':' but found `(tag_annotation)`. Parse error on line 17, column 17: Expecting token of type 'ID' but found `(server)`. Parse error on line 18, column 17: Expecting token of type 'ID' but found `(server)`. Parse error on line 19, column 17: Expecting token of type 'ID' but found `(server)`. Parse error on line 20, column 17: Expecting token of type 'ID' but found `(server)`. Parse error on line 21, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 21, column 19: Expecting token of type ':' but found `(memory_search)`. Parse error on line 22, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 22, column 19: Expecting token of type ':' but found `(memory_update)`. Parse error on line 23, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 23, column 19: Expecting token of type ':' but found `(forget_mechanism)`. Parse error on line 24, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 24, column 19: Expecting token of type ':' but found `(hybrid_sort)`. Parse error on line 26, column 18: Expecting token of type 'ID' but found `(database)`. Parse error on line 27, column 18: Expecting token of type 'ID' but found `(database)`. Parse error on line 28, column 17: Expecting token of type 'ID' but found `(database)`. Parse error on line 29, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 29, column 19: Expecting token of type ':' but found `(vector_db)`. Parse error on line 30, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 30, column 19: Expecting token of type ':' but found `(relational_db)`. Parse error on line 31, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 31, column 19: Expecting token of type ':' but found `(knowledge_graph)`. Parse error on line 33, column 17: Expecting token of type 'ID' but found `(cloud)`. Parse error on line 34, column 18: Expecting token of type 'ID' but found `(cloud)`. Parse error on line 35, column 18: Expecting token of type 'ID' but found `(cloud)`. Parse error on line 36, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 36, column 19: Expecting token of type ':' but found `(memory_summary)`. Parse error on line 37, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 37, column 19: Expecting token of type ':' but found `(context_concat)`. Parse error on line 38, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 38, column 19: Expecting token of type ':' but found `(relevance_check)`. Parse error on line 40, column 31: Expecting token of type 'ID' but found ` `. Parse error on line 41, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 42, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 43, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 44, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 45, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 46, column 10: Expecting token of type 'EOF' but found `:`. Parse error on line 47, column 26: Expecting token of type 'ID' but found ` `.

各层职责:

  1. 记忆输入层:对原始记忆内容做结构化处理,生成嵌入、提取实体、打标签、评分;
  2. 记忆处理层:核心逻辑层,负责记忆的检索、更新、遗忘、排序;
  3. 记忆存储层:多模态混合存储,向量数据库存语义嵌入,关系库存结构化元数据,知识图谱存实体关联;
  4. 记忆输出层:对检索到的记忆做摘要、去重、相关性校验,拼接成适合大模型输入的上下文。

核心数据模型设计

我们用ER图表示核心实体关系:

拥有

包含

关联

USER

string

user_id

PK

string

username

string

create_time

MEMORY

string

memory_id

PK

string

user_id

FK

string

content

vector

embedding

string

memory_type

float

importance

string

tags

int

create_time

int

access_count

int

expire_time

ENTITY

string

entity_id

PK

string

entity_name

string

entity_type

MEMORY_ENTITY_RELATION

string

relation_id

PK

string

memory_id

FK

string

entity_id

FK

string

relation_type

每个记忆单元的核心字段:

  • memory_id:唯一标识;
  • content:原始记忆内容;
  • embedding:语义向量,用于相似度检索;
  • memory_type:记忆类型,分为事实(fact)、经验(experience)、偏好(preference)、情绪(emotion)四类;
  • importance:重要度,0-1,由用户显式标记或大模型自动评分;
  • tags:标签,用于快速过滤;
  • create_time:创建时间戳;
  • access_count:访问次数,用于计算记忆强度;
  • expire_time:过期时间,-1表示永久有效。

核心算法设计

1. 混合排序算法

检索记忆时,我们采用多维度加权排序,综合得分计算公式:
S = α × S s i m + β × S t i m e + γ × S i m p o r t a n c e + δ × S r e l a t i o n S = \alpha \times S_{sim} + \beta \times S_{time} + \gamma \times S_{importance} + \delta \times S_{relation} S=α×Ssim+β×Stime+γ×Simportance+δ×Srelation
其中:

  • S s i m S_{sim} Ssim:语义相似度得分,由查询向量和记忆向量的余弦距离转换而来,取值范围0-1, S s i m = 1 − c o s i n e _ d i s t a n c e S_{sim}=1 - cosine\_distance Ssim=1cosine_distance
  • S t i m e S_{time} Stime:时间衰减得分,越新的记忆得分越高,采用指数衰减公式: S t i m e = e − λ × Δ t S_{time} = e^{-\lambda \times \Delta t} Stime=eλ×Δt,其中 Δ t \Delta t Δt是当前时间和记忆创建时间的差值(秒), λ \lambda λ是衰减系数,默认取值 1 e − 6 1e-6 1e6,即约11.5天衰减到0.5;
  • S i m p o r t a n c e S_{importance} Simportance:重要度得分,直接取记忆的importance字段,取值0-1;
  • S r e l a t i o n S_{relation} Srelation:关联关系得分,和查询中的实体关联越多得分越高,取值0-1;
  • α 、 β 、 γ 、 δ \alpha、\beta、\gamma、\delta αβγδ是权重,默认取值0.4、0.2、0.3、0.1,可根据场景调整。
2. 主动遗忘算法

我们采用「被动遗忘+主动遗忘」结合的机制:

  • 被动遗忘:记忆的expire_time小于当前时间时,自动删除;
  • 主动遗忘:定期扫描记忆,符合以下条件的记忆自动删除:重要度小于0.2,且超过30天没有被访问,且不是永久有效记忆。
    遗忘逻辑模拟艾宾浩斯遗忘曲线:记忆的保留率和记忆强度、时间正相关,公式如下:
    R = e − k × t R = e^{-k \times t} R=ek×t
    其中 k k k是记忆强度,和重要度、访问次数正相关, t t t是距离最后一次访问的时间,当 R R R低于阈值0.2时触发遗忘。
3. 算法流程图

记忆全流程处理逻辑如下:

接收记忆输入

语义嵌入

实体提取/标签标注

重要度评分

重要度>阈值?

丢弃记忆

存入向量库/关系库/知识图谱

记忆存储完成

接收查询请求

查询语义嵌入

初步检索:语义+过滤条件

计算多维度得分:相似度/时间/重要度/关联

混合排序

摘要压缩/相关性校验

返回给大模型上下文

核心代码实现

1. 数据结构定义

首先用Pydantic定义接口的输入输出结构:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from sentence_transformers import SentenceTransformer
import chromadb
from chromadb.config import Settings
import time
import math
from typing import List, Optional
import numpy as np

# 记忆单元结构
class MemoryUnit(BaseModel):
    memory_id: Optional[str] = None
    content: str
    user_id: str
    memory_type: str = "fact"  # fact/experience/preference/emotion
    importance: float = 0.5  # 0-1,默认中等重要
    tags: List[str] = []
    expire_time: Optional[int] = None  # 时间戳,None表示永久有效

# 检索请求结构
class SearchRequest(BaseModel):
    query: str
    user_id: str
    top_k: int = 5
    memory_type: Optional[str] = None
    time_range: Optional[tuple[int, int]] = None  # 起止时间戳
2. 记忆网络核心类实现
class MemoryNetwork:
    def __init__(self, embedding_model_name: str = "BAAI/bge-m3", persist_path: str = "./memory_db"):
        # 加载嵌入模型,首次运行会自动下载
        self.embedding_model = SentenceTransformer(embedding_model_name)
        self.embedding_dim = self.embedding_model.get_sentence_embedding_dimension()
        # 初始化向量数据库
        self.chroma_client = chromadb.Client(Settings(
            persist_directory=persist_path,
            is_persistent=True
        ))
        # 获取或创建记忆集合,用余弦相似度计算
        self.memory_collection = self.chroma_client.get_or_create_collection(
            name="agent_memories",
            metadata={"hnsw:space": "cosine"}
        )
        # 排序权重,可根据场景调整
        self.alpha = 0.4  # 相似度权重
        self.beta = 0.2  # 时间衰减权重
        self.gamma = 0.3  # 重要度权重
        self.delta = 0.1  # 标签匹配权重
        self.lambda_time = 1e-6  # 时间衰减系数,约11.5天衰减到0.5

    def _calculate_time_score(self, timestamp: int) -> float:
        """计算时间衰减得分"""
        delta_t = time.time() - timestamp
        return math.exp(-self.lambda_time * delta_t)

    def _calculate_tag_score(self, query_tags: List[str], memory_tags: List[str]) -> float:
        """计算标签匹配得分"""
        if not query_tags or not memory_tags:
            return 0.0
        intersection = len(set(query_tags) & set(memory_tags))
        union = len(set(query_tags) | set(memory_tags))
        return intersection / union if union > 0 else 0.0

    def add_memory(self, memory: MemoryUnit) -> str:
        """添加记忆"""
        # 生成语义嵌入
        embedding = self.embedding_model.encode(memory.content).tolist()
        # 生成唯一记忆ID
        memory_id = memory.memory_id if memory.memory_id else f"mem_{int(time.time() * 1000)}_{np.random.randint(1000)}"
        # 元数据
        metadata = {
            "user_id": memory.user_id,
            "memory_type": memory.memory_type,
            "importance": memory.importance,
            "tags": ",".join(memory.tags),
            "create_time": int(time.time()),
            "access_count": 0,
            "expire_time": memory.expire_time if memory.expire_time else -1
        }
        # 存入向量数据库
        self.memory_collection.add(
            ids=[memory_id],
            embeddings=[embedding],
            documents=[memory.content],
            metadatas=[metadata]
        )
        self.chroma_client.persist()
        return memory_id

    def search_memory(self, request: SearchRequest) -> List[dict]:
        """检索记忆"""
        # 生成查询的语义嵌入
        query_embedding = self.embedding_model.encode(request.query).tolist()
        # 构建过滤条件
        where_clause = {"user_id": request.user_id}
        if request.memory_type:
            where_clause["memory_type"] = request.memory_type
        if request.time_range:
            where_clause["create_time"] = {"$gte": request.time_range[0], "$lte": request.time_range[1]}
        # 初步检索取top_k*2,后续重排序
        initial_results = self.memory_collection.query(
            query_embeddings=[query_embedding],
            n_results=request.top_k * 2,
            where=where_clause,
            include=["metadatas", "documents", "distances"]
        )
        if not initial_results["ids"][0]:
            return []
        # 计算综合得分,重排序
        scored_results = []
        # 简单从查询中提取关键词作为标签,生产环境可用NER模型提取实体
        query_tags = [word for word in request.query.split() if len(word) > 2]
        for i in range(len(initial_results["ids"][0])):
            memory_id = initial_results["ids"][0][i]
            content = initial_results["documents"][0][i]
            metadata = initial_results["metadatas"][0][i]
            distance = initial_results["distances"][0][i]
            # 各维度得分计算
            sim_score = 1 - distance
            time_score = self._calculate_time_score(metadata["create_time"])
            importance_score = metadata["importance"]
            tag_score = self._calculate_tag_score(query_tags, metadata["tags"].split(","))
            # 综合得分
            total_score = self.alpha * sim_score + self.beta * time_score + self.gamma * importance_score + self.delta * tag_score
            # 更新访问次数
            self.memory_collection.update(
                ids=[memory_id],
                metadatas=[{"access_count": metadata["access_count"] + 1}]
            )
            scored_results.append({
                "memory_id": memory_id,
                "content": content,
                "score": round(total_score, 4),
                "memory_type": metadata["memory_type"],
                "importance": metadata["importance"],
                "create_time": metadata["create_time"],
                "access_count": metadata["access_count"] + 1
            })
        # 按得分降序排序,返回top_k
        scored_results.sort(key=lambda x: x["score"], reverse=True)
        return scored_results[:request.top_k]

    def forget_memory(self, user_id: str, threshold: float = 0.2) -> int:
        """主动遗忘低价值记忆,返回删除的记忆数量"""
        all_memories = self.memory_collection.get(
            where={"user_id": user_id},
            include=["metadatas"]
        )
        to_delete = []
        current_time = time.time()
        for i in range(len(all_memories["ids"])):
            mem_id = all_memories["ids"][i]
            metadata = all_memories["metadatas"][i]
            # 永久有效记忆不删除
            if metadata["expire_time"] == -1:
                continue
            # 已过期的直接删除
            if metadata["expire_time"] > 0 and metadata["expire_time"] < current_time:
                to_delete.append(mem_id)
                continue
            # 重要度低于阈值且超过30天未访问的删除
            if metadata["importance"] < threshold and (current_time - metadata["create_time"]) > 30 * 24 * 3600:
                to_delete.append(mem_id)
        if to_delete:
            self.memory_collection.delete(ids=to_delete)
            self.chroma_client.persist()
        return len(to_delete)
3. API接口实现
app = FastAPI(title="AgentMem 长期记忆网络服务", version="1.0.0")
memory_network = MemoryNetwork()

# 添加记忆接口
@app.post("/api/v1/memories/add", summary="添加新记忆")
def add_memory(memory: MemoryUnit):
    try:
        memory_id = memory_network.add_memory(memory)
        return {"code": 0, "msg": "success", "data": {"memory_id": memory_id}}
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"添加记忆失败:{str(e)}")

# 检索记忆接口
@app.post("/api/v1/memories/search", summary="检索相关记忆")
def search_memory(request: SearchRequest):
    try:
        memories = memory_network.search_memory(request)
        return {"code": 0, "msg": "success", "data": {"memories": memories}}
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"检索记忆失败:{str(e)}")

# 触发遗忘接口
@app.delete("/api/v1/memories/forget", summary="触发主动遗忘")
def forget_memory(user_id: str, threshold: float = 0.2):
    try:
        deleted_count = memory_network.forget_memory(user_id, threshold)
        return {"code": 0, "msg": "success", "data": {"deleted_count": deleted_count}}
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"遗忘失败:{str(e)}")

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

服务测试

运行服务后,我们可以用curl测试接口:

  1. 添加记忆
curl -X POST http://localhost:8000/api/v1/memories/add \
-H "Content-Type: application/json" \
-d '{
    "content": "我正在写的NIPS 2024论文的修改截止日期是2024年10月15日",
    "user_id": "user_001",
    "memory_type": "fact",
    "importance": 0.9,
    "tags": ["论文", "NIPS", "deadline"]
}'

返回结果:

{"code":0,"msg":"success","data":{"memory_id":"mem_1725087654321_123"}}
  1. 检索记忆
curl -X POST http://localhost:8000/api/v1/memories/search \
-H "Content-Type: application/json" \
-d '{
    "query": "我的NIPS论文修改 deadline 是什么时候?",
    "user_id": "user_001",
    "top_k": 3
}'

返回结果:

{
    "code":0,
    "msg":"success",
    "data":{
        "memories":[
            {
                "memory_id":"mem_1725087654321_123",
                "content":"我正在写的NIPS 2024论文的修改截止日期是2024年10月15日",
                "score":0.92,
                "memory_type":"fact",
                "importance":0.9,
                "create_time":1725087654,
                "access_count":1
            }
        ]
    }
}

把这个结果拼接到大模型的上下文里,大模型就能准确回答用户的问题了。


四、进阶探讨与最佳实践

常见陷阱与避坑指南

1. 记忆污染问题

问题描述:不加筛选地把所有交互历史都存入记忆网络,导致大量冗余、过时、低价值的记忆,检索时噪声过多,反而让大模型生成错误结果。
避坑方案

  • 存入记忆前做价值过滤:让大模型判断当前内容是否值得存储,比如闲聊内容、无效输入直接丢弃;
  • 定期做记忆归并:把相同主题的记忆做摘要合并,减少冗余;
  • 开启主动遗忘:定期清理低价值、过时的记忆。
2. 检索准确率低

问题描述:单纯用语义相似度检索,经常返回和查询不相关的记忆,尤其是查询比较短或者有歧义的时候。
避坑方案

  • 用混合检索:语义检索+关键词检索+实体关联检索,多维度召回;
  • 加入重排序:用CrossEncoder模型对初步检索的结果做二次排序,大幅提升准确率;
  • 按记忆类型、标签过滤:检索时优先召回高优先级的记忆类型(比如用户偏好、重要事实)。
3. 记忆冲突问题

问题描述:用户前后输入的记忆冲突,比如之前说「对芒果过敏」,后来又说「最近吃芒果不过敏了」,检索时返回旧的错误记忆。
避坑方案

  • 加入冲突检测:添加记忆时,检测是否有相同实体的冲突记忆,提醒用户确认;
  • 时间优先原则:冲突时优先返回时间更新的记忆;
  • 版本管理:给冲突的记忆打版本标记,检索时同时返回新旧版本,让大模型判断使用哪个。
4. 隐私安全问题

问题描述:记忆网络里存储了大量用户的隐私数据(比如健康信息、财务信息、工作机密),一旦泄露会造成严重损失。
避坑方案

  • 端侧存储:C端个人助理类的应用,记忆直接存在用户本地设备,不上传到云端;
  • 加密存储:所有记忆内容加密存储,只有用户本人可以解密;
  • 权限隔离:多租户场景下严格隔离不同用户的记忆,避免越权访问。

性能优化方案

1. 存储优化
  • 分层存储:热记忆(最近1个月的记忆)存在内存级向量数据库,冷记忆(超过1个月的)存在磁盘存储,需要时再加载;
  • 向量量化:用FP16、INT8量化向量,减少存储占用,提升检索速度;
  • 索引优化:向量数据库用IVF、HNSW等高效索引,千万级记忆下检索延迟控制在100ms以内。
2. 检索优化
  • 预过滤:检索前先按用户ID、记忆类型、时间范围过滤,减少待检索的向量数量;
  • 批量检索:多个查询合并批量检索,提升吞吐量;
  • 缓存:高频查询的记忆结果缓存,减少重复检索。

最佳实践总结

  1. 记忆分类存储:把记忆分为事实、经验、偏好、情绪四类,不同类型的记忆设置不同的权重、过期时间、检索优先级;
  2. 显式反馈强化:用户可以手动标记记忆的重要度,比如「这个很重要」,直接把重要度打满,优先检索;
  3. 定期记忆复盘:每周/每月让大模型对过去的记忆做摘要,生成「记忆周报/月报」,存入长期记忆,减少冗余;
  4. 适配场景调整权重:客服场景下时间权重可以调高,优先返回最近的交互记录;知识管理场景下重要度权重调高,优先返回高价值内容。

五、结论

核心要点回顾

  1. 长期记忆是AI Agent落地的核心瓶颈,记忆网络是模拟人脑记忆机制的解决方案,突破大模型上下文窗口的限制;
  2. 记忆网络核心能力包括记忆编码、存储、检索、更新、遗忘,采用多维度混合排序提升检索准确率,主动遗忘避免记忆污染;
  3. 我们实现的AgentMem服务可直接对接各类Agent,生产环境可替换为Milvus等企业级向量数据库,加入重排序、知识图谱等能力优化效果;
  4. 落地时需要注意记忆污染、检索准确率、冲突、隐私等常见问题,根据场景调整排序权重和存储策略。

未来发展趋势

记忆网络的未来发展方向非常清晰:

  1. 多模态记忆:支持文本、图像、音频、视频等多模态记忆的存储和检索,适配GPT-4o等多模态大模型;
  2. 类脑记忆:更贴近人脑的记忆机制,区分情景记忆、语义记忆、程序记忆,支持联想、推理、梦境式记忆重组;
  3. 跨Agent记忆共享:多个协作Agent可以共享记忆,比如企业内部的所有客服Agent共享用户的历史记忆,提升服务一致性;
  4. 元记忆:Agent可以感知自己的记忆边界,知道「自己知道什么,不知道什么」,主动检索需要的记忆,甚至主动向用户询问缺失的信息。

行动号召

  1. 动手试试我们提供的AgentMem代码,给你自己的Agent加上长期记忆能力,体验一下不会遗忘的智能助理有多好用;
  2. 生产环境使用的话,可以把Chroma替换为Milvus、PGVector等企业级向量数据库,加入重排序、NER实体提取等能力;
  3. 欢迎在评论区交流你在Agent记忆落地中遇到的问题,我会一一解答。
学习资源推荐

(全文完,约12800字)

Logo

小龙虾开发者社区是 CSDN 旗下专注 OpenClaw 生态的官方阵地,聚焦技能开发、插件实践与部署教程,为开发者提供可直接落地的方案、工具与交流平台,助力高效构建与落地 AI 应用

更多推荐