Graphiti构建和查询时态感知知识图谱
🔥 关注公众号“朋蛋”、“码上小明”
1 介绍
1.1 简单介绍
Graphiti 是一个用于构建时态感知知识图谱的Python 框架,专为 AI 智能体设计。它支持对知识图谱进行实时增量更新,无需批量重计算,因此非常适用于关系和信息随时间演变的动态环境。
# Github地址
https://github.com/getzep/graphiti
# 官网地址
https://help.getzep.com/graphiti
1.2 GraphRAG对比
Graphiti 专门设计用于应对动态、频繁更新数据集所带来的挑战,尤其适用于需要实时交互与精准历史查询的应用场景。
| 方面 | GraphRAG | Graphiti |
|---|---|---|
| 主要用途 | 静态文档摘要 | 面向智能体的动态、演进式上下文 |
| 数据处理 | 批处理导向 | 持续、增量更新 |
| 知识结构 | 实体集群与社区摘要 | 时序知识图谱——包含实体、带有效窗口的事实、事件、社区 |
| 检索方法 | 基于大语言模型的顺序摘要 | 混合语义、关键词与基于图的检索 |
| 适应性 | 低 | 高 |
| 时间处理 | 基础时间戳追踪 | 显式双时态追踪,支持事实自动失效 |
| 矛盾处理 | 基于大语言模型的摘要判断 | 事实自动失效,同时保留时间历史 |
| 查询延迟 | 数秒至数十秒 | 通常为亚秒级延迟 |
| 自定义实体类型 | 不支持 | 支持,可通过 Pydantic 模型自定义 |
| 可扩展性 | 中等 | 高,针对大规模数据集进行了优化 |
1.3 安装环境
安装依赖
pip install graphiti-core -i https://pypi.tuna.tsinghua.edu.cn/simple
安装图数据库
docker run -itd \
--name neo4j \
-p 7474:7474 \
-p 7687:7687 \
-v /home/neo4j/data:/data \
-v /home/neo4j/logs:/logs \
-v /home/neo4j/plugins:/plugins \
-e NEO4J_AUTH=neo4j/secretgraph \
neo4j:5.26.18
2 官网代码
使用的大模型,文本生成:Kimi,嵌入:qwen,重排;qwen。
import asyncio
import json
import logging
import os
from datetime import datetime, timezone
from logging import INFO
from graphiti_core import Graphiti
from graphiti_core.cross_encoder import OpenAIRerankerClient
from graphiti_core.driver.neo4j_driver import Neo4jDriver
from graphiti_core.embedder import OpenAIEmbedder, OpenAIEmbedderConfig
from graphiti_core.llm_client import LLMConfig
from graphiti_core.llm_client.openai_generic_client import OpenAIGenericClient
from graphiti_core.nodes import EpisodeType
from graphiti_core.search.search_config_recipes import NODE_HYBRID_SEARCH_RRF
# 配置日志
logging.basicConfig(
level=INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
)
logger = logging.getLogger(__name__)
neo4j_uri = 'bolt://192.168.108.147:7687'
neo4j_user = 'neo4j'
neo4j_password = 'secretgraph'
# 设置并发数量
os.environ['SEMAPHORE_LIMIT'] = '2'
async def main():
# ---
# 1 初始化兼容OpenAI服务的Graphiti客户端
graphiti = Graphiti(
# 配置驱动
graph_driver=Neo4jDriver(
# 配置图数据库
uri=neo4j_uri,
user=neo4j_user,
password=neo4j_password
),
# 配置兼容OpenAI的大模型客户端
# 测试调用互联网的Qwen、Deepseek以及Kimi的新版本都无法使用,只能用下面kimi的旧版本
llm_client=OpenAIGenericClient(
config=LLMConfig(
api_key="sk-XXXX",
model="moonshot-v1-32k",
base_url="https://api.moonshot.cn/v1"
)
),
# 配置兼容OpenAI的嵌入模型
embedder=OpenAIEmbedder(
config=OpenAIEmbedderConfig(
api_key="sk-XXXX",
embedding_model="text-embedding-v4",
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
),
# 配置兼容OpenAI的重排模型 rerank模型
cross_encoder=OpenAIRerankerClient(
config=LLMConfig(
api_key="sk-XXXX",
model="qwen3-rerank",
base_url="https://dashscope.aliyuncs.com/compatible-api/v1/reranks",
)
)
)
try:
# ---
# 2 添加片段(Episodes-情节、事件、剧集都可以)
# Episodes是Graphiti中的主要信息单元。它们可以是文本或结构化的JSON,并且会被自动处理以提取实体和关系。
# 含有文本和JSON对象的Episodes列表
episodes = [
{
'content': 'Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.',
'type': EpisodeType.text,
'description': 'podcast transcript',
},
{
'content': 'As AG, Harris was in office from January 3, 2011 – January 3, 2017',
'type': EpisodeType.text,
'description': 'podcast transcript',
},
{
'content': {
'name': 'Gavin Newsom',
'position': 'Governor',
'state': 'California',
'previous_role': 'Lieutenant Governor',
'previous_location': 'San Francisco',
},
'type': EpisodeType.json,
'description': 'podcast metadata',
},
{
'content': {
'name': 'Gavin Newsom',
'position': 'Governor',
'term_start': 'January 7, 2019',
'term_end': 'Present',
},
'type': EpisodeType.json,
'description': 'podcast metadata',
},
]
# 添加Episodes到图谱中
for i, episode in enumerate(episodes):
print(episode)
# time.sleep(60)
# 获取内容
episode_body_str: str = ""
if isinstance(episode['content'], str):
# 获取字符串
episode_body_str = episode['content']
else:
# 序列化为字符串
episode_body_str = json.dumps(episode['content'])
await graphiti.add_episode(
name=f'Freakonomics Radio {i}',
episode_body=episode_body_str,
source=episode['type'],
source_description=episode['description'],
reference_time=datetime.now(timezone.utc),
)
print(f'增加的的片段: Freakonomics Radio {i} ({episode["type"].value})')
# ---
# 3 基础检索
# 从Graphiti中检索关系(边)的最简单方式是使用search方法,该方法结合语义相似度和BM25文本检索实现混合搜索。
print("\n搜索内容: 'Who was the California Attorney General?'")
results = await graphiti.search('Who was the California Attorney General?')
# Print search results
print('\nSearch Results:')
for result in results:
print(f'UUID: {result.uuid}')
print(f'Fact: {result.fact}')
if hasattr(result, 'valid_at') and result.valid_at:
print(f'Valid from: {result.valid_at}')
if hasattr(result, 'invalid_at') and result.invalid_at:
print(f'Valid until: {result.invalid_at}')
print('---')
# ---
# 4 中心节点检索
# 为了获得更符合上下文的结果,可以使用中心节点,根据搜索结果与特定节点的图距离对其进行重新排序
# 使用最靠前的搜索结果的UUID作为中心节点进行重新排序
if results and len(results) > 0:
# 从最靠前的搜索结果中获取源节点 UUID
center_node_uuid = results[0].source_node_uuid
print('\n根据图距离对搜索结果进行重新排序:')
print(f'使用的中心节点UUID: {center_node_uuid}')
reranked_results = await graphiti.search(
'Who was the California Attorney General?', center_node_uuid=center_node_uuid
)
# Print reranked search results
print('\n重排后结果:')
for result in reranked_results:
print(f'UUID: {result.uuid}')
print(f'Fact: {result.fact}')
if hasattr(result, 'valid_at') and result.valid_at:
print(f'Valid from: {result.valid_at}')
if hasattr(result, 'invalid_at') and result.invalid_at:
print(f'Valid until: {result.invalid_at}')
print('---')
else:
print('初始搜索未返回任何结果,因此无法选取中心节点.')
# ---
# 5 使用预定义策略进行节点搜索
# Graphiti提供了预定义的搜索策略,这些策略针对不同的搜索场景进行了优化。可使用NODE_HYBRID_SEARCH_RRF来直接检索节点,而不是检索边(关系)。
# 示例:使用预定义标准策略的_search方法进行节点搜索
print(
'\nPerforming node search using _search method with standard recipe NODE_HYBRID_SEARCH_RRF:'
)
# 定义预定义搜索策略,并修改其限制条件
node_search_config = NODE_HYBRID_SEARCH_RRF.model_copy(deep=True)
# 限制返回值的结果数量为5
node_search_config.limit = 5
# 执行节点搜索
node_search_results = await graphiti._search(
query='California Governor',
config=node_search_config,
)
# 打印节点信息
print('\n搜索的节点信息:')
for node in node_search_results.nodes:
print(f'Node UUID: {node.uuid}')
print(f'Node Name: {node.name}')
node_summary = node.summary[:100] + '...' if len(node.summary) > 100 else node.summary
print(f'Content Summary: {node_summary}')
print(f'Node Labels: {", ".join(node.labels)}')
print(f'Created At: {node.created_at}')
if hasattr(node, 'attributes') and node.attributes:
print('Attributes:')
for key, value in node.attributes.items():
print(f' {key}: {value}')
print('---')
finally:
# 清理资源
# 结束时,务必关闭与Neo4j的连接,以正确释放资源
# 关闭连接
await graphiti.close()
print('\nConnection closed')
if __name__ == '__main__':
asyncio.run(main())
3 结果
3.1 执行结果
(1)控制面板的结果
……
增加的的片段: Freakonomics Radio 0 (text)
2026-03-20 18:13:39 - neo4j.notifications - INFO - Received notification from DBMS server: <GqlStatusObject gql_status='00NA0', status_description="note: successful completion - index or constraint already exists. The command 'CREATE RANGE INDEX community_uuid IF NOT EXISTS FOR (e:Community) ON (e.uuid)' has no effect. The index or constraint specified by 'RANGE INDEX community_uuid FOR (e:Community) ON (e.uuid)' already exists.", position=None, raw_classification='SCHEMA', classification=<NotificationClassification.SCHEMA: 'SCHEMA'>, raw_severity='INFORMATION', severity=<NotificationSeverity.INFORMATION: 'INFORMATION'>, diagnostic_record={'_classification': 'SCHEMA', '_severity': 'INFORMATION', 'OPERATION': '', 'OPERATION_CODE': '0', 'CURRENT_SCHEMA': '/'}> for query: 'CREATE INDEX community_uuid IF NOT EXISTS FOR (n:Community) ON (n.uuid)'
……
2026-03-20 18:13:42 - httpx - INFO - HTTP Request: POST https://api.moonshot.cn/v1/chat/completions "HTTP/1.1 200 OK"
2026-03-20 18:13:43 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings "HTTP/1.1 200 OK"
……
增加的的片段: Freakonomics Radio 1 (text)
……
增加的的片段: Freakonomics Radio 2 (json)
……
增加的的片段: Freakonomics Radio 3 (json)
搜索内容: 'Who was the California Attorney General?'
Search Results:
UUID: 341b9eff-343e-4129-a41a-a2176ff1c897
Fact: Kamala Harris was the district attorney for San Francisco before becoming the Attorney General of California.
---
UUID: 54bc7574-5610-487d-92c3-ed9f7094e9c6
Fact: Kamala Harris served as the Attorney General of California.
Valid from: 2026-03-20 10:13:39.601668+00:00
Valid until: 2017-01-04 00:00:00+00:00
---
UUID: da7c9b7d-6b9f-44a1-a9ce-9a227055cb44
Fact: Kamala Harris is the Attorney General of California.
Valid from: 2026-03-20 08:09:37.882078+00:00
---
UUID: 753abd55-1c30-4315-a1c4-9c6070086e62
Fact: Gavin Newsom is currently the Governor of California.
Valid from: 2026-03-20 10:14:27.766084+00:00
---
UUID: 35e9fbc6-8fda-44c4-8842-f51046a52a01
Fact: Kamala Harris was previously the district attorney for San Francisco.
---
UUID: 5885816b-8454-423d-9d8e-2bcc62ae1e5d
Fact: Harris held the position of Attorney General from January 3, 2011, to January 3, 2017.
Valid from: 2011-01-03 00:00:00+00:00
Valid until: 2017-01-03 00:00:00+00:00
---
UUID: 7fc71156-4255-4eac-aed3-ff8f20b2cf4e
Fact: Gavin Newsom previously held the position of Lieutenant Governor.
---
UUID: eff5d784-8bdc-494a-ba38-f36912e04616
Fact: Kamala Harris was the district attorney for San Francisco prior to her current position.
---
UUID: 2d82ef48-7252-49d6-b964-2e66ab3898c2
Fact: The position of district attorney that Kamala Harris held was in San Francisco.
---
UUID: 5361c1a2-68f5-4a9a-9fb6-154338a3729d
Fact: Gavin Newsom was previously based in San Francisco.
---
根据图距离对搜索结果进行重新排序:
使用的中心节点UUID: 2c26b541-1d44-4366-8335-76705c326b0c
重排后结果:
UUID: 341b9eff-343e-4129-a41a-a2176ff1c897
Fact: Kamala Harris was the district attorney for San Francisco before becoming the Attorney General of California.
---
UUID: 54bc7574-5610-487d-92c3-ed9f7094e9c6
Fact: Kamala Harris served as the Attorney General of California.
Valid from: 2026-03-20 10:13:39.601668+00:00
Valid until: 2017-01-04 00:00:00+00:00
---
UUID: da7c9b7d-6b9f-44a1-a9ce-9a227055cb44
Fact: Kamala Harris is the Attorney General of California.
Valid from: 2026-03-20 08:09:37.882078+00:00
---
UUID: 35e9fbc6-8fda-44c4-8842-f51046a52a01
Fact: Kamala Harris was previously the district attorney for San Francisco.
---
UUID: 5885816b-8454-423d-9d8e-2bcc62ae1e5d
Fact: Harris held the position of Attorney General from January 3, 2011, to January 3, 2017.
Valid from: 2011-01-03 00:00:00+00:00
Valid until: 2017-01-03 00:00:00+00:00
---
UUID: eff5d784-8bdc-494a-ba38-f36912e04616
Fact: Kamala Harris was the district attorney for San Francisco prior to her current position.
---
UUID: 2d82ef48-7252-49d6-b964-2e66ab3898c2
Fact: The position of district attorney that Kamala Harris held was in San Francisco.
---
UUID: 753abd55-1c30-4315-a1c4-9c6070086e62
Fact: Gavin Newsom is currently the Governor of California.
Valid from: 2026-03-20 10:14:27.766084+00:00
---
UUID: 7fc71156-4255-4eac-aed3-ff8f20b2cf4e
Fact: Gavin Newsom previously held the position of Lieutenant Governor.
---
UUID: 5361c1a2-68f5-4a9a-9fb6-154338a3729d
Fact: Gavin Newsom was previously based in San Francisco.
---
Performing node search using _search method with standard recipe NODE_HYBRID_SEARCH_RRF:
搜索的节点信息:
Node UUID: c95b4158-c5bf-4556-81e2-b24fe2dfc159
Node Name: California
Content Summary: Kamala Harris served as the Attorney General of California.
Gavin Newsom is currently the Governor o...
Node Labels: Entity
Created At: 2026-03-20 10:13:42.973291+00:00
---
Node UUID: b22f03ae-986f-46ff-b1ee-d49657190735
Node Name: Gavin Newsom
Content Summary: Gavin Newsom is currently the Governor of California.
Gavin Newsom previously held the position of L...
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: 89c7c583-5cb3-4d62-99b1-9588f78acbd4
Node Name: Governor
Content Summary: Gavin Newsom is the Governor of California, previously serving as Lieutenant Governor in San Francis...
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: 5c34f7f4-602f-4452-ae0e-3cbef1f934cf
Node Name: Lieutenant Governor
Content Summary: Gavin Newsom previously held the position of Lieutenant Governor.
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: fa5e67c7-dd3c-4eb9-8ceb-bb4e64564f8f
Node Name: Attorney General of California
Content Summary: Kamala Harris is the Attorney General of California.
Harris held the position of Attorney General fr...
Node Labels: Entity
Created At: 2026-03-20 08:09:49.618825+00:00
---
Connection closed
(2)neo4j的结果
neo4j截图

详细图

(3)数据节点的值
{
"n": {
"identity": 0,
"labels": [
"Episodic"
],
"properties": {
"entity_edges": [
"da7c9b7d-6b9f-44a1-a9ce-9a227055cb44",
"35e9fbc6-8fda-44c4-8842-f51046a52a01"
],
"group_id": "",
"name": "Freakonomics Radio 0",
"created_at": "2026-03-20T08:09:37.882078000Z",
"source": "text",
"uuid": "2006447a-2971-454a-97cb-7e0cf7a580f9",
"content": "Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.",
"source_description": "podcast transcript",
"valid_at": "2026-03-20T08:09:37.882078000Z"
},
"elementId": "4:d48ecc53-79cb-424c-aa6e-0eba75e0ea5a:0"
}
}
3.2 错误解决方法
下面的错误就是大模型的版本不兼容导致,只能更换模型。我测试的kimi中的moonshot-v1-32k可以执行。
报错的位置。原因应该是模型解析后,无法有效映射数据。
await graphiti.add_episode(
name=f'Freakonomics Radio {i}',
episode_body=episode_body_str,
source=episode['type'],
source_description=episode['description'],
reference_time=datetime.now(timezone.utc),
)
报错结果
pydantic_core._pydantic_core.ValidationError: 3 validation errors for ExtractedEntities
extracted_entities.0.name
Field required [type=missing, input_value={'entity_name': 'Kamala H...s', 'entity_type_id': 0}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.12/v/missing
extracted_entities.1.name
Field required [type=missing, input_value={'entity_name': 'Attorney...a', 'entity_type_id': 0}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.12/v/missing
extracted_entities.2.name
Field required [type=missing, input_value={'entity_name': 'San Fran...o', 'entity_type_id': 0}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.12/v/missing
Github上待修复的Bug地址
https://bgithub.xyz/getzep/graphiti/issues/912
1501

被折叠的 条评论
为什么被折叠?



