Graphiti构建和查询时态感知知识图谱

Graphiti构建和查询时态感知知识图谱

🔥 关注公众号“朋蛋”、“码上小明”

1 介绍

1.1 简单介绍

Graphiti 是一个用于构建时态感知知识图谱的Python 框架,专为 AI 智能体设计。它支持对知识图谱进行实时增量更新,无需批量重计算,因此非常适用于关系和信息随时间演变的动态环境。

# Github地址
https://github.com/getzep/graphiti

# 官网地址
https://help.getzep.com/graphiti

1.2 GraphRAG对比

Graphiti 专门设计用于应对动态、频繁更新数据集所带来的挑战,尤其适用于需要实时交互与精准历史查询的应用场景。

方面GraphRAGGraphiti
主要用途静态文档摘要面向智能体的动态、演进式上下文
数据处理批处理导向持续、增量更新
知识结构实体集群与社区摘要时序知识图谱——包含实体、带有效窗口的事实、事件、社区
检索方法基于大语言模型的顺序摘要混合语义、关键词与基于图的检索
适应性
时间处理基础时间戳追踪显式双时态追踪,支持事实自动失效
矛盾处理基于大语言模型的摘要判断事实自动失效,同时保留时间历史
查询延迟数秒至数十秒通常为亚秒级延迟
自定义实体类型不支持支持,可通过 Pydantic 模型自定义
可扩展性中等高,针对大规模数据集进行了优化

1.3 安装环境

安装依赖

pip install graphiti-core -i https://pypi.tuna.tsinghua.edu.cn/simple

安装图数据库

docker run -itd \
--name neo4j \
-p 7474:7474 \
-p 7687:7687 \
-v /home/neo4j/data:/data \
-v /home/neo4j/logs:/logs \
-v /home/neo4j/plugins:/plugins \
-e NEO4J_AUTH=neo4j/secretgraph \
neo4j:5.26.18

2 官网代码

使用的大模型,文本生成:Kimi,嵌入:qwen,重排;qwen。

import asyncio
import json
import logging
import os
from datetime import datetime, timezone
from logging import INFO

from graphiti_core import Graphiti
from graphiti_core.cross_encoder import OpenAIRerankerClient
from graphiti_core.driver.neo4j_driver import Neo4jDriver
from graphiti_core.embedder import OpenAIEmbedder, OpenAIEmbedderConfig
from graphiti_core.llm_client import LLMConfig
from graphiti_core.llm_client.openai_generic_client import OpenAIGenericClient
from graphiti_core.nodes import EpisodeType
from graphiti_core.search.search_config_recipes import NODE_HYBRID_SEARCH_RRF

# 配置日志
logging.basicConfig(
    level=INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S',
)
logger = logging.getLogger(__name__)


neo4j_uri = 'bolt://192.168.108.147:7687'
neo4j_user = 'neo4j'
neo4j_password = 'secretgraph'


# 设置并发数量
os.environ['SEMAPHORE_LIMIT'] = '2'


async def main():
    # ---
    # 1 初始化兼容OpenAI服务的Graphiti客户端
    graphiti = Graphiti(
        # 配置驱动
        graph_driver=Neo4jDriver(
            # 配置图数据库
            uri=neo4j_uri,
            user=neo4j_user,
            password=neo4j_password
        ),

        # 配置兼容OpenAI的大模型客户端
        # 测试调用互联网的Qwen、Deepseek以及Kimi的新版本都无法使用,只能用下面kimi的旧版本
        llm_client=OpenAIGenericClient(
            config=LLMConfig(
                api_key="sk-XXXX",
                model="moonshot-v1-32k",
                base_url="https://api.moonshot.cn/v1"
            )
        ),
        # 配置兼容OpenAI的嵌入模型
        embedder=OpenAIEmbedder(
            config=OpenAIEmbedderConfig(
                api_key="sk-XXXX",
                embedding_model="text-embedding-v4",
                base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
            )
        ),
        # 配置兼容OpenAI的重排模型 rerank模型
        cross_encoder=OpenAIRerankerClient(
            config=LLMConfig(
                api_key="sk-XXXX",
                model="qwen3-rerank",
                base_url="https://dashscope.aliyuncs.com/compatible-api/v1/reranks",
            )
        )
    )

    try:
        # ---
        # 2 添加片段(Episodes-情节、事件、剧集都可以)
        # Episodes是Graphiti中的主要信息单元。它们可以是文本或结构化的JSON,并且会被自动处理以提取实体和关系。

        # 含有文本和JSON对象的Episodes列表
        episodes = [
            {
                'content': 'Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.',
                'type': EpisodeType.text,
                'description': 'podcast transcript',
            },
            {
                'content': 'As AG, Harris was in office from January 3, 2011 – January 3, 2017',
                'type': EpisodeType.text,
                'description': 'podcast transcript',
            },
            {
                'content': {
                    'name': 'Gavin Newsom',
                    'position': 'Governor',
                    'state': 'California',
                    'previous_role': 'Lieutenant Governor',
                    'previous_location': 'San Francisco',
                },
                'type': EpisodeType.json,
                'description': 'podcast metadata',
            },
            {
                'content': {
                    'name': 'Gavin Newsom',
                    'position': 'Governor',
                    'term_start': 'January 7, 2019',
                    'term_end': 'Present',
                },
                'type': EpisodeType.json,
                'description': 'podcast metadata',
            },
        ]

        # 添加Episodes到图谱中
        for i, episode in enumerate(episodes):
            print(episode)
            # time.sleep(60)

            # 获取内容
            episode_body_str: str = ""
            if isinstance(episode['content'], str):
                # 获取字符串
                episode_body_str = episode['content']
            else:
                # 序列化为字符串
                episode_body_str = json.dumps(episode['content'])

            await graphiti.add_episode(
                name=f'Freakonomics Radio {i}',
                episode_body=episode_body_str,
                source=episode['type'],
                source_description=episode['description'],
                reference_time=datetime.now(timezone.utc),
            )
            print(f'增加的的片段: Freakonomics Radio {i} ({episode["type"].value})')

        # ---
        # 3 基础检索
        # 从Graphiti中检索关系(边)的最简单方式是使用search方法,该方法结合语义相似度和BM25文本检索实现混合搜索。
        print("\n搜索内容: 'Who was the California Attorney General?'")
        results = await graphiti.search('Who was the California Attorney General?')

        # Print search results
        print('\nSearch Results:')
        for result in results:
            print(f'UUID: {result.uuid}')
            print(f'Fact: {result.fact}')
            if hasattr(result, 'valid_at') and result.valid_at:
                print(f'Valid from: {result.valid_at}')
            if hasattr(result, 'invalid_at') and result.invalid_at:
                print(f'Valid until: {result.invalid_at}')
            print('---')

        # ---
        # 4 中心节点检索
        # 为了获得更符合上下文的结果,可以使用中心节点,根据搜索结果与特定节点的图距离对其进行重新排序

        # 使用最靠前的搜索结果的UUID作为中心节点进行重新排序
        if results and len(results) > 0:
            # 从最靠前的搜索结果中获取源节点 UUID
            center_node_uuid = results[0].source_node_uuid

            print('\n根据图距离对搜索结果进行重新排序:')
            print(f'使用的中心节点UUID: {center_node_uuid}')

            reranked_results = await graphiti.search(
                'Who was the California Attorney General?', center_node_uuid=center_node_uuid
            )

            # Print reranked search results
            print('\n重排后结果:')
            for result in reranked_results:
                print(f'UUID: {result.uuid}')
                print(f'Fact: {result.fact}')
                if hasattr(result, 'valid_at') and result.valid_at:
                    print(f'Valid from: {result.valid_at}')
                if hasattr(result, 'invalid_at') and result.invalid_at:
                    print(f'Valid until: {result.invalid_at}')
                print('---')
        else:
            print('初始搜索未返回任何结果,因此无法选取中心节点.')

        # ---
        # 5 使用预定义策略进行节点搜索
        # Graphiti提供了预定义的搜索策略,这些策略针对不同的搜索场景进行了优化。可使用NODE_HYBRID_SEARCH_RRF来直接检索节点,而不是检索边(关系)。
        # 示例:使用预定义标准策略的_search方法进行节点搜索
        print(
            '\nPerforming node search using _search method with standard recipe NODE_HYBRID_SEARCH_RRF:'
        )

        # 定义预定义搜索策略,并修改其限制条件
        node_search_config = NODE_HYBRID_SEARCH_RRF.model_copy(deep=True)
        # 限制返回值的结果数量为5
        node_search_config.limit = 5

        # 执行节点搜索
        node_search_results = await graphiti._search(
            query='California Governor',
            config=node_search_config,
        )

        # 打印节点信息
        print('\n搜索的节点信息:')
        for node in node_search_results.nodes:
            print(f'Node UUID: {node.uuid}')
            print(f'Node Name: {node.name}')
            node_summary = node.summary[:100] + '...' if len(node.summary) > 100 else node.summary
            print(f'Content Summary: {node_summary}')
            print(f'Node Labels: {", ".join(node.labels)}')
            print(f'Created At: {node.created_at}')
            if hasattr(node, 'attributes') and node.attributes:
                print('Attributes:')
                for key, value in node.attributes.items():
                    print(f'  {key}: {value}')
            print('---')

    finally:
        # 清理资源
        # 结束时,务必关闭与Neo4j的连接,以正确释放资源
        # 关闭连接
        await graphiti.close()
        print('\nConnection closed')


if __name__ == '__main__':
    asyncio.run(main())

3 结果

3.1 执行结果

(1)控制面板的结果

……
增加的的片段: Freakonomics Radio 0 (text)
2026-03-20 18:13:39 - neo4j.notifications - INFO - Received notification from DBMS server: <GqlStatusObject gql_status='00NA0', status_description="note: successful completion - index or constraint already exists. The command 'CREATE RANGE INDEX community_uuid IF NOT EXISTS FOR (e:Community) ON (e.uuid)' has no effect. The index or constraint specified by 'RANGE INDEX community_uuid FOR (e:Community) ON (e.uuid)' already exists.", position=None, raw_classification='SCHEMA', classification=<NotificationClassification.SCHEMA: 'SCHEMA'>, raw_severity='INFORMATION', severity=<NotificationSeverity.INFORMATION: 'INFORMATION'>, diagnostic_record={'_classification': 'SCHEMA', '_severity': 'INFORMATION', 'OPERATION': '', 'OPERATION_CODE': '0', 'CURRENT_SCHEMA': '/'}> for query: 'CREATE INDEX community_uuid IF NOT EXISTS FOR (n:Community) ON (n.uuid)'
……
2026-03-20 18:13:42 - httpx - INFO - HTTP Request: POST https://api.moonshot.cn/v1/chat/completions "HTTP/1.1 200 OK"
2026-03-20 18:13:43 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings "HTTP/1.1 200 OK"
……
增加的的片段: Freakonomics Radio 1 (text)
……
增加的的片段: Freakonomics Radio 2 (json)
……
增加的的片段: Freakonomics Radio 3 (json)


搜索内容: 'Who was the California Attorney General?'

Search Results:
UUID: 341b9eff-343e-4129-a41a-a2176ff1c897
Fact: Kamala Harris was the district attorney for San Francisco before becoming the Attorney General of California.
---
UUID: 54bc7574-5610-487d-92c3-ed9f7094e9c6
Fact: Kamala Harris served as the Attorney General of California.
Valid from: 2026-03-20 10:13:39.601668+00:00
Valid until: 2017-01-04 00:00:00+00:00
---
UUID: da7c9b7d-6b9f-44a1-a9ce-9a227055cb44
Fact: Kamala Harris is the Attorney General of California.
Valid from: 2026-03-20 08:09:37.882078+00:00
---
UUID: 753abd55-1c30-4315-a1c4-9c6070086e62
Fact: Gavin Newsom is currently the Governor of California.
Valid from: 2026-03-20 10:14:27.766084+00:00
---
UUID: 35e9fbc6-8fda-44c4-8842-f51046a52a01
Fact: Kamala Harris was previously the district attorney for San Francisco.
---
UUID: 5885816b-8454-423d-9d8e-2bcc62ae1e5d
Fact: Harris held the position of Attorney General from January 3, 2011, to January 3, 2017.
Valid from: 2011-01-03 00:00:00+00:00
Valid until: 2017-01-03 00:00:00+00:00
---
UUID: 7fc71156-4255-4eac-aed3-ff8f20b2cf4e
Fact: Gavin Newsom previously held the position of Lieutenant Governor.
---
UUID: eff5d784-8bdc-494a-ba38-f36912e04616
Fact: Kamala Harris was the district attorney for San Francisco prior to her current position.
---
UUID: 2d82ef48-7252-49d6-b964-2e66ab3898c2
Fact: The position of district attorney that Kamala Harris held was in San Francisco.
---
UUID: 5361c1a2-68f5-4a9a-9fb6-154338a3729d
Fact: Gavin Newsom was previously based in San Francisco.
---

根据图距离对搜索结果进行重新排序:
使用的中心节点UUID: 2c26b541-1d44-4366-8335-76705c326b0c

重排后结果:
UUID: 341b9eff-343e-4129-a41a-a2176ff1c897
Fact: Kamala Harris was the district attorney for San Francisco before becoming the Attorney General of California.
---
UUID: 54bc7574-5610-487d-92c3-ed9f7094e9c6
Fact: Kamala Harris served as the Attorney General of California.
Valid from: 2026-03-20 10:13:39.601668+00:00
Valid until: 2017-01-04 00:00:00+00:00
---
UUID: da7c9b7d-6b9f-44a1-a9ce-9a227055cb44
Fact: Kamala Harris is the Attorney General of California.
Valid from: 2026-03-20 08:09:37.882078+00:00
---
UUID: 35e9fbc6-8fda-44c4-8842-f51046a52a01
Fact: Kamala Harris was previously the district attorney for San Francisco.
---
UUID: 5885816b-8454-423d-9d8e-2bcc62ae1e5d
Fact: Harris held the position of Attorney General from January 3, 2011, to January 3, 2017.
Valid from: 2011-01-03 00:00:00+00:00
Valid until: 2017-01-03 00:00:00+00:00
---
UUID: eff5d784-8bdc-494a-ba38-f36912e04616
Fact: Kamala Harris was the district attorney for San Francisco prior to her current position.
---
UUID: 2d82ef48-7252-49d6-b964-2e66ab3898c2
Fact: The position of district attorney that Kamala Harris held was in San Francisco.
---
UUID: 753abd55-1c30-4315-a1c4-9c6070086e62
Fact: Gavin Newsom is currently the Governor of California.
Valid from: 2026-03-20 10:14:27.766084+00:00
---
UUID: 7fc71156-4255-4eac-aed3-ff8f20b2cf4e
Fact: Gavin Newsom previously held the position of Lieutenant Governor.
---
UUID: 5361c1a2-68f5-4a9a-9fb6-154338a3729d
Fact: Gavin Newsom was previously based in San Francisco.
---

Performing node search using _search method with standard recipe NODE_HYBRID_SEARCH_RRF:

搜索的节点信息:
Node UUID: c95b4158-c5bf-4556-81e2-b24fe2dfc159
Node Name: California
Content Summary: Kamala Harris served as the Attorney General of California.
Gavin Newsom is currently the Governor o...
Node Labels: Entity
Created At: 2026-03-20 10:13:42.973291+00:00
---
Node UUID: b22f03ae-986f-46ff-b1ee-d49657190735
Node Name: Gavin Newsom
Content Summary: Gavin Newsom is currently the Governor of California.
Gavin Newsom previously held the position of L...
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: 89c7c583-5cb3-4d62-99b1-9588f78acbd4
Node Name: Governor
Content Summary: Gavin Newsom is the Governor of California, previously serving as Lieutenant Governor in San Francis...
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: 5c34f7f4-602f-4452-ae0e-3cbef1f934cf
Node Name: Lieutenant Governor
Content Summary: Gavin Newsom previously held the position of Lieutenant Governor.
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: fa5e67c7-dd3c-4eb9-8ceb-bb4e64564f8f
Node Name: Attorney General of California
Content Summary: Kamala Harris is the Attorney General of California.
Harris held the position of Attorney General fr...
Node Labels: Entity
Created At: 2026-03-20 08:09:49.618825+00:00
---

Connection closed

(2)neo4j的结果

neo4j截图

在这里插入图片描述

详细图

在这里插入图片描述

(3)数据节点的值

{
    "n": {
      "identity": 0,
      "labels": [
        "Episodic"
      ],
      "properties": {
        "entity_edges": [
          "da7c9b7d-6b9f-44a1-a9ce-9a227055cb44",
          "35e9fbc6-8fda-44c4-8842-f51046a52a01"
        ],
        "group_id": "",
        "name": "Freakonomics Radio 0",
        "created_at": "2026-03-20T08:09:37.882078000Z",
        "source": "text",
        "uuid": "2006447a-2971-454a-97cb-7e0cf7a580f9",
        "content": "Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.",
        "source_description": "podcast transcript",
        "valid_at": "2026-03-20T08:09:37.882078000Z"
      },
      "elementId": "4:d48ecc53-79cb-424c-aa6e-0eba75e0ea5a:0"
    }
}

3.2 错误解决方法

下面的错误就是大模型的版本不兼容导致,只能更换模型。我测试的kimi中的moonshot-v1-32k可以执行。

报错的位置。原因应该是模型解析后,无法有效映射数据。

await graphiti.add_episode(
    name=f'Freakonomics Radio {i}',
    episode_body=episode_body_str,
    source=episode['type'],
    source_description=episode['description'],
    reference_time=datetime.now(timezone.utc),
)

报错结果

pydantic_core._pydantic_core.ValidationError: 3 validation errors for ExtractedEntities
extracted_entities.0.name
  Field required [type=missing, input_value={'entity_name': 'Kamala H...s', 'entity_type_id': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing
extracted_entities.1.name
  Field required [type=missing, input_value={'entity_name': 'Attorney...a', 'entity_type_id': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing
extracted_entities.2.name
  Field required [type=missing, input_value={'entity_name': 'San Fran...o', 'entity_type_id': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing

Github上待修复的Bug地址

https://bgithub.xyz/getzep/graphiti/issues/912
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值