使用langchain的意义是什么

背景

在RAG(检索增强生成)这个概念没有出生之前,各大中厂实现KBQA(基于知识库的问答系统)的方式,也用到了检索 技术(参见:基于ElasticSearch+文本相似度模型的检索式智能对话方案_elasticsearch 文本相似度-CSDN博客),只是以前所检索的是“QA问答对”中的“Q”。

参见 下面“微信对话开放平台”的截图:

这种方式最大的优点是 chatbot的回答相当可控,最大缺点是需要大量的人力去配置问答对。

使用langchain做KBQA

在langchain的方案里,通常不用像上述方案一样需要整理问答对,可以直接对承载“知识”的文档进行文本分块,供后续与用户的提问 进行“向量相似度计算”。流程大致如下:

(其实不用 langchain 也可以实现上述流程,只是 langchain 对“文本分块”、“向量化”等动作都做好了方法封装,用langchain来实现会比较方便)

代码示例

    # Simulated knowledge base
    documents = [
        Document(page_content="Python is a dynamically-typed, interpreted programming language created by Guido van Rossum and first released in 1991."),
        Document(page_content="LangChain is a framework for building LLM-powered applications, supporting chains, agents, and RAG patterns."),
        Document(page_content="Deep learning is a subset of machine learning that uses multi-layer neural networks to learn representations from data."),
        Document(page_content="PyTorch is an open-source deep learning framework developed by Meta AI, known for its dynamic computation graph."),
        Document(page_content="The Transformer architecture was proposed by Vaswani et al. in 2017 and is the foundation of modern LLMs."),
    ]

    # Split into chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
    splits = text_splitter.split_documents(documents)

    # Build vector store with local embeddings
    vectorstore = InMemoryVectorStore.from_documents(splits, embeddings)
    retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

    # RAG prompt template
    rag_prompt = ChatPromptTemplate.from_template("""\
You are a knowledgeable assistant. Answer the question based on the provided documents only.
If the documents don't contain the answer, say "I cannot answer based on the provided information."

Documents:
{context}

Question: {question}

Answer:""")

    def format_docs(docs):
        return "\n\n".join(doc.page_content for doc in docs)

    rag_chain = (
        {"context": retriever | format_docs, "question": lambda x: x}
        | rag_prompt
        | model
        | StrOutputParser()
    )

    questions = [
        "What is LangChain?",
        "Who created Python?",
        "What's the weather like today?",
    ]

    for q in questions:
        result = rag_chain.invoke(q)
        print(f"\nQ: {q}")
        print(f"A: {result}")

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值