大模型从入门到应用——LangChain:链(Chains)-[链与索引:检索式问答]
分类目录:《大模型从入门到应用》总目录
下面这个示例展示了如何在索引上进行问答:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
loader = TextLoader("../../state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)
日志输出:
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=docsearch.as_retriever())
query = "What did the president say about Ketanji Brown Jackson"
qa.run(query)
输出:
" The president said that she is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support, from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
链的类型
我们可以指定不同的链的类型来加载和使用RetrievalQA链。有关这些类型的更详细说明可以参考《大模型从入门到应用——LangChain:链(Chains)-[链与索引:问答的基础知识]》。
有两种加载不同链的类型的方法。首先,我们可以在from_chain_type
方法中指定链的类型参数。这允许我们传入要使用的链的类型的名称。例如,在下面的示例中,我们将链的类型更改为map_reduce
。
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="map_reduce", retriever=docsearch.as_retriever())
query = "What did the president say about Ketanji Brown Jackson"
qa.run(query)
输出:
" The president said that Judge Ketanji Brown Jackson is one of our nation's top legal minds, a former top litigator in private practice and a former federal public defender, from a family of public school educators and police officers, a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
上述方法确实简单地更改了链的类型,但它对该链的类型的参数提供了很大的灵活性。如果我们想要控制这些参数,可以直接加载链式,然后将其直接传递给RetrievalQA链的combine_documents_chain
参数,例如:
from langchain.chains.question_answering import load_qa_chainqa_chain = load_qa_chain(OpenAI(temperature=0), chain_type="stuff")
qa = RetrievalQA(combine_documents_chain=qa_chain, retriever=docsearch.as_retriever())
query = "What did the president say about Ketanji Brown Jackson"
qa.run(query)
输出:
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
自定义提示
我们可以传递自定义提示来进行问答,这些提示与我们可以传递给基础问答链的提示相同。
from langchain.prompts import PromptTemplate
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.{context}Question: {question}
Answer in Italian:"""
PROMPT = PromptTemplate(template=prompt_template, input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": PROMPT}
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=docsearch.as_retriever(), chain_type_kwargs=chain_type_kwargs)
query = "What did the president say about Ketanji Brown Jackson"
qa.run(query)
输出:
" Il presidente ha detto che Ketanji Brown Jackson è una delle menti legali più importanti del paese, che continuerà l'eccellenza di Justice Breyer e che ha ricevuto un ampio sostegno, da Fraternal Order of Police a ex giudici nominati da democratici e repubblicani."
返回源文档
此外,我们可以通过在构建链式时指定一个可选参数来返回用于回答问题的源文档。
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=docsearch.as_retriever(), return_source_documents=True)
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"query": query})
result["result"]
输出:
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice and a former federal public defender from a family of public school educators and police officers, and that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
输入:
result["source_documents"]
输出:
[Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0),Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n\nWe can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling. \n\nWe’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n\nWe’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n\nWe’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0),Document(page_content='And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \n\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \n\nWhile it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \n\nAnd soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \n\nSo tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together. \n\nFirst, beat the opioid epidemic.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0),Document(page_content='Tonight, I’m announcing a crackdown on these companies overcharging American businesses and consumers. \n\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up. \n\nThat ends on my watch. \n\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \n\nWe’ll also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \n\nLet’s pass the Paycheck Fairness Act and paid leave. \n\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \n\nLet’s increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls America’s best-kept secret: community colleges.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0)]
使用源文档进行检索式问答
本节介绍了如何在索引上使用源文档进行问答。它通过使用RetrievalQAWithSourcesChain
实现,该链式结构可以从索引中查找文档。
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores import Chromawith open("../../state_of_the_union.txt") as f:state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{"source": f"{i}-pl"} for i in range(len(texts))])
日志输出:
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
输入:
from langchain.chains import RetrievalQAWithSourcesChain
from langchain import OpenAIchain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="stuff", retriever=docsearch.as_retriever())
chain({"question": "What did the president say about Justice Breyer"}, return_only_outputs=True)
输出:
{'answer': ' The president honored Justice Breyer for his service and mentioned his legacy of excellence.\n','sources': '31-pl'}
链的类型
我们可以指定不同的链的类型,以在RetrievalQAWithSourcesChain
链中加载和使用。有关这些类型的更详细说明,可以参考《大模型从入门到应用——LangChain:链(Chains)-[链与索引:图问答(Graph QA)和带来源的问答(Q&A with Sources)]》中带来源的问答的部分。
有两种加载不同链类型的方式。首先,我们可以在from_chain_type
方法中指定链类型参数,这允许我们传递要使用的链类型的名称。例如,在下面的示例中,我们将链类型更改为map_reduce
:
chain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="map_reduce", retriever=docsearch.as_retriever())
chain({"question": "What did the president say about Justice Breyer"}, return_only_outputs=True)
输出:
{'answer': ' The president said "Justice Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service."\n','sources': '31-pl'}
上述方法允许我们非常简单地更改链式类型,但它确实在链的类型的参数上提供了很大的灵活性。如果我们想控制这些参数,可以直接加载链式结构,然后使用combine_documents_chain
参数将其直接传递给RetrievalQAWithSourcesChain
链式结构:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
qa_chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
qa = RetrievalQAWithSourcesChain(combine_documents_chain=qa_chain, retriever=docsearch.as_retriever())
qa({"question": "What did the president say about Justice Breyer"}, return_only_outputs=True)
输出:
{'answer': ' The president honored Justice Breyer for his service and mentioned his legacy of excellence.\n','sources': '31-pl'}
参考文献:
[1] LangChain官方网站:https://www.langchain.com/
[2] LangChain 🦜️🔗 中文网,跟着LangChain一起学LLM/GPT开发:https://www.langchain.com.cn/
[3] LangChain中文网 - LangChain 是一个用于开发由语言模型驱动的应用程序的框架:http://www.cnlangchain.com/
相关文章:
大模型从入门到应用——LangChain:链(Chains)-[链与索引:检索式问答]
分类目录:《大模型从入门到应用》总目录 下面这个示例展示了如何在索引上进行问答: from langchain.embeddings.openai import OpenAIEmbeddings from langchain.vectorstores import Chroma from langchain.text_splitter import CharacterTextSplitte…...

【LeetCode-中等题】142. 环形链表 II
文章目录 题目方法一:哈希表set去重方法二:快慢指针 题目 方法一:哈希表set去重 思路:我们遍历链表中的每个节点,并将它记录下来;一旦遇到了此前遍历过的节点,就可以判定链表中存在环。借助哈希…...

Android TV开发之VerticalGridView
Android TV应用开发和手机应用开发是一样的,只是多了焦点控制,即选中变色。 androidx.leanback.widget.VerticalGridView 继承 BaseGridView , BaseGridView 继承 RecyclerView 。 所以 VerticalGridView 就是 RecyclerView ,使…...
SpringBoot+Vue项目添加腾讯云人脸识别
一、引言 人脸识别是一种基于人脸特征进行身份认证和识别的技术。它使用计算机视觉和模式识别的方法,通过分析图像或视频中的人脸特征,例如脸部轮廓、眼睛、鼻子、嘴巴等,来验证一个人的身份或识别出他们是谁。 人脸识别可以应用在多个领域…...
什么是IPv4?什么又是IPv6?
IPv4网络IPv4地址 IPv6网络IPv6地址 路由总结感谢 💖 hello大家好😊 IPv4网络 IPv4(Internet Protocol Version 4)是当今互联网上使用的主要网络协议。 IPv4地址 IPv4 地址有32位,通常使用点号分隔的四个十进制八位…...

飞腾FT-2000/4、D2000 log报错指导(3)
在爱好者群中遇见了很多的固件问题,这里总结记录了大家的交流内容和调试心得。主要是飞腾桌面CPU FT-2000/4 D2000相关的,包含uboot和UEFI。希望对大家调试有所帮助。 这个专题会持续更新,凑够一些就发。 23 在s3 唤醒时报错如下 check suspend ,Platform exception report…...

基于安卓的考研助手系统app 微信小程序
,设计并开发实用、方便的应用程序具有重要的意义和良好的市场前景。HBuilder技术作为当前最流行的操作平台,自然也存在着大量的应用服务需求。 本课题研究的是基于HBuilder技术平台的安卓的考研助手APP,开发这款安卓的考研助手APP主要是为了…...

Leetcode:238. 除自身以外数组的乘积【题解超详细】
纯C语言实现(小白也能看明白) 题目 给你一个整数数组 nums,返回 数组 answer ,其中 answer[i] 等于 nums 中除 nums[i] 之外其余各元素的乘积 。 题目数据 保证 数组 nums之中任意元素的全部前缀元素和后缀的乘积都在 32 位 整数…...

基于单片机的智能数字电子秤proteus仿真设计
一、系统方案 1、当电子称开机时,单片机会进入一系列初始化,进入1602显示模式设定,如开关显示、光标有无设置、光标闪烁设置,定时器初始化,进入定时器模式,如初始值赋值。之后液晶会显示Welcome To Use Ele…...

大数据(二)大数据行业相关统计数据
大数据(二)大数据行业相关统计数据 目录 一、大数据相关的各种资讯 二、转载自网络的大数据统计数据 2.1、国家大数据政策 2.2、产业结构分析 2.3、应用结构分析 2.4、数据中心 2.5、云计算 一、大数据相关的各种资讯 1. 据IDC预测࿰…...

Ruoyi安装部署(linux环境、前后端不分离版本)
目录 简介 1 新建目录 2 安装jdk 2.1 jdk下载 2.2 解压并移动文件夹到/data/service目录 2.3 配置环境变量 3 安装maven 3.1 进入官网下载最新的maven 3.2 解压并移动文件夹到/data//service目录 3.3 配置环境变量 3.4 配置本地仓库地址与阿里云镜像 4 安装git 4.…...

PHP聚合支付网站源码/对接十多个支付接口 第三方/第四方支付/系统源码
PHP聚合支付网站源码/对接十多个支付接口 第三方/第四方支付/系统源码 内附数十个支付接口代码文件。 下载地址:https://bbs.csdn.net/topics/616764485...

容器化微服务:用Kubernetes实现弹性部署
随着云计算的迅猛发展,容器化和微服务架构成为了构建现代应用的重要方式。而在这个过程中,Kubernetes(常简称为K8s)作为一个开源的容器编排平台,正在引领着容器化微服务的部署和管理革命。本文将深入探讨容器化微服务的…...

DevOps系列文章 之 Python基础
Python语法结构 语句块缩进 1.python代码块通过缩进对齐表达代码逻辑而不是使用大括号 2.缩进表达一个语句属于哪个代码块 3.缩进风格 : 建议使用四个空格 如果是Linux系统的话,可以这样做,实现自动缩进 : vim ~/.vimrc set ai…...

Harbour.Space Scholarship Contest 2023-2024 (Div. 1 + Div. 2) A ~ D
比赛链接 A 正常枚举就行,从最后一位往前枚举,-1、-2、-3...这样 #include<bits/stdc.h> #define IOS ios::sync_with_stdio(0);cin.tie(0);cout.tie(0); #define endl \nusing namespace std;typedef pair<int, int> PII; typedef long l…...

[管理与领导-53]:IT基层管理者 - 8项核心技能 - 8 - 持续改进
前言: 管理者存在的价值就是制定目标,即目标管理、通过团队(他人)拿到结果。 要想通过他人拿到结果: (1)目标:制定符合SMART原则的符合业务需求的目标,团队跳一跳就可以…...

芯片验证板卡设计原理图:446-基于VU440T的多核处理器多输入芯片验证板卡
基于VU440T的多核处理器多输入芯片验证板卡 一、板卡概述 基于XCVU440-FLGA2892的多核处理器多输入芯片验证板卡为实现网络交换芯片的验证,包括四个FMC接口、DDR、GPIO等,北京太速科技芯片验证板卡用于完成甲方的芯片验证任务,多任务…...

几个nlp的小任务(机器翻译)
几个nlp的小任务(机器翻译) 安装依赖库数据集介绍与模型介绍加载数据集看一看数据集的样子评测测试数据预处理测试tokenizer处理目标特殊的token预处理函数对数据集的所有数据进行预处理微调预训练模型设置训练参数需要一个数据收集器,把处理好数据喂给模型设置评估方法参数…...

飞腾X100 LPDDR颗粒线序配置辅助工具
B站讲解视频: 正文内容: 一、 飞腾X100显存使用LPDDR4时,需要工程师在X100的固件中去配置线序交换说明,就类似下面这个: 图1 我们需要输入每个slice中DQ的线序,也需要输入slice之间的交换关系,这个工作量也不小,同时容易出现错误,所以开发了一款辅助小工具,…...

二、数学建模之整数规划篇
1.定义 2.例题 3.使用软件及解题 一、定义 1.整数规划(Integer Programming,简称IP):是一种数学优化问题,它是线性规划(Linear Programming,简称LP)的一个扩展形式。在线性规划中&…...

Linux 文件类型,目录与路径,文件与目录管理
文件类型 后面的字符表示文件类型标志 普通文件:-(纯文本文件,二进制文件,数据格式文件) 如文本文件、图片、程序文件等。 目录文件:d(directory) 用来存放其他文件或子目录。 设备…...
工程地质软件市场:发展现状、趋势与策略建议
一、引言 在工程建设领域,准确把握地质条件是确保项目顺利推进和安全运营的关键。工程地质软件作为处理、分析、模拟和展示工程地质数据的重要工具,正发挥着日益重要的作用。它凭借强大的数据处理能力、三维建模功能、空间分析工具和可视化展示手段&…...

C++ Visual Studio 2017厂商给的源码没有.sln文件 易兆微芯片下载工具加开机动画下载。
1.先用Visual Studio 2017打开Yichip YC31xx loader.vcxproj,再用Visual Studio 2022打开。再保侟就有.sln文件了。 易兆微芯片下载工具加开机动画下载 ExtraDownloadFile1Info.\logo.bin|0|0|10D2000|0 MFC应用兼容CMD 在BOOL CYichipYC31xxloaderDlg::OnIni…...

【数据分析】R版IntelliGenes用于生物标志物发现的可解释机器学习
禁止商业或二改转载,仅供自学使用,侵权必究,如需截取部分内容请后台联系作者! 文章目录 介绍流程步骤1. 输入数据2. 特征选择3. 模型训练4. I-Genes 评分计算5. 输出结果 IntelliGenesR 安装包1. 特征选择2. 模型训练和评估3. I-Genes 评分计…...
MySQL 8.0 事务全面讲解
以下是一个结合两次回答的 MySQL 8.0 事务全面讲解,涵盖了事务的核心概念、操作示例、失败回滚、隔离级别、事务性 DDL 和 XA 事务等内容,并修正了查看隔离级别的命令。 MySQL 8.0 事务全面讲解 一、事务的核心概念(ACID) 事务是…...

Elastic 获得 AWS 教育 ISV 合作伙伴资质,进一步增强教育解决方案产品组合
作者:来自 Elastic Udayasimha Theepireddy (Uday), Brian Bergholm, Marianna Jonsdottir 通过搜索 AI 和云创新推动教育领域的数字化转型。 我们非常高兴地宣布,Elastic 已获得 AWS 教育 ISV 合作伙伴资质。这一重要认证表明,Elastic 作为 …...

五子棋测试用例
一.项目背景 1.1 项目简介 传统棋类文化的推广 五子棋是一种古老的棋类游戏,有着深厚的文化底蕴。通过将五子棋制作成网页游戏,可以让更多的人了解和接触到这一传统棋类文化。无论是国内还是国外的玩家,都可以通过网页五子棋感受到东方棋类…...

倒装芯片凸点成型工艺
UBM(Under Bump Metallization)与Bump(焊球)形成工艺流程。我们可以将整张流程图分为三大阶段来理解: 🔧 一、UBM(Under Bump Metallization)工艺流程(黄色区域ÿ…...
计算机系统结构复习-名词解释2
1.定向:在某条指令产生计算结果之前,其他指令并不真正立即需要该计算结果,如果能够将该计算结果从其产生的地方直接送到其他指令中需要它的地方,那么就可以避免停顿。 2.多级存储层次:由若干个采用不同实现技术的存储…...
自定义线程池1.2
自定义线程池 1.2 1. 简介 上次我们实现了 1.1 版本,将线程池中的线程数量交给使用者决定,并且将线程的创建延迟到任务提交的时候,在本文中我们将对这个版本进行如下的优化: 在新建线程时交给线程一个任务。让线程在某种情况下…...