大模型LLM基于本地ollama平台的RAG数据增强的文档问答系统调试

环境：python 3.10.13

1. 加载文档

import bs4
from langchain_community.document_loaders import WebBaseLoader# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()

2.文档切分

from langchain.text_splitter import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

# 查看切割了多少片段，66个切割后的小文本片段。
len(all_splits)
# 66

# 查看拆分的第一个切片的内容
len(all_splits[0].page_content)

# 查元数据属性看
all_splits[10].metadata#{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',
# 'start_index': 7056}

3. 向量存储

from langchain_community.embeddings.ollama import OllamaEmbeddingsfrom langchain_community.vectorstores import Chromavectorstore = Chroma.from_documents(documents=all_splits, embedding=OllamaEmbeddings(base_url='http://192.168.17.***:11434'))

4.检索（Retrieval）

retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

5. 增强（Augmented）

# 提示模板
from langchain import hub
prompt = hub.pull("rlm/rag-prompt")from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
from langchain_community.llms import Ollama# ollama中已经下载的大模型model
llm = ChatOllama(base_url='http://192.168.17.***:11434',model="qwen:14b",temperature=0)for chunk in rag_chain.stream("What is Task Decomposition?"):print(chunk, end="", flush=True)

输出结果：

Memory in human brains can be broadly categorized into several types:
1. Sensory Memory: This is the initial stage where we retain impressions of sensory information, such as visual (iconic memory), auditory (echoic memory), and tactile (haptic memory).
2. Working Memory: Also known as short-term memory, it allows us to temporarily hold and manipulate a limited amount of information.
3. Long-Term Memory: This is the permanent storage of information in our brains. It can be further divided into:
- Episodic Memory: Memories of specific events or experiences.
- Semantic Memory: General knowledge and facts about the world.
- Procedural Memory: Skills and habits, such as riding a bike.
These memory types interact and support each other to enable complex cognitive processes.

参考来源：

Quickstart | 🦜️🔗 LangChain

智见无极

大模型 LLM RAG在 Text2SQL 上的应用实践_大模型生成sql prompt-CSDN博客

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/822023.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！