环境:python 3.10.13
1. 加载文档
import bs4
from langchain_community.document_loaders import WebBaseLoader# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()
2.文档切分
from langchain.text_splitter import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)
# 查看切割了多少片段,66个切割后的小文本片段。
len(all_splits)
# 66
# 查看拆分的第一个切片的内容
len(all_splits[0].page_content)
# 查元数据属性看
all_splits[10].metadata#{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',
# 'start_index': 7056}
3. 向量存储
from langchain_community.embeddings.ollama import OllamaEmbeddingsfrom langchain_community.vectorstores import Chromavectorstore = Chroma.from_documents(documents=all_splits, embedding=OllamaEmbeddings(base_url='http://192.168.17.***:11434'))
4.检索(Retrieval)
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})
5. 增强(Augmented)
# 提示模板
from langchain import hub
prompt = hub.pull("rlm/rag-prompt")from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
from langchain_community.llms import Ollama# ollama中已经下载的大模型model
llm = ChatOllama(base_url='http://192.168.17.***:11434',model="qwen:14b",temperature=0)for chunk in rag_chain.stream("What is Task Decomposition?"):print(chunk, end="", flush=True)
输出结果:
Memory in human brains can be broadly categorized into several types:
1. Sensory Memory: This is the initial stage where we retain impressions of sensory information, such as visual (iconic memory), auditory (echoic memory), and tactile (haptic memory).
2. Working Memory: Also known as short-term memory, it allows us to temporarily hold and manipulate a limited amount of information.
3. Long-Term Memory: This is the permanent storage of information in our brains. It can be further divided into:
- Episodic Memory: Memories of specific events or experiences.
- Semantic Memory: General knowledge and facts about the world.
- Procedural Memory: Skills and habits, such as riding a bike.
These memory types interact and support each other to enable complex cognitive processes.
参考来源:
Quickstart | 🦜️🔗 LangChain
智见无极
大模型 LLM RAG在 Text2SQL 上的应用实践_大模型生成sql prompt-CSDN博客