LangChain

LangChain 的迭代速度极快，API 经常变，文档跟不上代码。这篇文章记录我踩过的一些有代表性的坑。坑 1：版本兼容性 LangChain 把包拆分了，以前的 langchain 现在分成了： langchain-core：核心抽象 langchain：主包 langchain-community：第三方集成 langchain-openai、langchain-anthropic 等：各家模型的独立包很多旧教程的 import 路径在新版本里已经不对了： # 旧写法（可能报 ImportError） from langchain.chat_models import ChatOpenAI # 新写法 from langchain_openai import ChatOpenAI 解决办法：固定版本，或者直接看报错信息里的迁移提示。坑 2：ConversationBufferMemory 在 LCEL 里不能直接用从旧版 Chain API 迁移到 LCEL 时，发现旧的 Memory 类不能直接套用： # LCEL 里需要手动管理历史 from langchain_core.messages import HumanMessage, AIMessage history = [] def chat(user_input: str) -> str: history.append(HumanMessage(content=user_input)) response = chain.invoke({"messages": history}) history.append(AIMessage(content=response)) return response LCEL 更偏向函数式，状态管理需要自己来。坑 3：Chroma 持久化 # 错误：每次都重建 vectorstore，已有的数据被覆盖 vectorstore = Chroma.from_documents(docs, embeddings, persist_directory="./db") # 正确：已有数据库直接加载 import os if os.path.exists("./db"): vectorstore = Chroma(persist_directory="./db", embedding_function=embeddings) else: vectorstore = Chroma.from_documents(docs, embeddings, persist_directory="./db") 坑 4：输出解析器的错误处理 LLM 偶尔会输出格式不符合预期的内容，导致解析器报错： ...

2023 年初，LangChain 在 GitHub 上的星数以惊人的速度增长，一夜之间成为 LLM 应用开发的标配。这篇文章梳理一下它的核心设计和基础用法。为什么需要 LangChain 直接调用 OpenAI API 能做很多事，但当应用变复杂时，你会发现需要反复造轮子：多轮对话的历史管理 Prompt 模板化链式调用多个 LLM 步骤连接外部数据源 LangChain 把这些封装成了可组合的组件。核心概念 Chain Chain 是 LangChain 的核心抽象，把多个操作串联起来： from langchain.chains import LLMChain from langchain.prompts import PromptTemplate from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-3.5-turbo") prompt = PromptTemplate( input_variables=["topic"], template="用 3 句话解释{topic}，面向初学者" ) chain = LLMChain(llm=llm, prompt=prompt) result = chain.invoke({"topic": "向量数据库"}) print(result["text"]) Memory Memory 组件负责维护对话历史： from langchain.memory import ConversationBufferMemory from langchain.chains import ConversationChain memory = ConversationBufferMemory() conversation = ConversationChain(llm=llm, memory=memory) conversation.predict(input="我叫 Kada") conversation.predict(input="你还记得我叫什么吗？") # 能记住 Document Loaders 加载外部文档，是构建 RAG 的基础： from langchain_community.document_loaders import TextLoader, PyPDFLoader # 加载文本文件 loader = TextLoader("./doc.txt", encoding="utf-8") docs = loader.load() # 加载 PDF pdf_loader = PyPDFLoader("./report.pdf") pages = pdf_loader.load_and_split() LCEL：新的链式语法 LangChain 0.1 之后推荐用 LCEL（LangChain Expression Language）写链： ...

LangChain

LangChain 踩坑合集：那些让我头疼的问题

LangChain 入门：用 Python 构建你的第一个 LLM 应用