LangChain Tutorial: Building Production AI Applications 2025
LangChain Tutorial: Building Production AI Applications 2025
LangChain has become the de facto framework for building sophisticated AI applications. Whether you're creating chatbots, document analysis tools, or autonomous agents, LangChain provides the building blocks to go from prototype to production. This comprehensive tutorial covers everything you need to master LangChain in 2025.
What is LangChain?
The Framework Overview
LangChain is an open-source framework that simplifies building applications with Large Language Models (LLMs). Think of it as the "React for AI"—it provides:
- Standardized interfaces for different LLMs
- Chains: Sequences of AI operations
- Agents: Autonomous decision-making systems
- Memory: Conversation state management
- Retrieval: Integration with vector databases
- Tools: Pre-built components for common tasks
Why LangChain in 2025?
Market Adoption:
- 85,000+ GitHub stars
- Used by OpenAI, Google, Microsoft
- 500,000+ developers
- Production-ready with LangSmith monitoring
- Active community and ecosystem
Key Advantages:
- Vendor-agnostic (works with any LLM)
- Production-grade monitoring
- Built-in best practices
- Extensive documentation
- Regular updates
Installation and Setup
Basic Installation
# Core LangChain
pip install langchain
# LangChain with OpenAI
pip install langchain-openai
# Vector store support
pip install langchain-community
# Additional utilities
pip install langchainhub
# Development tools
pip install langsmith
Environment Setup
import os
from dotenv import load_dotenv
load_dotenv()
# API Keys
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
LANGCHAIN_API_KEY = os.getenv("LANGCHAIN_API_KEY")
LANGCHAIN_TRACING_V2 = "true" # Enable monitoring
Project Structure
my_ai_app/
├── .env
├── requirements.txt
├── app/
│ ├── __init__.py
│ ├── chains/
│ │ ├── __init__.py
│ │ └── qa_chain.py
│ ├── agents/
│ │ ├── __init__.py
│ │ └── research_agent.py
│ ├── memory/
│ │ ├── __init__.py
│ │ └── conversation_memory.py
│ └── utils/
│ ├── __init__.py
│ └── vector_store.py
├── data/
└── tests/
Core Concepts
1. LLMs and Chat Models
Basic LLM Usage
from langchain_openai import ChatOpenAI
# Initialize LLM
llm = ChatOpenAI(
model="gpt-4-turbo",
temperature=0.7,
max_tokens=1000
)
# Simple completion
response = llm.invoke("What is LangChain?")
print(response.content)
Structured Output
from langchain_core.pydantic_v1 import BaseModel, Field
class ProductReview(BaseModel):
"""Structured product review"""
sentiment: str = Field(description="positive, negative, or neutral")
rating: int = Field(description="Rating from 1-5")
summary: str = Field(description="Brief summary")
key_points: list[str] = Field(description="Key points from review")
# Get structured output
structured_llm = llm.with_structured_output(ProductReview)
review = structured_llm.invoke("This product is amazing! Best purchase ever.")
2. Prompt Templates
Basic Templates
from langchain_core.prompts import ChatPromptTemplate
# Create template
template = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant specializing in {domain}."),
("human", "{question}")
])
# Use template
prompt = template.format_messages(
domain="machine learning",
question="What is gradient descent?"
)
response = llm.invoke(prompt)
Few-Shot Prompting
from langchain_core.prompts import FewShotChatMessagePromptTemplate
# Define examples
examples = [
{
"input": "I loved it!",
"output": "Sentiment: Positive, Score: 0.9"
},
{
"input": "Terrible experience.",
"output": "Sentiment: Negative, Score: 0.1"
}
]
# Create few-shot template
example_prompt = ChatPromptTemplate.from_messages([
("human", "{input}"),
("ai", "{output}")
])
few_shot_prompt = FewShotChatMessagePromptTemplate(
example_prompt=example_prompt,
examples=examples
)
3. Chains: Composing Operations
Simple Sequential Chain
from langchain.chains import LLMChain, SimpleSequentialChain
# Chain 1: Generate topic
topic_chain = LLMChain(
llm=llm,
prompt=ChatPromptTemplate.from_template(
"Generate a blog topic about {subject}"
)
)
# Chain 2: Write outline
outline_chain = LLMChain(
llm=llm,
prompt=ChatPromptTemplate.from_template(
"Create a detailed outline for: {topic}"
)
)
# Combine chains
overall_chain = SimpleSequentialChain(
chains=[topic_chain, outline_chain],
verbose=True
)
# Execute
result = overall_chain.run("artificial intelligence")
Custom Chain with LCEL (LangChain Expression Language)
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
# Modern approach using LCEL
chain = (
{
"context": RunnablePassthrough(),
"question": RunnablePassthrough()
}
| prompt
| llm
| StrOutputParser()
)
# Run chain
result = chain.invoke({
"context": "LangChain is an AI framework",
"question": "What is LangChain used for?"
})
4. Memory: Maintaining Conversation State
Conversation Buffer Memory
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
# Create memory
memory = ConversationBufferMemory()
# Create conversation chain
conversation = ConversationChain(
llm=llm,
memory=memory,
verbose=True
)
# Multi-turn conversation
conversation.predict(input="Hi, I'm Alice")
conversation.predict(input="What's my name?") # Will remember "Alice"
Window Memory (Last N Messages)
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(
k=5 # Keep last 5 interactions
)
Summary Memory
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(
llm=llm,
max_token_limit=1000
)
Persistent Memory with Database
from langchain.memory import ConversationBufferMemory
from langchain.memory.chat_message_histories import RedisChatMessageHistory
# Redis-backed memory
message_history = RedisChatMessageHistory(
url="redis://localhost:6379",
session_id="user_123"
)
memory = ConversationBufferMemory(
chat_memory=message_history
)
Advanced: Retrieval-Augmented Generation (RAG)
What is RAG?
RAG combines LLMs with external knowledge bases to provide accurate, up-to-date, and source-cited responses.
Architecture:
Question → Retrieve relevant docs → Add to context → LLM generates answer
Step 1: Document Loading
from langchain_community.document_loaders import (
PyPDFLoader,
WebBaseLoader,
TextLoader
)
# Load PDF
pdf_loader = PyPDFLoader("document.pdf")
pdf_docs = pdf_loader.load()
# Load website
web_loader = WebBaseLoader("https://example.com")
web_docs = web_loader.load()
# Load text files
text_loader = TextLoader("data.txt")
text_docs = text_loader.load()
Step 2: Text Splitting
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Create splitter
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", " ", ""]
)
# Split documents
splits = text_splitter.split_documents(pdf_docs)
Step 3: Embeddings and Vector Store
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
# Create embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
# Create vector store
vectorstore = Chroma.from_documents(
documents=splits,
embedding=embeddings,
persist_directory="./chroma_db"
)
Step 4: Retrieval Chain
from langchain.chains import RetrievalQA
# Create retriever
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 4}
)
# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=retriever,
return_source_documents=True
)
# Ask question
result = qa_chain({"query": "What is the main topic?"})
print(result["result"])
print(result["source_documents"]) # View sources
Advanced RAG with Custom Prompt
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
# Custom prompt
template = """You are an AI assistant. Use the following context to answer the question.
If you don't know the answer, say so. Always cite your sources.
Context: {context}
Question: {question}
Answer with sources:"""
prompt = PromptTemplate(
template=template,
input_variables=["context", "question"]
)
# Advanced QA chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=retriever,
chain_type_kwargs={"prompt": prompt},
return_source_documents=True
)
Multi-Query Retrieval
from langchain.retrievers import MultiQueryRetriever
# Generate multiple query perspectives
multi_query_retriever = MultiQueryRetriever.from_llm(
retriever=vectorstore.as_retriever(),
llm=llm
)
# More comprehensive retrieval
docs = multi_query_retriever.get_relevant_documents(
"What are the benefits of AI?"
)
Building AI Agents
What Are Agents?
Agents are autonomous systems that:
- Understand user goals
- Plan steps to achieve them
- Use tools to take actions
- Adjust based on outcomes
Step 1: Define Tools
from langchain.agents import Tool
from langchain.tools import DuckDuckGoSearchRun
# Search tool
search = DuckDuckGoSearchRun()
# Calculator tool
from langchain.tools import BaseTool
class CalculatorTool(BaseTool):
name = "Calculator"
description = "Useful for math calculations"
def _run(self, query: str) -> str:
try:
return str(eval(query))
except:
return "Invalid calculation"
# Define tools list
tools = [
Tool(
name="Search",
func=search.run,
description="Search the internet for current information"
),
CalculatorTool()
]
Step 2: Create Agent
from langchain.agents import initialize_agent, AgentType
# Initialize agent
agent = initialize_agent(
tools=tools,
llm=llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
max_iterations=5
)
# Run agent
result = agent.run("What is 25% of the population of Tokyo?")
Advanced: Custom Agent with Memory
from langchain.agents import AgentExecutor, create_react_agent
from langchain.memory import ConversationBufferMemory
# Create memory
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# Create custom agent
agent = create_react_agent(
llm=llm,
tools=tools,
prompt=custom_agent_prompt
)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
memory=memory,
verbose=True
)
# Use agent
agent_executor.invoke({
"input": "Research AI trends and summarize"
})
Function Calling Agent (Most Powerful)
from langchain.agents import create_openai_functions_agent
# Define functions
functions = [
{
"name": "get_customer_info",
"description": "Retrieve customer information",
"parameters": {
"type": "object",
"properties": {
"customer_id": {
"type": "string",
"description": "Customer ID"
}
},
"required": ["customer_id"]
}
}
]
# Create function agent
agent = create_openai_functions_agent(
llm=llm,
tools=tools,
prompt=prompt
)
Production Best Practices
1. Error Handling
from langchain.callbacks import get_openai_callback
from langchain.callbacks.base import BaseCallbackHandler
class ErrorHandler(BaseCallbackHandler):
def on_llm_error(self, error: Exception, **kwargs):
print(f"LLM Error: {error}")
# Log to monitoring service
# Implement retry logic
# Use callback
with get_openai_callback() as cb:
try:
result = chain.invoke(input_data)
except Exception as e:
# Handle gracefully
result = fallback_response
2. Cost Tracking
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
result = chain.invoke(input_data)
print(f"Total Tokens: {cb.total_tokens}")
print(f"Total Cost: ${cb.total_cost:.4f}")
3. Caching Responses
from langchain.cache import InMemoryCache, SQLiteCache
from langchain.globals import set_llm_cache
# In-memory cache
set_llm_cache(InMemoryCache())
# Or persistent cache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))
4. Rate Limiting
from langchain.llms.openai import OpenAI
from langchain.callbacks import RateLimitCallback
rate_limiter = RateLimitCallback(
requests_per_minute=60,
check_every_n_seconds=1
)
llm = OpenAI(
callbacks=[rate_limiter],
max_retries=3
)
5. Monitoring with LangSmith
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your_key"
os.environ["LANGCHAIN_PROJECT"] = "production-app"
# All chains automatically traced
# View at smith.langchain.com
6. Structured Logging
import logging
from langchain.callbacks import StdOutCallbackHandler
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
# Log chain execution
logger.info(f"Executing chain with input: {input_data}")
result = chain.invoke(input_data)
logger.info(f"Chain result: {result}")
Real-World Application Examples
Example 1: Customer Support Bot
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
# Load company documentation
loader = PyPDFLoader("company_docs.pdf")
docs = loader.load_and_split()
# Create vector store
vectorstore = Chroma.from_documents(docs, embeddings)
# Create memory
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# Create support bot
support_bot = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectorstore.as_retriever(),
memory=memory
)
# Handle customer query
response = support_bot({"question": "What is your refund policy?"})
Example 2: Document Analysis Pipeline
from langchain.chains import AnalyzeDocumentChain
from langchain.chains.summarize import load_summarize_chain
# Summarization chain
summary_chain = load_summarize_chain(
llm=llm,
chain_type="map_reduce"
)
# Document analysis
analyze_chain = AnalyzeDocumentChain(
combine_docs_chain=summary_chain
)
# Analyze document
with open("report.txt") as f:
summary = analyze_chain.run(f.read())
Example 3: SQL Query Agent
from langchain.agents import create_sql_agent
from langchain.sql_database import SQLDatabase
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
# Connect to database
db = SQLDatabase.from_uri("sqlite:///company.db")
# Create toolkit
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
# Create agent
sql_agent = create_sql_agent(
llm=llm,
toolkit=toolkit,
verbose=True
)
# Natural language query
result = sql_agent.run("How many customers signed up last month?")
Example 4: Research Assistant
from langchain.agents import Tool, AgentExecutor
from langchain.tools import WikipediaQueryRun, DuckDuckGoSearchRun
from langchain.utilities import WikipediaAPIWrapper
# Research tools
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
search = DuckDuckGoSearchRun()
tools = [
Tool(
name="Wikipedia",
func=wikipedia.run,
description="Search Wikipedia"
),
Tool(
name="Web Search",
func=search.run,
description="Search the internet"
)
]
# Create research agent
research_agent = initialize_agent(
tools=tools,
llm=llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
# Research task
report = research_agent.run(
"Research the latest developments in quantum computing and write a summary"
)
Performance Optimization
1. Batch Processing
# Instead of multiple calls
results = [chain.invoke(input) for input in inputs]
# Use batch
results = chain.batch(inputs)
2. Streaming Responses
# Stream tokens as they arrive
for chunk in llm.stream("Write a long story"):
print(chunk.content, end="", flush=True)
3. Async Operations
import asyncio
async def process_queries(queries):
tasks = [chain.ainvoke(q) for q in queries]
results = await asyncio.gather(*tasks)
return results
# Run async
results = asyncio.run(process_queries(query_list))
4. Optimize Embeddings
# Use smaller, faster embedding models for large datasets
from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
Testing LangChain Applications
Unit Tests
import pytest
from langchain.prompts import ChatPromptTemplate
def test_prompt_template():
template = ChatPromptTemplate.from_template(
"Translate {text} to {language}"
)
result = template.format_messages(
text="Hello",
language="Spanish"
)
assert "Hello" in str(result)
assert "Spanish" in str(result)
def test_chain_output():
# Mock LLM for testing
from langchain.llms.fake import FakeListLLM
fake_llm = FakeListLLM(responses=["Mocked response"])
chain = LLMChain(llm=fake_llm, prompt=template)
result = chain.run("test input")
assert result == "Mocked response"
Integration Tests
@pytest.mark.integration
def test_rag_pipeline():
# Test end-to-end RAG
docs = load_test_documents()
vectorstore = create_test_vectorstore(docs)
chain = create_qa_chain(vectorstore)
result = chain({"query": "test question"})
assert result["result"]
assert result["source_documents"]
Common Pitfalls and Solutions
Pitfall 1: Token Limits
Problem: Exceeding context window
Solution:
from langchain.text_splitter import TokenTextSplitter
splitter = TokenTextSplitter(
chunk_size=1000,
chunk_overlap=100
)
Pitfall 2: Poor Retrieval Quality
Problem: Retrieving irrelevant documents
Solution:
# Use hybrid search
from langchain.retrievers import EnsembleRetriever
ensemble_retriever = EnsembleRetriever(
retrievers=[
vectorstore.as_retriever(),
bm25_retriever
],
weights=[0.5, 0.5]
)
Pitfall 3: Hallucinations
Problem: LLM making up information
Solution:
# Add source verification
prompt = """Answer based only on the provided context.
If you cannot answer from the context, say "I don't have that information."
Context: {context}
Question: {question}"""
Resources and Learning Path
Week 1: Fundamentals
- Basic chains
- Prompt engineering
- Memory systems
Week 2: RAG
- Document loading
- Vector stores
- Retrieval chains
Week 3: Agents
- Tool creation
- Agent types
- Custom agents
Week 4: Production
- Error handling
- Monitoring
- Optimization
Getting Help from Sayl Solutions
We specialize in LangChain application development:
Our Services:
- Consulting: Architecture and strategy
- Development: Custom applications
- Training: Team education
- Support: Production assistance
Recent Projects:
- Enterprise RAG system (1M+ documents)
- Customer support automation (90% resolution)
- Research assistant agent
- Document analysis pipeline
Conclusion
LangChain has matured into a production-ready framework for AI applications. Whether you're building chatbots, analysis tools, or autonomous agents, LangChain provides the foundation.
Start with simple chains, master RAG for knowledge-grounded responses, then explore agents for autonomous capabilities. The learning curve is real, but the results are transformative.
Ready to build production AI applications? Contact Sayl Solutions for expert LangChain development and consulting.
Need help building LangChain applications? Sayl Solutions provides end-to-end development services from architecture to deployment. Schedule a free consultation to discuss your AI project.