Back to Blog
LangChainAI DevelopmentGPT-4Vector DatabaseRAGPython

LangChain Tutorial: Building Production AI Applications 2025

Sayl Solutions14 min read

LangChain Tutorial: Building Production AI Applications 2025

LangChain has become the de facto framework for building sophisticated AI applications. Whether you're creating chatbots, document analysis tools, or autonomous agents, LangChain provides the building blocks to go from prototype to production. This comprehensive tutorial covers everything you need to master LangChain in 2025.

What is LangChain?

The Framework Overview

LangChain is an open-source framework that simplifies building applications with Large Language Models (LLMs). Think of it as the "React for AI"—it provides:

  • Standardized interfaces for different LLMs
  • Chains: Sequences of AI operations
  • Agents: Autonomous decision-making systems
  • Memory: Conversation state management
  • Retrieval: Integration with vector databases
  • Tools: Pre-built components for common tasks

Why LangChain in 2025?

Market Adoption:

  • 85,000+ GitHub stars
  • Used by OpenAI, Google, Microsoft
  • 500,000+ developers
  • Production-ready with LangSmith monitoring
  • Active community and ecosystem

Key Advantages:

  • Vendor-agnostic (works with any LLM)
  • Production-grade monitoring
  • Built-in best practices
  • Extensive documentation
  • Regular updates

Installation and Setup

Basic Installation

# Core LangChain
pip install langchain

# LangChain with OpenAI
pip install langchain-openai

# Vector store support
pip install langchain-community

# Additional utilities
pip install langchainhub

# Development tools
pip install langsmith

Environment Setup

import os
from dotenv import load_dotenv

load_dotenv()

# API Keys
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
LANGCHAIN_API_KEY = os.getenv("LANGCHAIN_API_KEY")
LANGCHAIN_TRACING_V2 = "true"  # Enable monitoring

Project Structure

my_ai_app/
├── .env
├── requirements.txt
├── app/
│   ├── __init__.py
│   ├── chains/
│   │   ├── __init__.py
│   │   └── qa_chain.py
│   ├── agents/
│   │   ├── __init__.py
│   │   └── research_agent.py
│   ├── memory/
│   │   ├── __init__.py
│   │   └── conversation_memory.py
│   └── utils/
│       ├── __init__.py
│       └── vector_store.py
├── data/
└── tests/

Core Concepts

1. LLMs and Chat Models

Basic LLM Usage

from langchain_openai import ChatOpenAI

# Initialize LLM
llm = ChatOpenAI(
    model="gpt-4-turbo",
    temperature=0.7,
    max_tokens=1000
)

# Simple completion
response = llm.invoke("What is LangChain?")
print(response.content)

Structured Output

from langchain_core.pydantic_v1 import BaseModel, Field

class ProductReview(BaseModel):
    """Structured product review"""
    sentiment: str = Field(description="positive, negative, or neutral")
    rating: int = Field(description="Rating from 1-5")
    summary: str = Field(description="Brief summary")
    key_points: list[str] = Field(description="Key points from review")

# Get structured output
structured_llm = llm.with_structured_output(ProductReview)
review = structured_llm.invoke("This product is amazing! Best purchase ever.")

2. Prompt Templates

Basic Templates

from langchain_core.prompts import ChatPromptTemplate

# Create template
template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant specializing in {domain}."),
    ("human", "{question}")
])

# Use template
prompt = template.format_messages(
    domain="machine learning",
    question="What is gradient descent?"
)

response = llm.invoke(prompt)

Few-Shot Prompting

from langchain_core.prompts import FewShotChatMessagePromptTemplate

# Define examples
examples = [
    {
        "input": "I loved it!",
        "output": "Sentiment: Positive, Score: 0.9"
    },
    {
        "input": "Terrible experience.",
        "output": "Sentiment: Negative, Score: 0.1"
    }
]

# Create few-shot template
example_prompt = ChatPromptTemplate.from_messages([
    ("human", "{input}"),
    ("ai", "{output}")
])

few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples
)

3. Chains: Composing Operations

Simple Sequential Chain

from langchain.chains import LLMChain, SimpleSequentialChain

# Chain 1: Generate topic
topic_chain = LLMChain(
    llm=llm,
    prompt=ChatPromptTemplate.from_template(
        "Generate a blog topic about {subject}"
    )
)

# Chain 2: Write outline
outline_chain = LLMChain(
    llm=llm,
    prompt=ChatPromptTemplate.from_template(
        "Create a detailed outline for: {topic}"
    )
)

# Combine chains
overall_chain = SimpleSequentialChain(
    chains=[topic_chain, outline_chain],
    verbose=True
)

# Execute
result = overall_chain.run("artificial intelligence")

Custom Chain with LCEL (LangChain Expression Language)

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Modern approach using LCEL
chain = (
    {
        "context": RunnablePassthrough(),
        "question": RunnablePassthrough()
    }
    | prompt
    | llm
    | StrOutputParser()
)

# Run chain
result = chain.invoke({
    "context": "LangChain is an AI framework",
    "question": "What is LangChain used for?"
})

4. Memory: Maintaining Conversation State

Conversation Buffer Memory

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

# Create memory
memory = ConversationBufferMemory()

# Create conversation chain
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

# Multi-turn conversation
conversation.predict(input="Hi, I'm Alice")
conversation.predict(input="What's my name?")  # Will remember "Alice"

Window Memory (Last N Messages)

from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(
    k=5  # Keep last 5 interactions
)

Summary Memory

from langchain.memory import ConversationSummaryMemory

memory = ConversationSummaryMemory(
    llm=llm,
    max_token_limit=1000
)

Persistent Memory with Database

from langchain.memory import ConversationBufferMemory
from langchain.memory.chat_message_histories import RedisChatMessageHistory

# Redis-backed memory
message_history = RedisChatMessageHistory(
    url="redis://localhost:6379",
    session_id="user_123"
)

memory = ConversationBufferMemory(
    chat_memory=message_history
)

Advanced: Retrieval-Augmented Generation (RAG)

What is RAG?

RAG combines LLMs with external knowledge bases to provide accurate, up-to-date, and source-cited responses.

Architecture:

Question → Retrieve relevant docs → Add to context → LLM generates answer

Step 1: Document Loading

from langchain_community.document_loaders import (
    PyPDFLoader,
    WebBaseLoader,
    TextLoader
)

# Load PDF
pdf_loader = PyPDFLoader("document.pdf")
pdf_docs = pdf_loader.load()

# Load website
web_loader = WebBaseLoader("https://example.com")
web_docs = web_loader.load()

# Load text files
text_loader = TextLoader("data.txt")
text_docs = text_loader.load()

Step 2: Text Splitting

from langchain.text_splitter import RecursiveCharacterTextSplitter

# Create splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    separators=["\n\n", "\n", " ", ""]
)

# Split documents
splits = text_splitter.split_documents(pdf_docs)

Step 3: Embeddings and Vector Store

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

# Create embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

# Create vector store
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    persist_directory="./chroma_db"
)

Step 4: Retrieval Chain

from langchain.chains import RetrievalQA

# Create retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 4}
)

# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# Ask question
result = qa_chain({"query": "What is the main topic?"})
print(result["result"])
print(result["source_documents"])  # View sources

Advanced RAG with Custom Prompt

from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

# Custom prompt
template = """You are an AI assistant. Use the following context to answer the question.
If you don't know the answer, say so. Always cite your sources.

Context: {context}

Question: {question}

Answer with sources:"""

prompt = PromptTemplate(
    template=template,
    input_variables=["context", "question"]
)

# Advanced QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={"prompt": prompt},
    return_source_documents=True
)

Multi-Query Retrieval

from langchain.retrievers import MultiQueryRetriever

# Generate multiple query perspectives
multi_query_retriever = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(),
    llm=llm
)

# More comprehensive retrieval
docs = multi_query_retriever.get_relevant_documents(
    "What are the benefits of AI?"
)

Building AI Agents

What Are Agents?

Agents are autonomous systems that:

  1. Understand user goals
  2. Plan steps to achieve them
  3. Use tools to take actions
  4. Adjust based on outcomes

Step 1: Define Tools

from langchain.agents import Tool
from langchain.tools import DuckDuckGoSearchRun

# Search tool
search = DuckDuckGoSearchRun()

# Calculator tool
from langchain.tools import BaseTool

class CalculatorTool(BaseTool):
    name = "Calculator"
    description = "Useful for math calculations"
    
    def _run(self, query: str) -> str:
        try:
            return str(eval(query))
        except:
            return "Invalid calculation"

# Define tools list
tools = [
    Tool(
        name="Search",
        func=search.run,
        description="Search the internet for current information"
    ),
    CalculatorTool()
]

Step 2: Create Agent

from langchain.agents import initialize_agent, AgentType

# Initialize agent
agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    max_iterations=5
)

# Run agent
result = agent.run("What is 25% of the population of Tokyo?")

Advanced: Custom Agent with Memory

from langchain.agents import AgentExecutor, create_react_agent
from langchain.memory import ConversationBufferMemory

# Create memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Create custom agent
agent = create_react_agent(
    llm=llm,
    tools=tools,
    prompt=custom_agent_prompt
)

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    memory=memory,
    verbose=True
)

# Use agent
agent_executor.invoke({
    "input": "Research AI trends and summarize"
})

Function Calling Agent (Most Powerful)

from langchain.agents import create_openai_functions_agent

# Define functions
functions = [
    {
        "name": "get_customer_info",
        "description": "Retrieve customer information",
        "parameters": {
            "type": "object",
            "properties": {
                "customer_id": {
                    "type": "string",
                    "description": "Customer ID"
                }
            },
            "required": ["customer_id"]
        }
    }
]

# Create function agent
agent = create_openai_functions_agent(
    llm=llm,
    tools=tools,
    prompt=prompt
)

Production Best Practices

1. Error Handling

from langchain.callbacks import get_openai_callback
from langchain.callbacks.base import BaseCallbackHandler

class ErrorHandler(BaseCallbackHandler):
    def on_llm_error(self, error: Exception, **kwargs):
        print(f"LLM Error: {error}")
        # Log to monitoring service
        # Implement retry logic

# Use callback
with get_openai_callback() as cb:
    try:
        result = chain.invoke(input_data)
    except Exception as e:
        # Handle gracefully
        result = fallback_response

2. Cost Tracking

from langchain.callbacks import get_openai_callback

with get_openai_callback() as cb:
    result = chain.invoke(input_data)
    print(f"Total Tokens: {cb.total_tokens}")
    print(f"Total Cost: ${cb.total_cost:.4f}")

3. Caching Responses

from langchain.cache import InMemoryCache, SQLiteCache
from langchain.globals import set_llm_cache

# In-memory cache
set_llm_cache(InMemoryCache())

# Or persistent cache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

4. Rate Limiting

from langchain.llms.openai import OpenAI
from langchain.callbacks import RateLimitCallback

rate_limiter = RateLimitCallback(
    requests_per_minute=60,
    check_every_n_seconds=1
)

llm = OpenAI(
    callbacks=[rate_limiter],
    max_retries=3
)

5. Monitoring with LangSmith

import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your_key"
os.environ["LANGCHAIN_PROJECT"] = "production-app"

# All chains automatically traced
# View at smith.langchain.com

6. Structured Logging

import logging
from langchain.callbacks import StdOutCallbackHandler

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

logger = logging.getLogger(__name__)

# Log chain execution
logger.info(f"Executing chain with input: {input_data}")
result = chain.invoke(input_data)
logger.info(f"Chain result: {result}")

Real-World Application Examples

Example 1: Customer Support Bot

from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

# Load company documentation
loader = PyPDFLoader("company_docs.pdf")
docs = loader.load_and_split()

# Create vector store
vectorstore = Chroma.from_documents(docs, embeddings)

# Create memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Create support bot
support_bot = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    memory=memory
)

# Handle customer query
response = support_bot({"question": "What is your refund policy?"})

Example 2: Document Analysis Pipeline

from langchain.chains import AnalyzeDocumentChain
from langchain.chains.summarize import load_summarize_chain

# Summarization chain
summary_chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce"
)

# Document analysis
analyze_chain = AnalyzeDocumentChain(
    combine_docs_chain=summary_chain
)

# Analyze document
with open("report.txt") as f:
    summary = analyze_chain.run(f.read())

Example 3: SQL Query Agent

from langchain.agents import create_sql_agent
from langchain.sql_database import SQLDatabase
from langchain.agents.agent_toolkits import SQLDatabaseToolkit

# Connect to database
db = SQLDatabase.from_uri("sqlite:///company.db")

# Create toolkit
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

# Create agent
sql_agent = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True
)

# Natural language query
result = sql_agent.run("How many customers signed up last month?")

Example 4: Research Assistant

from langchain.agents import Tool, AgentExecutor
from langchain.tools import WikipediaQueryRun, DuckDuckGoSearchRun
from langchain.utilities import WikipediaAPIWrapper

# Research tools
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
search = DuckDuckGoSearchRun()

tools = [
    Tool(
        name="Wikipedia",
        func=wikipedia.run,
        description="Search Wikipedia"
    ),
    Tool(
        name="Web Search",
        func=search.run,
        description="Search the internet"
    )
]

# Create research agent
research_agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Research task
report = research_agent.run(
    "Research the latest developments in quantum computing and write a summary"
)

Performance Optimization

1. Batch Processing

# Instead of multiple calls
results = [chain.invoke(input) for input in inputs]

# Use batch
results = chain.batch(inputs)

2. Streaming Responses

# Stream tokens as they arrive
for chunk in llm.stream("Write a long story"):
    print(chunk.content, end="", flush=True)

3. Async Operations

import asyncio

async def process_queries(queries):
    tasks = [chain.ainvoke(q) for q in queries]
    results = await asyncio.gather(*tasks)
    return results

# Run async
results = asyncio.run(process_queries(query_list))

4. Optimize Embeddings

# Use smaller, faster embedding models for large datasets
from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

Testing LangChain Applications

Unit Tests

import pytest
from langchain.prompts import ChatPromptTemplate

def test_prompt_template():
    template = ChatPromptTemplate.from_template(
        "Translate {text} to {language}"
    )
    
    result = template.format_messages(
        text="Hello",
        language="Spanish"
    )
    
    assert "Hello" in str(result)
    assert "Spanish" in str(result)

def test_chain_output():
    # Mock LLM for testing
    from langchain.llms.fake import FakeListLLM
    
    fake_llm = FakeListLLM(responses=["Mocked response"])
    chain = LLMChain(llm=fake_llm, prompt=template)
    
    result = chain.run("test input")
    assert result == "Mocked response"

Integration Tests

@pytest.mark.integration
def test_rag_pipeline():
    # Test end-to-end RAG
    docs = load_test_documents()
    vectorstore = create_test_vectorstore(docs)
    chain = create_qa_chain(vectorstore)
    
    result = chain({"query": "test question"})
    
    assert result["result"]
    assert result["source_documents"]

Common Pitfalls and Solutions

Pitfall 1: Token Limits

Problem: Exceeding context window

Solution:

from langchain.text_splitter import TokenTextSplitter

splitter = TokenTextSplitter(
    chunk_size=1000,
    chunk_overlap=100
)

Pitfall 2: Poor Retrieval Quality

Problem: Retrieving irrelevant documents

Solution:

# Use hybrid search
from langchain.retrievers import EnsembleRetriever

ensemble_retriever = EnsembleRetriever(
    retrievers=[
        vectorstore.as_retriever(),
        bm25_retriever
    ],
    weights=[0.5, 0.5]
)

Pitfall 3: Hallucinations

Problem: LLM making up information

Solution:

# Add source verification
prompt = """Answer based only on the provided context.
If you cannot answer from the context, say "I don't have that information."

Context: {context}
Question: {question}"""

Resources and Learning Path

Week 1: Fundamentals

  • Basic chains
  • Prompt engineering
  • Memory systems

Week 2: RAG

  • Document loading
  • Vector stores
  • Retrieval chains

Week 3: Agents

  • Tool creation
  • Agent types
  • Custom agents

Week 4: Production

  • Error handling
  • Monitoring
  • Optimization

Getting Help from Sayl Solutions

We specialize in LangChain application development:

Our Services:

  • Consulting: Architecture and strategy
  • Development: Custom applications
  • Training: Team education
  • Support: Production assistance

Recent Projects:

  • Enterprise RAG system (1M+ documents)
  • Customer support automation (90% resolution)
  • Research assistant agent
  • Document analysis pipeline

Conclusion

LangChain has matured into a production-ready framework for AI applications. Whether you're building chatbots, analysis tools, or autonomous agents, LangChain provides the foundation.

Start with simple chains, master RAG for knowledge-grounded responses, then explore agents for autonomous capabilities. The learning curve is real, but the results are transformative.

Ready to build production AI applications? Contact Sayl Solutions for expert LangChain development and consulting.


Need help building LangChain applications? Sayl Solutions provides end-to-end development services from architecture to deployment. Schedule a free consultation to discuss your AI project.