LangChain 101: Build Your First Real LLM Application Step by Step

A practical LangChain intro that teaches prompts, chains, and parsing through a real working example. Build your first LLM workflow the right way.

Paco Awissi

11 min read • November 25, 2025

If you're reading this, you probably want to start building with LLMs but don't know where to begin with LangChain. I get it. The documentation can feel overwhelming, and most tutorials jump straight into complex RAG systems or agent architectures. Let's take a step back.

This is your LangChain 101. Think of it as your first day learning to code, except instead of "Hello World," we're going to build something actually useful. By the end of this guide, you'll understand the core concepts that make LangChain tick, and more importantly, you'll have working code that does something real.

What We're Building (And Why It Matters)

We're going to build a system that takes messy customer emails and turns them into clean, structured JSON data. Why this example? Because it shows you every fundamental LangChain concept in action without getting lost in the weeds. You'll see prompts, chains, parsers, and error handling all working together.

But honestly, the specific use case doesn't matter that much. What matters is that you'll understand how to connect these pieces. Once you get this, you can build anything.

The Core Concepts You Actually Need

Before we write any code, let me explain the four concepts that power everything in LangChain. And I mean everything.

Runnables and Chains

A runnable is just something you can call with input and get output. That's it. A chain is when you connect multiple runnables together with the pipe operator. Think of it like Unix pipes but for LLMs. You take a prompt, pipe it to a model, pipe that to a parser. Simple.

Prompt Templates

These let you create reusable prompts with variables. Instead of hardcoding "Extract data from this email: [email text]" every time, you create a template once and inject different emails at runtime. It keeps your prompts organized and testable. For more advanced prompting strategies, check out our techniques for prompting reasoning models for clear, accurate answers.

Structured Output Parsers

Here's where things get interesting. Parsers force the LLM to return data in a specific format. You define what fields you want, what types they should be, and the parser validates everything. No more regex nightmares trying to extract data from free-form text.

Messages

LangChain uses message objects to represent conversation turns. SystemMessage for instructions, HumanMessage for user input, AIMessage for model responses. This makes multi-turn conversations explicit and portable across different models. If you want to dive deeper into making your LLMs learn from examples on the fly, explore our guide on in-context learning techniques to boost LLM accuracy.

Quick Start: From Zero to Working Code

Alright, let's build something. First, install what you need. Nothing fancy here, just the basics:

!pip install -U langchain langchain-community langchain-openai openai python-dotenv

Now, about API keys. Please don't hardcode them. I learned this the hard way in a previous role. Use environment variables or, if you're in Colab, use their secrets feature:

import os
from google.colab import userdata
from google.colab.userdata import SecretNotFoundError

keys = ["OPENAI_API_KEY", "ANTHROPIC_API_KEY"]
missing = []
for k in keys:
    value = None
    try:
        value = userdata.get(k)
    except SecretNotFoundError:
        pass

    os.environ[k] = value if value is not None else ""

    if not os.environ[k]:
        missing.append(k)

if missing:
    raise EnvironmentError(f"Missing keys: {', '.join(missing)}. Add them in Colab → Settings → Secrets.")

print("All keys loaded.")

Let's verify everything's working:

import os

assert os.getenv("OPENAI_API_KEY"), "Set OPENAI_API_KEY in your Colab secrets"

from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

Create your model instance. I'm using temperature 0 because we want consistent outputs:

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0,
    max_tokens=300,
)

Test it with a simple prompt to make sure your API key works:

msg = llm.invoke("Summarize why consistent JSON outputs help downstream systems.")
print(type(msg))
print(msg.content)

Check the response metadata. This shows you token usage, which matters for cost:

print("Response metadata:", getattr(msg, "response_metadata", {}))
print("Usage metadata:", getattr(msg, "usage_metadata", {}))

Building Your First Chain

Now for the fun part. Let's build a conversation with explicit message roles:

messages = [
    SystemMessage(content="You are a concise assistant that extracts key facts."),
    HumanMessage(content="I purchased earbuds last week. The left bud is dead."),
    AIMessage(content="Noted. A device failure on the left earbud."),
    HumanMessage(content="What information would you need to process a warranty claim?")
]

reply = llm.invoke(messages)
print(reply.content)

Create a prompt template. This is where you inject variables:

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "{persona}"),
        ("human", "{user_input}")
    ]
)

rendered = prompt.invoke({
    "persona": "You are a helpful customer support assistant.",
    "user_input": "Customer reports a faulty left earbud after 7 days. Next step?"
})
print(rendered.to_messages())

Actually, let me emphasize something. Parameterized prompts are crucial for production systems. You want to version control these, test them, swap them out easily. For more on building reliable LLM features, see our guide on prompt engineering with LLM APIs for reliable outputs.

Now compose a chain using LCEL (LangChain Expression Language):

chain = prompt | llm

resp = chain.invoke({
    "persona": "You are a helpful customer support assistant.",
    "user_input": "The customer wants a refund for defective earbuds. What should we do?"
})
print(resp.content)

Adding Structure with Parsers

This is where LangChain really shines. Define what fields you want to extract:

schemas = [
    ResponseSchema(
        name="type",
        description="One of complaint, inquiry, feedback."
    ),
    ResponseSchema(
        name="product",
        description="Product or service mentioned, string."
    ),
    ResponseSchema(
        name="action",
        description="Recommended action like refund, replace, clarify, route_to_support."
    ),
]

Create a parser and generate format instructions:

parser = StructuredOutputParser.from_response_schemas(schemas)
format_instructions = parser.get_format_instructions()
print(format_instructions[:200], "...")

Build the complete extraction chain:

extraction_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You extract structured fields from customer emails. "
            "Return JSON that strictly follows these rules. {format_instructions}"
        ),
        ("human", "Email:\n{email}")
    ]
)

extraction_chain = extraction_prompt | llm | parser

Run some customer emails through it:

emails = [
    "Hi, my left earbud stopped working after a week. I want a refund please.",
    "Hello, can you tell me if the Model X earbuds support wireless charging?",
    "Just wanted to say the new firmware fixed my microphone issue. Thanks."
]

for e in emails:
    result = extraction_chain.invoke({
        "email": e,
        "format_instructions": format_instructions
    })
    print(result)

Validate that you're getting the right fields:

def validate_result(d):
    assert isinstance(d, dict)
    assert d["type"] in {"complaint", "inquiry", "feedback"}
    assert isinstance(d["product"], str)
    assert isinstance(d["action"], str)

for e in emails:
    d = extraction_chain.invoke({"email": e, "format_instructions": format_instructions})
    validate_result(d)

Making It Production-Ready

Let's be honest, the basic chain works but it's not ready for production. You need error handling, monitoring, and flexibility.

First, check performance without parsing overhead:

from time import perf_counter

raw_chain = extraction_prompt | llm

start = perf_counter()
raw_msg = raw_chain.invoke({"email": emails[0], "format_instructions": format_instructions})
elapsed = perf_counter() - start

print("Latency seconds:", round(elapsed, 3))
print("Usage:", getattr(raw_msg, "usage_metadata", {}))

parsed = parser.invoke(raw_msg)
print(parsed)

Build a reusable function with proper validation:

def extract_email_fields(email: str) -> dict:
    raw = raw_chain.invoke({"email": email, "format_instructions": format_instructions})
    usage = getattr(raw, "usage_metadata", {})
    parsed = parser.invoke(raw)
    validate_result(parsed)
    return {"data": parsed, "usage": usage}

print(extract_email_fields("The Model X case will not charge. Need a replacement."))

Add domain-specific personas. I've found this incredibly useful when dealing with technical support versus sales inquiries:

persona = "You are a support triage assistant for consumer audio devices."
persona_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", persona + " Return strict JSON. {format_instructions}"),
        ("human", "Email:\n{email}")
    ]
)
persona_chain = persona_prompt | llm | parser

print(persona_chain.invoke({"email": emails[1], "format_instructions": format_instructions}))

Test with your own examples, including edge cases:

my_emails = [
    "Order 1234. Model X earbuds arrived scratched. I want a refund.",
    "Do the Model Y earbuds pair with two phones at once?",
    "Love the sound on Model Z. Battery could be better, just feedback."
]

for e in my_emails:
    print(extraction_chain.invoke({"email": e, "format_instructions": format_instructions}))

Track telemetry for each call:

def timed_invoke(email):
    import time
    t0 = time.perf_counter()
    raw = raw_chain.invoke({"email": email, "format_instructions": format_instructions})
    dt = time.perf_counter() - t0
    usage = getattr(raw, "usage_metadata", {})
    return dt, usage, parser.invoke(raw)

for e in emails:
    dt, usage, data = timed_invoke(e)
    print({"latency_s": round(dt, 3), "usage": usage, "data": data})

Here's something important. Models sometimes return malformed JSON. You need retry logic:

from langchain.output_parsers import OutputParserException

def safe_extract(email, max_retries=1):
    for attempt in range(max_retries + 1):
        try:
            return extraction_chain.invoke({"email": email, "format_instructions": format_instructions})
        except OutputParserException:
            corrective = ChatPromptTemplate.from_messages(
                [
                    ("system", "Return valid JSON only. Do not include commentary. {format_instructions}"),
                    ("human", "Email:\n{email}")
                ]
            )
            retry_chain = corrective | llm | parser
            if attempt < max_retries:
                result = retry_chain.invoke({"email": email, "format_instructions": format_instructions})
                return result
            raise

print(safe_extract("Refund me please. Model X left earbud broke in a week."))

Extending and Customizing

As your needs grow, you'll want to expand the schema:

schemas_extended = schemas + [
    ResponseSchema(name="urgency", description="low, medium, high based on sentiment and urgency cues.")
]
parser_ext = StructuredOutputParser.from_response_schemas(schemas_extended)
fmt_ext = parser_ext.get_format_instructions()

prompt_ext = ChatPromptTemplate.from_messages(
    [
        ("system", "Extract fields and urgency. Return strict JSON. {format_instructions}"),
        ("human", "Email:\n{email}")
    ]
)
chain_ext = prompt_ext | llm | parser_ext

print(chain_ext.invoke({"email": emails[0], "format_instructions": fmt_ext}))

Add basic tests. Trust me, you'll thank yourself later:

def test_extraction():
    sample = "Left earbud on Model X stopped working. Please replace."
    d = extraction_chain.invoke({"email": sample, "format_instructions": format_instructions})
    assert d["type"] in {"complaint", "inquiry", "feedback"}
    assert isinstance(d["product"], str)
    assert d["action"] in {"refund", "replace", "clarify", "route_to_support"}

test_extraction()

Use fixtures for reproducible testing:

fixtures = [
    {
        "email": "Model X case not charging. Need a replacement.",
        "expect_type": {"complaint"},
    },
    {
        "email": "Do Model Y earbuds support USB C charging?",
        "expect_type": {"inquiry"},
    },
]

for fx in fixtures:
    d = extraction_chain.invoke({"email": fx["email"], "format_instructions": format_instructions})
    assert d["type"] in fx["expect_type"]

Process multiple emails efficiently:

results = [extraction_chain.invoke({"email": e, "format_instructions": format_instructions}) for e in emails]
print(results)

The beauty of LangChain? Swapping models is trivial. In a personal project, I compared GPT-4 against Claude for extraction tasks with just this change:

llm_alt = ChatOpenAI(model="gpt-4o", temperature=0, max_tokens=300)
extraction_chain_alt = extraction_prompt | llm_alt | parser

print(extraction_chain_alt.invoke({"email": emails[2], "format_instructions": format_instructions}))

Version your prompts and parsers:

PROMPT_VERSION = "v1.2"
SCHEMA_VERSION = "v1.1"

print({"prompt_version": PROMPT_VERSION, "schema_version": SCHEMA_VERSION})

Use environment variables for configuration:

MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4o-mini")
TEMP = float(os.getenv("TEMP", "0"))
llm_cfg = ChatOpenAI(model=MODEL_NAME, temperature=TEMP, max_tokens=300)
cfg_chain = extraction_prompt | llm_cfg | parser

Try these test cases to see it all work:

cases = [
    "I love the sound on Model X, but the right bud randomly disconnects. Can you replace it?",
    "Do Model Y earbuds work with iOS 17? If yes, how to pair?",
    "Great update, pairing is faster now. Just a note for your team."
]

for c in cases:
    print(cfg_chain.invoke({"email": c, "format_instructions": format_instructions}))

Where to Go From Here

You've just built a complete LangChain workflow. You understand runnables, chains, prompts, and parsers. You can handle errors, track usage, and swap models. That's the foundation.

What's next? Well, if you need to ground your responses in external data, look into RAG (Retrieval-Augmented Generation). Our ultimate guide to vector store retrieval for RAG systems shows you how to add semantic search to your chains.

Want more control over model behavior? Consider fine-tuning. Our step-by-step guide to fine-tuning large language models walks through the entire process.

And if you're curious about what's happening under the hood, check out our ultimate guide to transformer models for LLM practitioners.

But honestly? Start by building something with what you learned today. Pick a problem, any problem, and solve it with LangChain. The best way to learn is by doing. And now you know enough to actually do something useful.