Pydantic

Also known as: pydantic v2, BaseModel, data validation

TL;DR

Pydantic is the runtime type-validation library that has quietly become a hard dependency of the Python ML ecosystem. You declare a BaseModel, get validation, JSON-schema export, and a v2 Rust core for free.

Pydantic is the data-validation library that, somewhere between 2020 and 2023, became load-bearing for the entire Python ML stack. You declare a class that inherits from BaseModel, annotate its fields with types, and get back: runtime validation, JSON serialization, JSON-schema export, and — in v2 — a Rust core that makes the whole thing fast enough to put on the hot path.

from pydantic import BaseModel

class Citation(BaseModel):
    doc_id: str
    quote: str
    confidence: float

c = Citation.model_validate({"doc_id": "abc", "quote": "...", "confidence": 0.92})

That’s the whole API surface most people touch. The rest of the library is built around making this pattern scale to nested models, custom validators, and schemas that round-trip cleanly to LLM providers.

Why every ML codebase ends up using it

Python is a dynamically typed language being used to write systems whose correctness depends entirely on data shape. The contradiction is unsustainable. Half the Python ML toolchain — PyTorch, JAX, NumPy — is genuinely worth the dynamic-language tax because the iteration speed is real. But the moment data crosses a process boundary (HTTP, queue, LLM output, file format), the type checker stops helping and you’re back to KeyError at 3am.

Pydantic closes that gap. It’s the runtime check that mypy can’t perform.

The type checker tells you what you wrote. Pydantic tells you what actually arrived.

This is why it shows up in places where you might not expect it:

  • FastAPI uses Pydantic models for request/response schemas — they double as OpenAPI specs.
  • LangChain, LlamaIndex, and the OpenAI/Anthropic SDKs use Pydantic for tool definitions and structured outputs.
  • Hugging Face’s newer training configs, Modal’s function specs, Prefect’s flow inputs — all Pydantic.
  • servers describe their tools with Pydantic-compatible JSON schemas.

Once you accept that Python is the language and you’re not going to escape, Pydantic is the cheapest path to the kind of guarantees a strongly-typed language gives you for free.

The schema-as-truth pattern

The single most useful thing about Pydantic in an LLM context is that the same class drives the prompt, the response format, and the downstream consumer:

from pydantic import BaseModel
from openai import OpenAI

class Answer(BaseModel):
    summary: str
    citations: list[str]
    confidence: float

client = OpenAI()
resp = client.chat.completions.parse(
    model="gpt-4o",
    messages=[...],
    response_format=Answer,
)
answer: Answer = resp.choices[0].message.parsed

The schema gets shipped to the model, the model is constrained to emit conforming JSON, and the result is parsed straight back into a typed Python object. There is exactly one definition of Answer in the codebase, and it’s the source of truth for prompt, transport, and consumer. Drift is impossible.

This is the same pattern that powers and — the tool’s signature is a Pydantic model, the model emits arguments matching it, and your handler receives a validated object.

Performance — v1 to v2 was a Rust rewrite

Pydantic v1 was pure Python and, on hot paths, painfully slow — easily a bottleneck on data-heavy services. v2 (released 2023) rewrote the validation engine in Rust as pydantic-core and exposed it through a thin Python facade.

Typical v1 to v2 speedups
  • Simple model validation: 5-10x
  • Nested models: 10-20x
  • Large lists of models: 20-50x
  • JSON parsing-and-validation in one pass: ~17x

The catch: v2 broke a number of v1 idioms. @validator became @field_validator, Config classes became model_config dicts, .dict() became .model_dump(). Most projects can migrate with bump-pydantic, but a handful of edge cases (custom root validators, recursive models, GenericModel) need hand-tuning.

Even v2 has a cost — it’s not zero. Hot paths to watch: (a) deeply nested models with thousands of items, where the per-item validation overhead compounds; (b) .model_dump() called in a serialization loop, which re-walks the tree every time; (c) custom @field_validator functions written in pure Python, which break out of the Rust core. Fixes: validate once at the boundary and pass typed objects internally, cache .model_dump() results when serializing repeatedly, push expensive validators to native types where possible.

Where it composes

Pydantic’s value compounds with everything it touches:

  • model_json_schema() is exactly the JSON Schema that OpenAI/Anthropic want for response_format.
  • — tool definitions are Pydantic models; arguments are parsed back into instances.
  • — agents define their tool surface as Pydantic schemas, the LLM emits structured calls, your dispatcher receives typed objects.
  • — the protocol’s tool descriptions are JSON Schema, which Pydantic produces natively. Most Python MCP servers are Pydantic models all the way down.
  • FastAPI — request/response/dependency injection all use Pydantic; the OpenAPI doc falls out for free.

The pattern is always the same: declare the shape once, export the schema, validate at the boundary.

Antipatterns

A few things that show up regularly in code review and shouldn’t:

  • Using Pydantic models for in-memory hot loops. If the data doesn’t cross a boundary, a dataclass or NamedTuple is faster and the type checker is enough. Pydantic earns its keep at edges, not in inner loops.
  • model_config = ConfigDict(extra="allow") everywhere. This silently swallows unexpected fields. The default ("ignore") is usually right; "forbid" is right for strict APIs. "allow" should be a deliberate choice, not a copy-paste.
  • Custom validators that do I/O. @field_validator should be a pure function. The moment it makes a network call or hits a database, it’s no longer validation — it’s business logic in a confusing place.
  • Mutating models after construction. Pydantic models can be frozen (model_config = ConfigDict(frozen=True)); doing so eliminates a class of bugs where downstream code modifies a “validated” object into an invalid state.
Go further

Why not just use dataclasses or TypedDict?

dataclasses and TypedDict are static-only — the type checker believes you, but at runtime a str masquerading as an int will sail through. Pydantic validates, coerces where sane, and raises a structured ValidationError at the boundary. For data crossing a network edge (LLM output, HTTP request, queue message), runtime validation is the only thing that actually protects you.

How does Pydantic plug into LLM structured output?

BaseModel.model_json_schema() exports a JSON Schema that OpenAI, Anthropic, and most other providers accept directly as the response-format spec. The model emits JSON conforming to the schema; you call MyModel.model_validate_json(raw) and get a typed object back. The schema is the single source of truth — same class drives the prompt and the parser.

Is v2 actually faster, or is it marketing?

Pydantic v2 rewrote the validation core in Rust (pydantic-core). Benchmarks show 5-50x speedups depending on workload — more skewed toward the high end on nested models and large lists. If you have a v1 codebase processing millions of records, the migration is usually worth it on throughput alone.

ZeroEntropy
The best AI teams build with ZeroEntropy models
Follow us on
GitHubTwitterSlackLinkedInDiscord