choosing a data model
on this page
| Decision: | Database → SQLAlchemy, Validation → Pydantic, Else → dataclasses | 
| Performance: | dataclasses fastest creation, Pydantic 2.5x faster serialization | 
| Dependencies: | dataclasses (stdlib), Pydantic (1 package), SQLAlchemy (complex deps) | 
| Common combo: | FastAPI uses Pydantic + SQLAlchemy together | 
| Recommendation: | Use dataclasses for internal data, zero deps, maximum performance | 
overview
python offers three main approaches for data modeling:
- dataclasses - simple containers (standard library)
- pydantic - validation and serialization
- sqlalchemy - database persistence
quick decision guide
| i need to… | use this | 
|---|---|
| store data internally | dataclasses | 
| validate user input | pydantic | 
| handle json/api data | pydantic | 
| save to a database | sqlalchemy | 
| validate config files | pydantic | 
| pass data between functions | dataclasses | 
| build a web app | pydantic + sqlalchemy | 
key differences
| aspect | dataclasses | pydantic | sqlalchemy | 
|---|---|---|---|
| purpose | data containers | validation | database orm | 
| validation | manual | automatic | database-level | 
| json support | basic | full | via serializers | 
| dependencies | none (stdlib) | pydantic package | sqlalchemy + driver | 
| complexity | simple | moderate | complex | 
when to use each
use dataclasses when
- data is already validated
- working with internal application data
- performance is critical
- you want zero dependencies
@dataclass
class Point:
    x: float
    y: floatuse pydantic when
- handling external data (apis, files, user input)
- need automatic validation
- working with json/yaml
- building apis with fastapi
class User(BaseModel):
    email: EmailStr
    age: int = Field(ge=0, le=150)use sqlalchemy when
- need to persist data
- require complex queries
- need transactions
- working with relational data
class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str] = mapped_column(unique=True)common patterns
api development (fastapi)
# pydantic for validation
class UserCreate(BaseModel):
    email: str
    password: str
# sqlalchemy for storage
class UserDB(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str]
    hashed_password: Mapped[str]
# pydantic for response
class UserResponse(BaseModel):
    id: int
    email: strconfiguration management
# pydantic for config files
class Settings(BaseSettings):
    database_url: str
    api_key: SecretStr
    debug: bool = Falsedata processing
# dataclasses for internal data
@dataclass(slots=True)
class DataPoint:
    timestamp: float
    value: float
    sensor_id: intcombining approaches
you can use multiple tools together:
- pydantic + sqlalchemy: validate input, store in database
- dataclasses + pydantic: process data internally, validate at boundaries
- all three: complex applications with multiple layers
simple decision flow
need database? → sqlalchemy
need validation? → pydantic
else → dataclassesperformance
based on comprehensive benchmarks across python 3.10-3.14:
| operation | dataclasses | pydantic | sqlalchemy | 
|---|---|---|---|
| creation | 1.0x (baseline) | 1.8x slower | 50.5x slower | 
| modification | 1.0x (baseline) | 1.1x slower | 138.8x slower | 
| serialization | 1.0x (baseline) | 0.4x (2.5x faster!) | 1.0x (similar) | 
note: sqlalchemy times include database operations (in-memory sqlite)
important: performance varies significantly by python version. see version comparison for details.
learn more
library documentation
performance benchmarks
- full performance comparison - detailed benchmarks with visualizations
- python version analysis - how performance evolved from python 3.10 to 3.14
- raw benchmark data - scripts and results