choosing a data model

overview

python offers three main approaches for data modeling:

  1. dataclasses - simple containers (standard library)
  2. pydantic - validation and serialization
  3. sqlalchemy - database persistence

quick decision guide

i need to…use this
store data internallydataclasses
validate user inputpydantic
handle json/api datapydantic
save to a databasesqlalchemy
validate config filespydantic
pass data between functionsdataclasses
build a web apppydantic + sqlalchemy

key differences

aspectdataclassespydanticsqlalchemy
purposedata containersvalidationdatabase orm
validationmanualautomaticdatabase-level
json supportbasicfullvia serializers
dependenciesnone (stdlib)pydantic packagesqlalchemy + driver
complexitysimplemoderatecomplex

when to use each

use dataclasses when

  • data is already validated
  • working with internal application data
  • performance is critical
  • you want zero dependencies
@dataclass
class Point:
    x: float
    y: float

use pydantic when

  • handling external data (apis, files, user input)
  • need automatic validation
  • working with json/yaml
  • building apis with fastapi
class User(BaseModel):
    email: EmailStr
    age: int = Field(ge=0, le=150)

use sqlalchemy when

  • need to persist data
  • require complex queries
  • need transactions
  • working with relational data
class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str] = mapped_column(unique=True)

common patterns

api development (fastapi)

# pydantic for validation
class UserCreate(BaseModel):
    email: str
    password: str

# sqlalchemy for storage
class UserDB(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str]
    hashed_password: Mapped[str]

# pydantic for response
class UserResponse(BaseModel):
    id: int
    email: str

configuration management

# pydantic for config files
class Settings(BaseSettings):
    database_url: str
    api_key: SecretStr
    debug: bool = False

data processing

# dataclasses for internal data
@dataclass(slots=True)
class DataPoint:
    timestamp: float
    value: float
    sensor_id: int

combining approaches

you can use multiple tools together:

  • pydantic + sqlalchemy: validate input, store in database
  • dataclasses + pydantic: process data internally, validate at boundaries
  • all three: complex applications with multiple layers

simple decision flow

need database? → sqlalchemy
need validation? → pydantic
else → dataclasses

performance

based on comprehensive benchmarks across python 3.10-3.14:

operationdataclassespydanticsqlalchemy
creation1.0x (baseline)1.8x slower50.5x slower
modification1.0x (baseline)1.1x slower138.8x slower
serialization1.0x (baseline)0.4x (2.5x faster!)1.0x (similar)

note: sqlalchemy times include database operations (in-memory sqlite)

important: performance varies significantly by python version. see version comparison for details.

learn more

library documentation

performance benchmarks

══════════════════════════════════════════════════════════════════
on this page