python data models performance

Overview

three approaches to data modeling in python:

  • dataclasses: standard library data containers
  • pydantic: validation and serialization
  • sqlalchemy: database orm

this benchmark measures their relative performance.

Test Setup

each library implements the same model:

# Dataclass
@dataclass
class PersonDataclass:
    name: str
    age: int
    email: str
    active: bool = True

# Pydantic
class PersonPydantic(BaseModel):
    name: str
    age: int
    email: str
    active: bool = True

# SQLAlchemy (with in-memory SQLite)
class PersonSQLAlchemy(Base):
    __tablename__ = 'persons'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    age = Column(Integer)
    email = Column(String)
    active = Column(Boolean, default=True)

Operations Tested

  • creation: instantiating objects
  • modification: updating attributes
  • serialization: converting to dictionaries

10,000 iterations per operation, measured with time.perf_counter()

Results

Python data model performance comparison

Average Performance (Python 3.10-3.14)

librarycreationmodificationserialization
dataclasses1.0x1.0x1.0x
pydantic1.8x slower1.1x slower2.5x faster
sqlalchemy50.5x slower138.8x slower1.0x

Key Findings

  • pydantic serializes ~2.5x faster due to rust-based implementation
  • sqlalchemy operates at a different scale (50-140x slower) due to database operations
  • dataclasses and pydantic have similar performance for basic operations

Actual Times (Microseconds)

operationdataclassespydanticsqlalchemy
creation0.640.9026.08
modification1.721.96239.49
serialization2.780.762.03

When to Use Each

Dataclasses

  • simple data containers
  • validation not required
  • zero dependencies
  • significant performance improvements in python 3.13+

Pydantic

  • data validation required
  • json serialization workflows
  • api development (fastapi)
  • configuration management

SQLAlchemy

  • database persistence
  • complex queries and relationships
  • transaction management
  • database-agnostic applications

Benchmark Files

Running the Benchmarks

# Run benchmarks with specific Python version
uv run --python 3.14 --prerelease allow --with pydantic --with sqlalchemy python benchmark.py

# Analyze results
uv run --with matplotlib python analyze.py

# Generate visualizations
uv run --with matplotlib python visualize.py

Conclusion

performance is one factor among many:

  • dataclasses: optimal for simple use cases, especially with python 3.13+
  • pydantic: best choice for validation and serialization workflows
  • sqlalchemy: necessary for database-backed applications

choose based on requirements, not benchmarks alone.

See Also

══════════════════════════════════════════════════════════════════
on this page