dataclasses

overview

dataclasses - standard library module for creating classes that primarily store data

  • part of python standard library since 3.7
  • reduces boilerplate for simple data containers
  • automatic __init__, __repr__, __eq__
  • no runtime validation or serialization
  • zero dependencies

when to use

use dataclasses when:

  • you need simple data containers
  • data is already validated elsewhere
  • performance matters (minimal overhead)
  • you want to avoid external dependencies

don’t use when:

  • you need runtime validation
  • you need json/yaml serialization
  • you’re building apis or handling external data
  • complex business logic dominates

basic usage

simple data class

from dataclasses import dataclass, field
from typing import List, Optional
from datetime import datetime

@dataclass
class User:
    name: str
    email: str
    age: int
    tags: List[str] = field(default_factory=list)  # mutable defaults need field()
    created_at: datetime = field(default_factory=datetime.now)

immutable data

@dataclass(frozen=True)
class Point:
    x: float
    y: float

# point = Point(1.0, 2.0)
# point.x = 3.0  # raises FrozenInstanceError

post-init validation

@dataclass
class Email:
    address: str

    def __post_init__(self):
        if "@" not in self.address:
            raise ValueError(f"invalid email: {self.address}")

field options

@dataclass
class Config:
    # hide from repr
    api_key: str = field(repr=False)

    # exclude from equality checks
    cache_size: int = field(default=100, compare=False)

    # not included in __init__
    _internal: str = field(default="", init=False)

    # keyword-only argument (python 3.10+)
    timeout: int = field(default=30, kw_only=True)

performance features

slots (python 3.10+)

@dataclass(slots=True)
class OptimizedData:
    x: int
    y: int
    # 20-30% memory savings
    # faster attribute access
    # no dynamic attributes allowed

ordered comparisons

@dataclass(order=True)
class Task:
    priority: int = field(compare=True)
    title: str = field(compare=False)
    created: datetime = field(default_factory=datetime.now, compare=False)

# tasks sort by priority only
tasks = [Task(3, "low"), Task(1, "high"), Task(2, "medium")]
sorted_tasks = sorted(tasks)  # sorts by priority

pattern matching (python 3.10+)

@dataclass
class Response:
    status: int
    data: Optional[dict] = None

match response:
    case Response(status=200, data=data) if data:
        return data
    case Response(status=404):
        raise NotFound()
    case Response(status=status):
        raise APIError(f"status {status}")

serialization patterns

basic dict conversion

from dataclasses import asdict, astuple

@dataclass
class Person:
    name: str
    age: int

person = Person("alice", 30)
data = asdict(person)  # {"name": "alice", "age": 30}
values = astuple(person)  # ("alice", 30)

json serialization

import json
from dataclasses import asdict

@dataclass
class Person:
    name: str
    age: int

    def to_json(self) -> str:
        return json.dumps(asdict(self))

    @classmethod
    def from_json(cls, json_str: str):
        return cls(**json.loads(json_str))

common patterns

configuration with defaults

@dataclass
class DatabaseConfig:
    host: str = "localhost"
    port: int = 5432
    database: str = "myapp"

    @classmethod
    def from_env(cls):
        import os
        return cls(
            host=os.getenv("DB_HOST", cls.host),
            port=int(os.getenv("DB_PORT", cls.port)),
            database=os.getenv("DB_NAME", cls.database)
        )

inheritance

@dataclass
class BaseModel:
    id: int
    created_at: datetime = field(default_factory=datetime.now)

@dataclass
class User(BaseModel):
    name: str
    email: str
    # inherits id and created_at

custom init

@dataclass
class Temperature:
    celsius: float = field(init=False)

    def __init__(self, fahrenheit: float):
        self.celsius = (fahrenheit - 32) * 5 / 9

complete example

from dataclasses import dataclass, field
from typing import List, Optional
from datetime import datetime
from decimal import Decimal

@dataclass(slots=True)
class Product:
    """immutable product with efficient memory usage"""
    id: int
    name: str
    price: Decimal
    tags: List[str] = field(default_factory=list)

    def __post_init__(self):
        if self.price < 0:
            raise ValueError("price cannot be negative")

@dataclass
class Cart:
    """mutable shopping cart with computed properties"""
    items: List[Product] = field(default_factory=list)
    created_at: datetime = field(default_factory=datetime.now)

    def add_item(self, product: Product) -> None:
        self.items.append(product)

    @property
    def total(self) -> Decimal:
        return sum(item.price for item in self.items)

    def to_dict(self) -> dict:
        return {
            "items": [asdict(item) for item in self.items],
            "total": str(self.total),
            "created_at": self.created_at.isoformat()
        }

comparison with alternatives

featuredataclassespydanticattrssqlalchemy
standard library
runtime validation
serializationbasicadvancedadvancedorm
performancefastestfastfastslower
learning curveminimalmoderatemoderatesteep

best practices

do

  • use frozen=True for immutable data
  • use slots=True for better performance
  • use field(default_factory=...) for mutable defaults
  • validate in __post_init__ when needed
  • keep dataclasses focused on data

don’t

  • use mutable default values directly
  • mix heavy business logic into dataclasses
  • rely on type hints for runtime validation
  • use for classes with many methods
  • inherit from multiple dataclasses

limitations

  • no automatic type validation at runtime
  • limited serialization capabilities
  • no schema generation
  • no async validation support
  • no built-in json schema support

migration path

when dataclasses aren’t enough:

  1. need validationpydantic
  2. need databasesqlalchemy
  3. need both → see choosing a data model

references

══════════════════════════════════════════════════════════════════
on this page