Modern Python Intermediate #1: dataclass and __slots__

4 min read

If you’ve finished the Modern Python Basics series, it’s time to step up. The intermediate series is seven posts that take the tools we only touched on in basics and explore them seriously.

  • #1 dataclass and __slots__ ← this post
  • #2 typing in earnest — Generic, Protocol, TypedDict, Literal
  • #3 Context managers (with, contextlib)
  • #4 Iterables/generators/yield from
  • #5 Decorator patterns
  • #6 Pattern matching in depth
  • #7 Async intro (asyncio)

The first topic is a tool for writing data-holding classes with less boilerplate@dataclass, plus the __slots__ option that saves memory.

What problem do data classes solve? #

Anyone who’s written a class like this knows what’s annoying right away.

🚫 hand-written class — long and repetitive
class User:
    def __init__(self, id: int, name: str, age: int):
        self.id = id
        self.name = name
        self.age = age

    def __repr__(self) -> str:
        return f"User(id={self.id!r}, name={self.name!r}, age={self.age!r})"

    def __eq__(self, other) -> bool:
        if not isinstance(other, User):
            return NotImplemented
        return (self.id, self.name, self.age) == (other.id, other.name, other.age)

Three fields require hand-writing __init__, __repr__, and __eq__. Adding one field means editing all three places.

@dataclass solves this.

✅ @dataclass — same job
from dataclasses import dataclass

@dataclass
class User:
    id: int
    name: str
    age: int

That’s all. __init__, __repr__, and __eq__ are auto-generated.

It works
u = User(id=1, name="커티스", age=30)
print(u)
# User(id=1, name='커티스', age=30)

print(u == User(id=1, name="커티스", age=30))   # True

Type hints (id: int, etc.) are themselves the field definitions. Declare the data shape in one place; behaviors come automatically.

@dataclass options — the common ones #

With options
from dataclasses import dataclass

@dataclass(frozen=True, kw_only=True, slots=True)
class User:
    id: int
    name: str
    age: int = 0

What each option means:

OptionDefaultMeaning
frozenFalseTrue: immutable — fields can’t change after creation
kw_onlyFalseTrue: every field is keyword-only
slotsFalseTrue: auto-add __slots__ (3.10+)
eqTrueauto-generate __eq__
orderFalseTrue: auto-generate <, >, etc.
reprTrueauto-generate __repr__
initTrueauto-generate __init__

frozen=True — immutable objects #

frozen
@dataclass(frozen=True)
class Point:
    x: float
    y: float

p = Point(1.0, 2.0)
p.x = 3.0    # ✗ FrozenInstanceError

Why immutability is good:

  • Usable as a dict key or set element (becomes hashable automatically)
  • Blocks unintended mutations — prevents bugs from data being modified after it has been passed around
  • Safe in multithreaded code (no race conditions)

A great fit for domain model objects that shouldn’t change after creation.

kw_only=True — no positional args #

kw_only
@dataclass(kw_only=True)
class User:
    id: int
    name: str
    age: int = 0

u = User(id=1, name="커티스")          # OK
u = User(1, "커티스")                  # ✗ TypeError

The same effect as keyword-only from Basics #5. Calls with many fields like User(1, "커티스", 30, True, "admin") are hard to read; kw_only=True blocks them. Recommended on by default for new data classes.

slots=True — memory and speed #

slots
@dataclass(slots=True)
class Point:
    x: float
    y: float

We cover this in detail later in the post. One-line summary: “makes instances lighter and faster”.

Comparable — order=True #

order
@dataclass(order=True)
class Score:
    value: int
    name: str

scores = [Score(80, "B"), Score(95, "A"), Score(70, "C")]
scores.sort()    # works automatically
print(scores)    # [Score(70, 'C'), Score(80, 'B'), Score(95, 'A')]

<, <=, >, >= compare field-by-field like a tuple. If the first field ties, it falls through to the next. Useful where order matters (scores, times, coordinates).

field() — fine-grained per-field config #

When the default isn’t a simple value, or when you need finer options, use field().

Pitfall — don’t put a mutable default directly #

🚫 Error
@dataclass
class User:
    name: str
    tags: list[str] = []   # ✗ ValueError
# mutable default <class 'list'> for field tags is not allowed

The pitfall from Basics #5. dataclass kindly catches it. Use default_factory.

✅ default_factory
from dataclasses import dataclass, field

@dataclass
class User:
    name: str
    tags: list[str] = field(default_factory=list)

Each instance gets a fresh empty list.

Other field() options #

field options
from dataclasses import dataclass, field

@dataclass
class User:
    id: int
    # 1. default
    role: str = "member"
    # 2. mutable default
    tags: list[str] = field(default_factory=list)
    # 3. exclude from repr/eq
    password: str = field(repr=False, compare=False, default="")
    # 4. exclude from init — populate later
    created_at: float = field(init=False, default=0.0)
    # 5. metadata
    score: int = field(default=0, metadata={"max": 100})

repr=False is common for fields like passwords that shouldn’t appear in logs. compare=False keeps a field out of equality — e.g., users with different created_at still count as equal.

__post_init__ — post-creation hook #

Since __init__ is auto-generated, you can’t write it directly; use __post_init__ when you need extra processing after the object is created.

__post_init__
from dataclasses import dataclass, field

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)

    def __post_init__(self):
        self.area = self.width * self.height

r = Rectangle(3, 4)
print(r.area)   # 12.0

A common pattern: exclude a field from the constructor with init=False and compute it in __post_init__.

Where dataclass doesn’t fit #

It isn’t a panacea. Look elsewhere for these:

CaseBetter tool
Strong validation (email format, length limits)Pydantic
Frequent JSON conversion (serialize/deserialize)Pydantic, attrs, msgspec
Inheritance + lots of behaviorregular class
Named tuple is enoughNamedTuple
A dict is fineTypedDict

Especially for API input validation, Pydantic (briefly seen in Basics #2) is overwhelmingly better. Treat dataclass as for “internal data models.”

__slots__ — memory and speed #

Now to the real story behind slots=True.

Regular instances — using __dict__ #

Python objects store attributes in a dict by default.

dict-based
class Point:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y

p = Point(1.0, 2.0)
print(p.__dict__)
# {'x': 1.0, 'y': 2.0}

p.z = 3.0    # can freely add attributes
print(p.__dict__)
# {'x': 1.0, 'y': 2.0, 'z': 3.0}

Pro: very flexible. Con: per-attribute dict overhead every time. Memory grows a lot when you create millions of objects.

__slots__ — only predeclared attributes #

__slots__
class Point:
    __slots__ = ("x", "y")

    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y

p = Point(1.0, 2.0)
p.z = 3.0    # ✗ AttributeError: 'Point' object has no attribute 'z'

Defining __slots__:

  • No __dict__ is created — memory drops
  • Can’t add attributes — only declared ones
  • Slightly faster attribute access — direct slot access instead of dict lookup

In numbers, 40–50% memory savings per instance and 10–25% faster attribute access are typical (varies by object size and interpreter version).

dataclass(slots=True) is the easiest path #

Writing __slots__ directly means listing field names twice — once in the type annotations and once in __slots__. dataclass(slots=True) handles both automatically.

Auto slots
from dataclasses import dataclass

@dataclass(slots=True)
class Point:
    x: float
    y: float

Behind the scenes, it generates the same thing as writing it manually. One line, no reason not to use it.

Things to watch when using __slots__ #

It isn’t a silver bullet.

1) Multiple inheritance restrictions #

Multiple inheritance of classes that both define __slots__ causes conflicts. Single inheritance is fine.

2) Weak references — weakref doesn’t work #

Default __slots__ doesn’t support weakref. If needed:

weakref support
class Node:
    __slots__ = ("data", "__weakref__")

dataclass(slots=True, weakref_slot=True) is also available (3.11+).

3) Beware class-variable conflicts #

🚫 Conflict
class Bad:
    __slots__ = ("x",)
    x = 0    # ✗ ValueError — same-named class variable and slot

4) Can’t add dynamic attributes #

Patterns that attach temporary attributes (plugins, mocking) break. Usually fine, but worth considering when building libraries — users may try to do this.

When to turn slots on? #

Situationslots
Data models with tens of thousands to millions of instances (coordinates, graph nodes)✅ definitely
Strong immutability — block arbitrary attribute additions
Regular domain objects, not many instances⭕ no harm in turning it on (just turn it on)
Metaprogramming / dynamic attributes / heavy multiple inheritance❌ off, or use carefully

When in doubt, defaulting to dataclass(slots=True) is the typical modern Python answer.

Wrap-up #

The tools this post covered:

  • @dataclass auto-generates __init__/__repr__/__eq__
  • Options: frozen (immutable, hashable), kw_only (no positional), order (sorting), slots (memory)
  • Use field(default_factory=list) for mutable defaults
  • Per-field control with field(repr=False, compare=False, init=False)
  • Post-creation hook with __post_init__
  • __slots__ — removes per-instance dict overhead, saves memory and speed
  • dataclass(slots=True) is the shortest way to use slots
  • For strong validation/JSON serialization, Pydantic, not dataclass

In the next post (#2 typing in earnest) we cover the powerful tools of the type system — Generic, Protocol, TypedDict, Literal. The next step from the type hints we set up in basics.

X