dataclass and __slots__
Every option of @dataclass for short, safe data classes — frozen, kw_only, field() — plus __slots__ for memory savings, all in one place.
The first chapter of Part 2, Structuring Code. By the end of Part 1’s seven chapters we had functions, collections, exceptions, and modules in hand, but we had not yet seen a concise way to express “objects with a fixed shape.” This chapter uses @dataclass and __slots__ to handle that problem.
The dataclass from this chapter is the one you’ll meet most often in the rest of the book. It pairs up with the Protocol from Chapter 9 typing in earnest — Generic, Protocol, TypedDict, Literal, and Part 4’s FastAPI Pydantic model is essentially a dataclass extension (Chapter 24 Pydantic v2 in depth compares them head-on).
What problem do data classes solve? #
Anyone who’s written a class like this knows what’s annoying right away.
class User:
def __init__(self, id: int, name: str, age: int):
self.id = id
self.name = name
self.age = age
def __repr__(self) -> str:
return f"User(id={self.id!r}, name={self.name!r}, age={self.age!r})"
def __eq__(self, other) -> bool:
if not isinstance(other, User):
return NotImplemented
return (self.id, self.name, self.age) == (other.id, other.name, other.age)Three fields require hand-writing __init__, __repr__, and __eq__. Adding one field means editing all three places.
@dataclass solves this.
from dataclasses import dataclass
@dataclass
class User:
id: int
name: str
age: intThat’s all. __init__, __repr__, and __eq__ are auto-generated.
u = User(id=1, name="Curtis", age=30)
print(u)
# User(id=1, name='Curtis', age=30)
print(u == User(id=1, name="Curtis", age=30)) # TrueType hints (id: int, etc.) are themselves the field definitions. Declare the data shape in one place; behaviors come automatically.
@dataclass options — the common ones
#
from dataclasses import dataclass
@dataclass(frozen=True, kw_only=True, slots=True)
class User:
id: int
name: str
age: int = 0What each option means:
| Option | Default | Meaning |
|---|---|---|
frozen | False | True: immutable — fields can’t change after creation |
kw_only | False | True: every field is keyword-only |
slots | False | True: auto-add __slots__ (3.10+) |
eq | True | auto-generate __eq__ |
order | False | True: auto-generate <, >, etc. |
repr | True | auto-generate __repr__ |
init | True | auto-generate __init__ |
frozen=True — immutable objects
#
@dataclass(frozen=True)
class Point:
x: float
y: float
p = Point(1.0, 2.0)
p.x = 3.0 # ✗ FrozenInstanceErrorWhy immutability is good:
- Usable as a dict key or set element (becomes hashable automatically)
- Blocks unintended mutations — prevents bugs from someone editing data after passing it
- Safe in multithreaded code (no race conditions)
A great fit for “this data doesn’t change after creation” in domain models.
kw_only=True — no positional args
#
@dataclass(kw_only=True)
class User:
id: int
name: str
age: int = 0
u = User(id=1, name="Curtis") # OK
u = User(1, "Curtis") # ✗ TypeErrorThe same effect as keyword-only from Chapter 5 function argument patterns. Calls with many fields like User(1, "Curtis", 30, True, "admin") are hard to read; kw_only=True blocks them. Recommended on by default for new data classes.
slots=True — memory and speed
#
@dataclass(slots=True)
class Point:
x: float
y: floatWe cover this in detail later in the chapter. Short version: “makes instances lighter and faster”.
Comparable — order=True
#
@dataclass(order=True)
class Score:
value: int
name: str
scores = [Score(80, "B"), Score(95, "A"), Score(70, "C")]
scores.sort() # works automatically
print(scores) # [Score(70, 'C'), Score(80, 'B'), Score(95, 'A')]<, <=, >, >= compare field-by-field like a tuple. If the first field ties, it falls through to the next. Useful where order matters (scores, times, coordinates).
field() — fine-grained per-field config
#
When the default isn’t a simple value, or when you need finer options, use field().
Pitfall — don’t put a mutable default directly #
@dataclass
class User:
name: str
tags: list[str] = [] # ✗ ValueError
# mutable default <class 'list'> for field tags is not allowedThe mutable-default pitfall from Chapter 5. dataclass kindly catches it. Use default_factory.
from dataclasses import dataclass, field
@dataclass
class User:
name: str
tags: list[str] = field(default_factory=list)Each instance gets a fresh empty list.
Other field() options
#
from dataclasses import dataclass, field
@dataclass
class User:
id: int
# 1. default
role: str = "member"
# 2. mutable default
tags: list[str] = field(default_factory=list)
# 3. exclude from repr/eq
password: str = field(repr=False, compare=False, default="")
# 4. exclude from init — populate later
created_at: float = field(init=False, default=0.0)
# 5. metadata
score: int = field(default=0, metadata={"max": 100})repr=False is common for fields like passwords that shouldn’t appear in logs (Chapter 31 logging and observability revisits PII / secret logging prevention patterns). compare=False keeps a field out of equality — e.g., users with different created_at still count as equal.
__post_init__ — post-creation hook
#
Since __init__ is auto-generated, it’s hard to write directly; use this when you need extra processing after creation.
from dataclasses import dataclass, field
@dataclass
class Rectangle:
width: float
height: float
area: float = field(init=False)
def __post_init__(self):
self.area = self.width * self.height
r = Rectangle(3, 4)
print(r.area) # 12.0A common pattern: exclude a field from the constructor with init=False and compute it in __post_init__.
Where dataclass doesn’t fit
#
It isn’t a panacea. Look elsewhere for these:
| Spot | Better tool |
|---|---|
| Strong validation (email format, length limits) | Pydantic |
| Frequent JSON conversion (serialize/deserialize) | Pydantic, attrs, msgspec |
| Inheritance + lots of behavior | regular class |
| Named tuple is enough | NamedTuple |
| A dict is fine | TypedDict |
For API input validation, Pydantic is overwhelmingly better. Chapter 24 Pydantic v2 in depth covers it in earnest. Treat dataclass as for “internal data models.”
__slots__ — memory and speed
#
Now to the real story behind slots=True.
Regular instances — using __dict__
#
Python objects store attributes in a dict by default.
class Point:
def __init__(self, x: float, y: float):
self.x = x
self.y = y
p = Point(1.0, 2.0)
print(p.__dict__)
# {'x': 1.0, 'y': 2.0}
p.z = 3.0 # can freely add attributes
print(p.__dict__)
# {'x': 1.0, 'y': 2.0, 'z': 3.0}Pro: very flexible. Con: per-attribute dict overhead every time. Memory grows a lot when you create millions of objects.
__slots__ — only predeclared attributes
#
class Point:
__slots__ = ("x", "y")
def __init__(self, x: float, y: float):
self.x = x
self.y = y
p = Point(1.0, 2.0)
p.z = 3.0 # ✗ AttributeError: 'Point' object has no attribute 'z'Defining __slots__:
- No
__dict__is created — memory drops - Can’t add attributes — only declared ones
- Slightly faster attribute access — direct slot access instead of dict lookup
In numbers, 40–50% memory savings per instance and 10–25% faster attribute access are typical (varies by object size and interpreter version). Precise measurement is covered with the right tools in Chapter 21 performance — cProfile, py-spy, memory profiling.
dataclass(slots=True) is the easiest path
#
Writing __slots__ directly means listing field names twice (typing declaration + __slots__). dataclass(slots=True) handles it automatically.
from dataclasses import dataclass
@dataclass(slots=True)
class Point:
x: float
y: floatBehind the scenes, it’s the same as writing it manually. One line — there’s no reason not to use it.
Things to watch when using __slots__
#
It isn’t a silver bullet.
1) Multiple inheritance restrictions #
Multiple inheritance of classes that both define __slots__ causes conflicts. Single inheritance is fine.
2) Weak references — weakref doesn’t work
#
Default __slots__ doesn’t support weakref. If needed:
class Node:
__slots__ = ("data", "__weakref__")dataclass(slots=True, weakref_slot=True) is also available (3.11+).
3) Beware class-variable conflicts #
class Bad:
__slots__ = ("x",)
x = 0 # ✗ ValueError — same-named class variable and slot4) Can’t add dynamic attributes #
Patterns that attach temporary attributes (plugins, mocking) break. Usually fine, but consider this when building libraries — users may try.
When to turn slots on? #
| Situation | slots |
|---|---|
| Data models with tens of thousands to millions of instances (coordinates, graph nodes) | ✅ definitely |
| Strong immutability — block arbitrary attribute additions | ✅ |
| Regular domain objects, not many instances | ⭕ no harm in turning it on (just turn it on) |
| Metaprogramming / dynamic attributes / heavy multiple inheritance | ❌ off, or use carefully |
When in doubt, defaulting to dataclass(slots=True) is a typical modern Python choice.
Exercises #
- Define
Address(country: str, city: str, postal_code: str)with@dataclass(frozen=True, kw_only=True, slots=True). Verify that callingAddress("KR", "Seoul", "06000")raisesTypeError(kw_only), that two instances built with the same values are==, thathash(addr)works (frozen → hashable), and thataddr.city = "Busan"raisesFrozenInstanceError. - Build a
Cartclass with@dataclass. Theitems: list[str]field needsfield(default_factory=list)because it’s a mutable default. Excludetotal: floatfrom init withfield(init=False, default=0.0)and fill it in__post_init__as the number ofitems× 1000. - Create 10,000
@dataclassinstances and 10,000@dataclass(slots=True)instances, then compare per-instance memory usingsys.getsizeof(instance)+sys.getsizeof(instance.__dict__)(wrap the slots side in try / except since it has no__dict__). Precise memory measurement tools come back in Chapter 21 performance.
In one line: one line of
@dataclassauto-generates__init__/__repr__/__eq__. Common options arefrozen(immutable, hashable),kw_only(call readability),slots(memory),order(sorting). Usefield(default_factory=...)for mutable defaults;__post_init__for post-creation processing. If you need validation / serialization, use Pydantic instead of dataclass.
Next chapter #
In Chapter 9 typing in earnest — Generic, Protocol, TypedDict, Literal we cover the powerful tools of the type system. When this chapter’s dataclass meets Chapter 9’s Protocol, the core pattern of “structural typing” comes together.