Modern Python Advanced #1: Magic methods in depth and protocols

4 min read

If you’ve finished the Modern Python Intermediate series, it’s time to go into the depths of the language. The Advanced series — seven posts — covers tools you meet in library/framework code: magic methods, descriptors, metaclasses, async depth, GIL/concurrency, advanced typing, and performance.

  • #1 Magic methods in depth and protocols ← this post
  • #2 Descriptors and __set_name__
  • #3 Metaclasses — when do you really need them?
  • #4 Async in depth (event loop, gather/wait, async generator)
  • #5 GIL and concurrency — threading vs multiprocessing vs asyncio
  • #6 Advanced typing — Variance, ParamSpec, Self
  • #7 Performance — cProfile, line_profiler, memory profiling

Magic methods (a.k.a. dunder methods, dunder = double underscore) are the official hooks where Python objects meet language features. len(x) calls x.__len__(); a + b calls a.__add__(b). Knowing these hooks precisely lets you build Pythonic objects and understand what is being called when you read library code.

Object lifecycle #

__init__ vs __new__ — the difference #

The two roles
class Foo:
    def __new__(cls, *args, **kwargs):
        # creates an instance (memory allocation)
        instance = super().__new__(cls)
        return instance

    def __init__(self, value):
        # initializes the created instance
        self.value = value
  • __new__ — acts like a class method; builds and returns the instance. Takes cls first.
  • __init__ — instance method; initializes an already-built instance. Takes self.

Most code only writes __init__. The places that need __new__ are narrow:

  • Subclassing immutable types (tuple, str)
  • Caching instances (singletons)
  • Blocking the call to __init__ itself
When __new__ returns a different object
class Singleton:
    _instance = None
    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

a = Singleton()
b = Singleton()
print(a is b)   # True

If __new__ returns an object that isn’t an instance of the class, __init__ is never called. This is a subtle edge case — worth keeping in mind.

__del__ — almost never used #

Called when an object is garbage-collected. But:

  • No guarantee when it runs — depends on GC timing
  • May not be called at all if there are reference cycles
  • Exceptions inside are silently ignored

For resource cleanup, the answer is context managers (Intermediate #3), not __del__.

Representation — __repr__ vs __str__ #

repr / str
class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

    def __repr__(self) -> str:
        return f"Point(x={self.x}, y={self.y})"

    def __str__(self) -> str:
        return f"({self.x}, {self.y})"

p = Point(1, 2)
print(repr(p))   # Point(x=1, y=2)   ← for debugging, unambiguous
print(str(p))    # (1, 2)            ← human-readable
print(f"{p}")    # (1, 2)            ← f-string uses str
print(p)         # (1, 2)            ← print uses str

Rules:

  • __repr__unambiguous. Aim for eval(repr(x)) == x if possible
  • __str__for users to read. If undefined, falls back to __repr__

What @dataclass (Intermediate #1) auto-generates is __repr__.

__format__ — f-string format spec #

Receive format spec
class Money:
    def __init__(self, amount, currency):
        self.amount, self.currency = amount, currency

    def __format__(self, spec):
        if spec == "k":
            return f"{self.amount / 1000:.1f}k {self.currency}"
        return f"{self.amount} {self.currency}"

m = Money(12345, "KRW")
print(f"{m}")     # 12345 KRW
print(f"{m:k}")   # 12.3k KRW

The fmt part of f-string f"{x:fmt}" becomes the argument to __format__(spec). This is the hook for adding custom formatting to domain objects.

Comparison #

__eq__ and __hash__ — partners #

These two always go together.

Basic rule
class User:
    def __init__(self, id, name):
        self.id, self.name = id, name

    def __eq__(self, other):
        if not isinstance(other, User):
            return NotImplemented
        return self.id == other.id

    def __hash__(self):
        return hash(self.id)

Rules:

  • If a == b, then hash(a) == hash(b) must hold
  • Defining only __eq__ automatically sets __hash__ to None, making it unusable as a set/dict key
  • Mutable objects are usually not hashable

@dataclass(frozen=True) auto-generates both methods together.

__lt__ etc. — functools.total_ordering #

When writing all six (<, <=, >, >=, ==, !=) is tedious.

total_ordering
from functools import total_ordering

@total_ordering
class Score:
    def __init__(self, value):
        self.value = value

    def __eq__(self, other):
        return self.value == other.value

    def __lt__(self, other):
        return self.value < other.value
# the other 4 fill in automatically

@dataclass(order=True) is shorter but only fits simple field comparisons. For complex comparison logic, use total_ordering.

Container-like — sequences/mappings #

__len__, __getitem__, __contains__, __iter__ #

Building a sequence
class Page:
    def __init__(self, items):
        self.items = items

    def __len__(self):
        return len(self.items)

    def __getitem__(self, index):
        return self.items[index]

    def __contains__(self, value):
        return value in self.items

    def __iter__(self):
        return iter(self.items)

p = Page(["a", "b", "c"])
len(p)            # 3
p[0]              # 'a'
"b" in p          # True
list(p)           # ['a', 'b', 'c']
for x in p: ...   # OK

Filling these four makes the object behave almost like a list. Actually, just __getitem__ and __len__ are enough to make for in work (it tries indices from 0 until IndexError).

Slicing also works automatically #

__getitem__’s argument can be a slice object.

Handle slice
class MyList:
    def __init__(self, items):
        self.items = items

    def __getitem__(self, key):
        if isinstance(key, slice):
            return MyList(self.items[key])
        return self.items[key]

m = MyList([1, 2, 3, 4, 5])
m[1:3].items   # [2, 3]

Calling m[1:3] passes slice(1, 3, None) as key.

__setitem__, __delitem__ #

Write/delete
class Cache:
    def __init__(self):
        self._data = {}

    def __getitem__(self, key):
        return self._data[key]

    def __setitem__(self, key, value):
        self._data[key] = value

    def __delitem__(self, key):
        del self._data[key]

c = Cache()
c["x"] = 1
print(c["x"])   # 1
del c["x"]

Use these to make an object behave like a dict.

Callable — __call__ #

The hook that lets you call an object like a function.

__call__
class Counter:
    def __init__(self):
        self.count = 0

    def __call__(self, value):
        self.count += 1
        return value * 2

c = Counter()
print(c(5))        # 10
print(c(7))        # 14
print(c.count)     # 2

The class-form decorator from Intermediate #5 is built on this hook. PyTorch’s nn.Module also has __call__ so models can be called like functions.

Attribute access — __getattr__, __setattr__, __getattribute__ #

__getattr__ — when missing attributes are requested #

Lazy missing-attribute handling
class Lazy:
    def __getattr__(self, name):
        if name.startswith("get_"):
            field = name[4:]
            return lambda: f"value of {field}"
        raise AttributeError(name)

l = Lazy()
l.get_name()   # 'value of name'
l.get_age()    # 'value of age'

Called only for missing attributes. Existing attributes follow the normal path. Common in ORM auto-methods, proxy objects.

__getattribute__ — every attribute request #

Intercept everything — dangerous
class All:
    def __getattribute__(self, name):
        print(f"접근: {name}")
        return super().__getattribute__(name)

Intercepts every attribute access. Easy to recurse infinitely if used wrong (using self.x inside calls __getattribute__ again). Usually only __getattr__ is used.

__setattr__, __delattr__ #

Intercept writes
class Frozen:
    def __init__(self, x):
        self.x = x
        object.__setattr__(self, "_locked", True)

    def __setattr__(self, name, value):
        if getattr(self, "_locked", False):
            raise AttributeError(f"{name} 변경 불가")
        super().__setattr__(name, value)

f = Frozen(5)
f.x = 10   # AttributeError

@dataclass(frozen=True)’s behavior is exactly this pattern.

Arithmetic — __add__ etc. #

Operator overloading
class Vec:
    def __init__(self, x, y):
        self.x, self.y = x, y

    def __add__(self, other):
        return Vec(self.x + other.x, self.y + other.y)

    def __mul__(self, k):
        return Vec(self.x * k, self.y * k)

    def __rmul__(self, k):
        return self.__mul__(k)

v = Vec(1, 2) + Vec(3, 4)        # __add__
w = Vec(1, 2) * 3                 # __mul__
u = 3 * Vec(1, 2)                 # __rmul__ (left operand doesn't know the type)

For symmetric operations, also define a reflected version like __rmul__. Called when the left operand doesn’t know how to multiply itself by us (e.g., int * Vec).

__bool__ — truthiness #

bool
class Bag:
    def __init__(self):
        self.items = []

    def __bool__(self):
        return bool(self.items)

b = Bag()
if b:
    print("뭔가 있음")

Without it, Python checks __len__; without that, always True. For container shapes, __len__ alone is often enough.

Subclass-time hook — __init_subclass__ #

A hook called when a subclass is created. You can do similar things to a metaclass without one.

__init_subclass__
class Plugin:
    registry = []

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        Plugin.registry.append(cls)

class JsonPlugin(Plugin):
    pass

class CsvPlugin(Plugin):
    pass

print(Plugin.registry)
# [<class 'JsonPlugin'>, <class 'CsvPlugin'>]

Common for plugin auto-registration. Metaclasses (#3) are more powerful, but for jobs this small, __init_subclass__ is lighter and safer.

Common methods — one table #

CategoryMethodsWhen called
Birth/death__new__, __init__, __del__instance creation / cleanup
Representation__repr__, __str__, __format__, __bool__repr(), str(), f"", if
Comparison__eq__, __hash__, __lt__, etc.==, hash(), <
Container__len__, __getitem__, __setitem__, __contains__, __iter__len(), x[k], in, for
Call__call__x(...)
Attributes__getattr__, __setattr__, __delattr__attribute access
Arithmetic__add__, __sub__, __mul__, __truediv__, …+, -, *, /
Async__await__, __aiter__, __anext__, __aenter__, __aexit__await, async for/with
Inheritance hooks__init_subclass__, __class_getitem__subclass / Cls[T]

You don’t need to memorize this table. Simply knowing these hooks exist means you can look them up whenever you encounter them in library code.

Wrap-up #

What this post covered:

  • __new__ creates the instance, __init__ initializes; mostly only __init__
  • Prefer context managers over __del__
  • __repr__ (developer) vs __str__ (user); without __str__, __repr__ is the fallback
  • __format__ to handle f-string format specs
  • __eq__ and __hash__ are partners; frozen=True dataclass auto-generates both
  • Containers act almost like lists with __len__ + __getitem__
  • __call__ makes objects callable like functions
  • __getattr__ for missing attributes, __getattribute__ for everything (dangerous)
  • __init_subclass__ for lightweight auto-registration without metaclasses
  • Magic methods are the official hooks linking objects and language features

In the next post (#2 Descriptors and __set_name__) we cover a special category among magic methods — descriptors that turn attributes into objects. @property is actually one form of descriptor.

X