gh-112653: Intern dataclass field names to improve performance#112657
gh-112653: Intern dataclass field names to improve performance#112657XuehaiPan wants to merge 13 commits intopython:mainfrom
Conversation
sobolevn
left a comment
There was a problem hiding this comment.
If the intent is
to improve performance
can you please share some numbers about it?
Here is an example use case: import dataclasses
@dataclasses.dataclass
class Foo:
a: int = dataclasses.field(default=0)
b: int = dataclasses.field(default=0)
c: int = dataclasses.field(default=0)
d: int = dataclasses.field(default=0)
e: int = dataclasses.field(default=0)
f: int = dataclasses.field(default=0)
g: int = dataclasses.field(default=0)
h: int = dataclasses.field(default=0)
i: int = dataclasses.field(default=0)
j: int = dataclasses.field(default=0)
k: int = dataclasses.field(default=0)
l: int = dataclasses.field(default=0)
m: int = dataclasses.field(default=0)
n: int = dataclasses.field(default=0)
o: int = dataclasses.field(default=0)
p: int = dataclasses.field(default=0)
q: int = dataclasses.field(default=0)
r: int = dataclasses.field(default=0)
s: int = dataclasses.field(default=0)
t: int = dataclasses.field(default=0)
u: int = dataclasses.field(default=0)
v: int = dataclasses.field(default=0)
w: int = dataclasses.field(default=0)
x: int = dataclasses.field(default=0)
y: int = dataclasses.field(default=0)
z: int = dataclasses.field(default=0)Benchmark results: >>> %timeit Foo()
Before:
1.5 µs ± 110 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
After:
1.36 µs ± 36.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
>>> foo = Foo()
>>> %timeit dataclasses.asdict(foo)
Before:
9.67 µs ± 749 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
After:
9.08 µs ± 60.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)The patched version constantly runs faster and has a smaller variance in running time. |
|
What about dataclasses with reasonable amount of fields? Like 0, 1, 5? Can you also test these use-cases? |
|
I do some benchmarks. Interning the field names always improves the performance of However, interning the default factory names may have a negative impact during the instantiation. I removed this in the last commit. Scriptimport textwrap
import timeit
NUMBER = 1_000_000
REPEAT = 5
for num_fields in [1, 2, 3, 4, 5, 8, 16, 32, 64]:
SETUP = textwrap.dedent(
f"""
import dataclasses
Foo = dataclasses.make_dataclass(
'Foo',
fields=[
(f'field_xxx_{{i}}', int, dataclasses.field(default=0))
for i in range({num_fields})
],
)
foo = Foo()
"""
).strip()
ctor_time = (
min(
timeit.repeat(
'Foo()',
setup=SETUP,
number=NUMBER,
repeat=REPEAT,
)
)
/ NUMBER
)
asdict_time = (
min(
timeit.repeat(
'dataclasses.asdict(foo)',
setup=SETUP,
number=NUMBER,
repeat=REPEAT,
)
)
/ NUMBER
)
print(
f'num_fields: {num_fields:<2d} '
f'ctor: {ctor_time * 1e6:5.3f}us '
f'asdict: {asdict_time * 1e6:5.3f}us'
)Results (I run these on another device, macOS with M2 Pro): |
|
I don't think this performance gain is worth it, but I'm open to other opinions. |


This PR interns field names of user-defined dataclasses. This interning operation only occurs on the type creation. This operation is a one-time operation and the overhead is relatively small. We already do similar improvements for
namedtuple.cpython/Lib/collections/__init__.py
Line 384 in a971574
cpython/Lib/collections/__init__.py
Line 424 in a971574
Resolves #112653