gh-126703: Add freelists for iterators and range, method and builtin_function_or_method objects#128368
gh-126703: Add freelists for iterators and range, method and builtin_function_or_method objects#128368eendebakpt wants to merge 11 commits intopython:mainfrom
Conversation
|
I don't think we should share the freelists for iterators. We're not using that much memory and it's really bug-prone to share them. |
I agree with you. I am experimenting a bit to see whether it is possible at all to do this this with different types (maybe for PyType_GenericAlloc, or some size based freelist), but for the iterators I will probably split it again. |
|
The results are excellent! 1% faster geomean. Great work and congrats Pieter! |
|
I am not sure that it's worth adding the free list every time if there is a small margin (<3-5%). |
|
pycfunctionobject / pycmethodobject / class_method / shared_iters are maybe good to be added. |
Benchmark results show consistent 1% geomean speedup on pyperformance. That's pretty worth it (for comparison, the entire types optimizer in the JIT is only 1% speedup and is way more code). Though you're probably right that not all of them are worth it. I'm thinking the method and list/tuple iters are most worth it. |
|
I made PRs for the individual components that are worthwhile (based on the stats). The ones that do not have a PR yet (because the implementation would be more complex) are generators, StopIteration (or more general exceptions) and ints of small size (e.g. 2 of 3). I will close this PR as it is superseded by the others. |
This one looks promising. Maybe start to extract this free list from your PR as a new PR? |
I created 4 followup PRs. The |
|
Closing in favor of the individual PRs |
In this PR we add freelists for the top most allocated objects (measured using pyperformance benchmark). Some often allocated objects that have not yet been added: ints with 2 or 3 digits, exceptions (
StopIteration,IndexError) and generators.If the freelists increase performance, the PR should probably be split into multiple ones.
Microbenchmarks:
The list, float and int freelists are already in main, so we don't expect an improvement there. The iterator benchmarks show a modest improvement. bench_builtin_or_method shows an improvement, but is a a bit artificial benchmark.
Benchmark script