GH-102300: Reuse objects with refcount == 1 in float specialized binary ops.#102301
Conversation
|
I can't see the results. I think external parties can't view that repo. |
Sorry, you can change the repo name from |
brandtbucher
left a comment
There was a problem hiding this comment.
Looks good (like the clever use of _Py_DECREF_NO_DEALLOC). Do we want a helper function or macro to reduce duplication?
In the original approach (#30594), I only considered the LHS, and added a special case for in-place ops followed by some variant of LOAD_FAST (where the refcount is 2). Not sure if we want to consider adding something similar later.
|
An inline function is tricky without introducing extra branches, so I've added a macro. The |
About 0.5% faster:
https://github.com/faster-cpython/benchmarking/tree/main/results/bm-20230227-3.12.0a5%2B-c29d369
The results are quite noisy.
But the results seem good, as the most float heavy benchmark, nbody, does show a 7% speedup.