-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of conditional inputs and outputs for instructions in bytecodes.c #128914
Comments
|
(Tagging it as a |
…and the code generators (pythonGH-128918)
…odes.c` and the code generators (pythonGH-128918)" The commit introduced a large performance regression in the free threading build. This reverts commit ab61d3f.
…odes.c` and the code generators (pythonGH-128918)" The commit introduced a ~2.5-3% regression in the free threading build. This reverts commit ab61d3f.
|
It seems that #128918 caused a 2-3% slowdown on the free-threading build. |
FWIW, I couldn't reproduce this on our infrastructure. I'm not claiming one is better or more reliable, only that the effect being noticed here might be more complicated... |


We should remove the conditional stack effects in instruction definitions in bytecodes.c
Conditional stack effects already complicate code generation and that is only going to get worse with top-of-stack caching and other interpreter/JIT optimizations.
There were two reasons for having conditional stack effects:
Reason 1 no longer applies. Instructions are much more regular now and it isn't that much work to remove the remaining conditional stack effects.
That leaves performance. I experimentally removed the conditional stack effects for
LOAD_GLOBALandLOAD_ATTRwhich is the worse possible case for performance as it makes no attempt to mitigate the extra dispatch costs and possibly worse specialization.The results are here
Overall we see a 0.8% slowdown. It seems that specialization is not significantly worse, but there is a large increase in
PUSH_NULLfollowingLOAD_GLOBALthat appears to responsible for the slowdown. An extra specialization should fix that.Prior discussion
Linked PRs
bytecodes.cand the code generators #128918bytecodes.cand the code generators (GH-128918)" #129202The text was updated successfully, but these errors were encountered: