Python reports vectorcall as available even when using it will not help and even hurt performance #123372
Labels
performance
Performance or resource usage
topic-C-API
type-bug
An unexpected behavior, bug, or error


Bug report
Bug description:
The following Python function
Has vectorcall enabled (
PyVectorcall_Function()returns a non-NULL function), but because it collectsargs, Python needs to convert the arguments into a tuple, and so vectorcall won't be any faster than allocating a tuple directly.This is not a problem if we know the number of arguments beforehand, but if we allocate the array for vectorcall, we will allocate twice needlessly.
Avoiding allocating for vectorcall also does not always provide the best performance, because vectorcall can have an edge with bound methods with the flag
PY_VECTORCALL_ARGUMENTS_OFFSET.Per the docs:
As such, I believe this function (and similar functions) should not implement vectorcall.
Originally reported on Stack Overflow.
CPython versions tested on:
3.12
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered: