gh-131675: mimalloc: fix mi_atomic_yield on 32-bit ARM#131784
gh-131675: mimalloc: fix mi_atomic_yield on 32-bit ARM#131784colesbury merged 4 commits intopython:mainfrom
Conversation
Previously, `mi_atomic_yield` for 32-bit ARM:
* Used a non-standard __ARM_ARCH__ macro to determine if the compiler
was targeting ARMv7+ in order to emit a `yield` instruction, however
it was often not defined so fell through to an `__armel__` branch
* Had a logic gap in the #ifdef where an `__arm__` target would be
expected to emit `mi_atomic_yield` but there was no condition to
define a function for big-endian targets
Now, the standard ACLE __ARM_ARCH macro [1] [2] is used which GCC and
Clang support.
The branching logic for `__armel__` has been removed so if the target
architecture supports v7+ instructions, a `yield` is emitted, otherwise
a `nop` is emitted. This covers both big and little endian scenarios.
[1]: https://arm-software.github.io/acle/main/acle.html#arm-architecture-level
[2]: https://arm-software.github.io/acle/main/acle.html#mapping-of-object-build-attributes-to-predefines
Signed-off-by: Vincent Fazio <vfazio@gmail.com>
|
@colesbury FYI |
|
I tested this on Python 3.13 builds via Buildroot which is where the original issue came from. The builds now work (I tested 32bit big-endian ARMv7+ and ARMv5+ builds). Note that I did not perform functional testing. Upstream mimalloc has a similar fix, but it does not fix issues with big-endian builds where I may make a a PR for that repository so nothing gets lost the next time CPython syncs the library internally. |
|
🤖 New build scheduled with the buildbot fleet by @colesbury for commit 7302058 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F131784%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
|
🤞 The RPi build seemed to pass, so that's a good sign. |
|
Can we use the same change as mimalloc upstream: microsoft/mimalloc@632eab9 ? |
We can, but it does not cover the situation where big endian armv6/5/4 is being targeted. |
|
I mentioned this here microsoft/mimalloc#1046 (comment) I'd prefer to not have that gap. I think the fix here is more complete. If that means I need to submit this change up stream, I will |
|
Just as an example, for Buildroot, this build would continue to fail otherwise https://autobuild.buildroot.org/results/c0b592969b67e0a6454ab0714ca9540d7bae498d/ I can obviously create a patch for us that does more than what we're willing to do here, it's just an additional maintenance burden. |
if we use this, arm big endian will revert to the default implementation I also prefer this patch as it groups together all the Arm variants together, it has a better coverage and it is easier to reason about. With what whatever solution go, we should keep it in sync with mimalloc repo. |
It's a little worse than that. I tested a mimalloc build with an armeb toolchain targeting armv6 and it fails to link because the function body isn't emitted. The way it's structured is: armeb is only handled if it's also armv7+ I'll make a PR for upstream and see if they accept it |
|
I've created microsoft/mimalloc#1050 in case either of you would like to chime in there |
|
The upstream commit was accepted. I've updated this branch to sync changes with what's in upstream for |
|
@colesbury @diegorusso you guys have been super responsive, I really appreciate that. Not all of my PRs here get feedback so quickly. |
Thanks for the PR and for the one on the mimalloc project as well. LGTM. I leave to @colesbury to approve/merge. |
|
Awesome, thanks! LMK if there's anything I can help with here. |
|
Thanks @vfazio! |
No problem! @colesbury Out of curiosity, what is the best way to get a reviewer assigned for future (or outstanding) PRs? Is there a place to post for help? |
|
Thanks @vfazio for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13. |
…hongh-131784) Use the standard `__ARM_ARCH` macro, which is supported by GCC and Clang. The branching logic for of `__ARMEL__` has been removed so if the target architecture supports v7+ instructions, a yield is emitted, otherwise a nop is emitted. This covers both big and little endian scenarios. (cherry picked from commit 03f6c8e) Co-authored-by: Vincent Fazio <vfazio@gmail.com> Signed-off-by: Vincent Fazio <vfazio@gmail.com>
|
GH-131954 is a backport of this pull request to the 3.13 branch. |
…-131784) (gh-131954) Use the standard `__ARM_ARCH` macro, which is supported by GCC and Clang. The branching logic for of `__ARMEL__` has been removed so if the target architecture supports v7+ instructions, a yield is emitted, otherwise a nop is emitted. This covers both big and little endian scenarios. (cherry picked from commit 03f6c8e) Signed-off-by: Vincent Fazio <vfazio@gmail.com> Co-authored-by: Vincent Fazio <vfazio@gmail.com>
|
You can post in the discussion forums, but ymmv depending on the PR. |
Previously,
mi_atomic_yieldfor 32-bit ARM:Used a non-standard
__ARM_ARCH__macro to determine if the compiler was targeting ARMv7+ in order to emit ayieldinstruction, however it was often not defined so fell through to an__ARMEL__branchHad a logic gap in the #ifdef where an
__arm__target would be expected to emitmi_atomic_yieldbut there was no condition to define a function for big-endian targetsNow, the standard ACLE __ARM_ARCH macro 1 2 is used which GCC and Clang support.
The branching logic for
__ARMEL__has been removed so if the target architecture supports v7+ instructions, ayieldis emitted, otherwise anopis emitted. This covers both big and little endian scenarios.