gh-60462: Fix locale.strxfrm() on Solaris#138242
gh-60462: Fix locale.strxfrm() on Solaris#138242serhiy-storchaka merged 4 commits intopython:mainfrom
Conversation
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering.
ea8283a to
60a5481
Compare
|
!buildbot Solaris |
|
🤖 New build scheduled with the buildbot fleet by @serhiy-storchaka for commit 60a5481 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F138242%2Fmerge The command will test the builders whose names match following regular expression: The builders matched are:
|
|
|
|
Thanks! I tested the patch on Solaris on both SPARC and Intel, and the tests are happy with it. That said, I am unsure whether it's correct to split the codes only when they are longer than 16 bits - couldn't that break the ordering? for example with values
-> comparing element by element, |
|
BTW, we are using similar patch on Solaris: |
|
Note |
Yes, it is surprisingly similar. You don't need to add 0x10000 if you split every character. My implementation needs this because it leaves 16-bit codes unchanged (this saves memory and time). More important, |
Oh, I completely overlooked that
That's true. I don't know if in can change order in our case, but it certainly shouldn't go through that |
|
Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
|
Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13. |
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering. (cherry picked from commit 482fd0c) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering. (cherry picked from commit 482fd0c) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
|
GH-138448 is a backport of this pull request to the 3.14 branch. |
|
GH-138449 is a backport of this pull request to the 3.13 branch. |
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering. (cherry picked from commit 482fd0c) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering.
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering. (cherry picked from commit 482fd0c) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
It should interpret the result of wcsxfrm() as a sequence of abstract integers, not a sequence of Unicode code points or using other encoding scheme that does not preserve ordering.