GH-116554: Relax list.sort()'s notion of "descending" runs by tim-one · Pull Request #116578 · python/cpython

tim-one · 2024-03-10T23:55:36Z

Not yet done, but good enough for timing.

Issue: Relax list.sort()'s notion of "descending" runs #116554

`sortslice_reverse()`. The current sortslice_advance(&slice, n - neq); reverse_sortslice(&slice, neq); pair of lines is an affront to the soul ;-) Putting the compound name io <OBJECT TYPE NAME>_<METHOD NAME> order feels significantly more natural.

with few distinct elements. It's in the nature of the beast that this will catch nearly all plausible "off by 1" coding errors in this context.

Switching to what I guess are some tool's different spelling of GH's single backticks.

tim-one · 2024-03-11T03:58:40Z

This looks good to go to me. I wasn't able to provoke any errors in the code, and didn't find any significant timing surprises (compared to the current main branch). The OP's original "problem case" (on StackOverflow) runs much faster now, and, as expected, is down to about 1.5 compares per element (independent of list size - the first call of count_run() now sorts the whole thing in-place).

Reviews always appreciated, but if a few days pass without one, I'll just commit it anyway.

comments in binarysort(), and deleted a redundant computation. Sue me ;-)

Misc/NEWS.d/next/Core and Builtins/2024-03-11-00-45-39.gh-issue-116554.gYumG5.rst

Lib/test/test_sort.py

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

…e-116554.gYumG5.rst Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>

I kept `lo` mostly to reduce fiddly typing needed for IFLT arguments. But IFLT calls were tedious and error-prone too. So introduced two tiny macros to capture the gibberish needed to spell "is the next element smaller/larger?" once and for all. There was no real use left for `lo` then, so got rid of it. Although a vrbl named `lo` still exists. But with a different meaning. It's a const capturing slo->keys, viewed as an array for `n` to index. A modern optimizing compiler shouldn't need my help to realize that marching through an array one at a time can be strength-reduced to pointer increments (which is what the old `lo` did).

thing we should be oprimizi9ng for ;-) Seriously, they'll return very early anyway, as a matter of course, after the first loop terminates without doing anything, and then the `n == nremaining` test will pass.

…hon#116578) * pythonGH-116554: Relax list.sort()'s notion of "descending" run Rewrote `count_run()` so that sub-runs of equal elements no longer end a descending run. Both ascending and descending runs can have arbitrarily many sub-runs of arbitrarily many equal elements now. This is tricky, because we only use ``<`` comparisons, so checking for equality doesn't come "for free". Surprisingly, it turned out there's a very cheap (one comparison) way to determine whether an ascending run consisted of all-equal elements. That sealed the deal. In addition, after a descending run is reversed in-place, we now go on to see whether it can be extended by an ascending run that just happens to be adjacent. This succeeds in finding at least one additional element to append about half the time, and so appears to more than repay its cost (the savings come from getting to skip a binary search, when a short run is artificially forced to length MIINRUN later, for each new element `count_run()` can add to the initial run). While these have been in the back of my mind for years, a question on StackOverflow pushed it to action: https://stackoverflow.com/questions/78108792/ They were wondering why it took about 4x longer to sort a list like: [999_999, 999_999, ..., 2, 2, 1, 1, 0, 0] than "similar" lists. Of course that runs very much faster after this patch. Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>

nanonyme · 2025-01-30T11:24:31Z

This seems to be resulting in sorting changes between 3.12 and 3.13. Is the sort still Timsort after this? The descending criteria seems to now conflict with https://en.m.wikipedia.org/wiki/Timsort

tim-one added 2 commits March 10, 2024 18:32

pythonGH-116554: Relax list.sort()'s notion of "descending" run

1eb8ba1

Merge remote-tracking branch 'upstream/main' into descend

c485699

bedevere-app bot added the awaiting core review label Mar 10, 2024

bedevere-app bot mentioned this pull request Mar 10, 2024

Relax list.sort()'s notion of "descending" runs #116554

Closed

tim-one added DO-NOT-MERGE interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Mar 10, 2024

tim-one and others added 5 commits March 10, 2024 19:19

Repair assorted typos in comments.

fc25664

📜🤖 Added by blurb_it.

b3b39a8

Add exhaustive stability testing for all possible short lists

997649b

with few distinct elements. It's in the nature of the beast that this will catch nearly all plausible "off by 1" coding errors in this context.

Update 2024-03-11-00-45-39.gh-issue-116554.gYumG5.rst

3b2bce6

Switching to what I guess are some tool's different spelling of GH's single backticks.

tim-one removed the DO-NOT-MERGE label Mar 11, 2024

tim-one added 2 commits March 10, 2024 23:23

While I was in the neighborhood, repaired several out-of-date

563aa8a

comments in binarysort(), and deleted a redundant computation. Sue me ;-)

Repaired more mistakes in comments.

e93703f

eendebakpt reviewed Mar 11, 2024

View reviewed changes

Misc/NEWS.d/next/Core and Builtins/2024-03-11-00-45-39.gh-issue-116554.gYumG5.rst Outdated Show resolved Hide resolved

AlexWaygood reviewed Mar 11, 2024

View reviewed changes

Lib/test/test_sort.py Outdated Show resolved Hide resolved

tim-one and others added 12 commits March 11, 2024 18:36

Update Lib/test/test_sort.py

f915279

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

Update Misc/NEWS.d/next/Core and Builtins/2024-03-11-00-45-39.gh-issu…

53a8033

…e-116554.gYumG5.rst Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>

Merge remote-tracking branch 'upstream/main' into descend

14c4c74

Merge remote-tracking branch 'upstream/main' into descend

0e7735b

Remove early-out for singletons ar the start. They're the last

2c98515

thing we should be oprimizi9ng for ;-) Seriously, they'll return very early anyway, as a matter of course, after the first loop terminates without doing anything, and then the `n == nremaining` test will pass.

Arghgh - another spelling error in a comment.

96719c9

Merge branch 'main' into descend

be79636

And another spelling error in a comment - I'm tiring of this ;-)

a1a9f45

Repair comments that weren't updated when the meaning of lo changed.

a09902a

Merge branch 'main' into descend

a41d77b

Merge branch 'main' into descend

9c69c5a

tim-one merged commit bf121d6 into python:main Mar 13, 2024

bedevere-app bot removed the awaiting core review label Mar 13, 2024

tim-one deleted the descend branch March 13, 2024 01:01

mwtoews mentioned this pull request Mar 4, 2025

Unexpected sorted result #130823

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GH-116554: Relax list.sort()'s notion of "descending" runs#116578

GH-116554: Relax list.sort()'s notion of "descending" runs#116578
tim-one merged 21 commits intopython:mainfrom
tim-one:descend

tim-one commented Mar 10, 2024 •

edited by bedevere-app bot

Loading

Uh oh!

tim-one commented Mar 11, 2024

Uh oh!

Uh oh!

Uh oh!

nanonyme commented Jan 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

tim-one commented Mar 10, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tim-one commented Mar 11, 2024

Uh oh!

Uh oh!

Uh oh!

nanonyme commented Jan 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tim-one commented Mar 10, 2024 •

edited by bedevere-app bot

Loading