gh-121795: Improve performance of set membership testing from set arguments by HarryLHW · Pull Request #121796 · python/cpython

HarryLHW · 2024-07-15T15:44:37Z

Issue: Improve performance of set membership testing from set arguments #121795

…d() methods

picnixz

Can we have some benchmarks?

Objects/setobject.c

HarryLHW · 2024-07-15T16:09:03Z

Can we have some benchmarks?

Yes :)

Benchmark (on an M2 Macbook Air):

Script:

import timeit

for n in (1, 10, 100, 1000, 10000):
    setup = f'''
a = set(range({n}))
b = {{frozenset(a)}}
'''
    print(timeit.timeit("a in b", setup), f'N = {n}')

main branch result:

0.3688333749996673 N = 1
0.48004929200033075 N = 10
1.781020708000142 N = 100
11.256919416999153 N = 1000
152.03752795800028 N = 10000

my branch result:

0.30569541699878755 N = 1
0.3746081659992342 N = 10
1.2535620420021587 N = 100
7.2957435000025725 N = 1000
86.63279329200304 N = 10000

There will be no performance regression on the normal case.

picnixz

Maybe like this? (for the rest, I'll leave it to Raymond)

Objects/setobject.c

erlend-aasland

This is a neat performance improvement, that contrary to most other such attempts, does not add to the complexity of the code; nice. AFAICS, this is good to go. I'll leave the landing of the PR to the code owner.

Objects/setobject.c

HarryLHW · 2024-07-17T17:07:20Z

This is a neat performance improvement, that contrary to most other such attempts, does not add to the complexity of the code; nice. AFAICS, this is good to go. I'll leave the landing of the PR to the code owner.

Thank you so much for your review. I have made changes to resolve the issues.

rhettinger · 2024-07-21T06:46:19Z

We used to do something like this, but it caused bugs and had to be removed. See https://bugs.python.org/issue8757 and checkin 4d45c10 . IIRC there was also a reason that the set had to be made temporarily immutable during the lookup. I don't remember all the details now. Perhaps @serhiy-storchaka does.

The original set_swap_bodies variants were very fast, so it was a bummer to have to use a set copy instead. The good news is that set copies are now much faster than they were. Also, we realized that the implicit frozenset conversion promised in the docs was an almost never used feature.

serhiy-storchaka · 2024-07-21T07:55:57Z

It was before I started contributing to CPython, so I have no memories about this case.

The approach of this PR is better than the code used before bpo-8757. It leaves the original set key unmodified, so other threads are not affected if they only read it. The original reproducer should pass this test.

But what if the set key is modified in other thread? I think this is safe if set comparison is thread-safe (and AFAIK it is).

There is a new race condition: the set key can be changed during calculating its hash (currently creating a frozenset from a set is atomic, or it can be made atomic). But I do not think this is bad. We should not guarantee the correct result in such case.

I think this change is safe. LGTM.

HarryLHW added 3 commits July 15, 2024 22:42

optimize using a set argument in __contains__(), remove(), and discar…

af45196

…d() methods

add tests

53a009a

add critical section

2d46655

HarryLHW requested a review from rhettinger as a code owner July 15, 2024 15:44

bedevere-app bot mentioned this pull request Jul 15, 2024

Improve performance of set membership testing from set arguments #121795

Closed

bedevere-app bot added the awaiting review label Jul 15, 2024

rhettinger self-assigned this Jul 15, 2024

picnixz reviewed Jul 15, 2024

View reviewed changes

Objects/setobject.c Outdated Show resolved Hide resolved

Objects/setobject.c Outdated Show resolved Hide resolved

refactor frozenset_hash()

d5c5459

picnixz reviewed Jul 15, 2024

View reviewed changes

Objects/setobject.c Outdated Show resolved Hide resolved

HarryLHW and others added 3 commits July 16, 2024 01:38

rename set_hash_func to compute_setobject_hash

c2fa92a

📜🤖 Added by blurb_it.

079ecd6

trim trailing whitespace

60f7cf7

erlend-aasland approved these changes Jul 17, 2024

View reviewed changes

Objects/setobject.c Outdated Show resolved Hide resolved

Objects/setobject.c Outdated Show resolved Hide resolved

Objects/setobject.c Outdated Show resolved Hide resolved

Objects/setobject.c Outdated Show resolved Hide resolved

Objects/setobject.c Outdated Show resolved Hide resolved

bedevere-app bot added awaiting merge and removed awaiting review labels Jul 17, 2024

HarryLHW added 2 commits July 18, 2024 00:29

code style

2dbf214

compute_setobject_hash => frozenset_hash_impl, and comments

8e979d7

trim trailing whitespace

ce298a3

erlend-aasland approved these changes Jul 19, 2024

View reviewed changes

rhettinger added the DO-NOT-MERGE label Jul 21, 2024

serhiy-storchaka approved these changes Jul 21, 2024

View reviewed changes

rhettinger removed the DO-NOT-MERGE label Jul 22, 2024

rhettinger merged commit 2408a8a into python:main Jul 22, 2024

bedevere-app bot removed the awaiting merge label Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-121795: Improve performance of set membership testing from set arguments#121796

gh-121795: Improve performance of set membership testing from set arguments#121796
rhettinger merged 10 commits intopython:mainfrom
HarryLHW:set-hash

HarryLHW commented Jul 15, 2024 •

edited by bedevere-app bot

Loading

Uh oh!

picnixz left a comment

Uh oh!

Uh oh!

Uh oh!

HarryLHW commented Jul 15, 2024

Uh oh!

picnixz left a comment

Uh oh!

Uh oh!

erlend-aasland left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HarryLHW commented Jul 17, 2024

Uh oh!

rhettinger commented Jul 21, 2024

Uh oh!

serhiy-storchaka commented Jul 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

HarryLHW commented Jul 15, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

HarryLHW commented Jul 15, 2024

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

erlend-aasland left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HarryLHW commented Jul 17, 2024

Uh oh!

rhettinger commented Jul 21, 2024

Uh oh!

serhiy-storchaka commented Jul 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

HarryLHW commented Jul 15, 2024 •

edited by bedevere-app bot

Loading