X Tutup
The Wayback Machine - https://web.archive.org/web/20251205112635/https://github.com/python/cpython/pull/31625
Skip to content

Conversation

@sweeneyde
Copy link
Member

@sweeneyde sweeneyde commented Mar 1, 2022

@sweeneyde sweeneyde marked this pull request as draft March 1, 2022 02:04
@sweeneyde sweeneyde marked this pull request as ready for review March 1, 2022 03:13
@rumpelsepp
Copy link
Contributor

I like this proposal better than mine: #31554.

@sweeneyde
Copy link
Member Author

Just a couple of benchmarks on Windows:

from pyperf import Runner
runner = Runner()

for haystack, needle in [
    ("""b'x' * 100_000""", """b'y'"""),
    ("""b'x' * 100_000""", """b'yz'"""),
    ("""b'x' * 100_000""", """b'xy'"""),
    ("""b'x' * 100_000""", """b'yx'"""),
    ("""b'ab' * 100_000""", """b'abracadabra'"""),
    ("""b'a' * 10_000 + b'b' * 10_000 + b'a' * 10_000""",
     """b'a' * 10_001"""),
]:
    runner.timeit(
        f"{needle} in {haystack}",
        setup=f"""\
haystack = {haystack}
needle = {needle}
import mmap
m = mmap.mmap(-1, len(haystack))
m.write(haystack)
m.seek(0)
""",
        stmt=f"m.find(needle)"
    )
Slower (2):
- b'yx' in b'x' * 100_000: 101 us +- 2 us -> 141 us +- 1 us: 1.39x slower
- b'xy' in b'x' * 100_000: 126 us +- 1 us -> 134 us +- 1 us: 1.06x slower

Faster (4):
- b'a' * 10_001 in b'a' * 10_000 + b'b' * 10_000 + b'a' * 10_000: 15.9 ms +- 0.5 ms -> 71.2 us +- 0.7 us: 223.53x faster
- b'y' in b'x' * 100_000: 101 us +- 1 us -> 3.37 us +- 0.07 us: 30.02x faster
- b'yz' in b'x' * 100_000: 101 us +- 1 us -> 25.5 us +- 0.3 us: 3.95x faster
- b'abracadabra' in b'ab' * 100_000: 335 us +- 8 us -> 303 us +- 3 us: 1.11x faster

Geometric mean: 5.20x faster

@sweeneyde
Copy link
Member Author

No refleaks

0:00:00 [1/1] test_mmap
beginning 6 repetitions
123456
......

== Tests result: SUCCESS ==

1 test OK.

@sweeneyde sweeneyde added the performance Performance or resource usage label Mar 2, 2022
@sweeneyde sweeneyde merged commit 6ddb09f into python:main Mar 2, 2022
@sweeneyde sweeneyde deleted the mmap_fastsearch branch March 2, 2022 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

X Tutup