X Tutup
The Wayback Machine - https://web.archive.org/web/20250529150105/https://github.com/python/cpython/issues/131728
Skip to content

SelectorEventLoop sock_connect can return before cleaning up if sock.connect encounters BlockingIOError or InterruptedError on cancellation #131728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bdraco opened this issue Mar 25, 2025 · 11 comments
Labels
stdlib Python modules in the Lib dir topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@bdraco
Copy link
Contributor

bdraco commented Mar 25, 2025

Bug report

Bug description:

Originally reported in aio-libs/aiohttp#10617 (comment)

To make this happen:

Call loop.sock_connect and cancel it before the socket is connected.

Because _sock_write_done is called via add_done_callback it gets called in the next iteration of the event loop on cancellation.

fut.add_done_callback(

sock_connect will return before _sock_write_done is called which means the writer has not been removed yet and can get reused.

Additionally, this makes the close in

sock.close()
unsafe because on cancellation the socket gets closed before the writer is removed so the fd can get reused

CPython versions tested on:

3.10 but the same code exists for 3.13+

Operating systems tested on:

Linux

@bdraco bdraco added the type-bug An unexpected behavior, bug, or error label Mar 25, 2025
@bdraco bdraco changed the title SelectorEventLoop sock_connect can return without cleaning up if sock.connect encounters BlockingIOError or InterruptedError on cancellation SelectorEventLoop sock_connect can return before cleaning up if sock.connect encounters BlockingIOError or InterruptedError on cancellation Mar 25, 2025
@github-project-automation github-project-automation bot moved this to Todo in asyncio Mar 25, 2025
bdraco added a commit to aio-libs/aiohappyeyeballs that referenced this issue Mar 26, 2025
bdraco added a commit to aio-libs/aiohappyeyeballs that referenced this issue Mar 26, 2025
@bdraco
Copy link
Contributor Author

bdraco commented Mar 27, 2025

I can't make a reproducer for this.

I tried

async def socket_runner(port: int) -> None:
    loop = asyncio.get_running_loop()
    while True:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.setblocking(False)
        task = asyncio.current_task()
        loop.call_later(0.1, task.cancel)
        try:
            await loop.sock_connect(sock, ("1.1.1.1", port))
        except asyncio.CancelledError:
            sock.close()

But the callback always fires in time.

@bdraco
Copy link
Contributor Author

bdraco commented Mar 27, 2025

I thinkthe problem is actually happening in _create_connection_transport

transport = self._make_socket_transport(sock, protocol, waiter)
only loop.sock_connect that ends up losing the race

@bdraco
Copy link
Contributor Author

bdraco commented Mar 27, 2025

So if create_connection get cancelled, _create_connection_transport has already created the transport and protocol. The waiter gets cancelled at

which calls transport.close but the call_soon in
self._loop.call_soon(self._add_reader,
hasn't run yet. So in the race case, create_connection is done and the socket gets closed in the downstream code, than the call_soon to add the reader runs after

@bdraco
Copy link
Contributor Author

bdraco commented Mar 27, 2025

I'll keep working on a reproducer so this issue is actually actionable.

@bdraco
Copy link
Contributor Author

bdraco commented Mar 27, 2025

I can't come up with a working reproducer for that either. I'm going to need more information for the original reporter.

I'll close this for now and reopen if I can get enough information to make a reproducer.

Sorry for the noise.

@bdraco bdraco closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2025
@github-project-automation github-project-automation bot moved this from Todo to Done in asyncio Mar 27, 2025
@rajeevn1
Copy link

home-assistant/core#141855

related

@bdraco bdraco reopened this Mar 31, 2025
@github-project-automation github-project-automation bot moved this from Done to In Progress in asyncio Mar 31, 2025
@bdraco
Copy link
Contributor Author

bdraco commented Mar 31, 2025

Reproducer that works on 3.12 and 3.13

import asyncio
import socket
from contextlib import suppress


async def socket_runner() -> None:
    loop = asyncio.get_running_loop()
    while True:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.setblocking(False)
        task = asyncio.create_task(
            loop.create_connection(asyncio.Protocol, None, None, sock=sock)
        )
        for _ in range(10):
            await asyncio.sleep(0)
        task.cancel()
        with suppress(asyncio.CancelledError):
            await task
        sock.close()


asyncio.run(socket_runner())

Will produce

RuntimeError: File descriptor <socket.socket fd=6, family=2, type=1, proto=0, laddr=('0.0.0.0', 0)> is used by transport <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>

After a few runs

@picnixz picnixz added the stdlib Python modules in the Lib dir label Mar 31, 2025
@bdraco
Copy link
Contributor Author

bdraco commented Mar 31, 2025

If I take the close it looks like the socket leaks even after gc because something is holding a strong ref. Not that its a concern for cpython, but if I switch to uvloop, the exception doesn't happen and it just leaks instead.

@bdraco
Copy link
Contributor Author

bdraco commented Mar 31, 2025

Finished tracing it and its still not a good reproducer.

@bdraco bdraco closed this as not planned Won't fix, can't repro, duplicate, stale Mar 31, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in asyncio Mar 31, 2025
@graingert
Copy link
Contributor

Can you see what refcycle says about why the socket is still alive? It should also be closed explicitly to avoid a ResourceWarning

@bdraco
Copy link
Contributor Author

bdraco commented Apr 1, 2025

Still looking into this one, if I come up with something useful, I will reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

5 participants
X Tutup