X Tutup
The Wayback Machine - https://web.archive.org/web/20221204095601/https://github.com/python/cpython/issues/87512
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[subprocess] run() sometimes ignores timeout in Windows #87512

Open
eryksun opened this issue Feb 28, 2021 · 2 comments
Open

[subprocess] run() sometimes ignores timeout in Windows #87512

eryksun opened this issue Feb 28, 2021 · 2 comments
Labels
3.10 3.11 3.12 OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@eryksun
Copy link
Contributor

eryksun commented Feb 28, 2021

BPO 43346
Nosy @pfmoore, @ericvsmith, @tjguk, @zware, @eryksun, @zooba

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-02-28.22:54:01.059>
labels = ['type-bug', '3.8', '3.9', '3.10', '3.11', 'library', 'OS-windows']
title = 'subprocess.run() sometimes ignores timeout in Windows'
updated_at = <Date 2021-06-28.18:54:15.735>
user = 'https://github.com/eryksun'

bugs.python.org fields:

activity = <Date 2021-06-28.18:54:15.735>
actor = 'zach.ware'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)', 'Windows']
creation = <Date 2021-02-28.22:54:01.059>
creator = 'eryksun'
dependencies = []
files = []
hgrepos = []
issue_num = 43346
keywords = []
message_count = 2.0
messages = ['387822', '387823']
nosy_count = 7.0
nosy_names = ['paul.moore', 'eric.smith', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'bugale bugale']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue43346'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

@eryksun
Copy link
Contributor Author

eryksun commented Feb 28, 2021

subprocess.run() handles TimeoutExpired by terminating the process and waiting on it. On POSIX, the exception object contains the partially read stdout and stderr bytes. For example:

    cmd = 'echo spam; echo eggs >&2; sleep 2'
    try: p = subprocess.run(cmd, shell=True, capture_output=True,
                            text=True, timeout=1)
    except subprocess.TimeoutExpired as e: ex = e
     
    >>> ex.stdout, ex.stderr
    (b'spam\n', b'eggs\n')

On Windows, subprocess.run() has to finish reading output with a second communicate() call, after which it manually sets the exception's stdout and stderr attributes.

The poses the problem that the second communicate() call may block indefinitely, even though the child process has terminated.

The primary issue is that the pipe handles may be inherited by one or more descendant processes (e.g. via shell=True), which are all regarded as potential writers that keep the pipe from closing. Reading from an open pipe that's empty will block until data becomes available. This is generally desirable for efficiency, compared to polling in a loop. But in this case, the downside is that run() in Windows will effectively ignore the given timeout.

Another problem is that _communicate() writes the input to stdin on the calling thread with a single write() call. If the input exceeds the pipe capacity (4 KiB by default -- but a pipesize 'suggested' size could be supported), the write will block until the child process reads the excess data. This could block indefinitely, which will effectively ignore a given timeout. The POSIX implementation, in contrast, correctly handles a timeout in this case.

Also, Popen.__exit__() closes the stdout, stderr, and stdin files without regard to the _communicate() worker threads. This may seem innocuous, but if a worker thread is blocked on synchronous I/O with one of these files, WinAPI CloseHandle() will also block if it's closing the last handle for the file in the current process. (In this case, the kernel I/O manager has a close procedure that waits to acquire the file for the current thread before performing various housekeeping operations, primarily in the filesystem, such as clearing byte-range locks set by the current process.) A blocked close() is easy to demonstrate. For example:

    args = 'python -c "import time; time.sleep(99)"'
    p = subprocess.Popen(args, shell=True, stdout=subprocess.PIPE)
    try: p.communicate(timeout=1)
    except: pass

    p.kill() # terminates the shell process -- not python.exe
    with p: pass # stdout.close() blocks until python.exe exits

The Windows implementation of Popen._communicate() could be redesigned as follows:

* read in chunks, with a size from 1 byte up to the maximum available,
  as determined by `_winapi.PeekNamedPipe()`
* write to the child's `stdin` on a separate thread
* after `communicate()` has started, ensure that synchronous I/O in worker
  threads has been canceled via `CancelSynchronousIo()` before closing 
  the pipes.

The _winapi module would need to wrap OpenThread() and CancelSynchronousIo(), plus define the TERMINATE_THREAD (0x0001) access right.

With the proposed changes, subprocess.run() would no longer special case TimeoutExpired on Windows.

@eryksun eryksun added 3.10 3.8 3.9 stdlib Python modules in the Lib dir OS-windows type-bug An unexpected behavior, bug, or error labels Feb 28, 2021
@eryksun
Copy link
Contributor Author

eryksun commented Feb 28, 2021

Demo Popen() methods, for discussion:

    def _read_output(self, fileobj):
        handle = msvcrt.get_osfhandle(fileobj.fileno())
        output = self._fileobj2output[fileobj]
        
        while True:
            try:
                size = _winapi.PeekNamedPipe(handle)[0] or 1
                data = _winapi.ReadFile(handle, size)[0]
            except BrokenPipeError:
                break
            except OSError as e:
                if e.winerror == _winapi.ERROR_OPERATION_ABORTED:
                    # Should this be mapped to InterruptedError
                    # (EINTR) in PC/errmap.h?
                    break
                raise
            output.append(data)

        fileobj.close()


    def _communicate(self, input, endtime, orig_timeout):
        if not self._communication_started:
            self._fileobj2thread = {}
            self._fileobj2output = {}
            if self.stdout:
                self._fileobj2output[self.stdout] = []
            if self.stderr:
                self._fileobj2output[self.stderr] = []

        stdout = self._fileobj2output.get(self.stdout)
        stderr = self._fileobj2output.get(self.stderr)

        thread_list = []
        for fileobj in (self.stdin, self.stdout, self.stderr):
            if fileobj is None:
                continue
            if fileobj in self._fileobj2thread:
                thread = self._fileobj2thread[fileobj]
            else:
                if fileobj == self.stdin:
                    target, args = self._stdin_write, (input,)
                else:
                    target, args = self._read_output, (fileobj,)
                thread = threading.Thread(target=target, args=args,
                            daemon=True)
                thread.start()
                self._fileobj2thread[fileobj] = thread
            thread_list.append(thread)

        for thread in thread_list:
            thread.join(self._remaining_time(endtime))
            if thread.is_alive():
                self._check_timeout(endtime, orig_timeout,
                    stdout, stderr, skip_check_and_raise=True)

        # Join partial reads.
        if stdout is not None:
            stdout = b''.join(stdout)
        if stderr is not None:
            stderr = b''.join(stderr)

        if self.text_mode:
            if stdout is not None:
                stdout = self._translate_newlines(stdout,
                    self.stdout.encoding, self.stdout.errors)
            if stderr is not None:
                stderr = self._translate_newlines(stderr,
                    self.stderr.encoding, self.stderr.errors)

        return (stdout, stderr)


    def _cancel_io(self):
        if not hasattr(self, '_fileobj2thread'):
            return
        for fileobj in (self.stdin, self.stdout, self.stderr):
            thread = self._fileobj2thread.get(fileobj)
            if thread is None or not thread.is_alive():
                continue
            try:
                handle = _winapi.OpenThread(
                    _winapi.TERMINATE_THREAD, False, thread.ident)
            except OSError:
                pass
            else:
                try:
                    try:
                        _winapi.CancelSynchronousIo(handle)
                    except OSError:
                        pass
                finally:
                    _winapi.CloseHandle(handle)


    def __exit__(self, exc_type, value, traceback):
        if _mswindows:
            self._cancel_io()
        # rest unchanged...

@zware zware added the 3.11 label Jun 28, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@eryksun eryksun changed the title subprocess.run() sometimes ignores timeout in Windows [subprocess] run() sometimes ignores timeout in Windows Aug 8, 2022
@gpshead gpshead added 3.12 and removed 3.9 3.8 labels Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 3.11 3.12 OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants
X Tutup