More fork safety by youknowone · Pull Request #7380 · RustPython/RustPython

youknowone · 2026-03-07T11:50:57Z

Summary by CodeRabbit

New Features
- Thread API: ThreadHandle adds ident, is_done, _set_done and join(timeout); thread-start APIs accept consolidated arg form.
- Mutexes: lock wait points can run a user-provided wrapper while waiting.
Bug Fixes
- Reduced spurious signal handling and tightened stop-the-world/shutdown synchronization to avoid lost wakeups and races.
- Improved fork-time warnings when forking from multi-threaded processes.
Performance
- Blocking I/O and OS waits now release the interpreter lock for better concurrency.

coderabbitai · 2026-03-07T11:51:12Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a lock_wrapped API and uses it with vm.allow_threads to release the VM/GIL during blocking waits; refactors stop-the-world state to world_stopped plus a thread_countdown, adds per-thread stop_requested, tightens signal-exception gating, and overhauls thread start/join/shutdown lifecycles.

Changes

Cohort / File(s)	Summary
Lock wrapping `crates/common/src/lock/thread_mutex.rs`	Added `lock_wrapped` to `RawThreadMutex` and `ThreadMutex` to run a user-provided wrapper (e.g., `vm.allow_threads`) around the blocking lock operation.
GIL-release for OS waits & multiplexing `crates/stdlib/src/multiprocessing.rs`, `crates/stdlib/src/select.rs`	Blocking OS calls (semaphores, `select`, `poll`, `epoll`) now execute inside `vm.allow_threads` closures so the VM/GIL is released during OS-level waits.
I/O locking and blocking behavior `crates/vm/src/stdlib/io.rs`	Replaced direct mutex locks with `lock_wrapped(
Signal handling `crates/vm/src/frame.rs`, `crates/vm/src/signal.rs`	Signal-exception handling in ExecutingFrame::run is now gated by `eval_breaker_tripped()`; added `pub(crate) fn is_triggered()` and updated the signal-exception handler signature.
Stop-the-world redesign `crates/vm/src/vm/mod.rs`, `crates/vm/src/vm/thread.rs`	Replaced detach-yields with `world_stopped: AtomicBool` and `thread_countdown: AtomicI64`; added per-thread `stop_requested` bit, countdown helpers, requester/notify helpers, and simplified attach/detach/suspend/resume flows; changed `detach_thread` signature.
Thread API & lifecycle overhaul `crates/vm/src/stdlib/thread.rs`	Refactored `start_new_thread`/`start_joinable_thread` to accept `FuncArgs`, added strict arg validation and audit hook calls, added `ThreadHandle` methods (`ident`, `is_done`, `_set_done`, `join`) and join helpers; reworked shutdown and join semantics.
Posix fork/thread counting `crates/vm/src/stdlib/posix.rs`	Added OS-thread-count heuristics (macOS/Linux/fallback) and refactored fork warning logic to use OS data when available and warn after parent resumes.

Sequence Diagram(s)

sequenceDiagram
    participant Py as Python code (caller)
    participant VM as VirtualMachine
    participant Mutex as ThreadMutex/RawThreadMutex
    participant OS as OS blocking syscall

    Py->>VM: request lock_wrapped(wrap_fn)
    VM->>Mutex: lock_wrapped(wrap_fn)
    Mutex->>VM: invoke wrap_fn(do_lock)
    VM->>OS: vm.allow_threads(do_lock)  -- releases VM/GIL
    OS-->>Mutex: blocking wait returns
    Mutex-->>VM: set owner and return
    VM-->>Py: return guard / success

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~110 minutes

Possibly related PRs

more algorithm-independent GC infra #7194: Related GC finalization changes that interact with interpreter shutdown and thread lifecycle.
Suspend Python threads before fork() #7364: Overlaps stop-the-world and lock/fork coordination changes; touches similar locking and reinit paths.
Update test_sys from v3.14.3 and impl more sys module #7075: Modifies ExecutingFrame::run and signal/exception handling, directly related to frame/signal edits here.

Suggested reviewers

ShaharNaveh
fanninpm

Poem

🐇
I nudged the lock and let it sleep,
While other threads could dance and leap.
A breaker hums, a countdown peeps,
I boxed the wait and freed the heap,
Hooray — concurrency’s tiny beep!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'More fork safety' is vague and generic, using non-descriptive language that doesn't clearly convey the specific changes made in this comprehensive pull request.	Consider using a more descriptive title that captures the main focus, such as 'Add lock_wrapped for GIL-safe mutex operations' or 'Improve thread safety in fork and stop-the-world logic'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-07T15:37:01Z

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] lib: cpython/Lib/asyncio
[ ] test: cpython/Lib/test/test_asyncio (TODO: 37)

dependencies:

asyncio (native: _asyncio, _overlapped, _pyrepl.console, _pyrepl.main, _pyrepl.simple_interact, _remote_debugging, _winapi, asyncio.tools, base_events, collections.abc, concurrent.futures, coroutines, errno, events, exceptions, futures, graph, itertools, locks, log, math, msvcrt, protocols, queues, readline, runners, streams, sys, taskgroups, tasks, threads, time, timeouts, transports, unix_events, windows_events)
- collections (native: _collections, _weakref, itertools, sys)
- logging (native: atexit, collections.abc, email.message, email.utils, errno, http.client, logging.handlers, multiprocessing.queues, select, sys, time, urllib.parse, win32evtlog, win32evtlogutil)
- site (native: _io, _pyrepl.main, _pyrepl.pager, _pyrepl.readline, _pyrepl.unix_console, _pyrepl.windows_console, atexit, builtins, errno, readline, sitecustomize, sys, usercustomize)
- tokenize (native: _tokenize, builtins, itertools, sys)
- _colorize, argparse, ast, contextlib, contextvars, dataclasses, enum, functools, heapq, inspect, io, linecache, os, reprlib, rlcompleter, selectors, signal, socket, ssl, stat, struct, subprocess, tempfile, threading, traceback, types, warnings, weakref

dependent tests: (7 tests)

asyncio: test_asyncio test_contextlib_async test_inspect test_logging test_os test_sys_settrace test_unittest

[x] lib: cpython/Lib/threading.py
[x] lib: cpython/Lib/_threading_local.py
[ ] test: cpython/Lib/test/test_threading.py (TODO: 16)
[ ] test: cpython/Lib/test/test_threadedtempfile.py
[ ] test: cpython/Lib/test/test_threading_local.py (TODO: 3)

dependencies:

threading

dependent tests: (148 tests)

Legend:

[+] path exists in CPython
[x] up-to-date, [ ] outdated

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/vm/src/vm/thread.rs (1)
187-209: ⚠️ Potential issue | 🔴 Critical

Detaching a requested thread needs to wake the stop-the-world requester.

When park_detached_threads() marks an ATTACHED thread with stop_requested = true, a subsequent call to detach_thread() (via allow_threads()) can transition it to THREAD_DETACHED without notifying the requester. This leaves the thread countdown outstanding. The blocked stop_the_world() loop will only advance when either the countdown reaches zero or park_detached_threads() is called again to rescan the detached thread, causing unnecessary stalling.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/vm/thread.rs` around lines 187 - 209, detaching an ATTACHED
thread can race with a prior park_detached_threads() that set
stop_requested=true and so detach_thread() must notify the stop-the-world
requester; update detach_thread() (inside CURRENT_THREAD_SLOT handling when
compare_exchange succeeds going from THREAD_ATTACHED to THREAD_DETACHED) to
check the thread's stop_requested flag and, if true, call the same notification
used by park_detached_threads()/stop_the_world (i.e., invoke the stop-the-world
wake/notify function used elsewhere) so the stop_the_world() countdown/loop is
advanced when a requested thread is detached.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/io.rs`:
- Line 5339: The code currently uses std::io::Read::read_to_end(), which hides
EINTR and prevents vm.check_signals() from running on the unbounded-read path
(size < 0); replace that call with an explicit chunked loop that uses
crt_fd::read() to fill a Vec<u8> (read fixed-size chunks, e.g., 8KiB), append
each successful chunk to the buffer, break on EOF (0 bytes), and on read errors
return Err so EINTR is propagated to the outer error handler; make sure to
reference and replace the read_to_end() call site and keep vm.check_signals()
semantics by letting crt_fd::read() surface EINTR instead of swallowing it.

In `@crates/vm/src/vm/mod.rs`:
- Around line 334-343: The fast-return path when initial_countdown == 0 leaves
stats_last_wait_ns unchanged; update the code in the branch (inside the block
after self.world_stopped.store(...) and before returning) to reset the VM's
last-wait statistic (the atomic backing field used by
stats_last_wait_ns/stats_snapshot) to zero using the same atomic semantics used
elsewhere (e.g., an atomic store with appropriate Ordering), so stats_snapshot()
does not report the previous stop's wait time after a no-wait stop; locate the
change around init_thread_countdown and world_stopped in stop() in vm::mod.rs.
- Around line 1931-1949: The fast-path in eval_breaker_tripped() performs a
Relaxed load of state.finalizing which can bypass the later Acquire load in
check_signals(), breaking the Release→Acquire synchronization from the shutdown
write; change the atomic load on state.finalizing in eval_breaker_tripped() to
Ordering::Acquire (or otherwise ensure an Acquire-load or appropriate fence is
executed when checking finalizing) so that the shutdown write in interpreter.rs
(Release) becomes visible on other threads; references: eval_breaker_tripped,
state.finalizing, check_signals, and the shutdown write in interpreter.rs.

In `@crates/vm/src/vm/thread.rs`:
- Around line 426-462: The cleanup_current_thread_frames path can call
vm.state.stop_the_world.notify_thread_gone() for threads that were
THREAD_DETACHED before the stop-the-world began; only decrement the requester
countdown for threads that were counted by init_thread_countdown. Update the
final conditional in cleanup_current_thread_frames to also ensure the removed
slot's state is not THREAD_DETACHED (e.g. add slot.state.load(...) !=
THREAD_DETACHED alongside the existing THREAD_SUSPENDED check) before calling
stop_the_world.notify_thread_gone(); reference symbols:
cleanup_current_thread_frames, vm.state.stop_the_world.notify_thread_gone,
init_thread_countdown, THREAD_DETACHED, THREAD_SUSPENDED, slot.state.
- Around line 247-252: Reads of the stop-request bit in suspend_if_needed and
stop_requested_for_current_thread currently use Ordering::Relaxed; change those
.load(Ordering::Relaxed) calls on the stop_requested AtomicBool to
.load(Ordering::Acquire) so they synchronize with park_detached_threads() which
publishes the bit with Ordering::Release (ensuring eval_breaker_tripped() sees
the update and threads suspend correctly).

---

Outside diff comments:
In `@crates/vm/src/vm/thread.rs`:
- Around line 187-209: detaching an ATTACHED thread can race with a prior
park_detached_threads() that set stop_requested=true and so detach_thread() must
notify the stop-the-world requester; update detach_thread() (inside
CURRENT_THREAD_SLOT handling when compare_exchange succeeds going from
THREAD_ATTACHED to THREAD_DETACHED) to check the thread's stop_requested flag
and, if true, call the same notification used by
park_detached_threads()/stop_the_world (i.e., invoke the stop-the-world
wake/notify function used elsewhere) so the stop_the_world() countdown/loop is
advanced when a requested thread is detached.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 9daf3542-35f7-4fa9-8ecd-777f0f7c9a5d

📥 Commits

Reviewing files that changed from the base of the PR and between f680f8a and b29c3b8.

⛔ Files ignored due to path filters (2)

Lib/test/test_asyncio/test_events.py is excluded by !Lib/**
Lib/test/test_threading.py is excluded by !Lib/**

📒 Files selected for processing (9)

crates/common/src/lock/thread_mutex.rs
crates/stdlib/src/multiprocessing.rs
crates/stdlib/src/select.rs
crates/vm/src/frame.rs
crates/vm/src/signal.rs
crates/vm/src/stdlib/io.rs
crates/vm/src/stdlib/thread.rs
crates/vm/src/vm/mod.rs
crates/vm/src/vm/thread.rs

coderabbitai · 2026-03-07T16:47:01Z

crates/vm/src/stdlib/io.rs

    };
    use crossbeam_utils::atomic::AtomicCell;
-    use std::io::{Read, Write};
+    use std::io::Read;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the file exists and get its total line count wc -l crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 96

🏁 Script executed:

# Read the specific lines mentioned in the review (5339 and 5764-5786) # Start with line 5339 sed -n '5335,5345p' crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 427

🏁 Script executed:

# Read the lines 5764-5786 to see the full context of the issue sed -n '5760,5790p' crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 1218

🏁 Script executed:

# Also check around line 5767 mentioned in the comment sed -n '5760,5800p' crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 1533

🏁 Script executed:

# Search for crt_fd module usage and definition rg "crt_fd" crates/vm/src/stdlib/io.rs | head -20

Repository: RustPython/RustPython

Length of output: 859

🏁 Script executed:

# Check the Rust std::io::Read documentation behavior - search for how read_to_end is used elsewhere rg "read_to_end" crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 114

🏁 Script executed:

# Look for crt_fd module definition fd "crt_fd" crates/

Repository: RustPython/RustPython

Length of output: 92

🏁 Script executed:

# Check if there's a read function being used with crt_fd elsewhere rg "crt_fd::read" crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 222

🏁 Script executed:

# Search for DEFAULT_BUFFER_SIZE constant rg "DEFAULT_BUFFER_SIZE" crates/vm/src/stdlib/io.rs | head -5

Repository: RustPython/RustPython

Length of output: 360

🏁 Script executed:

# Check the crt_fd module to understand the read function cat crates/common/src/crt_fd.rs

Repository: RustPython/RustPython

Length of output: 9961

🏁 Script executed:

# Look for the context where crt_fd::read is used in that loop rg -B5 -A5 "crt_fd::read\(handle" crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 1070

Replace read_to_end() with explicit crt_fd::read() loop to preserve signal checking on EINTR.

The current code uses std::io::Read::read_to_end(), which internally retries on EINTR, preventing the outer error handler from triggering vm.check_signals(). This breaks the unbounded-read path (size < 0) signal handling.

Use a chunked read loop with crt_fd::read() instead, which surfaces EINTR to the error handler:

Replace read_to_end() with chunked crt_fd::read() loop

- let mut bytes = vec![]; - // Loop on EINTR (PEP 475) - loop { - match vm.allow_threads(|| { - let mut h = handle; - h.read_to_end(&mut bytes) - }) { - Ok(_) => break, - Err(e) if e.raw_os_error() == Some(libc::EINTR) => { - vm.check_signals()?; - continue; - } - // Non-blocking mode: return None if EAGAIN (only if no data read yet) - Err(e) if e.raw_os_error() == Some(libc::EAGAIN) => { - if bytes.is_empty() { - return Ok(None); - } - break; - } - Err(e) => return Err(Self::io_error(zelf, e, vm)), - } - } + let mut bytes = vec![]; + loop { + let mut chunk = vec![0; super::DEFAULT_BUFFER_SIZE]; + match vm.allow_threads(|| crt_fd::read(handle, &mut chunk)) { + Ok(0) => break, + Ok(n) => bytes.extend_from_slice(&chunk[..n]), + Err(e) if e.raw_os_error() == Some(libc::EINTR) => { + vm.check_signals()?; + continue; + } + Err(e) if e.raw_os_error() == Some(libc::EAGAIN) => { + if bytes.is_empty() { + return Ok(None); + } + break; + } + Err(e) => return Err(Self::io_error(zelf, e, vm)), + } + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/io.rs` at line 5339, The code currently uses std::io::Read::read_to_end(), which hides EINTR and prevents vm.check_signals() from running on the unbounded-read path (size < 0); replace that call with an explicit chunked loop that uses crt_fd::read() to fill a Vec<u8> (read fixed-size chunks, e.g., 8KiB), append each successful chunk to the buffer, break on EOF (0 bytes), and on read errors return Err so EINTR is propagated to the outer error handler; make sure to reference and replace the read_to_end() call site and keep vm.check_signals() semantics by letting crt_fd::read() surface EINTR instead of swallowing it.

coderabbitai · 2026-03-07T16:47:02Z

crates/vm/src/vm/mod.rs

+        let initial_countdown = self.init_thread_countdown(vm);
        stw_trace(format_args!("stop begin requester={requester_ident}"));
+        if initial_countdown == 0 {
+            self.world_stopped.store(true, Ordering::Release);
+            #[cfg(debug_assertions)]
+            self.debug_assert_all_non_requester_suspended(vm);
+            stw_trace(format_args!(
+                "stop end requester={requester_ident} wait_ns=0 polls=0"
+            ));
+            return;


⚠️ Potential issue | 🟡 Minor

Reset last_wait_ns on the zero-thread fast path.

When Lines 336-343 return early, stats_last_wait_ns is left unchanged, so stats_snapshot() can report the previous stop's wait time after a no-wait stop.

Proposed fix

let initial_countdown = self.init_thread_countdown(vm); stw_trace(format_args!("stop begin requester={requester_ident}")); if initial_countdown == 0 { + self.stats_last_wait_ns.store(0, Ordering::Relaxed); self.world_stopped.store(true, Ordering::Release); #[cfg(debug_assertions)] self.debug_assert_all_non_requester_suspended(vm); stw_trace(format_args!( "stop end requester={requester_ident} wait_ns=0 polls=0"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let initial_countdown = self.init_thread_countdown(vm);

stw_trace(format_args!("stop begin requester={requester_ident}"));

if initial_countdown == 0 {

self.world_stopped.store(true, Ordering::Release);

#[cfg(debug_assertions)]

self.debug_assert_all_non_requester_suspended(vm);

stw_trace(format_args!(

"stop end requester={requester_ident} wait_ns=0 polls=0"

));

return;

let initial_countdown = self.init_thread_countdown(vm);

stw_trace(format_args!("stop begin requester={requester_ident}"));

if initial_countdown == 0 {

self.stats_last_wait_ns.store(0, Ordering::Relaxed);

self.world_stopped.store(true, Ordering::Release);

#[cfg(debug_assertions)]

self.debug_assert_all_non_requester_suspended(vm);

stw_trace(format_args!(

"stop end requester={requester_ident} wait_ns=0 polls=0"

));

return;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/mod.rs` around lines 334 - 343, The fast-return path when initial_countdown == 0 leaves stats_last_wait_ns unchanged; update the code in the branch (inside the block after self.world_stopped.store(...) and before returning) to reset the VM's last-wait statistic (the atomic backing field used by stats_last_wait_ns/stats_snapshot) to zero using the same atomic semantics used elsewhere (e.g., an atomic store with appropriate Ordering), so stats_snapshot() does not report the previous stop's wait time after a no-wait stop; locate the change around init_thread_countdown and world_stopped in stop() in vm::mod.rs.

coderabbitai · 2026-03-07T16:47:02Z

crates/vm/src/vm/mod.rs

+    #[inline]
+    pub(crate) fn eval_breaker_tripped(&self) -> bool {
+        #[cfg(feature = "threading")]
+        if self.state.finalizing.load(Ordering::Relaxed) && !self.is_main_thread() {
+            return true;
+        }
+
+        #[cfg(all(unix, feature = "threading"))]
+        if thread::stop_requested_for_current_thread() {
+            return true;
+        }
+
+        #[cfg(not(target_arch = "wasm32"))]
+        if crate::signal::is_triggered() {
+            return true;
+        }
+
+        false
+    }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== eval_breaker_tripped call sites ==\n' rg -n -C2 '\beval_breaker_tripped\s*\(' crates/vm/src printf '\n== finalizing atomic loads/stores ==\n' rg -n -C2 'finalizing\.(load|store)\(Ordering::' crates/vm/src printf '\n== stop_requested atomic loads/stores ==\n' rg -n -C2 'stop_requested\.(load|store)\(Ordering::' crates/vm/src/vm printf '\n== current-thread stop_requested helper ==\n' rg -n -C3 '\bstop_requested_for_current_thread\s*\(' crates/vm/src/vm/thread.rs

Repository: RustPython/RustPython

Length of output: 3615

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== finalizing atomic stores (write side) ==\n' rg -n -C3 'finalizing\.store\(Ordering::' crates/vm/src printf '\n== finalizing field initialization ==\n' rg -n -B2 -A2 'finalizing\s*[:=].*Atomic' crates/vm/src printf '\n== context around finalizing in state struct ==\n' rg -n -B5 -A5 'struct.*VmState\|finalizing:' crates/vm/src/vm/mod.rs | head -50 printf '\n== check_signals context (lines around 1954-1970) ==\n' sed -n '1950,1975p' crates/vm/src/vm/mod.rs

Repository: RustPython/RustPython

Length of output: 109

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== VmState struct definition ==\n' rg -n -B2 -A15 'pub.*struct VmState' crates/vm/src/vm/mod.rs printf '\n== All references to .finalizing ==\n' rg -n '\.finalizing' crates/vm/src/vm/mod.rs | head -20 printf '\n== frame.rs context around eval_breaker_tripped call ==\n' sed -n '1195,1215p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 97

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== Find files containing VmState ==\n' rg -l 'struct VmState' crates/vm/src printf '\n== Search for VmState definition more broadly ==\n' rg -n 'struct VmState' crates/vm/src printf '\n== Get frame.rs lines 1195-1215 ==\n' cat -n crates/vm/src/frame.rs | sed -n '1195,1215p' printf '\n== Search for finalizing field definition ==\n' rg -n 'finalizing.*AtomicBool\|finalizing.*Atomic' crates/vm/src

Repository: RustPython/RustPython

Length of output: 101

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== Check if files exist ==\n' ls -la crates/vm/src/vm/mod.rs crates/vm/src/frame.rs printf '\n== View lines 1930-1950 of vm/mod.rs ==\n' sed -n '1930,1950p' crates/vm/src/vm/mod.rs printf '\n== View lines 1950-1975 of vm/mod.rs ==\n' sed -n '1950,1975p' crates/vm/src/vm/mod.rs printf '\n== View lines 1195,1215 of frame.rs ==\n' sed -n '1195,1215p' crates/vm/src/frame.rs printf '\n== Simple search for finalizing in vm/mod.rs ==\n' grep -n 'finalizing' crates/vm/src/vm/mod.rs | head -15

Repository: RustPython/RustPython

Length of output: 2988

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== Search for finalizing assignments/modifications ==\n' grep -rn 'finalizing\s*=' crates/vm/src --include='*.rs' | grep -v '//' | head -20 printf '\n== Search for finalize or finalization functions that might set finalizing ==\n' grep -rn 'fn.*finali' crates/vm/src/vm/mod.rs | head -10 printf '\n== Get context around line 581 (field definition) ==\n' sed -n '575,595p' crates/vm/src/vm/mod.rs printf '\n== Look for Release orderings related to finalizing ==\n' grep -rn 'Release\|Release,' crates/vm/src/vm/mod.rs | grep -i finali

Repository: RustPython/RustPython

Length of output: 119

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== Get struct VmState definition around line 581 ==\n' sed -n '550,620p' crates/vm/src/vm/mod.rs printf '\n== Search for new() or initialization of VmState ==\n' grep -n 'fn new\|impl.*VmState' crates/vm/src/vm/mod.rs | head -10 printf '\n== Get broader context - search for any function containing finalizing ==\n' sed -n '1075,1090p' crates/vm/src/vm/mod.rs printf '\n== Look for finalization methods ==\n' sed -n '1000,1100p' crates/vm/src/vm/mod.rs | grep -n -B2 -A2 'finali'

Repository: RustPython/RustPython

Length of output: 4755

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== Search for store operations on finalizing ==\n' grep -n 'finalizing\.store\|finalizing\.swap' crates/vm/src --include='*.rs' -r printf '\n== Search for any assignment to finalizing ==\n' grep -n 'finalizing' crates/vm/src --include='*.rs' -r | grep -v 'load\|//' printf '\n== Check thread.rs for stop_the_world and synchronization ==\n' sed -n '345,365p' crates/vm/src/vm/thread.rs printf '\n== Look for Python shutdown/finalization code ==\n' grep -n 'finalize\|shutdown' crates/vm/src/stdlib/sys.rs | head -15

Repository: RustPython/RustPython

Length of output: 4447

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== Context around Release store at interpreter.rs:433 ==\n' sed -n '420,450p' crates/vm/src/vm/interpreter.rs printf '\n== Check what triggers the shutdown sequence ==\n' sed -n '400,440p' crates/vm/src/vm/interpreter.rs printf '\n== Look for comments about memory ordering ==\n' grep -n -B3 -A3 'Relaxed\|Acquire\|Release\|memory\|ordering' crates/vm/src/vm/mod.rs | head -40

Repository: RustPython/RustPython

Length of output: 4937

Review the memory ordering on the eval_breaker_tripped() fast-path gate.

eval_breaker_tripped() at line 1934 uses Ordering::Relaxed to read finalizing, but this check gates the only path to check_signals() (frame.rs:1202-1203), which uses Ordering::Acquire. Since a Relaxed load returning false prevents the Acquire load from ever executing, the Release→Acquire synchronization pair from the shutdown write (interpreter.rs:433) is broken on weak memory models. A stale Relaxed read can cause non-main threads to miss finalization for an indeterminate period, delaying stop-the-world visibility. Consider using Ordering::Acquire in the hot-path gate or restructuring to ensure the Acquire read always executes after finalization is signaled.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/mod.rs` around lines 1931 - 1949, The fast-path in eval_breaker_tripped() performs a Relaxed load of state.finalizing which can bypass the later Acquire load in check_signals(), breaking the Release→Acquire synchronization from the shutdown write; change the atomic load on state.finalizing in eval_breaker_tripped() to Ordering::Acquire (or otherwise ensure an Acquire-load or appropriate fence is executed when checking finalizing) so that the shutdown write in interpreter.rs (Release) becomes visible on other threads; references: eval_breaker_tripped, state.finalizing, check_signals, and the shutdown write in interpreter.rs.

coderabbitai · 2026-03-07T16:47:02Z

crates/vm/src/vm/thread.rs

 pub fn suspend_if_needed(stw: &super::StopTheWorldState) {
-    if !stw.requested.load(Ordering::Relaxed) {
+    let should_suspend = CURRENT_THREAD_SLOT.with(|slot| {
+        slot.borrow()
+            .as_ref()
+            .is_some_and(|s| s.stop_requested.load(Ordering::Relaxed))
+    });


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n crates/vm/src/vm/thread.rs | sed -n '240,260p'

Repository: RustPython/RustPython

Length of output: 951

🏁 Script executed:

cat -n crates/vm/src/vm/thread.rs | sed -n '340,360p'

Repository: RustPython/RustPython

Length of output: 882

🏁 Script executed:

rg "stop_requested" crates/vm/src/vm/ -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 5985

🏁 Script executed:

cat -n crates/vm/src/vm/mod.rs | sed -n '250,330p'

Repository: RustPython/RustPython

Length of output: 4299

🏁 Script executed:

rg "eval_breaker_tripped" crates/vm/src/vm/ -A 3 -B 1

Repository: RustPython/RustPython

Length of output: 402

🏁 Script executed:

cat -n crates/vm/src/vm/mod.rs | sed -n '470,495p'

Repository: RustPython/RustPython

Length of output: 1066

🏁 Script executed:

rg "eval_breaker_tripped" crates/vm/src/vm/mod.rs -A 10

Repository: RustPython/RustPython

Length of output: 421

🏁 Script executed:

rg "check_signals" crates/vm/src/ -l

Repository: RustPython/RustPython

Length of output: 306

🏁 Script executed:

fd "codeobject\|ceval" crates/vm/src/ --extension rs

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

rg "check_signals" crates/vm/src/vm/mod.rs -B 2 -A 5

Repository: RustPython/RustPython

Length of output: 1238

Use Acquire for the stop-bit loads to ensure proper synchronization.

Lines 251 and 352 read stop_requested with Ordering::Relaxed, but park_detached_threads() publishes that bit with Ordering::Release in crates/vm/src/vm/mod.rs (lines 281, 304, and elsewhere). Since eval_breaker_tripped() gates suspension on these reads, stale loads can leave a requested thread running past stop-the-world.

Minimal ordering fix

- .is_some_and(|s| s.stop_requested.load(Ordering::Relaxed)) + .is_some_and(|s| s.stop_requested.load(Ordering::Acquire))

Apply the same change to both suspend_if_needed() (line 251) and stop_requested_for_current_thread() (line 352).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pub fn suspend_if_needed(stw: &super::StopTheWorldState) {

if !stw.requested.load(Ordering::Relaxed) {

let should_suspend = CURRENT_THREAD_SLOT.with(|slot| {

slot.borrow()

.as_ref()

.is_some_and(|s| s.stop_requested.load(Ordering::Relaxed))

});

pub fn suspend_if_needed(stw: &super::StopTheWorldState) {

let should_suspend = CURRENT_THREAD_SLOT.with(|slot| {

slot.borrow()

.as_ref()

.is_some_and(|s| s.stop_requested.load(Ordering::Acquire))

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/thread.rs` around lines 247 - 252, Reads of the stop-request bit in suspend_if_needed and stop_requested_for_current_thread currently use Ordering::Relaxed; change those .load(Ordering::Relaxed) calls on the stop_requested AtomicBool to .load(Ordering::Acquire) so they synchronize with park_detached_threads() which publishes the bit with Ordering::Release (ensuring eval_breaker_tripped() sees the update and threads suspend correctly).

crates/vm/src/vm/thread.rs

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 1210-1247: The join_internal helper currently enforces public join
validations (calls check_started and checks VM finalization/current-ident) which
breaks callers from _shutdown; update join_internal to accept an internal flag
(e.g., for_shutdown: bool) or a separate private helper so _shutdown can bypass
public-only checks: remove or guard the check_started(inner, vm) call and the
finalization/current-thread checks inside join_internal (references:
join_internal, check_started, _shutdown, ThreadHandle_join, get_ident,
ThreadHandleState::Running, vm.state.finalizing) and ensure the public join()
wrapper still performs check_started and finalization checks before delegating
to join_internal(for_shutdown=false) while _shutdown calls
join_internal(for_shutdown=true) to skip those validations.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 300d9b9c-9e25-44bf-b7b6-e9cd4c933f3a

📥 Commits

Reviewing files that changed from the base of the PR and between b29c3b8 and dc18133.

📒 Files selected for processing (1)

crates/vm/src/stdlib/thread.rs

coderabbitai · 2026-03-07T17:47:29Z

crates/vm/src/stdlib/thread.rs

+        fn join_internal(
+            inner: &Arc<parking_lot::Mutex<ThreadHandleInner>>,
+            done_event: &Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>,
+            timeout_duration: Option<Duration>,
            vm: &VirtualMachine,
        ) -> PyResult<()> {
-            // Convert timeout to Duration (None or negative = infinite wait)
-            let timeout_duration = match timeout.flatten() {
-                Some(Either::A(t)) if t >= 0.0 => Some(Duration::from_secs_f64(t)),
-                Some(Either::B(t)) if t >= 0 => Some(Duration::from_secs(t as u64)),
-                _ => None,
-            };
+            Self::check_started(inner, vm)?;

-            // Check for self-join first
-            {
-                let inner = self.inner.lock();
-                let current_ident = get_ident();
-                if inner.ident == current_ident && inner.state == ThreadHandleState::Running {
-                    return Err(vm.new_runtime_error("cannot join current thread"));
-                }
-            }
+            let deadline =
+                timeout_duration.and_then(|timeout| std::time::Instant::now().checked_add(timeout));

            // Wait for thread completion using Condvar (supports timeout)
            // Loop to handle spurious wakeups
-            let (lock, cvar) = &*self.done_event;
+            let (lock, cvar) = &**done_event;
            let mut done = lock.lock();

+            // ThreadHandle_join semantics: self-join/finalizing checks
+            // apply only while target thread has not reported it is exiting yet.
+            if !*done {
+                let inner_guard = inner.lock();
+                let current_ident = get_ident();
+                if inner_guard.ident == current_ident
+                    && inner_guard.state == ThreadHandleState::Running
+                {
+                    return Err(vm.new_runtime_error("Cannot join current thread"));
+                }
+                if vm
+                    .state
+                    .finalizing
+                    .load(core::sync::atomic::Ordering::Acquire)
+                {
+                    return Err(vm.new_exception_msg(
+                        vm.ctx.exceptions.python_finalization_error.to_owned(),
+                        "cannot join thread at interpreter shutdown"
+                            .to_owned()
+                            .into(),
+                    ));
+                }


⚠️ Potential issue | 🔴 Critical

Split public join() validation from the internal shutdown wait path.

Because _shutdown() now goes through this helper, the checks at Lines 1216-1247 and Lines 1318-1324 are no longer just public-API validation. _shutdown() can legitimately see Starting handles, and once finalization is active this helper will raise instead of waiting. That sends _shutdown() down the run_unraisable() path and it returns without joining the remaining non-daemon threads. Keep check_started() / finalization validation in the public join() wrapper, or add an internal mode that skips them for shutdown callers.

Also applies to: 1314-1325

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/thread.rs` around lines 1210 - 1247, The join_internal helper currently enforces public join validations (calls check_started and checks VM finalization/current-ident) which breaks callers from _shutdown; update join_internal to accept an internal flag (e.g., for_shutdown: bool) or a separate private helper so _shutdown can bypass public-only checks: remove or guard the check_started(inner, vm) call and the finalization/current-thread checks inside join_internal (references: join_internal, check_started, _shutdown, ThreadHandle_join, get_ident, ThreadHandleState::Running, vm.state.finalizing) and ensure the public join() wrapper still performs check_started and finalization checks before delegating to join_internal(for_shutdown=false) while _shutdown calls join_internal(for_shutdown=true) to skip those validations.

coderabbitai

♻️ Duplicate comments (1)

crates/vm/src/stdlib/thread.rs (1)

1210-1247: ⚠️ Potential issue | 🔴 Critical

Keep _shutdown() off the public join() validation path.

join_internal() still runs check_started() and the finalization guard even when _shutdown() is the caller. That means shutdown can legitimately hit Starting / NotStarted handles from the registry, or trip the finalizing check, and then _shutdown() logs the exception and returns early instead of waiting for the remaining non-daemon threads. The public checks need to stay in join(), or join_internal() needs an internal mode that skips them for shutdown/fork cleanup callers.

Possible direction

-        fn join_internal(
+        fn join_internal(
             inner: &Arc<parking_lot::Mutex<ThreadHandleInner>>,
             done_event: &Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>,
             timeout_duration: Option<Duration>,
+            public_checks: bool,
             vm: &VirtualMachine,
         ) -> PyResult<()> {
-            Self::check_started(inner, vm)?;
+            if public_checks {
+                Self::check_started(inner, vm)?;
+            }

             let deadline =
                 timeout_duration.and_then(|timeout| std::time::Instant::now().checked_add(timeout));

             let (lock, cvar) = &**done_event;
             let mut done = lock.lock();

-            if !*done {
+            if public_checks && !*done {
                 let inner_guard = inner.lock();
                 let current_ident = get_ident();
                 if inner_guard.ident == current_ident
                     && inner_guard.state == ThreadHandleState::Running
                 {
                     return Err(vm.new_runtime_error("Cannot join current thread"));
                 }
                 if vm.state.finalizing.load(core::sync::atomic::Ordering::Acquire) {
                     return Err(...);
                 }
             }

-            Self::join_internal(&self.inner, &self.done_event, timeout_duration, vm)
+            Self::join_internal(&self.inner, &self.done_event, timeout_duration, true, vm)

-                    if let Err(exc) = ThreadHandle::join_internal(&inner, &done_event, None, vm) {
+                    if let Err(exc) =
+                        ThreadHandle::join_internal(&inner, &done_event, None, false, vm)
+                    {

Also applies to: 1314-1325

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/thread.rs` around lines 1210 - 1247, The join_internal()
path runs public validations (check_started() and the vm.state.finalizing guard)
even when invoked from internal shutdown/fork cleanup, so change the call
boundary: move the public checks back into the public join() method and/or add
an internal-only mode to join_internal() (e.g., add a bool flag like
skip_public_checks or is_shutdown_call to join_internal(inner, done_event,
timeout_duration, vm, skip_public_checks: bool)), have join() call
join_internal(..., false) and _shutdown()/fork cleanup call join_internal(...,
true), and inside join_internal() skip check_started() and the finalizing check
when skip_public_checks is true; update all callers (including _shutdown()) to
use the new flag and keep ThreadHandleState/ThreadHandle_join semantics intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 1210-1247: The join_internal() path runs public validations
(check_started() and the vm.state.finalizing guard) even when invoked from
internal shutdown/fork cleanup, so change the call boundary: move the public
checks back into the public join() method and/or add an internal-only mode to
join_internal() (e.g., add a bool flag like skip_public_checks or
is_shutdown_call to join_internal(inner, done_event, timeout_duration, vm,
skip_public_checks: bool)), have join() call join_internal(..., false) and
_shutdown()/fork cleanup call join_internal(..., true), and inside
join_internal() skip check_started() and the finalizing check when
skip_public_checks is true; update all callers (including _shutdown()) to use
the new flag and keep ThreadHandleState/ThreadHandle_join semantics intact.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: de154942-d61b-4972-bdb4-43be2e906d06

📥 Commits

Reviewing files that changed from the base of the PR and between dc18133 and d36a0b1.

📒 Files selected for processing (1)

crates/vm/src/stdlib/thread.rs

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

crates/vm/src/stdlib/thread.rs (1)

1211-1249: ⚠️ Potential issue | 🔴 Critical

Keep shutdown waits off the public join() validation path.

join_internal() still rejects Starting handles at Line 1217 and still raises PythonFinalizationError at Lines 1237-1248. _shutdown() legitimately hits both states while draining non-daemon threads, so this helper can still abort shutdown early and leave threads unjoined. Move those checks into the public join() wrapper, or add an internal for_shutdown mode that skips them.

Refactor sketch

-        fn join_internal(
+        fn join_internal(
             inner: &Arc<parking_lot::Mutex<ThreadHandleInner>>,
             done_event: &Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>,
             timeout_duration: Option<Duration>,
+            for_shutdown: bool,
             vm: &VirtualMachine,
         ) -> PyResult<()> {
-            Self::check_started(inner, vm)?;
+            if !for_shutdown {
+                Self::check_started(inner, vm)?;
+            }

             let deadline =
                 timeout_duration.and_then(|timeout| std::time::Instant::now().checked_add(timeout));

             let (lock, cvar) = &**done_event;
             let mut done = lock.lock();

-            if !*done {
+            if !*done && !for_shutdown {
                 let inner_guard = inner.lock();
                 let current_ident = get_ident();
                 if inner_guard.ident == current_ident
                     && inner_guard.state == ThreadHandleState::Running
                 {
@@
-            Self::join_internal(&self.inner, &self.done_event, timeout_duration, vm)
+            Self::check_started(&self.inner, vm)?;
+            Self::join_internal(&self.inner, &self.done_event, timeout_duration, false, vm)

-                    if let Err(exc) = ThreadHandle::join_internal(&inner, &done_event, None, vm) {
+                    if let Err(exc) =
+                        ThreadHandle::join_internal(&inner, &done_event, None, true, vm)
+                    {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/thread.rs` around lines 1211 - 1249, join_internal
currently enforces "Starting" and interpreter-finalizing checks (via
Self::check_started and the PythonFinalizationError branch) which causes
_shutdown to abort while draining threads; modify join_internal to accept a
for_shutdown: bool parameter (or remove those checks here) and move the
Starting-state verification and the vm.state.finalizing check into the public
join() wrapper so normal joins still validate but calls from _shutdown can call
join_internal(..., for_shutdown=true) (or call a new internal no-validate
helper) to skip those two checks; update all callers (public join() and
_shutdown) to call the appropriate variant and ensure unique symbols referenced:
join_internal, check_started, ThreadHandle_join semantic check, and _shutdown
are adjusted accordingly.

crates/vm/src/vm/mod.rs (2)

1932-1937: ⚠️ Potential issue | 🔴 Critical

Use Acquire for the finalization gate in eval_breaker_tripped().

Line 1935 still does a Relaxed read even though this branch decides whether check_signals() runs at all. A stale false here skips the later Acquire load in check_signals(), so non-main threads can miss finalization on weak memory models.

Proposed fix

         #[cfg(feature = "threading")]
-        if self.state.finalizing.load(Ordering::Relaxed) && !self.is_main_thread() {
+        if self.state.finalizing.load(Ordering::Acquire) && !self.is_main_thread() {
             return true;
         }

#!/bin/bash
set -euo pipefail

printf '\n== eval_breaker_tripped ==\n'
sed -n '1932,1958p' crates/vm/src/vm/mod.rs

printf '\n== bytecode-loop gate ==\n'
sed -n '1198,1206p' crates/vm/src/frame.rs

printf '\n== finalizing loads/stores ==\n'
rg -n -C2 'finalizing\.(load|store)\(' crates/vm/src

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/vm/mod.rs` around lines 1932 - 1937, The read of the
finalization gate in eval_breaker_tripped() uses Ordering::Relaxed which can let
a stale false skip check_signals() on weak memory models; change the load to use
Ordering::Acquire (i.e., replace state.finalizing.load(Ordering::Relaxed) with
state.finalizing.load(Ordering::Acquire)) so that the subsequent
check_signals()'s synchronization is not bypassed, leaving all other logic in
eval_breaker_tripped() and callers like check_signals() unchanged.

332-339: ⚠️ Potential issue | 🟡 Minor

Reset stats_last_wait_ns on the zero-thread fast path.

When Line 332 returns immediately, stats_snapshot() keeps reporting the previous stop's wait time even though this stop waited 0ns. Clear stats_last_wait_ns before returning.

Proposed fix

         if initial_countdown == 0 {
+            self.stats_last_wait_ns.store(0, Ordering::Relaxed);
             self.world_stopped.store(true, Ordering::Release);
             #[cfg(debug_assertions)]
             self.debug_assert_all_non_requester_suspended(vm);
             stw_trace(format_args!(
                 "stop end requester={requester_ident} wait_ns=0 polls=0"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/vm/mod.rs` around lines 332 - 339, The zero-thread fast path
returns without resetting the last-stop duration, so stats_snapshot() reports
stale wait time; in the initial_countdown == 0 branch reset
self.stats_last_wait_ns to zero (use the same atomic store pattern and Ordering
used elsewhere in this module) before the stw_trace/return (after
debug_assert_all_non_requester_suspended(vm) if present) so the reported wait_ns
is correctly 0 for this stop.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 994-997: The code currently drops the result of
crate::stdlib::warnings::warn (using let _ = ...) which prevents
warning-as-error filters from taking effect; change the call site to propagate
the PyResult by removing the discard and using the try operator (i.e. return the
warn() result), change warn_if_multi_threaded to return PyResult<()> (update its
signature and return points) and propagate that result at each call site (e.g.,
where warn_if_multi_threaded("fork", num_os_threads, vm) is invoked), and update
the surrounding comment to state that CPython propagates warning errors rather
than silently ignoring them.

---

Duplicate comments:
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 1211-1249: join_internal currently enforces "Starting" and
interpreter-finalizing checks (via Self::check_started and the
PythonFinalizationError branch) which causes _shutdown to abort while draining
threads; modify join_internal to accept a for_shutdown: bool parameter (or
remove those checks here) and move the Starting-state verification and the
vm.state.finalizing check into the public join() wrapper so normal joins still
validate but calls from _shutdown can call join_internal(..., for_shutdown=true)
(or call a new internal no-validate helper) to skip those two checks; update all
callers (public join() and _shutdown) to call the appropriate variant and ensure
unique symbols referenced: join_internal, check_started, ThreadHandle_join
semantic check, and _shutdown are adjusted accordingly.

In `@crates/vm/src/vm/mod.rs`:
- Around line 1932-1937: The read of the finalization gate in
eval_breaker_tripped() uses Ordering::Relaxed which can let a stale false skip
check_signals() on weak memory models; change the load to use Ordering::Acquire
(i.e., replace state.finalizing.load(Ordering::Relaxed) with
state.finalizing.load(Ordering::Acquire)) so that the subsequent
check_signals()'s synchronization is not bypassed, leaving all other logic in
eval_breaker_tripped() and callers like check_signals() unchanged.
- Around line 332-339: The zero-thread fast path returns without resetting the
last-stop duration, so stats_snapshot() reports stale wait time; in the
initial_countdown == 0 branch reset self.stats_last_wait_ns to zero (use the
same atomic store pattern and Ordering used elsewhere in this module) before the
stw_trace/return (after debug_assert_all_non_requester_suspended(vm) if present)
so the reported wait_ns is correctly 0 for this stop.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: a2aa3b12-c448-410a-8e2d-4b744c66252d

📥 Commits

Reviewing files that changed from the base of the PR and between d36a0b1 and 026eb24.

📒 Files selected for processing (3)

crates/vm/src/stdlib/posix.rs
crates/vm/src/stdlib/thread.rs
crates/vm/src/vm/mod.rs

coderabbitai · 2026-03-08T01:47:54Z

crates/vm/src/stdlib/posix.rs

+            // Match PyErr_WarnFormat(..., stacklevel=1) in CPython.
+            // Best effort: ignore failures like CPython does in this path.
+            let _ =
+                crate::stdlib::warnings::warn(vm.ctx.exceptions.deprecation_warning, msg, 1, vm);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and read the relevant section of posix.rs wc -l crates/vm/src/stdlib/posix.rs

Repository: RustPython/RustPython

Length of output: 99

🏁 Script executed:

# Read the specific lines mentioned in the review sed -n '990,1010p' crates/vm/src/stdlib/posix.rs

Repository: RustPython/RustPython

Length of output: 866

🏁 Script executed:

# Find the warn_if_multi_threaded function definition to see its current signature rg -n "fn warn_if_multi_threaded" crates/vm/src/stdlib/posix.rs -A 10

Repository: RustPython/RustPython

Length of output: 689

🏁 Script executed:

# Find the warnings::warn function to verify its return type rg -n "fn warn" crates/vm/src/stdlib/warnings.rs -A 5

Repository: RustPython/RustPython

Length of output: 820

🏁 Script executed:

# Check where warn_if_multi_threaded is called to understand the context rg -n "warn_if_multi_threaded" crates/vm/src/stdlib/posix.rs

Repository: RustPython/RustPython

Length of output: 225

🏁 Script executed:

# Search for any related discussions or TODOs about fork warning handling rg -i "cpython|fork" crates/vm/src/stdlib/posix.rs -B 2 -A 2 | grep -A 5 -B 5 "warn\|error"

Repository: RustPython/RustPython

Length of output: 1656

🏁 Script executed:

# Check if there are any tests for the fork warning behavior fd -e rs | xargs rg "test.*fork\|fork.*test" -l | head -5

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Let's check the broader context of the fork function to understand the error handling strategy sed -n '1025,1060p' crates/vm/src/stdlib/posix.rs

Repository: RustPython/RustPython

Length of output: 1241

🏁 Script executed:

# Check if there are any other similar patterns where PyResult is dropped in critical functions rg "let _ = .*warn" crates/vm/src/stdlib/posix.rs -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 47

🌐 Web query:

CPython fork() warning error handling deprecation posixmodule.c

💡 Result:

In CPython, the “fork() in a multi-threaded process” DeprecationWarning is implemented in Modules/posixmodule.c (helper often named like warn_about_fork_with_threads()), and it’s emitted when CPython can detect more than 1 thread in the current process:

It tries to detect the thread count via OS-specific means (e.g., Linux /proc/self/stat thread field), and if that fails it falls back to counting what the threading module knows about (threading._active + threading._limbo). (raw.githubusercontent.com)

If num_threads > 1, it calls PyErr_WarnFormat(PyExc_DeprecationWarning, ...) with the message:
“This process (pid=...) is multi-threaded, use of fork() may lead to deadlocks in the child.” (raw.githubusercontent.com)

This behavior is documented as a Python 3.12 change (“os.fork() can now raise a DeprecationWarning…; use a different multiprocessing start method like spawn or forkserver”). (docs.python.org)

The “warning vs error” handling difference (why it sometimes doesn’t fail under -Werror)

There has been a real behavioral difference across versions/branches in how the C code handles the return/exception state from PyErr_WarnFormat:

In the 3.12.0 code, after issuing the warning, CPython clears the warning/error indicator unconditionally (PyErr_Clear()), meaning that even if your warnings filter turns DeprecationWarning into an exception (e.g. -W error), the exception state can get swallowed inside posixmodule.c rather than propagating to Python code. (raw.githubusercontent.com)

In current main, the helper returns the result of PyErr_WarnFormat(...) to the caller (instead of clearing it), allowing “warnings as errors” configurations to actually behave like errors. (raw.githubusercontent.com)

Practical takeaway

If you want this to be fatal in CI: be aware that some released versions (notably early 3.12 code paths) may clear the exception state in C and thus won’t reliably fail with -W error in all scenarios. (raw.githubusercontent.com)

If you want to fix the root cause: don’t fork() from a multi-threaded process—switch to multiprocessing start methods (spawn/forkserver) or use subprocess/posix_spawn style process creation where applicable. (docs.python.org)

Citations:

1: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c

2: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c

3: https://docs.python.org/3/whatsnew/3.12.html

4: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c

5: https://raw.githubusercontent.com/python/cpython/main/Modules/posixmodule.c

6: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c

7: https://docs.python.org/3/whatsnew/3.12.html

Propagate warning errors instead of silently dropping them.

The current code uses let _ = to drop the PyResult from warnings::warn(), which prevents warning-as-error filters from working. If a user configures DeprecationWarning as an error via warning filters, os.fork() will still succeed instead of raising an exception. This diverges from CPython's current behavior, which propagates the warning result.

Suggested changes:

Remove the error-dropping pattern and propagate the result:

crate::stdlib::warnings::warn( vm.ctx.exceptions.deprecation_warning, msg, 1, vm, )?;

Update warn_if_multi_threaded to return PyResult<()> and propagate at the call site:

fn warn_if_multi_threaded( name: &str, num_os_threads: isize, vm: &VirtualMachine, ) -> PyResult<()> { // ... } warn_if_multi_threaded("fork", num_os_threads, vm)?;

Update the comment to accurately reflect the current CPython behavior (propagating rather than ignoring).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/posix.rs` around lines 994 - 997, The code currently drops the result of crate::stdlib::warnings::warn (using let _ = ...) which prevents warning-as-error filters from taking effect; change the call site to propagate the PyResult by removing the discard and using the try operator (i.e. return the warn() result), change warn_if_multi_threaded to return PyResult<()> (update its signature and return points) and propagate that result at each call site (e.g., where warn_if_multi_threaded("fork", num_os_threads, vm) is invoked), and update the surrounding comment to state that CPython propagates warning errors rather than silently ignoring them.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

crates/vm/src/stdlib/posix.rs (1)

904-923: macOS: Calling task_for_pid on self is unnecessary overhead.

Using task_for_pid(mach_task_self(), getpid(), &mut proc_task) is redundant since mach_task_self() already returns the task port for the current process. Additionally, task_for_pid typically requires special entitlements (taskgated) when used for other processes, and while it works for self, it's an extra syscall.

♻️ Suggested simplification

-            let self_task = unsafe { mach_task_self() };
-            let mut proc_task: MachPortT = 0;
-            if unsafe { task_for_pid(self_task, libc::getpid(), &mut proc_task) } == KERN_SUCCESS {
-                let mut threads: ThreadActArrayT = core::ptr::null_mut();
-                let mut n_threads: MachMsgTypeNumberT = 0;
-                if unsafe { task_threads(proc_task, &mut threads, &mut n_threads) } == KERN_SUCCESS
+            let self_task = unsafe { mach_task_self() };
+            let mut threads: ThreadActArrayT = core::ptr::null_mut();
+            let mut n_threads: MachMsgTypeNumberT = 0;
+            if unsafe { task_threads(self_task, &mut threads, &mut n_threads) } == KERN_SUCCESS
+            {
+                if !threads.is_null() {
+                    let _ = unsafe {
+                        vm_deallocate(
+                            self_task,
+                            threads as libc::uintptr_t,
+                            (n_threads as usize * core::mem::size_of::<MachPortT>())
+                                as libc::uintptr_t,
+                        )
+                    };
+                }
+                return n_threads as isize;
+            }
+            0

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/posix.rs` around lines 904 - 923, The code redundantly
calls task_for_pid(self_task, getpid(), &mut proc_task); replace that flow by
assigning proc_task = self_task (from mach_task_self()) and then call
task_threads(proc_task, &mut threads, &mut n_threads) as before; keep the
existing vm_deallocate call using self_task and threads, and return n_threads as
isize, removing the task_for_pid branch and its associated conditional while
preserving error handling around task_threads and vm_deallocate (symbols:
mach_task_self, proc_task, task_for_pid, task_threads, vm_deallocate,
n_threads).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 926-946: The current parsing of /proc/self/stat is fragile because
split_whitespace() will break when the comm field (inside parentheses) contains
spaces; instead, after reading the file and converting to string, find the index
of the last ')' in the line, take the substring after that character, then
split_whitespace() on that remainder and parse the desired field (the same field
you previously accessed with nth(19)) from the resultant iterator (preserve the
existing return behavior using parse::<isize>().unwrap_or(0)); replace the
direct line.split_whitespace().nth(19) usage with this "find last ')', take
substring after it, then split" approach to reliably locate the field even when
comm contains spaces.

---

Nitpick comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 904-923: The code redundantly calls task_for_pid(self_task,
getpid(), &mut proc_task); replace that flow by assigning proc_task = self_task
(from mach_task_self()) and then call task_threads(proc_task, &mut threads, &mut
n_threads) as before; keep the existing vm_deallocate call using self_task and
threads, and return n_threads as isize, removing the task_for_pid branch and its
associated conditional while preserving error handling around task_threads and
vm_deallocate (symbols: mach_task_self, proc_task, task_for_pid, task_threads,
vm_deallocate, n_threads).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: d8154526-8adb-41ea-84aa-3bb0378c4963

📥 Commits

Reviewing files that changed from the base of the PR and between 026eb24 and 939a1ff.

📒 Files selected for processing (1)

crates/vm/src/stdlib/posix.rs

coderabbitai · 2026-03-08T02:08:06Z

crates/vm/src/stdlib/posix.rs

+        #[cfg(target_os = "linux")]
+        {
+            use std::io::Read as _;
+            let mut file = match std::fs::File::open("/proc/self/stat") {
+                Ok(f) => f,
+                Err(_) => return 0,
+            };
+            let mut buf = [0u8; 160];
+            let n = match file.read(&mut buf) {
+                Ok(n) => n,
+                Err(_) => return 0,
+            };
+            let line = match core::str::from_utf8(&buf[..n]) {
+                Ok(s) => s,
+                Err(_) => return 0,
+            };
+            if let Some(field) = line.split_whitespace().nth(19) {
+                return field.parse::<isize>().unwrap_or(0);
+            }
+            0
+        }


⚠️ Potential issue | 🟡 Minor

Linux: /proc/self/stat parsing is fragile when comm contains spaces.

The comm field (field 2 in /proc/self/stat) is enclosed in parentheses and can contain spaces, parentheses, or newlines. Using split_whitespace().nth(19) will yield incorrect results if the process name contains spaces.

For example, a process named "my app" produces:

123 (my app) S 456 ...

split_whitespace() splits this into ["123", "(my", "app)", "S", ...], making nth(19) point to the wrong field.

CPython handles this by finding the last ) to locate the end of the comm field, then counting fields from there.

🛡️ Suggested fix to handle comm field correctly

#[cfg(target_os = "linux")] { use std::io::Read as _; let mut file = match std::fs::File::open("/proc/self/stat") { Ok(f) => f, Err(_) => return 0, }; let mut buf = [0u8; 160]; let n = match file.read(&mut buf) { Ok(n) => n, Err(_) => return 0, }; let line = match core::str::from_utf8(&buf[..n]) { Ok(s) => s, Err(_) => return 0, }; - if let Some(field) = line.split_whitespace().nth(19) { - return field.parse::<isize>().unwrap_or(0); + // Find end of comm field (last ')') to handle spaces in process name + let after_comm = match line.rfind(')') { + Some(idx) => &line[idx + 1..], + None => return 0, + }; + // Field 20 (num_threads) is the 17th field after comm (fields 3-19 are 17 fields) + if let Some(field) = after_comm.split_whitespace().nth(17) { + return field.parse::<isize>().unwrap_or(0); } 0 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/posix.rs` around lines 926 - 946, The current parsing of /proc/self/stat is fragile because split_whitespace() will break when the comm field (inside parentheses) contains spaces; instead, after reading the file and converting to string, find the index of the last ')' in the line, take the substring after that character, then split_whitespace() on that remainder and parse the desired field (the same field you previously accessed with nth(19)) from the resultant iterator (preserve the existing return behavior using parse::<isize>().unwrap_or(0)); replace the direct line.split_whitespace().nth(19) usage with this "find last ')', take substring after it, then split" approach to reliably locate the field even when comm contains spaces.

- Fix park_detached_threads: successful CAS no longer sets all_suspended=false, avoiding unnecessary polling rounds - Replace park_timeout(50µs) with park() in wait_while_suspended - Remove redundant self-suspension in attach_thread and detach_thread; the STW controller handles DETACHED→SUSPENDED via park_detached_threads - Add double-check under mutex before condvar wait to prevent lost wakes - Remove dead stats_detach_wait_yields field and add_detach_wait_yields

Like CPython's ThreadHandle_start, set RUNNING state in the parent thread immediately after spawn() succeeds, rather than in the child. This eliminates a race where join() could see Starting state if called before the child thread executes. Also reverts the macOS skip for test_start_new_thread_failed since the root cause is fixed.

…ad_at_finalization

Add lock_wrapped to ThreadMutex for detaching thread state while waiting on contended locks. Use it for buffered and text IO locks. Wrap FileIO read/write in allow_threads via crt_fd to prevent STW hangs on blocking file operations.

Replace parking_lot Mutex/Condvar with std::sync (pthread-based) for started_event and handle_ready_event. This prevents hangs in forked children where parking_lot's global HASHTABLE may be corrupted.

* apply more allow_threads * Simplify STW thread state transitions - Fix park_detached_threads: successful CAS no longer sets all_suspended=false, avoiding unnecessary polling rounds - Replace park_timeout(50µs) with park() in wait_while_suspended - Remove redundant self-suspension in attach_thread and detach_thread; the STW controller handles DETACHED→SUSPENDED via park_detached_threads - Add double-check under mutex before condvar wait to prevent lost wakes - Remove dead stats_detach_wait_yields field and add_detach_wait_yields * Representable for ThreadHandle * Set ThreadHandle state to Running in parent thread after spawn Like CPython's ThreadHandle_start, set RUNNING state in the parent thread immediately after spawn() succeeds, rather than in the child. This eliminates a race where join() could see Starting state if called before the child thread executes. Also reverts the macOS skip for test_start_new_thread_failed since the root cause is fixed. * Set ThreadHandle state to Running in parent thread after spawn * Add debug_assert for thread state in start_the_world * Unskip now-passing test_get_event_loop_thread and test_start_new_thread_at_finalization * Wrap IO locks and file ops in allow_threads Add lock_wrapped to ThreadMutex for detaching thread state while waiting on contended locks. Use it for buffered and text IO locks. Wrap FileIO read/write in allow_threads via crt_fd to prevent STW hangs on blocking file operations. * Use std::sync for thread start/ready events Replace parking_lot Mutex/Condvar with std::sync (pthread-based) for started_event and handle_ready_event. This prevents hangs in forked children where parking_lot's global HASHTABLE may be corrupted.

youknowone force-pushed the implock branch 3 times, most recently from db453fc to b29c3b8 Compare March 7, 2026 15:36

youknowone marked this pull request as ready for review March 7, 2026 16:28

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

youknowone force-pushed the implock branch from dc18133 to d36a0b1 Compare March 7, 2026 17:45

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

youknowone added 9 commits March 8, 2026 11:15

apply more allow_threads

6217ad6

Representable for ThreadHandle

396fec9

Set ThreadHandle state to Running in parent thread after spawn

e37b15f

Add debug_assert for thread state in start_the_world

cfd7f01

Unskip now-passing test_get_event_loop_thread and test_start_new_thre…

1b97419

…ad_at_finalization

Wrap IO locks and file ops in allow_threads

1ba2821

Add lock_wrapped to ThreadMutex for detaching thread state while waiting on contended locks. Use it for buffered and text IO locks. Wrap FileIO read/write in allow_threads via crt_fd to prevent STW hangs on blocking file operations.

Use std::sync for thread start/ready events

ca41577

Replace parking_lot Mutex/Condvar with std::sync (pthread-based) for started_event and handle_ready_event. This prevents hangs in forked children where parking_lot's global HASHTABLE may be corrupted.

youknowone force-pushed the implock branch from 939a1ff to ca41577 Compare March 8, 2026 08:16

youknowone merged commit 45d8129 into RustPython:main Mar 8, 2026
13 of 14 checks passed

youknowone deleted the implock branch March 8, 2026 09:06

Conversation

youknowone commented Mar 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

github-actions bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📦 Library Dependencies

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

The “warning vs error” handling difference (why it sometimes doesn’t fail under -Werror)

Practical takeaway

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading

github-actions bot commented Mar 7, 2026 •

edited

Loading

The “warning vs error” handling difference (why it sometimes doesn’t fail under `-Werror`)