Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a lock_wrapped API and uses it with vm.allow_threads to release the VM/GIL during blocking waits; refactors stop-the-world state to world_stopped plus a thread_countdown, adds per-thread stop_requested, tightens signal-exception gating, and overhauls thread start/join/shutdown lifecycles. Changes
Sequence Diagram(s)sequenceDiagram
participant Py as Python code (caller)
participant VM as VirtualMachine
participant Mutex as ThreadMutex/RawThreadMutex
participant OS as OS blocking syscall
Py->>VM: request lock_wrapped(wrap_fn)
VM->>Mutex: lock_wrapped(wrap_fn)
Mutex->>VM: invoke wrap_fn(do_lock)
VM->>OS: vm.allow_threads(do_lock) -- releases VM/GIL
OS-->>Mutex: blocking wait returns
Mutex-->>VM: set owner and return
VM-->>Py: return guard / success
Estimated code review effort🎯 5 (Critical) | ⏱️ ~110 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
db453fc to
b29c3b8
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] lib: cpython/Lib/asyncio dependencies:
dependent tests: (7 tests)
[x] lib: cpython/Lib/threading.py dependencies:
dependent tests: (148 tests)
Legend:
|
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/vm/thread.rs (1)
187-209:⚠️ Potential issue | 🔴 CriticalDetaching a requested thread needs to wake the stop-the-world requester.
When
park_detached_threads()marks an ATTACHED thread withstop_requested = true, a subsequent call todetach_thread()(viaallow_threads()) can transition it to THREAD_DETACHED without notifying the requester. This leaves the thread countdown outstanding. The blockedstop_the_world()loop will only advance when either the countdown reaches zero orpark_detached_threads()is called again to rescan the detached thread, causing unnecessary stalling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/thread.rs` around lines 187 - 209, detaching an ATTACHED thread can race with a prior park_detached_threads() that set stop_requested=true and so detach_thread() must notify the stop-the-world requester; update detach_thread() (inside CURRENT_THREAD_SLOT handling when compare_exchange succeeds going from THREAD_ATTACHED to THREAD_DETACHED) to check the thread's stop_requested flag and, if true, call the same notification used by park_detached_threads()/stop_the_world (i.e., invoke the stop-the-world wake/notify function used elsewhere) so the stop_the_world() countdown/loop is advanced when a requested thread is detached.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/io.rs`:
- Line 5339: The code currently uses std::io::Read::read_to_end(), which hides
EINTR and prevents vm.check_signals() from running on the unbounded-read path
(size < 0); replace that call with an explicit chunked loop that uses
crt_fd::read() to fill a Vec<u8> (read fixed-size chunks, e.g., 8KiB), append
each successful chunk to the buffer, break on EOF (0 bytes), and on read errors
return Err so EINTR is propagated to the outer error handler; make sure to
reference and replace the read_to_end() call site and keep vm.check_signals()
semantics by letting crt_fd::read() surface EINTR instead of swallowing it.
In `@crates/vm/src/vm/mod.rs`:
- Around line 334-343: The fast-return path when initial_countdown == 0 leaves
stats_last_wait_ns unchanged; update the code in the branch (inside the block
after self.world_stopped.store(...) and before returning) to reset the VM's
last-wait statistic (the atomic backing field used by
stats_last_wait_ns/stats_snapshot) to zero using the same atomic semantics used
elsewhere (e.g., an atomic store with appropriate Ordering), so stats_snapshot()
does not report the previous stop's wait time after a no-wait stop; locate the
change around init_thread_countdown and world_stopped in stop() in vm::mod.rs.
- Around line 1931-1949: The fast-path in eval_breaker_tripped() performs a
Relaxed load of state.finalizing which can bypass the later Acquire load in
check_signals(), breaking the Release→Acquire synchronization from the shutdown
write; change the atomic load on state.finalizing in eval_breaker_tripped() to
Ordering::Acquire (or otherwise ensure an Acquire-load or appropriate fence is
executed when checking finalizing) so that the shutdown write in interpreter.rs
(Release) becomes visible on other threads; references: eval_breaker_tripped,
state.finalizing, check_signals, and the shutdown write in interpreter.rs.
In `@crates/vm/src/vm/thread.rs`:
- Around line 426-462: The cleanup_current_thread_frames path can call
vm.state.stop_the_world.notify_thread_gone() for threads that were
THREAD_DETACHED before the stop-the-world began; only decrement the requester
countdown for threads that were counted by init_thread_countdown. Update the
final conditional in cleanup_current_thread_frames to also ensure the removed
slot's state is not THREAD_DETACHED (e.g. add slot.state.load(...) !=
THREAD_DETACHED alongside the existing THREAD_SUSPENDED check) before calling
stop_the_world.notify_thread_gone(); reference symbols:
cleanup_current_thread_frames, vm.state.stop_the_world.notify_thread_gone,
init_thread_countdown, THREAD_DETACHED, THREAD_SUSPENDED, slot.state.
- Around line 247-252: Reads of the stop-request bit in suspend_if_needed and
stop_requested_for_current_thread currently use Ordering::Relaxed; change those
.load(Ordering::Relaxed) calls on the stop_requested AtomicBool to
.load(Ordering::Acquire) so they synchronize with park_detached_threads() which
publishes the bit with Ordering::Release (ensuring eval_breaker_tripped() sees
the update and threads suspend correctly).
---
Outside diff comments:
In `@crates/vm/src/vm/thread.rs`:
- Around line 187-209: detaching an ATTACHED thread can race with a prior
park_detached_threads() that set stop_requested=true and so detach_thread() must
notify the stop-the-world requester; update detach_thread() (inside
CURRENT_THREAD_SLOT handling when compare_exchange succeeds going from
THREAD_ATTACHED to THREAD_DETACHED) to check the thread's stop_requested flag
and, if true, call the same notification used by
park_detached_threads()/stop_the_world (i.e., invoke the stop-the-world
wake/notify function used elsewhere) so the stop_the_world() countdown/loop is
advanced when a requested thread is detached.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 9daf3542-35f7-4fa9-8ecd-777f0f7c9a5d
⛔ Files ignored due to path filters (2)
Lib/test/test_asyncio/test_events.pyis excluded by!Lib/**Lib/test/test_threading.pyis excluded by!Lib/**
📒 Files selected for processing (9)
crates/common/src/lock/thread_mutex.rscrates/stdlib/src/multiprocessing.rscrates/stdlib/src/select.rscrates/vm/src/frame.rscrates/vm/src/signal.rscrates/vm/src/stdlib/io.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rs
| }; | ||
| use crossbeam_utils::atomic::AtomicCell; | ||
| use std::io::{Read, Write}; | ||
| use std::io::Read; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check if the file exists and get its total line count
wc -l crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 96
🏁 Script executed:
# Read the specific lines mentioned in the review (5339 and 5764-5786)
# Start with line 5339
sed -n '5335,5345p' crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 427
🏁 Script executed:
# Read the lines 5764-5786 to see the full context of the issue
sed -n '5760,5790p' crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 1218
🏁 Script executed:
# Also check around line 5767 mentioned in the comment
sed -n '5760,5800p' crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 1533
🏁 Script executed:
# Search for crt_fd module usage and definition
rg "crt_fd" crates/vm/src/stdlib/io.rs | head -20Repository: RustPython/RustPython
Length of output: 859
🏁 Script executed:
# Check the Rust std::io::Read documentation behavior - search for how read_to_end is used elsewhere
rg "read_to_end" crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 114
🏁 Script executed:
# Look for crt_fd module definition
fd "crt_fd" crates/Repository: RustPython/RustPython
Length of output: 92
🏁 Script executed:
# Check if there's a read function being used with crt_fd elsewhere
rg "crt_fd::read" crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 222
🏁 Script executed:
# Search for DEFAULT_BUFFER_SIZE constant
rg "DEFAULT_BUFFER_SIZE" crates/vm/src/stdlib/io.rs | head -5Repository: RustPython/RustPython
Length of output: 360
🏁 Script executed:
# Check the crt_fd module to understand the read function
cat crates/common/src/crt_fd.rsRepository: RustPython/RustPython
Length of output: 9961
🏁 Script executed:
# Look for the context where crt_fd::read is used in that loop
rg -B5 -A5 "crt_fd::read\(handle" crates/vm/src/stdlib/io.rsRepository: RustPython/RustPython
Length of output: 1070
Replace read_to_end() with explicit crt_fd::read() loop to preserve signal checking on EINTR.
The current code uses std::io::Read::read_to_end(), which internally retries on EINTR, preventing the outer error handler from triggering vm.check_signals(). This breaks the unbounded-read path (size < 0) signal handling.
Use a chunked read loop with crt_fd::read() instead, which surfaces EINTR to the error handler:
Replace read_to_end() with chunked crt_fd::read() loop
- let mut bytes = vec![];
- // Loop on EINTR (PEP 475)
- loop {
- match vm.allow_threads(|| {
- let mut h = handle;
- h.read_to_end(&mut bytes)
- }) {
- Ok(_) => break,
- Err(e) if e.raw_os_error() == Some(libc::EINTR) => {
- vm.check_signals()?;
- continue;
- }
- // Non-blocking mode: return None if EAGAIN (only if no data read yet)
- Err(e) if e.raw_os_error() == Some(libc::EAGAIN) => {
- if bytes.is_empty() {
- return Ok(None);
- }
- break;
- }
- Err(e) => return Err(Self::io_error(zelf, e, vm)),
- }
- }
+ let mut bytes = vec![];
+ loop {
+ let mut chunk = vec![0; super::DEFAULT_BUFFER_SIZE];
+ match vm.allow_threads(|| crt_fd::read(handle, &mut chunk)) {
+ Ok(0) => break,
+ Ok(n) => bytes.extend_from_slice(&chunk[..n]),
+ Err(e) if e.raw_os_error() == Some(libc::EINTR) => {
+ vm.check_signals()?;
+ continue;
+ }
+ Err(e) if e.raw_os_error() == Some(libc::EAGAIN) => {
+ if bytes.is_empty() {
+ return Ok(None);
+ }
+ break;
+ }
+ Err(e) => return Err(Self::io_error(zelf, e, vm)),
+ }
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/io.rs` at line 5339, The code currently uses
std::io::Read::read_to_end(), which hides EINTR and prevents vm.check_signals()
from running on the unbounded-read path (size < 0); replace that call with an
explicit chunked loop that uses crt_fd::read() to fill a Vec<u8> (read
fixed-size chunks, e.g., 8KiB), append each successful chunk to the buffer,
break on EOF (0 bytes), and on read errors return Err so EINTR is propagated to
the outer error handler; make sure to reference and replace the read_to_end()
call site and keep vm.check_signals() semantics by letting crt_fd::read()
surface EINTR instead of swallowing it.
| let initial_countdown = self.init_thread_countdown(vm); | ||
| stw_trace(format_args!("stop begin requester={requester_ident}")); | ||
| if initial_countdown == 0 { | ||
| self.world_stopped.store(true, Ordering::Release); | ||
| #[cfg(debug_assertions)] | ||
| self.debug_assert_all_non_requester_suspended(vm); | ||
| stw_trace(format_args!( | ||
| "stop end requester={requester_ident} wait_ns=0 polls=0" | ||
| )); | ||
| return; |
There was a problem hiding this comment.
Reset last_wait_ns on the zero-thread fast path.
When Lines 336-343 return early, stats_last_wait_ns is left unchanged, so stats_snapshot() can report the previous stop's wait time after a no-wait stop.
Proposed fix
let initial_countdown = self.init_thread_countdown(vm);
stw_trace(format_args!("stop begin requester={requester_ident}"));
if initial_countdown == 0 {
+ self.stats_last_wait_ns.store(0, Ordering::Relaxed);
self.world_stopped.store(true, Ordering::Release);
#[cfg(debug_assertions)]
self.debug_assert_all_non_requester_suspended(vm);
stw_trace(format_args!(
"stop end requester={requester_ident} wait_ns=0 polls=0"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let initial_countdown = self.init_thread_countdown(vm); | |
| stw_trace(format_args!("stop begin requester={requester_ident}")); | |
| if initial_countdown == 0 { | |
| self.world_stopped.store(true, Ordering::Release); | |
| #[cfg(debug_assertions)] | |
| self.debug_assert_all_non_requester_suspended(vm); | |
| stw_trace(format_args!( | |
| "stop end requester={requester_ident} wait_ns=0 polls=0" | |
| )); | |
| return; | |
| let initial_countdown = self.init_thread_countdown(vm); | |
| stw_trace(format_args!("stop begin requester={requester_ident}")); | |
| if initial_countdown == 0 { | |
| self.stats_last_wait_ns.store(0, Ordering::Relaxed); | |
| self.world_stopped.store(true, Ordering::Release); | |
| #[cfg(debug_assertions)] | |
| self.debug_assert_all_non_requester_suspended(vm); | |
| stw_trace(format_args!( | |
| "stop end requester={requester_ident} wait_ns=0 polls=0" | |
| )); | |
| return; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/vm/mod.rs` around lines 334 - 343, The fast-return path when
initial_countdown == 0 leaves stats_last_wait_ns unchanged; update the code in
the branch (inside the block after self.world_stopped.store(...) and before
returning) to reset the VM's last-wait statistic (the atomic backing field used
by stats_last_wait_ns/stats_snapshot) to zero using the same atomic semantics
used elsewhere (e.g., an atomic store with appropriate Ordering), so
stats_snapshot() does not report the previous stop's wait time after a no-wait
stop; locate the change around init_thread_countdown and world_stopped in stop()
in vm::mod.rs.
| #[inline] | ||
| pub(crate) fn eval_breaker_tripped(&self) -> bool { | ||
| #[cfg(feature = "threading")] | ||
| if self.state.finalizing.load(Ordering::Relaxed) && !self.is_main_thread() { | ||
| return true; | ||
| } | ||
|
|
||
| #[cfg(all(unix, feature = "threading"))] | ||
| if thread::stop_requested_for_current_thread() { | ||
| return true; | ||
| } | ||
|
|
||
| #[cfg(not(target_arch = "wasm32"))] | ||
| if crate::signal::is_triggered() { | ||
| return true; | ||
| } | ||
|
|
||
| false | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== eval_breaker_tripped call sites ==\n'
rg -n -C2 '\beval_breaker_tripped\s*\(' crates/vm/src
printf '\n== finalizing atomic loads/stores ==\n'
rg -n -C2 'finalizing\.(load|store)\(Ordering::' crates/vm/src
printf '\n== stop_requested atomic loads/stores ==\n'
rg -n -C2 'stop_requested\.(load|store)\(Ordering::' crates/vm/src/vm
printf '\n== current-thread stop_requested helper ==\n'
rg -n -C3 '\bstop_requested_for_current_thread\s*\(' crates/vm/src/vm/thread.rsRepository: RustPython/RustPython
Length of output: 3615
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== finalizing atomic stores (write side) ==\n'
rg -n -C3 'finalizing\.store\(Ordering::' crates/vm/src
printf '\n== finalizing field initialization ==\n'
rg -n -B2 -A2 'finalizing\s*[:=].*Atomic' crates/vm/src
printf '\n== context around finalizing in state struct ==\n'
rg -n -B5 -A5 'struct.*VmState\|finalizing:' crates/vm/src/vm/mod.rs | head -50
printf '\n== check_signals context (lines around 1954-1970) ==\n'
sed -n '1950,1975p' crates/vm/src/vm/mod.rsRepository: RustPython/RustPython
Length of output: 109
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== VmState struct definition ==\n'
rg -n -B2 -A15 'pub.*struct VmState' crates/vm/src/vm/mod.rs
printf '\n== All references to .finalizing ==\n'
rg -n '\.finalizing' crates/vm/src/vm/mod.rs | head -20
printf '\n== frame.rs context around eval_breaker_tripped call ==\n'
sed -n '1195,1215p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 97
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== Find files containing VmState ==\n'
rg -l 'struct VmState' crates/vm/src
printf '\n== Search for VmState definition more broadly ==\n'
rg -n 'struct VmState' crates/vm/src
printf '\n== Get frame.rs lines 1195-1215 ==\n'
cat -n crates/vm/src/frame.rs | sed -n '1195,1215p'
printf '\n== Search for finalizing field definition ==\n'
rg -n 'finalizing.*AtomicBool\|finalizing.*Atomic' crates/vm/srcRepository: RustPython/RustPython
Length of output: 101
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== Check if files exist ==\n'
ls -la crates/vm/src/vm/mod.rs crates/vm/src/frame.rs
printf '\n== View lines 1930-1950 of vm/mod.rs ==\n'
sed -n '1930,1950p' crates/vm/src/vm/mod.rs
printf '\n== View lines 1950-1975 of vm/mod.rs ==\n'
sed -n '1950,1975p' crates/vm/src/vm/mod.rs
printf '\n== View lines 1195,1215 of frame.rs ==\n'
sed -n '1195,1215p' crates/vm/src/frame.rs
printf '\n== Simple search for finalizing in vm/mod.rs ==\n'
grep -n 'finalizing' crates/vm/src/vm/mod.rs | head -15Repository: RustPython/RustPython
Length of output: 2988
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== Search for finalizing assignments/modifications ==\n'
grep -rn 'finalizing\s*=' crates/vm/src --include='*.rs' | grep -v '//' | head -20
printf '\n== Search for finalize or finalization functions that might set finalizing ==\n'
grep -rn 'fn.*finali' crates/vm/src/vm/mod.rs | head -10
printf '\n== Get context around line 581 (field definition) ==\n'
sed -n '575,595p' crates/vm/src/vm/mod.rs
printf '\n== Look for Release orderings related to finalizing ==\n'
grep -rn 'Release\|Release,' crates/vm/src/vm/mod.rs | grep -i finaliRepository: RustPython/RustPython
Length of output: 119
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== Get struct VmState definition around line 581 ==\n'
sed -n '550,620p' crates/vm/src/vm/mod.rs
printf '\n== Search for new() or initialization of VmState ==\n'
grep -n 'fn new\|impl.*VmState' crates/vm/src/vm/mod.rs | head -10
printf '\n== Get broader context - search for any function containing finalizing ==\n'
sed -n '1075,1090p' crates/vm/src/vm/mod.rs
printf '\n== Look for finalization methods ==\n'
sed -n '1000,1100p' crates/vm/src/vm/mod.rs | grep -n -B2 -A2 'finali'Repository: RustPython/RustPython
Length of output: 4755
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== Search for store operations on finalizing ==\n'
grep -n 'finalizing\.store\|finalizing\.swap' crates/vm/src --include='*.rs' -r
printf '\n== Search for any assignment to finalizing ==\n'
grep -n 'finalizing' crates/vm/src --include='*.rs' -r | grep -v 'load\|//'
printf '\n== Check thread.rs for stop_the_world and synchronization ==\n'
sed -n '345,365p' crates/vm/src/vm/thread.rs
printf '\n== Look for Python shutdown/finalization code ==\n'
grep -n 'finalize\|shutdown' crates/vm/src/stdlib/sys.rs | head -15Repository: RustPython/RustPython
Length of output: 4447
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== Context around Release store at interpreter.rs:433 ==\n'
sed -n '420,450p' crates/vm/src/vm/interpreter.rs
printf '\n== Check what triggers the shutdown sequence ==\n'
sed -n '400,440p' crates/vm/src/vm/interpreter.rs
printf '\n== Look for comments about memory ordering ==\n'
grep -n -B3 -A3 'Relaxed\|Acquire\|Release\|memory\|ordering' crates/vm/src/vm/mod.rs | head -40Repository: RustPython/RustPython
Length of output: 4937
Review the memory ordering on the eval_breaker_tripped() fast-path gate.
eval_breaker_tripped() at line 1934 uses Ordering::Relaxed to read finalizing, but this check gates the only path to check_signals() (frame.rs:1202-1203), which uses Ordering::Acquire. Since a Relaxed load returning false prevents the Acquire load from ever executing, the Release→Acquire synchronization pair from the shutdown write (interpreter.rs:433) is broken on weak memory models. A stale Relaxed read can cause non-main threads to miss finalization for an indeterminate period, delaying stop-the-world visibility. Consider using Ordering::Acquire in the hot-path gate or restructuring to ensure the Acquire read always executes after finalization is signaled.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/vm/mod.rs` around lines 1931 - 1949, The fast-path in
eval_breaker_tripped() performs a Relaxed load of state.finalizing which can
bypass the later Acquire load in check_signals(), breaking the Release→Acquire
synchronization from the shutdown write; change the atomic load on
state.finalizing in eval_breaker_tripped() to Ordering::Acquire (or otherwise
ensure an Acquire-load or appropriate fence is executed when checking
finalizing) so that the shutdown write in interpreter.rs (Release) becomes
visible on other threads; references: eval_breaker_tripped, state.finalizing,
check_signals, and the shutdown write in interpreter.rs.
| pub fn suspend_if_needed(stw: &super::StopTheWorldState) { | ||
| if !stw.requested.load(Ordering::Relaxed) { | ||
| let should_suspend = CURRENT_THREAD_SLOT.with(|slot| { | ||
| slot.borrow() | ||
| .as_ref() | ||
| .is_some_and(|s| s.stop_requested.load(Ordering::Relaxed)) | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n crates/vm/src/vm/thread.rs | sed -n '240,260p'Repository: RustPython/RustPython
Length of output: 951
🏁 Script executed:
cat -n crates/vm/src/vm/thread.rs | sed -n '340,360p'Repository: RustPython/RustPython
Length of output: 882
🏁 Script executed:
rg "stop_requested" crates/vm/src/vm/ -A 2 -B 2Repository: RustPython/RustPython
Length of output: 5985
🏁 Script executed:
cat -n crates/vm/src/vm/mod.rs | sed -n '250,330p'Repository: RustPython/RustPython
Length of output: 4299
🏁 Script executed:
rg "eval_breaker_tripped" crates/vm/src/vm/ -A 3 -B 1Repository: RustPython/RustPython
Length of output: 402
🏁 Script executed:
cat -n crates/vm/src/vm/mod.rs | sed -n '470,495p'Repository: RustPython/RustPython
Length of output: 1066
🏁 Script executed:
rg "eval_breaker_tripped" crates/vm/src/vm/mod.rs -A 10Repository: RustPython/RustPython
Length of output: 421
🏁 Script executed:
rg "check_signals" crates/vm/src/ -lRepository: RustPython/RustPython
Length of output: 306
🏁 Script executed:
fd "codeobject\|ceval" crates/vm/src/ --extension rsRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
rg "check_signals" crates/vm/src/vm/mod.rs -B 2 -A 5Repository: RustPython/RustPython
Length of output: 1238
Use Acquire for the stop-bit loads to ensure proper synchronization.
Lines 251 and 352 read stop_requested with Ordering::Relaxed, but park_detached_threads() publishes that bit with Ordering::Release in crates/vm/src/vm/mod.rs (lines 281, 304, and elsewhere). Since eval_breaker_tripped() gates suspension on these reads, stale loads can leave a requested thread running past stop-the-world.
Minimal ordering fix
- .is_some_and(|s| s.stop_requested.load(Ordering::Relaxed))
+ .is_some_and(|s| s.stop_requested.load(Ordering::Acquire))Apply the same change to both suspend_if_needed() (line 251) and stop_requested_for_current_thread() (line 352).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn suspend_if_needed(stw: &super::StopTheWorldState) { | |
| if !stw.requested.load(Ordering::Relaxed) { | |
| let should_suspend = CURRENT_THREAD_SLOT.with(|slot| { | |
| slot.borrow() | |
| .as_ref() | |
| .is_some_and(|s| s.stop_requested.load(Ordering::Relaxed)) | |
| }); | |
| pub fn suspend_if_needed(stw: &super::StopTheWorldState) { | |
| let should_suspend = CURRENT_THREAD_SLOT.with(|slot| { | |
| slot.borrow() | |
| .as_ref() | |
| .is_some_and(|s| s.stop_requested.load(Ordering::Acquire)) | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/vm/thread.rs` around lines 247 - 252, Reads of the stop-request
bit in suspend_if_needed and stop_requested_for_current_thread currently use
Ordering::Relaxed; change those .load(Ordering::Relaxed) calls on the
stop_requested AtomicBool to .load(Ordering::Acquire) so they synchronize with
park_detached_threads() which publishes the bit with Ordering::Release (ensuring
eval_breaker_tripped() sees the update and threads suspend correctly).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 1210-1247: The join_internal helper currently enforces public join
validations (calls check_started and checks VM finalization/current-ident) which
breaks callers from _shutdown; update join_internal to accept an internal flag
(e.g., for_shutdown: bool) or a separate private helper so _shutdown can bypass
public-only checks: remove or guard the check_started(inner, vm) call and the
finalization/current-thread checks inside join_internal (references:
join_internal, check_started, _shutdown, ThreadHandle_join, get_ident,
ThreadHandleState::Running, vm.state.finalizing) and ensure the public join()
wrapper still performs check_started and finalization checks before delegating
to join_internal(for_shutdown=false) while _shutdown calls
join_internal(for_shutdown=true) to skip those validations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 300d9b9c-9e25-44bf-b7b6-e9cd4c933f3a
📒 Files selected for processing (1)
crates/vm/src/stdlib/thread.rs
| fn join_internal( | ||
| inner: &Arc<parking_lot::Mutex<ThreadHandleInner>>, | ||
| done_event: &Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>, | ||
| timeout_duration: Option<Duration>, | ||
| vm: &VirtualMachine, | ||
| ) -> PyResult<()> { | ||
| // Convert timeout to Duration (None or negative = infinite wait) | ||
| let timeout_duration = match timeout.flatten() { | ||
| Some(Either::A(t)) if t >= 0.0 => Some(Duration::from_secs_f64(t)), | ||
| Some(Either::B(t)) if t >= 0 => Some(Duration::from_secs(t as u64)), | ||
| _ => None, | ||
| }; | ||
| Self::check_started(inner, vm)?; | ||
|
|
||
| // Check for self-join first | ||
| { | ||
| let inner = self.inner.lock(); | ||
| let current_ident = get_ident(); | ||
| if inner.ident == current_ident && inner.state == ThreadHandleState::Running { | ||
| return Err(vm.new_runtime_error("cannot join current thread")); | ||
| } | ||
| } | ||
| let deadline = | ||
| timeout_duration.and_then(|timeout| std::time::Instant::now().checked_add(timeout)); | ||
|
|
||
| // Wait for thread completion using Condvar (supports timeout) | ||
| // Loop to handle spurious wakeups | ||
| let (lock, cvar) = &*self.done_event; | ||
| let (lock, cvar) = &**done_event; | ||
| let mut done = lock.lock(); | ||
|
|
||
| // ThreadHandle_join semantics: self-join/finalizing checks | ||
| // apply only while target thread has not reported it is exiting yet. | ||
| if !*done { | ||
| let inner_guard = inner.lock(); | ||
| let current_ident = get_ident(); | ||
| if inner_guard.ident == current_ident | ||
| && inner_guard.state == ThreadHandleState::Running | ||
| { | ||
| return Err(vm.new_runtime_error("Cannot join current thread")); | ||
| } | ||
| if vm | ||
| .state | ||
| .finalizing | ||
| .load(core::sync::atomic::Ordering::Acquire) | ||
| { | ||
| return Err(vm.new_exception_msg( | ||
| vm.ctx.exceptions.python_finalization_error.to_owned(), | ||
| "cannot join thread at interpreter shutdown" | ||
| .to_owned() | ||
| .into(), | ||
| )); | ||
| } |
There was a problem hiding this comment.
Split public join() validation from the internal shutdown wait path.
Because _shutdown() now goes through this helper, the checks at Lines 1216-1247 and Lines 1318-1324 are no longer just public-API validation. _shutdown() can legitimately see Starting handles, and once finalization is active this helper will raise instead of waiting. That sends _shutdown() down the run_unraisable() path and it returns without joining the remaining non-daemon threads. Keep check_started() / finalization validation in the public join() wrapper, or add an internal mode that skips them for shutdown callers.
Also applies to: 1314-1325
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/thread.rs` around lines 1210 - 1247, The join_internal
helper currently enforces public join validations (calls check_started and
checks VM finalization/current-ident) which breaks callers from _shutdown;
update join_internal to accept an internal flag (e.g., for_shutdown: bool) or a
separate private helper so _shutdown can bypass public-only checks: remove or
guard the check_started(inner, vm) call and the finalization/current-thread
checks inside join_internal (references: join_internal, check_started,
_shutdown, ThreadHandle_join, get_ident, ThreadHandleState::Running,
vm.state.finalizing) and ensure the public join() wrapper still performs
check_started and finalization checks before delegating to
join_internal(for_shutdown=false) while _shutdown calls
join_internal(for_shutdown=true) to skip those validations.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
crates/vm/src/stdlib/thread.rs (1)
1210-1247:⚠️ Potential issue | 🔴 CriticalKeep
_shutdown()off the publicjoin()validation path.
join_internal()still runscheck_started()and the finalization guard even when_shutdown()is the caller. That means shutdown can legitimately hitStarting/NotStartedhandles from the registry, or trip thefinalizingcheck, and then_shutdown()logs the exception and returns early instead of waiting for the remaining non-daemon threads. The public checks need to stay injoin(), orjoin_internal()needs an internal mode that skips them for shutdown/fork cleanup callers.Possible direction
- fn join_internal( + fn join_internal( inner: &Arc<parking_lot::Mutex<ThreadHandleInner>>, done_event: &Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>, timeout_duration: Option<Duration>, + public_checks: bool, vm: &VirtualMachine, ) -> PyResult<()> { - Self::check_started(inner, vm)?; + if public_checks { + Self::check_started(inner, vm)?; + } let deadline = timeout_duration.and_then(|timeout| std::time::Instant::now().checked_add(timeout)); let (lock, cvar) = &**done_event; let mut done = lock.lock(); - if !*done { + if public_checks && !*done { let inner_guard = inner.lock(); let current_ident = get_ident(); if inner_guard.ident == current_ident && inner_guard.state == ThreadHandleState::Running { return Err(vm.new_runtime_error("Cannot join current thread")); } if vm.state.finalizing.load(core::sync::atomic::Ordering::Acquire) { return Err(...); } }- Self::join_internal(&self.inner, &self.done_event, timeout_duration, vm) + Self::join_internal(&self.inner, &self.done_event, timeout_duration, true, vm)- if let Err(exc) = ThreadHandle::join_internal(&inner, &done_event, None, vm) { + if let Err(exc) = + ThreadHandle::join_internal(&inner, &done_event, None, false, vm) + {Also applies to: 1314-1325
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/thread.rs` around lines 1210 - 1247, The join_internal() path runs public validations (check_started() and the vm.state.finalizing guard) even when invoked from internal shutdown/fork cleanup, so change the call boundary: move the public checks back into the public join() method and/or add an internal-only mode to join_internal() (e.g., add a bool flag like skip_public_checks or is_shutdown_call to join_internal(inner, done_event, timeout_duration, vm, skip_public_checks: bool)), have join() call join_internal(..., false) and _shutdown()/fork cleanup call join_internal(..., true), and inside join_internal() skip check_started() and the finalizing check when skip_public_checks is true; update all callers (including _shutdown()) to use the new flag and keep ThreadHandleState/ThreadHandle_join semantics intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 1210-1247: The join_internal() path runs public validations
(check_started() and the vm.state.finalizing guard) even when invoked from
internal shutdown/fork cleanup, so change the call boundary: move the public
checks back into the public join() method and/or add an internal-only mode to
join_internal() (e.g., add a bool flag like skip_public_checks or
is_shutdown_call to join_internal(inner, done_event, timeout_duration, vm,
skip_public_checks: bool)), have join() call join_internal(..., false) and
_shutdown()/fork cleanup call join_internal(..., true), and inside
join_internal() skip check_started() and the finalizing check when
skip_public_checks is true; update all callers (including _shutdown()) to use
the new flag and keep ThreadHandleState/ThreadHandle_join semantics intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: de154942-d61b-4972-bdb4-43be2e906d06
📒 Files selected for processing (1)
crates/vm/src/stdlib/thread.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
crates/vm/src/stdlib/thread.rs (1)
1211-1249:⚠️ Potential issue | 🔴 CriticalKeep shutdown waits off the public
join()validation path.
join_internal()still rejectsStartinghandles at Line 1217 and still raisesPythonFinalizationErrorat Lines 1237-1248._shutdown()legitimately hits both states while draining non-daemon threads, so this helper can still abort shutdown early and leave threads unjoined. Move those checks into the publicjoin()wrapper, or add an internalfor_shutdownmode that skips them.Refactor sketch
- fn join_internal( + fn join_internal( inner: &Arc<parking_lot::Mutex<ThreadHandleInner>>, done_event: &Arc<(parking_lot::Mutex<bool>, parking_lot::Condvar)>, timeout_duration: Option<Duration>, + for_shutdown: bool, vm: &VirtualMachine, ) -> PyResult<()> { - Self::check_started(inner, vm)?; + if !for_shutdown { + Self::check_started(inner, vm)?; + } let deadline = timeout_duration.and_then(|timeout| std::time::Instant::now().checked_add(timeout)); let (lock, cvar) = &**done_event; let mut done = lock.lock(); - if !*done { + if !*done && !for_shutdown { let inner_guard = inner.lock(); let current_ident = get_ident(); if inner_guard.ident == current_ident && inner_guard.state == ThreadHandleState::Running { @@ - Self::join_internal(&self.inner, &self.done_event, timeout_duration, vm) + Self::check_started(&self.inner, vm)?; + Self::join_internal(&self.inner, &self.done_event, timeout_duration, false, vm)- if let Err(exc) = ThreadHandle::join_internal(&inner, &done_event, None, vm) { + if let Err(exc) = + ThreadHandle::join_internal(&inner, &done_event, None, true, vm) + {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/thread.rs` around lines 1211 - 1249, join_internal currently enforces "Starting" and interpreter-finalizing checks (via Self::check_started and the PythonFinalizationError branch) which causes _shutdown to abort while draining threads; modify join_internal to accept a for_shutdown: bool parameter (or remove those checks here) and move the Starting-state verification and the vm.state.finalizing check into the public join() wrapper so normal joins still validate but calls from _shutdown can call join_internal(..., for_shutdown=true) (or call a new internal no-validate helper) to skip those two checks; update all callers (public join() and _shutdown) to call the appropriate variant and ensure unique symbols referenced: join_internal, check_started, ThreadHandle_join semantic check, and _shutdown are adjusted accordingly.crates/vm/src/vm/mod.rs (2)
1932-1937:⚠️ Potential issue | 🔴 CriticalUse
Acquirefor the finalization gate ineval_breaker_tripped().Line 1935 still does a Relaxed read even though this branch decides whether
check_signals()runs at all. A stalefalsehere skips the later Acquire load incheck_signals(), so non-main threads can miss finalization on weak memory models.Proposed fix
#[cfg(feature = "threading")] - if self.state.finalizing.load(Ordering::Relaxed) && !self.is_main_thread() { + if self.state.finalizing.load(Ordering::Acquire) && !self.is_main_thread() { return true; }#!/bin/bash set -euo pipefail printf '\n== eval_breaker_tripped ==\n' sed -n '1932,1958p' crates/vm/src/vm/mod.rs printf '\n== bytecode-loop gate ==\n' sed -n '1198,1206p' crates/vm/src/frame.rs printf '\n== finalizing loads/stores ==\n' rg -n -C2 'finalizing\.(load|store)\(' crates/vm/src🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/mod.rs` around lines 1932 - 1937, The read of the finalization gate in eval_breaker_tripped() uses Ordering::Relaxed which can let a stale false skip check_signals() on weak memory models; change the load to use Ordering::Acquire (i.e., replace state.finalizing.load(Ordering::Relaxed) with state.finalizing.load(Ordering::Acquire)) so that the subsequent check_signals()'s synchronization is not bypassed, leaving all other logic in eval_breaker_tripped() and callers like check_signals() unchanged.
332-339:⚠️ Potential issue | 🟡 MinorReset
stats_last_wait_nson the zero-thread fast path.When Line 332 returns immediately,
stats_snapshot()keeps reporting the previous stop's wait time even though this stop waited0ns. Clearstats_last_wait_nsbefore returning.Proposed fix
if initial_countdown == 0 { + self.stats_last_wait_ns.store(0, Ordering::Relaxed); self.world_stopped.store(true, Ordering::Release); #[cfg(debug_assertions)] self.debug_assert_all_non_requester_suspended(vm); stw_trace(format_args!( "stop end requester={requester_ident} wait_ns=0 polls=0"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/mod.rs` around lines 332 - 339, The zero-thread fast path returns without resetting the last-stop duration, so stats_snapshot() reports stale wait time; in the initial_countdown == 0 branch reset self.stats_last_wait_ns to zero (use the same atomic store pattern and Ordering used elsewhere in this module) before the stw_trace/return (after debug_assert_all_non_requester_suspended(vm) if present) so the reported wait_ns is correctly 0 for this stop.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 994-997: The code currently drops the result of
crate::stdlib::warnings::warn (using let _ = ...) which prevents
warning-as-error filters from taking effect; change the call site to propagate
the PyResult by removing the discard and using the try operator (i.e. return the
warn() result), change warn_if_multi_threaded to return PyResult<()> (update its
signature and return points) and propagate that result at each call site (e.g.,
where warn_if_multi_threaded("fork", num_os_threads, vm) is invoked), and update
the surrounding comment to state that CPython propagates warning errors rather
than silently ignoring them.
---
Duplicate comments:
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 1211-1249: join_internal currently enforces "Starting" and
interpreter-finalizing checks (via Self::check_started and the
PythonFinalizationError branch) which causes _shutdown to abort while draining
threads; modify join_internal to accept a for_shutdown: bool parameter (or
remove those checks here) and move the Starting-state verification and the
vm.state.finalizing check into the public join() wrapper so normal joins still
validate but calls from _shutdown can call join_internal(..., for_shutdown=true)
(or call a new internal no-validate helper) to skip those two checks; update all
callers (public join() and _shutdown) to call the appropriate variant and ensure
unique symbols referenced: join_internal, check_started, ThreadHandle_join
semantic check, and _shutdown are adjusted accordingly.
In `@crates/vm/src/vm/mod.rs`:
- Around line 1932-1937: The read of the finalization gate in
eval_breaker_tripped() uses Ordering::Relaxed which can let a stale false skip
check_signals() on weak memory models; change the load to use Ordering::Acquire
(i.e., replace state.finalizing.load(Ordering::Relaxed) with
state.finalizing.load(Ordering::Acquire)) so that the subsequent
check_signals()'s synchronization is not bypassed, leaving all other logic in
eval_breaker_tripped() and callers like check_signals() unchanged.
- Around line 332-339: The zero-thread fast path returns without resetting the
last-stop duration, so stats_snapshot() reports stale wait time; in the
initial_countdown == 0 branch reset self.stats_last_wait_ns to zero (use the
same atomic store pattern and Ordering used elsewhere in this module) before the
stw_trace/return (after debug_assert_all_non_requester_suspended(vm) if present)
so the reported wait_ns is correctly 0 for this stop.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: a2aa3b12-c448-410a-8e2d-4b744c66252d
📒 Files selected for processing (3)
crates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/mod.rs
| // Match PyErr_WarnFormat(..., stacklevel=1) in CPython. | ||
| // Best effort: ignore failures like CPython does in this path. | ||
| let _ = | ||
| crate::stdlib::warnings::warn(vm.ctx.exceptions.deprecation_warning, msg, 1, vm); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and read the relevant section of posix.rs
wc -l crates/vm/src/stdlib/posix.rsRepository: RustPython/RustPython
Length of output: 99
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '990,1010p' crates/vm/src/stdlib/posix.rsRepository: RustPython/RustPython
Length of output: 866
🏁 Script executed:
# Find the warn_if_multi_threaded function definition to see its current signature
rg -n "fn warn_if_multi_threaded" crates/vm/src/stdlib/posix.rs -A 10Repository: RustPython/RustPython
Length of output: 689
🏁 Script executed:
# Find the warnings::warn function to verify its return type
rg -n "fn warn" crates/vm/src/stdlib/warnings.rs -A 5Repository: RustPython/RustPython
Length of output: 820
🏁 Script executed:
# Check where warn_if_multi_threaded is called to understand the context
rg -n "warn_if_multi_threaded" crates/vm/src/stdlib/posix.rsRepository: RustPython/RustPython
Length of output: 225
🏁 Script executed:
# Search for any related discussions or TODOs about fork warning handling
rg -i "cpython|fork" crates/vm/src/stdlib/posix.rs -B 2 -A 2 | grep -A 5 -B 5 "warn\|error"Repository: RustPython/RustPython
Length of output: 1656
🏁 Script executed:
# Check if there are any tests for the fork warning behavior
fd -e rs | xargs rg "test.*fork\|fork.*test" -l | head -5Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's check the broader context of the fork function to understand the error handling strategy
sed -n '1025,1060p' crates/vm/src/stdlib/posix.rsRepository: RustPython/RustPython
Length of output: 1241
🏁 Script executed:
# Check if there are any other similar patterns where PyResult is dropped in critical functions
rg "let _ = .*warn" crates/vm/src/stdlib/posix.rs -B 3 -A 3Repository: RustPython/RustPython
Length of output: 47
🌐 Web query:
CPython fork() warning error handling deprecation posixmodule.c
💡 Result:
In CPython, the “fork() in a multi-threaded process” DeprecationWarning is implemented in Modules/posixmodule.c (helper often named like warn_about_fork_with_threads()), and it’s emitted when CPython can detect more than 1 thread in the current process:
- It tries to detect the thread count via OS-specific means (e.g., Linux
/proc/self/statthread field), and if that fails it falls back to counting what thethreadingmodule knows about (threading._active+threading._limbo). (raw.githubusercontent.com) - If
num_threads > 1, it callsPyErr_WarnFormat(PyExc_DeprecationWarning, ...)with the message:
“This process (pid=...) is multi-threaded, use offork()may lead to deadlocks in the child.” (raw.githubusercontent.com) - This behavior is documented as a Python 3.12 change (“
os.fork()can now raise aDeprecationWarning…; use a different multiprocessing start method likespawnorforkserver”). (docs.python.org)
The “warning vs error” handling difference (why it sometimes doesn’t fail under -Werror)
There has been a real behavioral difference across versions/branches in how the C code handles the return/exception state from PyErr_WarnFormat:
- In the 3.12.0 code, after issuing the warning, CPython clears the warning/error indicator unconditionally (
PyErr_Clear()), meaning that even if your warnings filter turnsDeprecationWarninginto an exception (e.g.-W error), the exception state can get swallowed insideposixmodule.crather than propagating to Python code. (raw.githubusercontent.com) - In current
main, the helper returns the result ofPyErr_WarnFormat(...)to the caller (instead of clearing it), allowing “warnings as errors” configurations to actually behave like errors. (raw.githubusercontent.com)
Practical takeaway
- If you want this to be fatal in CI: be aware that some released versions (notably early 3.12 code paths) may clear the exception state in C and thus won’t reliably fail with
-W errorin all scenarios. (raw.githubusercontent.com) - If you want to fix the root cause: don’t
fork()from a multi-threaded process—switch tomultiprocessingstart methods (spawn/forkserver) or usesubprocess/posix_spawnstyle process creation where applicable. (docs.python.org)
Citations:
- 1: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c
- 2: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c
- 3: https://docs.python.org/3/whatsnew/3.12.html
- 4: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c
- 5: https://raw.githubusercontent.com/python/cpython/main/Modules/posixmodule.c
- 6: https://raw.githubusercontent.com/python/cpython/v3.12.0/Modules/posixmodule.c
- 7: https://docs.python.org/3/whatsnew/3.12.html
Propagate warning errors instead of silently dropping them.
The current code uses let _ = to drop the PyResult from warnings::warn(), which prevents warning-as-error filters from working. If a user configures DeprecationWarning as an error via warning filters, os.fork() will still succeed instead of raising an exception. This diverges from CPython's current behavior, which propagates the warning result.
Suggested changes:
- Remove the error-dropping pattern and propagate the result:
crate::stdlib::warnings::warn(
vm.ctx.exceptions.deprecation_warning,
msg,
1,
vm,
)?;- Update
warn_if_multi_threadedto returnPyResult<()>and propagate at the call site:
fn warn_if_multi_threaded(
name: &str,
num_os_threads: isize,
vm: &VirtualMachine,
) -> PyResult<()> {
// ...
}
warn_if_multi_threaded("fork", num_os_threads, vm)?;Update the comment to accurately reflect the current CPython behavior (propagating rather than ignoring).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/posix.rs` around lines 994 - 997, The code currently
drops the result of crate::stdlib::warnings::warn (using let _ = ...) which
prevents warning-as-error filters from taking effect; change the call site to
propagate the PyResult by removing the discard and using the try operator (i.e.
return the warn() result), change warn_if_multi_threaded to return PyResult<()>
(update its signature and return points) and propagate that result at each call
site (e.g., where warn_if_multi_threaded("fork", num_os_threads, vm) is
invoked), and update the surrounding comment to state that CPython propagates
warning errors rather than silently ignoring them.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/vm/src/stdlib/posix.rs (1)
904-923: macOS: Callingtask_for_pidon self is unnecessary overhead.Using
task_for_pid(mach_task_self(), getpid(), &mut proc_task)is redundant sincemach_task_self()already returns the task port for the current process. Additionally,task_for_pidtypically requires special entitlements (taskgated) when used for other processes, and while it works for self, it's an extra syscall.♻️ Suggested simplification
- let self_task = unsafe { mach_task_self() }; - let mut proc_task: MachPortT = 0; - if unsafe { task_for_pid(self_task, libc::getpid(), &mut proc_task) } == KERN_SUCCESS { - let mut threads: ThreadActArrayT = core::ptr::null_mut(); - let mut n_threads: MachMsgTypeNumberT = 0; - if unsafe { task_threads(proc_task, &mut threads, &mut n_threads) } == KERN_SUCCESS + let self_task = unsafe { mach_task_self() }; + let mut threads: ThreadActArrayT = core::ptr::null_mut(); + let mut n_threads: MachMsgTypeNumberT = 0; + if unsafe { task_threads(self_task, &mut threads, &mut n_threads) } == KERN_SUCCESS + { + if !threads.is_null() { + let _ = unsafe { + vm_deallocate( + self_task, + threads as libc::uintptr_t, + (n_threads as usize * core::mem::size_of::<MachPortT>()) + as libc::uintptr_t, + ) + }; + } + return n_threads as isize; + } + 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/posix.rs` around lines 904 - 923, The code redundantly calls task_for_pid(self_task, getpid(), &mut proc_task); replace that flow by assigning proc_task = self_task (from mach_task_self()) and then call task_threads(proc_task, &mut threads, &mut n_threads) as before; keep the existing vm_deallocate call using self_task and threads, and return n_threads as isize, removing the task_for_pid branch and its associated conditional while preserving error handling around task_threads and vm_deallocate (symbols: mach_task_self, proc_task, task_for_pid, task_threads, vm_deallocate, n_threads).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 926-946: The current parsing of /proc/self/stat is fragile because
split_whitespace() will break when the comm field (inside parentheses) contains
spaces; instead, after reading the file and converting to string, find the index
of the last ')' in the line, take the substring after that character, then
split_whitespace() on that remainder and parse the desired field (the same field
you previously accessed with nth(19)) from the resultant iterator (preserve the
existing return behavior using parse::<isize>().unwrap_or(0)); replace the
direct line.split_whitespace().nth(19) usage with this "find last ')', take
substring after it, then split" approach to reliably locate the field even when
comm contains spaces.
---
Nitpick comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 904-923: The code redundantly calls task_for_pid(self_task,
getpid(), &mut proc_task); replace that flow by assigning proc_task = self_task
(from mach_task_self()) and then call task_threads(proc_task, &mut threads, &mut
n_threads) as before; keep the existing vm_deallocate call using self_task and
threads, and return n_threads as isize, removing the task_for_pid branch and its
associated conditional while preserving error handling around task_threads and
vm_deallocate (symbols: mach_task_self, proc_task, task_for_pid, task_threads,
vm_deallocate, n_threads).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: d8154526-8adb-41ea-84aa-3bb0378c4963
📒 Files selected for processing (1)
crates/vm/src/stdlib/posix.rs
| #[cfg(target_os = "linux")] | ||
| { | ||
| use std::io::Read as _; | ||
| let mut file = match std::fs::File::open("/proc/self/stat") { | ||
| Ok(f) => f, | ||
| Err(_) => return 0, | ||
| }; | ||
| let mut buf = [0u8; 160]; | ||
| let n = match file.read(&mut buf) { | ||
| Ok(n) => n, | ||
| Err(_) => return 0, | ||
| }; | ||
| let line = match core::str::from_utf8(&buf[..n]) { | ||
| Ok(s) => s, | ||
| Err(_) => return 0, | ||
| }; | ||
| if let Some(field) = line.split_whitespace().nth(19) { | ||
| return field.parse::<isize>().unwrap_or(0); | ||
| } | ||
| 0 | ||
| } |
There was a problem hiding this comment.
Linux: /proc/self/stat parsing is fragile when comm contains spaces.
The comm field (field 2 in /proc/self/stat) is enclosed in parentheses and can contain spaces, parentheses, or newlines. Using split_whitespace().nth(19) will yield incorrect results if the process name contains spaces.
For example, a process named "my app" produces:
123 (my app) S 456 ...
split_whitespace() splits this into ["123", "(my", "app)", "S", ...], making nth(19) point to the wrong field.
CPython handles this by finding the last ) to locate the end of the comm field, then counting fields from there.
🛡️ Suggested fix to handle comm field correctly
#[cfg(target_os = "linux")]
{
use std::io::Read as _;
let mut file = match std::fs::File::open("/proc/self/stat") {
Ok(f) => f,
Err(_) => return 0,
};
let mut buf = [0u8; 160];
let n = match file.read(&mut buf) {
Ok(n) => n,
Err(_) => return 0,
};
let line = match core::str::from_utf8(&buf[..n]) {
Ok(s) => s,
Err(_) => return 0,
};
- if let Some(field) = line.split_whitespace().nth(19) {
- return field.parse::<isize>().unwrap_or(0);
+ // Find end of comm field (last ')') to handle spaces in process name
+ let after_comm = match line.rfind(')') {
+ Some(idx) => &line[idx + 1..],
+ None => return 0,
+ };
+ // Field 20 (num_threads) is the 17th field after comm (fields 3-19 are 17 fields)
+ if let Some(field) = after_comm.split_whitespace().nth(17) {
+ return field.parse::<isize>().unwrap_or(0);
}
0
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/posix.rs` around lines 926 - 946, The current parsing of
/proc/self/stat is fragile because split_whitespace() will break when the comm
field (inside parentheses) contains spaces; instead, after reading the file and
converting to string, find the index of the last ')' in the line, take the
substring after that character, then split_whitespace() on that remainder and
parse the desired field (the same field you previously accessed with nth(19))
from the resultant iterator (preserve the existing return behavior using
parse::<isize>().unwrap_or(0)); replace the direct
line.split_whitespace().nth(19) usage with this "find last ')', take substring
after it, then split" approach to reliably locate the field even when comm
contains spaces.
- Fix park_detached_threads: successful CAS no longer sets all_suspended=false, avoiding unnecessary polling rounds - Replace park_timeout(50µs) with park() in wait_while_suspended - Remove redundant self-suspension in attach_thread and detach_thread; the STW controller handles DETACHED→SUSPENDED via park_detached_threads - Add double-check under mutex before condvar wait to prevent lost wakes - Remove dead stats_detach_wait_yields field and add_detach_wait_yields
Like CPython's ThreadHandle_start, set RUNNING state in the parent thread immediately after spawn() succeeds, rather than in the child. This eliminates a race where join() could see Starting state if called before the child thread executes. Also reverts the macOS skip for test_start_new_thread_failed since the root cause is fixed.
…ad_at_finalization
Add lock_wrapped to ThreadMutex for detaching thread state while waiting on contended locks. Use it for buffered and text IO locks. Wrap FileIO read/write in allow_threads via crt_fd to prevent STW hangs on blocking file operations.
Replace parking_lot Mutex/Condvar with std::sync (pthread-based) for started_event and handle_ready_event. This prevents hangs in forked children where parking_lot's global HASHTABLE may be corrupted.
* apply more allow_threads * Simplify STW thread state transitions - Fix park_detached_threads: successful CAS no longer sets all_suspended=false, avoiding unnecessary polling rounds - Replace park_timeout(50µs) with park() in wait_while_suspended - Remove redundant self-suspension in attach_thread and detach_thread; the STW controller handles DETACHED→SUSPENDED via park_detached_threads - Add double-check under mutex before condvar wait to prevent lost wakes - Remove dead stats_detach_wait_yields field and add_detach_wait_yields * Representable for ThreadHandle * Set ThreadHandle state to Running in parent thread after spawn Like CPython's ThreadHandle_start, set RUNNING state in the parent thread immediately after spawn() succeeds, rather than in the child. This eliminates a race where join() could see Starting state if called before the child thread executes. Also reverts the macOS skip for test_start_new_thread_failed since the root cause is fixed. * Set ThreadHandle state to Running in parent thread after spawn * Add debug_assert for thread state in start_the_world * Unskip now-passing test_get_event_loop_thread and test_start_new_thread_at_finalization * Wrap IO locks and file ops in allow_threads Add lock_wrapped to ThreadMutex for detaching thread state while waiting on contended locks. Use it for buffered and text IO locks. Wrap FileIO read/write in allow_threads via crt_fd to prevent STW hangs on blocking file operations. * Use std::sync for thread start/ready events Replace parking_lot Mutex/Condvar with std::sync (pthread-based) for started_event and handle_ready_event. This prevents hangs in forked children where parking_lot's global HASHTABLE may be corrupted.
Summary by CodeRabbit
New Features
Bug Fixes
Performance