Clear frame locals and stack on generator close + Add dir_fd support for rmdir, remove/unlink, scandir#7222
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughMoves cell/free storage into FrameState with new accessors and cleanup calls; adds fd-based scandir/rmdir/unlink support with OwnedDir and ScandirIter/DirEntry fd fields; switches WASI OsStrExt UTF‑8 validation to core::str::from_utf8. Changes
Sequence DiagramsequenceDiagram
participant Py as Python
participant VM as "VM (os module)"
participant OwnedDir as "OwnedDir"
participant LibC as "libc"
participant DirEntry as "DirEntry"
Py->>VM: scandir(dir_fd=fd)
VM->>OwnedDir: OwnedDir::from_fd(fd) / dup fd
OwnedDir->>LibC: fdopendir(fd)
LibC-->>OwnedDir: DIR*
loop for each entry
VM->>LibC: readdir(dir_ptr)
LibC-->>VM: dirent*
VM->>DirEntry: build(name, d_type?, dir_fd?, stat info)
VM->>Py: yield DirEntry
end
Py->>VM: close iterator
VM->>OwnedDir: drop -> rewinddir/closedir(dir_ptr)
OwnedDir->>LibC: closedir(dir_ptr)
LibC-->>OwnedDir: ok
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin gen-os |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] test: cpython/Lib/test/test_generators.py (TODO: 12) dependencies: dependent tests: (no tests depend on generator) [x] lib: cpython/Lib/os.py dependencies:
dependent tests: (165 tests)
Legend:
|
915a617 to
7737a37
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/os.rs (1)
608-623:⚠️ Potential issue | 🔴 CriticalThe safety comment at line 643-645 is incorrect and misleading.
The comment claims the fd is "borrowed for the lifetime of this DirEntry reference," but
DirEntryis aPyPayloadwith no lifetime parameter—it stores a barecrt_fd::Raw(a primitivei32), not aBorrowed<'fd>. This means:
- Python code can hold
DirEntryobjects indefinitely after the iterator closes.- If the user's original file descriptor is closed (either by dropping the iterator's
OwnedDiror by the user closingfnodirectly), subsequent calls toDirEntry.stat()orDirEntry.is_*()will fail withEBADFwhen they attemptfstatat().- The current
stat_dir_fd()implementation (lines 640-649) creates aBorrowedfrom the raw fd, but this has no runtime enforcement—the fd could be invalid.While this mirrors CPython's behavior (fd-based
DirEntry.stat()fails if the fd is closed), the misleading safety comment should be removed or replaced with a clear warning. Consider adding either a runtime check (e.g., validate the fd before use) or explicit documentation thatDirEntryobjects are unsafe to use after the iterator or originating fd is closed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 608 - 623, Replace the incorrect safety comment on DirEntry (referencing raw crt_fd::Raw and Borrowed) with a clear warning: state that DirEntry stores a raw fd (crt_fd::Raw) with no lifetime guarantees, so DirEntry.stat(), DirEntry.is_*(), and stat_dir_fd() may fail with EBADF if the original fd (OwnedDir or the user's fd) is closed; update the comment near stat_dir_fd(), DirEntry, and any usages to reflect this behavior and suggest that callers must not rely on the fd remaining valid or should validate the fd before use (optionally implement a runtime fd-validity check if desired).
🧹 Nitpick comments (3)
crates/vm/src/stdlib/os.rs (3)
1026-1034: Therewinddircall inDropappears unnecessary and the comment is misleading.The comment in
listdir(lines 437-438) states thatOwnedDir::drop"restores the original fd's directory position," but this is incorrect. TheOwnedDirwraps a duplicated fd (fromnix::unistd::dup), andrewinddiron the dup'd directory stream does not affect the original fd's position—they are independent.If the intent is just cleanup,
rewinddirbeforeclosedirserves no purpose. Consider removing it:♻️ Proposed simplification
impl Drop for OwnedDir { fn drop(&mut self) { unsafe { - libc::rewinddir(self.0.as_ptr()); libc::closedir(self.0.as_ptr()); } } }Also update the misleading comment in
listdir(lines 437-438).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 1026 - 1034, The Drop impl for OwnedDir currently calls libc::rewinddir then libc::closedir, but rewinddir on the duplicated DIR does not affect the original FD and is unnecessary; remove the unsafe libc::rewinddir(self.0.as_ptr()) call from impl Drop for OwnedDir and leave only libc::closedir(self.0.as_ptr()) for cleanup, and update the comment in the listdir function that currently claims OwnedDir::drop "restores the original fd's directory position" to reflect that the dup'd directory stream is independent and OwnedDir::drop only closes the duplicated DIR.
637-649: Clarify the comment about fd origin.The comment on line 643-644 states "the fd came from os.open()" but for fd-based scandir,
orig_fdis the fd passed toscandir(), not necessarily fromos.open(). Consider clarifying:- // Safety: the fd came from os.open() and is borrowed for - // the lifetime of this DirEntry reference. + // Safety: the fd is the original fd passed to scandir() and must + // remain valid for the lifetime of this DirEntry.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 637 - 649, Clarify the comment inside stat_dir_fd: the stored fd (self.dir_fd) may be the fd passed into scandir (orig_fd) rather than always coming from os.open(); update the comment to state that the fd was provided by the caller or produced by fd-based scandir (e.g., "the fd was provided to scandir/originated from the caller, not necessarily from os.open()"), and keep the safety justification about borrowing via crt_fd::Borrowed::borrow_raw intact; reference the stat_dir_fd method and the dir_fd field in your wording.
1043-1048: TheSend/Syncimplementation relies on external synchronization invariants.The
unsafe impl Send/Sync for OwnedDiris sound only because all accesses withinScandirIteratorFdare protected byPyMutex. However,OwnedDiris also used directly inlistdir(lines 433-459) without synchronization—though that's safe since it's stack-local and single-threaded.Consider adding a more explicit safety comment documenting these invariants, or making
OwnedDir!Send + !Syncby default and only wrapping it in aSend+Syncnewtype where synchronization is guaranteed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 1043 - 1048, The current unsafe impls for Send and Sync on OwnedDir rely on external synchronization (ScandirIteratorFd's PyMutex) and on stack-local use in listdir, which isn't obvious; either document this invariant clearly or change the type to be non-Send/Sync and only expose a synchronized wrapper: remove or avoid unsafe impl Send/Sync for OwnedDir, add a detailed safety comment on OwnedDir explaining that it must only be accessed under the PyMutex in ScandirIteratorFd (and that listdir uses a stack-local OwnedDir), and if you prefer stricter type safety create a new wrapper type (e.g., ScopedOwnedDirSync or OwnedDirGuard) that implements Send+Sync and is the only type used inside ScandirIteratorFd so the synchronization invariant is enforced by types rather than comments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/frame.rs`:
- Around line 226-234: clear_locals_and_stack currently clears stack and
fastlocals but leaves Frame::cells_frees populated so cell-held PyCellRef
objects remain alive; update clear_locals_and_stack to also release/clear the
frame's cell references (Frame::cells_frees) when closing a frame: acquire the
appropriate lock/guard for cells_frees and drop or replace its entries (or clear
the container) so cell-held objects are released, analogous to how fastlocals is
zeroed, ensuring all frame-owned references are freed on close.
In `@crates/vm/src/stdlib/os.rs`:
- Around line 651-672: test_mode_via_stat currently calls
self.stat(DirFd::default(), ...) which ignores the stored directory FD and
causes stat to fail for fd-based scandir entries; change the call to use
self.stat_dir_fd() as the DirFd argument so stat is performed relative to the
stored directory FD (keep FollowSymlinks(follow_symlinks), same error handling),
updating the invocation inside test_mode_via_stat to pass self.stat_dir_fd()
instead of DirFd::default().
---
Outside diff comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 608-623: Replace the incorrect safety comment on DirEntry
(referencing raw crt_fd::Raw and Borrowed) with a clear warning: state that
DirEntry stores a raw fd (crt_fd::Raw) with no lifetime guarantees, so
DirEntry.stat(), DirEntry.is_*(), and stat_dir_fd() may fail with EBADF if the
original fd (OwnedDir or the user's fd) is closed; update the comment near
stat_dir_fd(), DirEntry, and any usages to reflect this behavior and suggest
that callers must not rely on the fd remaining valid or should validate the fd
before use (optionally implement a runtime fd-validity check if desired).
---
Nitpick comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 1026-1034: The Drop impl for OwnedDir currently calls
libc::rewinddir then libc::closedir, but rewinddir on the duplicated DIR does
not affect the original FD and is unnecessary; remove the unsafe
libc::rewinddir(self.0.as_ptr()) call from impl Drop for OwnedDir and leave only
libc::closedir(self.0.as_ptr()) for cleanup, and update the comment in the
listdir function that currently claims OwnedDir::drop "restores the original
fd's directory position" to reflect that the dup'd directory stream is
independent and OwnedDir::drop only closes the duplicated DIR.
- Around line 637-649: Clarify the comment inside stat_dir_fd: the stored fd
(self.dir_fd) may be the fd passed into scandir (orig_fd) rather than always
coming from os.open(); update the comment to state that the fd was provided by
the caller or produced by fd-based scandir (e.g., "the fd was provided to
scandir/originated from the caller, not necessarily from os.open()"), and keep
the safety justification about borrowing via crt_fd::Borrowed::borrow_raw
intact; reference the stat_dir_fd method and the dir_fd field in your wording.
- Around line 1043-1048: The current unsafe impls for Send and Sync on OwnedDir
rely on external synchronization (ScandirIteratorFd's PyMutex) and on
stack-local use in listdir, which isn't obvious; either document this invariant
clearly or change the type to be non-Send/Sync and only expose a synchronized
wrapper: remove or avoid unsafe impl Send/Sync for OwnedDir, add a detailed
safety comment on OwnedDir explaining that it must only be accessed under the
PyMutex in ScandirIteratorFd (and that listdir uses a stack-local OwnedDir), and
if you prefer stricter type safety create a new wrapper type (e.g.,
ScopedOwnedDirSync or OwnedDirGuard) that implements Send+Sync and is the only
type used inside ScandirIteratorFd so the synchronization invariant is enforced
by types rather than comments.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_os.pyis excluded by!Lib/**
📒 Files selected for processing (5)
crates/common/src/os.rscrates/vm/src/coroutine.rscrates/vm/src/frame.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
crates/vm/src/stdlib/os.rs (2)
1187-1195: Consider extracting shareddup+fdopendirsetup logic.This block and the
listdirfd branch (Line 431-Line 436) duplicate the same setup/error-handling pattern. A tiny helper would reduce drift risk.As per coding guidelines: When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 1187 - 1195, Extract the duplicated "dup the fd, convert to raw_fd, call OwnedDir::from_fd and close the dup on error" sequence into a small helper (e.g. dup_fd_to_owned_dir or into_owned_dir) that takes the original file descriptor (fno) and the VM/context and returns Result<OwnedDir, PyException>; replace both places (the current block using nix::unistd::dup + IntoRawFd + OwnedDir::from_fd and the listdir fd branch) to call this helper so the dup/cleanup/error mapping logic is centralized and the branches only provide the differing fd value.
370-383: MoveCStringconversion into thedir_fdsyscall branch.Line 370 does conversion even when
fs::remove_dirfallback is used. Scoping it to theunlinkatpath avoids unnecessary conversion work and keeps target-specific builds cleaner.♻️ Suggested refactor
- let c_path = path.clone().into_cstring(vm)?; #[cfg(not(target_os = "redox"))] if let Some(fd) = dir_fd.raw_opt() { + let c_path = path.clone().into_cstring(vm)?; let res = unsafe { libc::unlinkat(fd, c_path.as_ptr(), libc::AT_REMOVEDIR) }; return if res < 0 { let err = crate::common::os::errno_io_error(); Err(OSErrorBuilder::with_filename(&err, path, vm)) } else { Ok(()) }; }As per coding guidelines: Always run clippy to lint Rust code (
cargo clippy) before completing tasks and fix any warnings or lints that are introduced by your changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 370 - 383, The CString conversion (path.clone().into_cstring(vm)) is done unconditionally but is only needed for the unlinkat syscall path; move the conversion into the non-Redox branch that handles dir_fd.raw_opt() so you only call path.clone().into_cstring(vm) when preparing the c_path for unsafe { libc::unlinkat(...) } and return the same OSErrorBuilder::with_filename(...) on error; leave the target_os = "redox" handling and the fs::remove_dir fallback untouched and ensure you still call into_cstring on a cloned path inside the if let Some(fd) = dir_fd.raw_opt() block to avoid unnecessary work and clippy warnings.crates/vm/src/stdlib/posix.rs (1)
515-516: Scope CString conversion to theunlinkatpath only.Line 515 converts
pathtoCStringunconditionally, but the value is only needed for theunlinkatbranch. Moving the conversion inside that branch keeps fallback behavior leaner and avoids target-specific unused-work patterns.♻️ Suggested refactor
- let c_path = path.clone().into_cstring(vm)?; #[cfg(not(target_os = "redox"))] if let Some(fd) = dir_fd.raw_opt() { + let c_path = path.clone().into_cstring(vm)?; let res = unsafe { libc::unlinkat(fd, c_path.as_ptr(), 0) }; return if res < 0 { let err = crate::common::os::errno_io_error(); Err(OSErrorBuilder::with_filename(&err, path, vm)) } else { Ok(()) }; }As per coding guidelines: Always run clippy to lint Rust code (
cargo clippy) before completing tasks and fix any warnings or lints that are introduced by your changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/posix.rs` around lines 515 - 516, Move the CString conversion so it only happens when needed by the unlinkat code path: remove the unconditional let c_path = path.clone().into_cstring(vm)?; and instead perform path.clone().into_cstring(vm)? inside the #[cfg(not(target_os = "redox"))] branch before calling unlinkat (referencing c_path and unlinkat). This avoids doing work on targets that don't use unlinkat and keeps fallback behavior unchanged; run cargo clippy and fix any new lints after the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 727-739: The is_symlink path currently returns a synthetic
Err("file_type unavailable") which discards the original IO error; change it to
propagate the underlying file_type error (convert it to a PyException via
into_pyexception(vm)) and preserve the NotFound -> Ok(false) behavior used by
is_dir/is_file; specifically, adjust the fallback in the is_symlink method that
currently hits the #[cfg(not(unix))] branch so that if self.file_type is Err(e)
you convert and return that e.into_pyexception(vm), and only produce Ok(false)
for NotFound errors before falling back to test_mode_via_stat(self, false,
libc::S_IFLNK as _, vm) on unix as currently done.
---
Nitpick comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 1187-1195: Extract the duplicated "dup the fd, convert to raw_fd,
call OwnedDir::from_fd and close the dup on error" sequence into a small helper
(e.g. dup_fd_to_owned_dir or into_owned_dir) that takes the original file
descriptor (fno) and the VM/context and returns Result<OwnedDir, PyException>;
replace both places (the current block using nix::unistd::dup + IntoRawFd +
OwnedDir::from_fd and the listdir fd branch) to call this helper so the
dup/cleanup/error mapping logic is centralized and the branches only provide the
differing fd value.
- Around line 370-383: The CString conversion (path.clone().into_cstring(vm)) is
done unconditionally but is only needed for the unlinkat syscall path; move the
conversion into the non-Redox branch that handles dir_fd.raw_opt() so you only
call path.clone().into_cstring(vm) when preparing the c_path for unsafe {
libc::unlinkat(...) } and return the same OSErrorBuilder::with_filename(...) on
error; leave the target_os = "redox" handling and the fs::remove_dir fallback
untouched and ensure you still call into_cstring on a cloned path inside the if
let Some(fd) = dir_fd.raw_opt() block to avoid unnecessary work and clippy
warnings.
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 515-516: Move the CString conversion so it only happens when
needed by the unlinkat code path: remove the unconditional let c_path =
path.clone().into_cstring(vm)?; and instead perform
path.clone().into_cstring(vm)? inside the #[cfg(not(target_os = "redox"))]
branch before calling unlinkat (referencing c_path and unlinkat). This avoids
doing work on targets that don't use unlinkat and keeps fallback behavior
unchanged; run cargo clippy and fix any new lints after the change.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_os.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/common/src/os.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/common/src/os.rs
Add Frame::clear_locals_and_stack() to release references held by closed generators/coroutines, matching _PyFrame_ClearLocals behavior. Call it from Coro::close() after marking the coroutine as closed. Update test_generators.py expectedFailure markers accordingly.
- rmdir: use unlinkat(fd, path, AT_REMOVEDIR) when dir_fd given - remove/unlink: use unlinkat(fd, path, 0) when dir_fd given - scandir: accept fd via fdopendir, add ScandirIteratorFd - listdir: rewrite fd path to use raw readdir instead of nix::dir::Dir - DirEntry: add d_type and dir_fd fields for fd-based scandir - Update supports_fd/supports_dir_fd entries accordingly
8d71bb1 to
fc51504
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
222-227: Doc comment is misleading: fastlocals still holds cell references after this call.
clear_stack_and_cellsclearsstate.cells_frees, but the samePyCellRefobjects are also stored infastlocals[nlocals..](set at frame creation, lines 191-193). After calling this method, cells remain reachable throughfastlocals.Consider updating the doc comment to reflect the actual behavior:
📝 Suggested doc comment fix
- /// Clear the evaluation stack and drop all cell/free variable references. + /// Clear the evaluation stack and release the `cells_frees` array. + /// Note: Cell objects remain accessible via fastlocals until `clear_locals_and_stack` is called.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 222 - 227, The doc comment on clear_stack_and_cells is inaccurate: while the method clears state.stack and replaces state.cells_frees, it does not drop cell references still stored in fastlocals (the fastlocals[nlocals..] slice set at frame creation), so cells remain reachable; update the comment for clear_stack_and_cells to state precisely that it clears the evaluation stack and clears the frame's cells_frees storage but does not remove cell references from fastlocals, and mention the residual reachability via fastlocals to avoid implying all cell/free refs are dropped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 1056-1058: Two different Rust types are being exported to Python
with the same pyclass name "ScandirIter", causing ambiguous os.ScandirIter
resolution; change the duplicate #[pyclass(name = "ScandirIter")] to a unique
exported name (e.g., #[pyclass(name = "ScandirIterFd")] or remove the name
attribute to use the Rust type name) and update any constructors, factory
functions, type checks, and isinstance usages that reference the fd-based
iterator to the new name (symbols to update include the duplicate pyclass
declaration at ScandirIter, its impl/constructor functions, and any code that
returns or matches on that type). Ensure the public API and tests reflect the
new exported name so path-based and fd-based iterators are distinct in Python.
- Around line 437-438: Update the misleading comment in the OwnedDir::drop
block: replace the phrase that claims rewinddir() "restores the original fd's
directory position" with a precise statement that rewinddir() resets the
directory stream to the beginning (and that preserving/restoring a prior
position would require telldir()/seekdir(), which are not used here); keep the
note about closedir() but correct the behavior description to accurately reflect
rewinddir() resetting to the start.
---
Nitpick comments:
In `@crates/vm/src/frame.rs`:
- Around line 222-227: The doc comment on clear_stack_and_cells is inaccurate:
while the method clears state.stack and replaces state.cells_frees, it does not
drop cell references still stored in fastlocals (the fastlocals[nlocals..] slice
set at frame creation), so cells remain reachable; update the comment for
clear_stack_and_cells to state precisely that it clears the evaluation stack and
clears the frame's cells_frees storage but does not remove cell references from
fastlocals, and mention the residual reachability via fastlocals to avoid
implying all cell/free refs are dropped.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_os.pyis excluded by!Lib/**
📒 Files selected for processing (8)
crates/common/src/os.rscrates/vm/src/builtins/frame.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/super.rscrates/vm/src/coroutine.rscrates/vm/src/frame.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/coroutine.rs
crates/vm/src/stdlib/os.rs
Outdated
| // OwnedDir::drop calls rewinddir then closedir, | ||
| // restoring the original fd's directory position. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the code at the specified lines
head -n 50 crates/vm/src/stdlib/os.rs | tail -n 20Repository: RustPython/RustPython
Length of output: 1730
🏁 Script executed:
# Get better context around lines 437-438
sed -n '420,450p' crates/vm/src/stdlib/os.rsRepository: RustPython/RustPython
Length of output: 1692
🏁 Script executed:
# Get context around lines 1035-1036
sed -n '1020,1050p' crates/vm/src/stdlib/os.rsRepository: RustPython/RustPython
Length of output: 975
🏁 Script executed:
# Let's also search for any telldir/seekdir usage in the file
rg 'telldir|seekdir' crates/vm/src/stdlib/os.rs -nRepository: RustPython/RustPython
Length of output: 47
🌐 Web query:
POSIX rewinddir() behavior - does it restore a saved directory position or reset to beginning
💡 Result:
rewinddir(dirp) resets the directory stream to the beginning of the directory (and also makes the stream reflect the directory’s current state, as if you’d just reopened it). It does not restore some previously-saved position. [1][2]
If you want to save/restore a position, use telldir() to get a location and later seekdir() with that value—but a call to rewinddir() invalidates locations previously returned by telldir() for that stream (so seeking back with an old value after rewinding is not portable). [3]
Sources:
[1] POSIX rewinddir() description (man7 POSIX Programmer’s Manual): (man7.org)
[2] glibc manual, “Random Access in a Directory Stream”: (sourceware.org)
[3] QNX seekdir() docs on telldir() values becoming invalid after rewinddir(): (qnx.com)
Citations:
- 1: https://man7.org/linux/man-pages/man3/rewinddir.3p.html
- 2: https://sourceware.org/glibc/manual/latest/html_node/Random-Access-Directory.html?utm_source=openai
- 3: https://www.qnx.com/developers/docs/8.0/com.qnx.doc.neutrino.lib_ref/topic/s/seekdir.html?utm_source=openai
Correct the misleading comment about directory position restoration.
The comment at lines 437-438 states that rewinddir() restores the original directory position, but rewinddir() actually resets the directory stream to the beginning. To save and restore a prior position, telldir() and seekdir() would need to be used, which are not present in this code. Update the comment to accurately reflect that rewinddir() resets the directory stream.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/os.rs` around lines 437 - 438, Update the misleading
comment in the OwnedDir::drop block: replace the phrase that claims rewinddir()
"restores the original fd's directory position" with a precise statement that
rewinddir() resets the directory stream to the beginning (and that
preserving/restoring a prior position would require telldir()/seekdir(), which
are not used here); keep the note about closedir() but correct the behavior
description to accurately reflect rewinddir() resetting to the start.
| #[pyattr] | ||
| #[pyclass(name = "ScandirIter")] | ||
| #[derive(Debug, PyPayload)] |
There was a problem hiding this comment.
Avoid exporting two different types as ScandirIter.
Line 1057 introduces a second #[pyclass(name = "ScandirIter")] while the existing path-based iterator already uses that same exported name. This can make os.ScandirIter resolution and isinstance behavior inconsistent between path-based and fd-based iterators.
Suggested minimal fix
- #[pyattr]
#[pyclass(name = "ScandirIter")]
#[derive(Debug, PyPayload)]
struct ScandirIteratorFd {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[pyattr] | |
| #[pyclass(name = "ScandirIter")] | |
| #[derive(Debug, PyPayload)] | |
| #[pyclass(name = "ScandirIter")] | |
| #[derive(Debug, PyPayload)] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/os.rs` around lines 1056 - 1058, Two different Rust
types are being exported to Python with the same pyclass name "ScandirIter",
causing ambiguous os.ScandirIter resolution; change the duplicate #[pyclass(name
= "ScandirIter")] to a unique exported name (e.g., #[pyclass(name =
"ScandirIterFd")] or remove the name attribute to use the Rust type name) and
update any constructors, factory functions, type checks, and isinstance usages
that reference the fd-based iterator to the new name (symbols to update include
the duplicate pyclass declaration at ScandirIter, its impl/constructor
functions, and any code that returns or matches on that type). Ensure the public
API and tests reflect the new exported name so path-based and fd-based iterators
are distinct in Python.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
crates/vm/src/stdlib/os.rs (2)
1059-1062:⚠️ Potential issue | 🟠 MajorAvoid exporting two different iterator types as
ScandirIter.Line 1060 introduces a second
#[pyclass(name = "ScandirIter")]while Line 899 already exports that name. This can makeos.ScandirIterresolution andisinstancebehavior inconsistent between path-based and fd-based iterators.#!/bin/bash # Verify duplicate exported ScandirIter class declarations in os.rs rg -n -C2 '^\s*#\[pyclass\(name = "ScandirIter"\)\]|^\s*struct ScandirIterator(Fd)?\b' crates/vm/src/stdlib/os.rsMinimal collision fix
- #[pyattr] - #[pyclass(name = "ScandirIter")] + #[pyclass(name = "ScandirIterFd")] #[derive(Debug, PyPayload)] struct ScandirIteratorFd {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 1059 - 1062, The file defines two distinct Rust types both annotated as #[pyclass(name = "ScandirIter")] (ScandirIterator and ScandirIteratorFd) causing duplicate exports; fix by giving the fd-based type a distinct Python class name (e.g., change the attribute on ScandirIteratorFd from #[pyclass(name = "ScandirIter")] to #[pyclass(name = "ScandirIterFd")] or another unique name), then update any module exports, factory/constructor code, and isinstance/type-check usage that expect the fd variant to use the new name so there are no conflicting Python class names.
1038-1039:⚠️ Potential issue | 🟡 Minor
rewinddir()does not restore prior position.At Line 1038,
rewinddir()resets the stream to the beginning. It cannot restore an arbitrary prior directory position, so the claim at Line 437-Line 438 is inaccurate.POSIX rewinddir semantics: does rewinddir() restore a previously saved directory position, or does it reset to the beginning of the directory stream?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 1038 - 1039, The code is calling libc::rewinddir(self.0.as_ptr()) but the surrounding comment/claim that rewinddir restores a previously saved directory position is incorrect: rewinddir always resets to the beginning of the stream and cannot restore an arbitrary prior position. Update the implementation and comments around the use of rewinddir/closedir (the directory stream handling code that calls libc::rewinddir and libc::closedir) to state the correct POSIX semantics (rewinddir resets to start); if the original intent was to restore a saved position, replace the logic to save and restore positions using libc::telldir to record a position and libc::seekdir to restore it (or remove the restore attempt entirely if unnecessary).
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
222-227:clear_stack_and_cellscurrently overstates what it clears.At Line 226,
state.cells_freesis dropped, butfastlocalsstill retains cell/freePyCellRefs initialized at Line 191-Line 193. If this method is called directly, cell/free refs are still reachable.Suggested adjustment
- /// Clear the evaluation stack and drop all cell/free variable references. + /// Clear evaluation stack and state-owned cell/free references. + /// For full local/cell cleanup, call `clear_locals_and_stack()`. pub(crate) fn clear_stack_and_cells(&self) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 222 - 227, clear_stack_and_cells currently drops state.cells_frees but leaves cell/free PyCellRef references in fastlocals alive; update the method (clear_stack_and_cells) to also remove or take/clear the frame's fastlocals storage that holds those PyCellRef entries (the same place initialized around fastlocals at Lines ~191-193) so that all cell/free references are dropped when clearing the stack — e.g., take or clear the fastlocals container on the locked state in addition to state.cells_frees and state.stack to ensure no cell/free refs remain reachable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 1059-1062: The file defines two distinct Rust types both annotated
as #[pyclass(name = "ScandirIter")] (ScandirIterator and ScandirIteratorFd)
causing duplicate exports; fix by giving the fd-based type a distinct Python
class name (e.g., change the attribute on ScandirIteratorFd from #[pyclass(name
= "ScandirIter")] to #[pyclass(name = "ScandirIterFd")] or another unique name),
then update any module exports, factory/constructor code, and
isinstance/type-check usage that expect the fd variant to use the new name so
there are no conflicting Python class names.
- Around line 1038-1039: The code is calling libc::rewinddir(self.0.as_ptr())
but the surrounding comment/claim that rewinddir restores a previously saved
directory position is incorrect: rewinddir always resets to the beginning of the
stream and cannot restore an arbitrary prior position. Update the implementation
and comments around the use of rewinddir/closedir (the directory stream handling
code that calls libc::rewinddir and libc::closedir) to state the correct POSIX
semantics (rewinddir resets to start); if the original intent was to restore a
saved position, replace the logic to save and restore positions using
libc::telldir to record a position and libc::seekdir to restore it (or remove
the restore attempt entirely if unnecessary).
---
Nitpick comments:
In `@crates/vm/src/frame.rs`:
- Around line 222-227: clear_stack_and_cells currently drops state.cells_frees
but leaves cell/free PyCellRef references in fastlocals alive; update the method
(clear_stack_and_cells) to also remove or take/clear the frame's fastlocals
storage that holds those PyCellRef entries (the same place initialized around
fastlocals at Lines ~191-193) so that all cell/free references are dropped when
clearing the stack — e.g., take or clear the fastlocals container on the locked
state in addition to state.cells_frees and state.stack to ensure no cell/free
refs remain reachable.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
crates/vm/src/builtins/frame.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/super.rscrates/vm/src/frame.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- crates/vm/src/builtins/super.rs
- crates/vm/src/builtins/frame.rs
- crates/vm/src/stdlib/posix.rs
There was a problem hiding this comment.
♻️ Duplicate comments (2)
crates/vm/src/stdlib/os.rs (2)
437-438:⚠️ Potential issue | 🟡 MinorFix the
rewinddir()behavior comment.Line 437/Line 438 says
rewinddir()restores the original directory position, butrewinddir()resets to the beginning of the directory stream. The implementation note should reflect that behavior precisely.POSIX rewinddir() semantics: does rewinddir restore a prior saved position, or reset to the beginning of the directory stream?Suggested comment-only diff
- // OwnedDir::drop calls rewinddir then closedir, - // restoring the original fd's directory position. + // OwnedDir::drop calls rewinddir then closedir. + // rewinddir() resets the stream to the beginning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 437 - 438, The comment next to OwnedDir::drop incorrectly states that rewinddir() "restores the original fd's directory position"; update the comment to accurately describe POSIX rewinddir() semantics by saying rewinddir() resets the directory stream to the beginning (start of the directory), not restoring a previously saved cursor position — keep the note tied to OwnedDir::drop and mention that rewinddir() seeks to the start of the directory stream before closedir().
899-901:⚠️ Potential issue | 🟠 MajorResolve duplicate exported
ScandirIterclass names.Line 899 and Line 1060 both export distinct Rust types as
ScandirIter. This can makeos.ScandirIteridentity/lookup ambiguous and produce inconsistentisinstancebehavior across path-based vs fd-based iterators.Suggested minimal direction
- #[pyattr] - #[pyclass(name = "ScandirIter")] + #[pyattr] + #[pyclass(name = "ScandirIterFd")] #[derive(Debug, PyPayload)] struct ScandirIteratorFd {#!/bin/bash # Verify duplicate exported pyclass names in os.rs rg -n '#\[pyclass\(name = "ScandirIter"\)\]|struct ScandirIterator|struct ScandirIteratorFd' crates/vm/src/stdlib/os.rs -C 2Also applies to: 1059-1062
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 899 - 901, Two different Rust types are being exported with the same Python class name "ScandirIter" (the structs ScandirIterator and ScandirIteratorFd), causing ambiguous runtime identity; fix by giving the fd-based iterator a distinct pyclass name (e.g., change the #[pyclass(name = "ScandirIter")] on the ScandirIteratorFd definition to something like #[pyclass(name = "ScandirIterFd")] or similar) and update all references/usages that expect the old name (any isinstance/type checks or constructor exposures that reference ScandirIter, and the ScandirIteratorFd symbol) so the two exported classes are uniquely named and consistent.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/os.rs (1)
675-723: Extract sharedd_type/stat fallback logic used byis_dirandis_file.Line 675-723 duplicates almost the same control flow with only mode/type constants differing. Consider a shared helper (target d_type + target mode bit) to keep behavior aligned and reduce future drift.
As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/os.rs` around lines 675 - 723, Extract the duplicated d_type/stat fallback control flow from is_dir and is_file into a shared helper (e.g., a method on the same struct like test_type_via_dtype_or_stat) that accepts the follow_symlinks flag, the d_type target constant (libc::DT_DIR or libc::DT_REG) and the stat mode mask (libc::S_IFDIR as _ or libc::S_IFREG as _), plus &VirtualMachine; move the common checks for self.file_type, self.d_type, the DT_UNKNOWN/DT_LNK logic and the fs_metadata fallback into that helper, and replace the bodies of is_dir and is_file to call it with the appropriate constants (keeping FollowSymlinks wrapper usage and vm forwarding) so behavior remains identical and duplication is removed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 437-438: The comment next to OwnedDir::drop incorrectly states
that rewinddir() "restores the original fd's directory position"; update the
comment to accurately describe POSIX rewinddir() semantics by saying rewinddir()
resets the directory stream to the beginning (start of the directory), not
restoring a previously saved cursor position — keep the note tied to
OwnedDir::drop and mention that rewinddir() seeks to the start of the directory
stream before closedir().
- Around line 899-901: Two different Rust types are being exported with the same
Python class name "ScandirIter" (the structs ScandirIterator and
ScandirIteratorFd), causing ambiguous runtime identity; fix by giving the
fd-based iterator a distinct pyclass name (e.g., change the #[pyclass(name =
"ScandirIter")] on the ScandirIteratorFd definition to something like
#[pyclass(name = "ScandirIterFd")] or similar) and update all references/usages
that expect the old name (any isinstance/type checks or constructor exposures
that reference ScandirIter, and the ScandirIteratorFd symbol) so the two
exported classes are uniquely named and consistent.
---
Nitpick comments:
In `@crates/vm/src/stdlib/os.rs`:
- Around line 675-723: Extract the duplicated d_type/stat fallback control flow
from is_dir and is_file into a shared helper (e.g., a method on the same struct
like test_type_via_dtype_or_stat) that accepts the follow_symlinks flag, the
d_type target constant (libc::DT_DIR or libc::DT_REG) and the stat mode mask
(libc::S_IFDIR as _ or libc::S_IFREG as _), plus &VirtualMachine; move the
common checks for self.file_type, self.d_type, the DT_UNKNOWN/DT_LNK logic and
the fs_metadata fallback into that helper, and replace the bodies of is_dir and
is_file to call it with the appropriate constants (keeping FollowSymlinks
wrapper usage and vm forwarding) so behavior remains identical and duplication
is removed.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
crates/vm/src/builtins/frame.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/super.rscrates/vm/src/frame.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/vm/src/builtins/frame.rs
- crates/vm/src/builtins/super.rs
Summary by CodeRabbit
New Features
Bug Fixes
Refactor