Add method cache for MRO lookups#7313
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review infoConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds a lock-free, direct-mapped MCACHE for PyType attribute/MRO lookups with version-based invalidation, integrates cache-aware lookup/population into attribute resolution paths, exposes Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (get_attr)
participant Type as PyType
participant MCACHE as MCACHE
participant MRO as MRO Search
Client->>Type: get_attr(name)
Type->>Type: find_name_in_mro(name)
Type->>MCACHE: Check entry using tp_version_tag
alt Cache Hit
MCACHE-->>Type: Return cached value
else Cache Miss
Type->>MRO: find_name_in_mro_uncached(name)
MRO-->>Type: Search result (found / not found)
alt Found and version valid
Type->>MCACHE: Populate cache entry (if allowed)
MCACHE-->>Type: Cache stored
end
end
Type-->>Client: Return attribute or NotFound
sequenceDiagram
participant GC as Garbage Collector
participant Type as Type System
participant MCACHE as MCACHE
GC->>GC: collect_inner()
GC->>Type: type_cache_clear()
Type->>MCACHE: Set TYPE_CACHE_CLEARING flag
MCACHE->>MCACHE: Invalidate entries and drop refs
MCACHE-->>Type: Clearing complete
Type-->>GC: continue collection
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin mcache |
2e39e80 to
3ccbe51
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/builtins/type.rs (1)
316-341:⚠️ Potential issue | 🔴 CriticalPrevent
tp_version_tagoverwrite races during concurrent assignment.When multiple threads call
assign_version_tag()concurrently and findtp_version_tag == 0, they all proceed past the initial check and allocate distinct version numbers fromNEXT_TYPE_VERSION. The unconditionalstore()at line 338 then causes a race where the last thread's write overwrites earlier writes, leaving cache entries tied to the first-assigned version orphaned with a stale tag.Use
compare_exchange()onself.tp_version_tagto atomically serialize assignment: only the first thread stores its allocated version; subsequent threads return the existing assigned version.Fix
if NEXT_TYPE_VERSION .compare_exchange_weak(current, next, Ordering::Relaxed, Ordering::Relaxed) .is_ok() { - self.tp_version_tag.store(current, Ordering::Release); - return current; + match self.tp_version_tag.compare_exchange( + 0, + current, + Ordering::AcqRel, + Ordering::Acquire, + ) { + Ok(_) => return current, + Err(existing) if existing != 0 => return existing, + Err(_) => continue, + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 316 - 341, assign_version_tag races because threads can each allocate a version from NEXT_TYPE_VERSION and then unconditionally overwrite self.tp_version_tag; change the final write to an atomic compare-and-exchange so only the first setter wins. After successfully advancing NEXT_TYPE_VERSION (the compare_exchange_weak that yields `current`), attempt self.tp_version_tag.compare_exchange(0, current, Ordering::Release, Ordering::Relaxed); if that CAS succeeds return current, otherwise load and return the existing tp_version_tag value (or the CAS-provided existing) so subsequent threads return the already-assigned version; keep the same Acquire/Release semantics for tp_version_tag and use Relaxed for failures where appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 710-714: The get_attr path uses find_name_in_mro and relies on an
MRO cache, but direct mutations via
self.attributes.write().insert/remove/swap_remove/shift_remove bypass
self.modified() and leave stale cache entries; add a single helper on the class
(e.g., a private method like modify_attribute / set_attribute_and_invalidate)
that performs attribute mutations (insert/remove/swap_remove/shift_remove) under
the write lock and calls self.modified() / invalidates the MRO/type cache, then
replace all direct calls to self.attributes.write().* across the file (including
places currently doing insert, remove, swap_remove, shift_remove) to use this
helper so all runtime mutations consistently update caches and avoid stale
get_attr results that rely on find_name_in_mro.
- Around line 64-73: Remove the decorative separator comment "// --- Type
attribute cache (MCACHE) ---" and replace it with a concise non-decorative
comment or doc-comment that briefly states the purpose (e.g., "Type attribute
cache (MCACHE): direct-mapped cache keyed by (tp_version_tag,
interned_name_ptr)"), keeping the explanatory SeqLock notes (Read/Write
behavior) as short `//` lines or a `///` doc-comment; ensure references to
"MCACHE", "type_cache", and the SeqLock behavior remain but without repeated
decorative punctuation.
---
Outside diff comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 316-341: assign_version_tag races because threads can each
allocate a version from NEXT_TYPE_VERSION and then unconditionally overwrite
self.tp_version_tag; change the final write to an atomic compare-and-exchange so
only the first setter wins. After successfully advancing NEXT_TYPE_VERSION (the
compare_exchange_weak that yields `current`), attempt
self.tp_version_tag.compare_exchange(0, current, Ordering::Release,
Ordering::Relaxed); if that CAS succeeds return current, otherwise load and
return the existing tp_version_tag value (or the CAS-provided existing) so
subsequent threads return the already-assigned version; keep the same
Acquire/Release semantics for tp_version_tag and use Relaxed for failures where
appropriate.
| /// Internal get_attr implementation for fast lookup on a class. | ||
| /// Searches the full MRO (including self) with type cache acceleration. | ||
| pub fn get_attr(&self, attr_name: &'static PyStrInterned) -> Option<PyObjectRef> { | ||
| flame_guard!(format!("class_get_attr({:?})", attr_name)); | ||
|
|
||
| self.get_direct_attr(attr_name) | ||
| .or_else(|| self.get_super_attr(attr_name)) | ||
| self.find_name_in_mro(attr_name) | ||
| } |
There was a problem hiding this comment.
Cache invalidation is incomplete for direct attribute mutations.
Now that Line 713 and Line 827 use the cached MRO path, direct self.attributes.write().insert/remove call sites that do not call self.modified() can leave stale cache entries visible (e.g., special setters outside this hunk). Please centralize attribute mutation + invalidation behind one helper and route all runtime mutations through it.
Proposed direction
+ fn mutate_attributes<R>(&self, f: impl FnOnce(&mut PyAttributes) -> R) -> R {
+ let mut attrs = self.attributes.write();
+ let out = f(&mut attrs);
+ drop(attrs);
+ self.modified();
+ out
+ }
+
pub fn set_attr(&self, attr_name: &'static PyStrInterned, value: PyObjectRef) {
- self.attributes.write().insert(attr_name, value);
- self.modified();
+ self.mutate_attributes(|attrs| {
+ attrs.insert(attr_name, value);
+ });
}Then migrate other runtime self.attributes.write().insert/swap_remove/shift_remove paths to this helper.
Also applies to: 825-856
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 710 - 714, The get_attr path
uses find_name_in_mro and relies on an MRO cache, but direct mutations via
self.attributes.write().insert/remove/swap_remove/shift_remove bypass
self.modified() and leave stale cache entries; add a single helper on the class
(e.g., a private method like modify_attribute / set_attribute_and_invalidate)
that performs attribute mutations (insert/remove/swap_remove/shift_remove) under
the write lock and calls self.modified() / invalidates the MRO/type cache, then
replace all direct calls to self.attributes.write().* across the file (including
places currently doing insert, remove, swap_remove, shift_remove) to use this
helper so all runtime mutations consistently update caches and avoid stale
get_attr results that rely on find_name_in_mro.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/vm/src/builtins/type.rs (1)
710-714:⚠️ Potential issue | 🟠 MajorCache-backed class lookups still need centralized mutation invalidation.
Now that these paths always read through MCACHE, direct
self.attributes.write().insert/remove/swap_remove/shift_removesites that skipself.modified()can expose stale results. Please funnel all runtime attribute mutations through one helper that always invalidates after mutation.Suggested direction
+ fn mutate_attributes<R>(&self, f: impl FnOnce(&mut PyAttributes) -> R) -> R { + let mut attrs = self.attributes.write(); + let out = f(&mut attrs); + drop(attrs); + self.modified(); + out + }Then replace direct runtime
self.attributes.write().*mutators with this helper.Also applies to: 825-828
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 710 - 714, The class attribute mutations currently call self.attributes.write().insert/remove/swap_remove/shift_remove directly (seen around get_attr/find_name_in_mro and also lines ~825-828), which can leave MCACHE stale; create a single helper method on the class (e.g., fn mutate_attribute_and_invalidate(&self, op: impl FnOnce(&mut AttrMap)) or two simple helpers like fn insert_attr(&self, key, value) and fn remove_attr(&self, key)) that performs the write, then calls self.modified() (or the existing invalidation routine) after the mutation, and replace all direct self.attributes.write().* sites with calls to this helper so every runtime attribute mutation always invalidates the type cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 723-726: The MCACHE SeqLock in find_name_in_mro has an
unsynchronized writer: the writer uses a Relaxed swap for the shared `value`
which can race with the reader's Acquire load and allow an ABA/use-after-free;
change the writer's `value.swap(...)` to use Ordering::Release so the reader's
Acquire load pairs with it, or alternatively replace the single `version` field
with a monotonic sequence counter (a dedicated seq counter) to avoid the
assigned→0→assigned ABA window—update the writer side where `value.swap` is
called and ensure the reader still uses Acquire and rechecks the version/seq
after cloning.
---
Duplicate comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 710-714: The class attribute mutations currently call
self.attributes.write().insert/remove/swap_remove/shift_remove directly (seen
around get_attr/find_name_in_mro and also lines ~825-828), which can leave
MCACHE stale; create a single helper method on the class (e.g., fn
mutate_attribute_and_invalidate(&self, op: impl FnOnce(&mut AttrMap)) or two
simple helpers like fn insert_attr(&self, key, value) and fn remove_attr(&self,
key)) that performs the write, then calls self.modified() (or the existing
invalidation routine) after the mutation, and replace all direct
self.attributes.write().* sites with calls to this helper so every runtime
attribute mutation always invalidates the type cache.
| /// Uses a lock-free SeqLock pattern keyed by version: | ||
| /// Read: load version → check name → load value → clone → re-check version | ||
| /// Write: version=0 → swap value → set name → version=assigned | ||
| fn find_name_in_mro(&self, name: &'static PyStrInterned) -> Option<PyObjectRef> { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the current protocol shape (single version field reused, no independent seq counter)
rg -n "struct TypeCacheEntry|version: AtomicU32|fn find_name_in_mro|fn has_name_in_mro|entry.version.store\(0|entry.version.store\(assigned|let v2 = entry.version.load" crates/vm/src/builtins/type.rs -C3Repository: RustPython/RustPython
Length of output: 4022
🏁 Script executed:
# Verify the memory ordering details and drop timing
rg -n "entry.value.load\(Ordering::" crates/vm/src/builtins/type.rs -A1 -B1
rg -n "entry.value.swap\(|drop\(PyObjectRef::from_raw\(" crates/vm/src/builtins/type.rs -B1 -A1
rg -n "fn set_in_cache" crates/vm/src/builtins/type.rs -A 40 | head -50Repository: RustPython/RustPython
Length of output: 620
🏁 Script executed:
# Check the full write sequence and synchronization context
rg -n "fn set_in_cache" crates/vm/src/builtins/type.rs -A 35 | head -60Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Get the full set_in_cache function including line numbers
sed -n '768,802p' crates/vm/src/builtins/type.rsRepository: RustPython/RustPython
Length of output: 1498
MCACHE read/write protocol has unsynchronized memory ordering and ABA issues that can cause use-after-free.
The reader loads value with Acquire (line 738), but the writer swaps in new value with Relaxed (line 786), which don't synchronize. The reader's Acquire load only orders with Release stores of the same atomic; it doesn't guarantee visibility of the Relaxed swap between the version invalidation and restoration.
Combined with the ABA window—version cycles assigned → 0 → assigned if the type version unchanged—a reader can:
- Load
v1 = assignedbefore writer invalidates - Load old pointer before seeing the new value (no ordering between Acquire and Relaxed swap)
- Observe
v2 == v1(same version restored) - Clone and return the stale pointer that is immediately dropped (lines 795)
Use Ordering::Release on the value.swap() call (line 786), or replace the single version field with a dedicated monotonic sequence counter to ensure safe publication and reclamation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 723 - 726, The MCACHE SeqLock in
find_name_in_mro has an unsynchronized writer: the writer uses a Relaxed swap
for the shared `value` which can race with the reader's Acquire load and allow
an ABA/use-after-free; change the writer's `value.swap(...)` to use
Ordering::Release so the reader's Acquire load pairs with it, or alternatively
replace the single `version` field with a monotonic sequence counter (a
dedicated seq counter) to avoid the assigned→0→assigned ABA window—update the
writer side where `value.swap` is called and ensure the reader still uses
Acquire and rechecks the version/seq after cloning.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
crates/vm/src/builtins/type.rs (2)
704-707:⚠️ Potential issue | 🟠 MajorInvalidation is still not centralized for all type-attribute mutations.
set_attr()is now correct, but other directattributes.write().insert/swap_remove/shift_removepaths in this file can still bypassmodified(), leaving stale MCACHE entries.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 704 - 707, Multiple places in this file mutate the type's attributes directly (e.g., attributes.write().insert, .swap_remove, .shift_remove) and thus bypass the cache invalidation done by modified(); centralize invalidation by introducing and using a single helper (or ensure every mutation calls modified()) instead of direct writes: replace all direct attribute mutations with calls to set_attr/explicitly call self.modified() immediately after any attributes.write().insert/swap_remove/shift_remove, and/or refactor those mutation sites to use a new private method (e.g., set_attr_internal or invalidate_and_mutate) so modified() is always invoked when attributes are changed.
785-790:⚠️ Potential issue | 🔴 CriticalPreviously reported SeqLock ABA/publication risk is still present.
The writer still uses
value.swap(..., Ordering::Relaxed)while restoring the same version value, so the earlier ABA/UAF concern remains unresolved.#!/bin/bash # Verify MCACHE read/write ordering points in find_name_in_mro. rg -n -C3 'entry\.version\.store\(0, Ordering::Release\)|entry\.value\.swap\(new_ptr, Ordering::Relaxed\)|entry\.version\.store\(assigned, Ordering::Release\)|let v2 = entry\.version\.load\(Ordering::Acquire\)' crates/vm/src/builtins/type.rs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 785 - 790, The writer currently does entry.value.swap(new_ptr, Ordering::Relaxed) which can re-publish the identical pointer and reintroduce the SeqLock ABA/UAF risk; change the update to a CAS loop using entry.value.compare_exchange(old_ptr, new_ptr, Ordering::Release, Ordering::Relaxed) (retrying if it returns Err) so we only publish when we actually replace the pointer, keep entry.name.store(...) and entry.version.store(assigned, Ordering::Release) as-is, and apply this change around the same code paths referenced by find_name_in_mro / the writer that uses entry.value.swap to ensure the publication semantics and ordering are fixed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 771-774: The lookup currently only caches hits inside the block
that checks "if let Some(ref found) = result", so misses (result == None) are
not recorded; modify the lookup's cache update logic to record negative entries
when result is None by adding an explicit branch that inserts a
sentinel/negative marker into the same MCACHE/attribute cache using the same
cache key (instead of skipping it), and ensure invalidation paths that already
clear the positive cache (e.g., type mutation / MRO change handlers) also remove
or bump versioning for negative entries so they don't become stale; concretely,
replace the "if let Some(ref found) = result" pattern with a match or if/else
that writes either the found value or a negative-cache marker into the cache
(and use the existing cache APIs used for positive entries so keys and eviction
behave consistently).
- Around line 155-168: In type_cache_clear(), set TYPE_CACHE_CLEARING to true
before Phase 1 so concurrent lookups cannot repopulate the cache while you
invalidate entries: move the TYPE_CACHE_CLEARING.store(true, Ordering::Release)
to immediately before the loop that iterates TYPE_CACHE.iter(), perform the
version.store(0) and entry.take_value() collection while the flag is true, then
drop the collected values and finally store false; preserve the same atomic
ordering (Ordering::Release) for the stores so memory ordering remains
consistent.
---
Duplicate comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 704-707: Multiple places in this file mutate the type's attributes
directly (e.g., attributes.write().insert, .swap_remove, .shift_remove) and thus
bypass the cache invalidation done by modified(); centralize invalidation by
introducing and using a single helper (or ensure every mutation calls
modified()) instead of direct writes: replace all direct attribute mutations
with calls to set_attr/explicitly call self.modified() immediately after any
attributes.write().insert/swap_remove/shift_remove, and/or refactor those
mutation sites to use a new private method (e.g., set_attr_internal or
invalidate_and_mutate) so modified() is always invoked when attributes are
changed.
- Around line 785-790: The writer currently does entry.value.swap(new_ptr,
Ordering::Relaxed) which can re-publish the identical pointer and reintroduce
the SeqLock ABA/UAF risk; change the update to a CAS loop using
entry.value.compare_exchange(old_ptr, new_ptr, Ordering::Release,
Ordering::Relaxed) (retrying if it returns Err) so we only publish when we
actually replace the pointer, keep entry.name.store(...) and
entry.version.store(assigned, Ordering::Release) as-is, and apply this change
around the same code paths referenced by find_name_in_mro / the writer that uses
entry.value.swap to ensure the publication semantics and ordering are fixed.
| // Only cache positive results. Negative results are not cached to | ||
| // avoid stale entries from transient MRO walk failures during | ||
| // concurrent type modifications. | ||
| if let Some(ref found) = result |
There was a problem hiding this comment.
Negative caching is not implemented in the current lookup path.
The code explicitly skips caching misses, so repeated not-found lookups still do full MRO walks and this diverges from the stated MCACHE behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 771 - 774, The lookup currently
only caches hits inside the block that checks "if let Some(ref found) = result",
so misses (result == None) are not recorded; modify the lookup's cache update
logic to record negative entries when result is None by adding an explicit
branch that inserts a sentinel/negative marker into the same MCACHE/attribute
cache using the same cache key (instead of skipping it), and ensure invalidation
paths that already clear the positive cache (e.g., type mutation / MRO change
handlers) also remove or bump versioning for negative entries so they don't
become stale; concretely, replace the "if let Some(ref found) = result" pattern
with a match or if/else that writes either the found value or a negative-cache
marker into the cache (and use the existing cache APIs used for positive entries
so keys and eviction behave consistently).
4096-entry direct-mapped cache keyed by (tp_version_tag, interned_name_ptr) using lock-free SeqLock pattern for concurrent access. - find_name_in_mro() checks cache before MRO walk - has_name_in_mro() avoids cloning on cache hit - Auto-assigns version tags on cache miss - Invalidation via modified() (tp_version_tag=0) with value drops - TYPE_CACHE_CLEARING flag suppresses re-population during GC drops - Clear method cache during GC collection to break reference cycles
4096-entry direct-mapped cache keyed by (tp_version_tag, interned_name_ptr), mirroring CPython's type_cache in Objects/typeobject.c.
Summary by CodeRabbit
Performance
New Features