X Tutup
Skip to content

Use borrowed pointers in TYPE_CACHE instead of strong references#7384

Merged
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:type-cache
Mar 8, 2026
Merged

Use borrowed pointers in TYPE_CACHE instead of strong references#7384
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:type-cache

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Mar 8, 2026

The method cache (TYPE_CACHE) was storing strong references (Arc clones) to cached attribute values, which inflated sys.getrefcount(). This caused intermittent test_memoryview failures where refcount assertions would fail depending on GC collection timing.

Store borrowed raw pointers instead. Safety is guaranteed because:

  • type_cache_clear() nullifies all entries during GC collection, before the collector breaks cycles
  • type_cache_clear_version() nullifies entries when a type is modified, before the source dict entry is removed
  • Readers use try_to_owned_from_ptr (safe_inc) to atomically validate and increment the refcount on cache hit

Summary by CodeRabbit

  • Refactor
    • Optimized type cache memory management in the virtual machine for improved efficiency.

The method cache (TYPE_CACHE) was storing strong references (Arc
clones) to cached attribute values, which inflated sys.getrefcount().
This caused intermittent test_memoryview failures where refcount
assertions would fail depending on GC collection timing.

Store borrowed raw pointers instead. Safety is guaranteed because:
- type_cache_clear() nullifies all entries during GC collection,
  before the collector breaks cycles
- type_cache_clear_version() nullifies entries when a type is
  modified, before the source dict entry is removed
- Readers use try_to_owned_from_ptr (safe_inc) to atomically
  validate and increment the refcount on cache hit
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

📝 Walkthrough

Walkthrough

The PR converts the type cache from storing owned PyObject references to borrowed pointers, reducing reference counting overhead. The take_value method is replaced with clear_value for cache invalidation. Cache insertion and clearing operations are updated to work with borrowed pointers while maintaining correctness with GC behavior.

Changes

Cohort / File(s) Summary
Type Cache Reference Semantics
crates/vm/src/builtins/type.rs
Changed TypeCacheEntry value field from owned PyObjectRef to borrowed pointer; replaced take_value() with clear_value() for cache invalidation; updated cache clearing and insertion paths to work with borrowed pointers; adjusted PyType.modified to nullify pointers instead of releasing strong references; added safety notes on GC behavior and cycle collection interactions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Add method cache for MRO lookups #7313: Directly related—this PR modifies the MCACHE structures and cache management functions originally introduced in that PR, changing the ownership semantics from owned to borrowed references.

Poem

🐰 Hop, hop! A cache so clever,
From owned refs to borrowed—light as a feather!
No counters to count, no drops to fear,
Just borrowed whispers in memory's ear. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main change: replacing strong references with borrowed pointers in the TYPE_CACHE, which is the core motivation and implementation of this pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review March 8, 2026 12:58
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 84-89: Update the unsafe lifetime doc for the type cache to cover
both roots: state that entries referencing heap types are cleared by GC via
type_cache_clear() and type_cache_clear_version(), while static
(untracked/immortal) types are not collected and remain valid for the program
lifetime (refer to the static-type untracking logic). Mention both behaviors
explicitly so callers understand that cached pointers may be nulled during GC
for heap types but are permanently valid for static types.
- Around line 805-809: The type dict mutation pathway must always call the
cache-invalidation path before changing attributes; create and use a helper like
mutate_type_dict(&self, f: impl FnOnce(&mut PyAttributes)) that calls
self.modified() then invokes f on self.attributes.write(), and replace all
direct mutations (e.g., calls to self.attributes.write().insert, swap_remove,
remove) in the annotation/cache setters and the sites around the symbols
mentioned (lines handling __annotations_cache__, __annotate_func__, and the
blocks currently using insert/swap_remove) to use this helper so every
post-construction write funnels through the invalidation path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 35c55559-b67c-4e53-ad9e-c8a48f63fa18

📥 Commits

Reviewing files that changed from the base of the PR and between 46abff6 and b7b23e7.

📒 Files selected for processing (1)
  • crates/vm/src/builtins/type.rs

Comment on lines +84 to +89
/// The cache holds a **borrowed** pointer (no refcount increment).
/// Safety: `type_cache_clear()` nullifies all entries during GC,
/// and `type_cache_clear_version()` nullifies entries when a type
/// is modified — both before the source dict entry is removed.
/// Types are always part of reference cycles (via `mro` self-reference)
/// so they are always collected by the cyclic GC (never refcount-freed).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

The unsafe lifetime note needs the static-type case too.

Lines 689-696 explicitly untrack static types, so "Types are always collected by the cyclic GC" is not true for every cache owner here. Please document both supported roots explicitly: heap types are cleared by GC, while static types are immortal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/builtins/type.rs` around lines 84 - 89, Update the unsafe
lifetime doc for the type cache to cover both roots: state that entries
referencing heap types are cleared by GC via type_cache_clear() and
type_cache_clear_version(), while static (untracked/immortal) types are not
collected and remain valid for the program lifetime (refer to the static-type
untracking logic). Mention both behaviors explicitly so callers understand that
cached pointers may be nulled during GC for heap types but are permanently valid
for static types.

Comment on lines +805 to 809
// Invalidate caches BEFORE modifying attributes so that borrowed
// pointers in cache entries are nullified while the source objects
// are still alive.
self.modified();
self.attributes.write().insert(attr_name, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Borrowed-cache safety is still bypassed by other type-dict mutations.

After this switch, ordinary keys like __annotations_cache__ and __annotate_func__ can be cached as borrowed pointers, but at minimum Lines 1379, 1406-1408, 1470-1472, and 1489-1525 still replace/remove those entries with raw self.attributes.write().insert/swap_remove() and no modified() call. A simple t.__annotations_cache__ → mutate annotations → t.__annotations_cache__ sequence can leave the cache pointing at freed storage. Please funnel every post-construction attributes mutation through the same invalidation path.

Suggested direction
fn mutate_type_dict(&self, f: impl FnOnce(&mut PyAttributes)) {
    self.modified();
    f(&mut self.attributes.write());
}

Use that helper for the annotation/cache setters and any other runtime self.attributes writes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/builtins/type.rs` around lines 805 - 809, The type dict
mutation pathway must always call the cache-invalidation path before changing
attributes; create and use a helper like mutate_type_dict(&self, f: impl
FnOnce(&mut PyAttributes)) that calls self.modified() then invokes f on
self.attributes.write(), and replace all direct mutations (e.g., calls to
self.attributes.write().insert, swap_remove, remove) in the annotation/cache
setters and the sites around the symbols mentioned (lines handling
__annotations_cache__, __annotate_func__, and the blocks currently using
insert/swap_remove) to use this helper so every post-construction write funnels
through the invalidation path.

@youknowone youknowone merged commit ce8952b into RustPython:main Mar 8, 2026
12 of 13 checks passed
@youknowone youknowone deleted the type-cache branch March 8, 2026 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

X Tutup