Reduce per-instruction overhead in interpreter loop#7315
Reduce per-instruction overhead in interpreter loop#7315youknowone merged 2 commits intoRustPython:mainfrom
Conversation
📝 WalkthroughWalkthroughInlines inline-cache handling into the VM frame dispatcher, removing the Changes
Sequence Diagram(s)sequenceDiagram
participant VM as VM (dispatcher)
participant Frame as Frame
participant Op as Opcode
participant Cache as InlineCache
participant Trace as Tracer
VM->>Frame: dispatch instruction
Frame->>Op: resolve original/instrumented opcode
Op-->>Frame: execute / may jump
Frame->>Cache: op.cache_entries()
alt no jump (fallthrough)
Frame->>Frame: lasti += cache_count
end
alt tracing enabled
Frame->>Trace: emit opcode trace (filtered)
Trace-->>Frame: ack
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
be78ddd to
236c1cb
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/doctest.py dependencies:
dependent tests: (33 tests)
Legend:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/frame.rs (1)
1193-1193:⚠️ Potential issue | 🟠 MajorRemove
check_signalscall fromexecute_instructionand add it only to backward jump handling.The PR objective includes "Move check_signals from every instruction to JumpBackward only," but
vm.check_signals()?;remains inexecute_instruction()(line 1193), which is invoked for every instruction. This call defeats the optimization goal. To complete the PR objective, remove this call and addcheck_signalsonly toJumpBackwardandJumpBackwardNoInterruptinstruction handlers to reduce per-instruction overhead while maintaining signal handling in loops.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` at line 1193, Remove the per-instruction vm.check_signals()? call from execute_instruction() so it no longer runs for every opcode; instead, insert vm.check_signals()? only inside the handlers for JumpBackward and JumpBackwardNoInterrupt (immediately before performing the backward jump/pc update and preserving the error propagation with ?). Update JumpBackward and JumpBackwardNoInterrupt handlers to call vm.check_signals()? at the point where you detect/perform a backward jump so loop interruptions are still handled, and ensure you remove the redundant call from execute_instruction to avoid per-instruction overhead.
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
3419-3422: Consider extracting cache-skip logic into a helper (optional).The cache-adjustment pattern appears in three places: the main loop (lines 786-789),
InstrumentedLine(lines 3419-3422), andInstrumentedInstruction(lines 3454-3457). While the instrumented paths are cold and computeorig_cachesdifferently, a small helper could reduce duplication:#[inline] fn skip_caches_if_fallthrough(&mut self, caches: u8, lasti_before: u32) { if caches > 0 && self.lasti() == lasti_before { self.update_lasti(|i| *i += caches as u32); } }This is optional given the cold path context and slight differences in how
cachesis computed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 3419 - 3422, Extract the duplicated cache-adjustment block into a small helper (e.g., fn skip_caches_if_fallthrough(&mut self, caches: u8, lasti_before: u32)) and replace the three occurrences (main loop, InstrumentedLine block using orig_caches from original_op, and InstrumentedInstruction) with calls to that helper; ensure the helper checks caches > 0 and self.lasti() == lasti_before and then calls self.update_lasti(|i| *i += caches as u32), keep it #[inline] if desired, and adjust call sites to pass the computed caches and lasti_before values (matching the types used where orig_caches is computed).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/vm/src/frame.rs`:
- Line 1193: Remove the per-instruction vm.check_signals()? call from
execute_instruction() so it no longer runs for every opcode; instead, insert
vm.check_signals()? only inside the handlers for JumpBackward and
JumpBackwardNoInterrupt (immediately before performing the backward jump/pc
update and preserving the error propagation with ?). Update JumpBackward and
JumpBackwardNoInterrupt handlers to call vm.check_signals()? at the point where
you detect/perform a backward jump so loop interruptions are still handled, and
ensure you remove the redundant call from execute_instruction to avoid
per-instruction overhead.
---
Nitpick comments:
In `@crates/vm/src/frame.rs`:
- Around line 3419-3422: Extract the duplicated cache-adjustment block into a
small helper (e.g., fn skip_caches_if_fallthrough(&mut self, caches: u8,
lasti_before: u32)) and replace the three occurrences (main loop,
InstrumentedLine block using orig_caches from original_op, and
InstrumentedInstruction) with calls to that helper; ensure the helper checks
caches > 0 and self.lasti() == lasti_before and then calls self.update_lasti(|i|
*i += caches as u32), keep it #[inline] if desired, and adjust call sites to
pass the computed caches and lasti_before values (matching the types used where
orig_caches is computed).
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_doctest/test_doctest.pyis excluded by!Lib/**
📒 Files selected for processing (1)
crates/vm/src/frame.rs
- Defer prev_line update to when tracing is active - Inline skip_caches_if_fallthrough, compute cache_entries once per instruction
prev_line deferral fix corrects pdb line tracking, making these doctests pass.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/frame.rs`:
- Around line 755-763: The prev_line update logic in frame.rs currently skips
Instruction::Resume but not Instruction::InstrumentedResume, which allows
InstrumentedResume to pre-set prev_line and suppress the first real Line trace;
update the conditional inside the vm.use_tracing.get() branch (the match against
op) to also exclude Instruction::InstrumentedResume alongside Resume,
ExtendedArg, and InstrumentedLine so prev_line is not updated for instrumented
resume operations (look for the match that checks Instruction::Resume { .. } |
Instruction::ExtendedArg | Instruction::InstrumentedLine and add
Instruction::InstrumentedResume { .. } to that set).
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_doctest/test_doctest.pyis excluded by!Lib/**
📒 Files selected for processing (1)
crates/vm/src/frame.rs
| // Update prev_line only when tracing or monitoring is active. | ||
| // When neither is enabled, prev_line is stale but unused. | ||
| if vm.use_tracing.get() { | ||
| if !matches!( | ||
| op, | ||
| Instruction::Resume { .. } | ||
| | Instruction::InstrumentedResume | ||
| | Instruction::ExtendedArg | ||
| ) | ||
| { | ||
| vm.trace_event(crate::protocol::TraceEvent::Opcode, None)?; | ||
| | Instruction::InstrumentedLine | ||
| ) && let Some((loc, _)) = self.code.locations.get(idx) |
There was a problem hiding this comment.
Exclude InstrumentedResume when updating prev_line.
Line 758 excludes Resume but not InstrumentedResume. Under tracing + monitoring instrumentation, this can pre-set prev_line on resume and suppress the first real Line trace event.
🔧 Proposed fix
- // Update prev_line only when tracing or monitoring is active.
- // When neither is enabled, prev_line is stale but unused.
+ // Update prev_line only when tracing is active.
+ // Keep RESUME variants excluded so the first real instruction
+ // can still trigger a line event.
if vm.use_tracing.get() {
if !matches!(
op,
Instruction::Resume { .. }
+ | Instruction::InstrumentedResume
| Instruction::ExtendedArg
| Instruction::InstrumentedLine
) && let Some((loc, _)) = self.code.locations.get(idx)
{📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Update prev_line only when tracing or monitoring is active. | |
| // When neither is enabled, prev_line is stale but unused. | |
| if vm.use_tracing.get() { | |
| if !matches!( | |
| op, | |
| Instruction::Resume { .. } | |
| | Instruction::InstrumentedResume | |
| | Instruction::ExtendedArg | |
| ) | |
| { | |
| vm.trace_event(crate::protocol::TraceEvent::Opcode, None)?; | |
| | Instruction::InstrumentedLine | |
| ) && let Some((loc, _)) = self.code.locations.get(idx) | |
| // Update prev_line only when tracing is active. | |
| // Keep RESUME variants excluded so the first real instruction | |
| // can still trigger a line event. | |
| if vm.use_tracing.get() { | |
| if !matches!( | |
| op, | |
| Instruction::Resume { .. } | |
| | Instruction::InstrumentedResume | |
| | Instruction::ExtendedArg | |
| | Instruction::InstrumentedLine | |
| ) && let Some((loc, _)) = self.code.locations.get(idx) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/frame.rs` around lines 755 - 763, The prev_line update logic in
frame.rs currently skips Instruction::Resume but not
Instruction::InstrumentedResume, which allows InstrumentedResume to pre-set
prev_line and suppress the first real Line trace; update the conditional inside
the vm.use_tracing.get() branch (the match against op) to also exclude
Instruction::InstrumentedResume alongside Resume, ExtendedArg, and
InstrumentedLine so prev_line is not updated for instrumented resume operations
(look for the match that checks Instruction::Resume { .. } |
Instruction::ExtendedArg | Instruction::InstrumentedLine and add
Instruction::InstrumentedResume { .. } to that set).
close #5281
Note: Moving check_signals to JumpBackward only was attempted but reverted — it broke signal delivery in non-loop code (test_signal, test_io, test_faulthandler).
Summary by CodeRabbit