X Tutup
The Wayback Machine - https://web.archive.org/web/20240907033744/https://github.com/python/cpython/issues/121794
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_Py_MergeZeroLocalRefcount should not set ob_tid to zero in fast-path dealloc #121794

Closed
colesbury opened this issue Jul 15, 2024 · 0 comments
Closed
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@colesbury
Copy link
Contributor

colesbury commented Jul 15, 2024

Bug report

The _Py_MergeZeroLocalRefcount function is called when the local refcount field reaches zero. We generally maintain the invariant 1 that ob_tid == 0 implies that the refcount fields are merged (i.e., ob_ref_shared flags are _Py_REF_MERGED) and vice versa.

The current implementation breaks the invariant by setting ob_tid to zero when the refcount fields aren't merged. Typically, this isn't a problem because:

  • Most commonly, the object is deallocated so the values do not matter
  • If the object is resurrected in subtype_dealloc (e.g., for a finalizer), we use Py_SET_REFCNT, which will mark the fields as merged, restoring the invariant

However, if resurrection is done slightly differently, such as by Py_INCREF(), then things can break in very strange ways:

  • The GC may restore ob_tid from the allocator (because it's not merged), but ob_ref_local is still zero. The next Py_DECREF then leads to a "negative refcount" error.

Summary

We should maintain the invariant that ob_tid == 0 <=> _Py_REF_IS_MERGED() and check it with assertions when possible.

cpython/Objects/object.c

Lines 373 to 398 in 8303d32

void
_Py_MergeZeroLocalRefcount(PyObject *op)
{
assert(op->ob_ref_local == 0);
_Py_atomic_store_uintptr_relaxed(&op->ob_tid, 0);
Py_ssize_t shared = _Py_atomic_load_ssize_acquire(&op->ob_ref_shared);
if (shared == 0) {
// Fast-path: shared refcount is zero (including flags)
_Py_Dealloc(op);
return;
}
// Slow-path: atomically set the flags (low two bits) to _Py_REF_MERGED.
Py_ssize_t new_shared;
do {
new_shared = (shared & ~_Py_REF_SHARED_FLAG_MASK) | _Py_REF_MERGED;
} while (!_Py_atomic_compare_exchange_ssize(&op->ob_ref_shared,
&shared, new_shared));
if (new_shared == _Py_REF_MERGED) {
// i.e., the shared refcount is zero (only the flags are set) so we
// deallocate the object.
_Py_Dealloc(op);
}
}

Originally reported by @albanD

Linked PRs

Footnotes

  1. There are a few places where we deliberately re-use ob_tid for other purposes, such as the trashcan mechanism and during GC, but these objects are not visible to other parts of the program.

@colesbury colesbury added type-bug An unexpected behavior, bug, or error 3.13 bugs and security fixes topic-free-threading 3.14 new features, bugs and security fixes labels Jul 15, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Jul 15, 2024
We should maintain the invariant that a zero `ob_tid` implies the
refcount fields are merged.

* Move the assignment in `_Py_MergeZeroLocalRefcount` to immediately
  before the refcount merge.
* Update `_PyTrash_thread_destroy_chain` to set `ob_ref_shared` to
  `_Py_REF_MERGED` when setting `ob_tid` to zero.

Also check this invariant with assertions in the GC in debug builds.
That uncovered a bug when running out of memory during GC.
colesbury added a commit that referenced this issue Jul 15, 2024
We should maintain the invariant that a zero `ob_tid` implies the
refcount fields are merged.

* Move the assignment in `_Py_MergeZeroLocalRefcount` to immediately
  before the refcount merge.
* Update `_PyTrash_thread_destroy_chain` to set `ob_ref_shared` to
  `_Py_REF_MERGED` when setting `ob_tid` to zero.

Also check this invariant with assertions in the GC in debug builds.
That uncovered a bug when running out of memory during GC.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 15, 2024
…honGH-121799)

We should maintain the invariant that a zero `ob_tid` implies the
refcount fields are merged.

* Move the assignment in `_Py_MergeZeroLocalRefcount` to immediately
  before the refcount merge.
* Update `_PyTrash_thread_destroy_chain` to set `ob_ref_shared` to
  `_Py_REF_MERGED` when setting `ob_tid` to zero.

Also check this invariant with assertions in the GC in debug builds.
That uncovered a bug when running out of memory during GC.
(cherry picked from commit d23be39)

Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury added a commit that referenced this issue Jul 15, 2024
…-121799) (#121821)

We should maintain the invariant that a zero `ob_tid` implies the
refcount fields are merged.

* Move the assignment in `_Py_MergeZeroLocalRefcount` to immediately
  before the refcount merge.
* Update `_PyTrash_thread_destroy_chain` to set `ob_ref_shared` to
  `_Py_REF_MERGED` when setting `ob_tid` to zero.

Also check this invariant with assertions in the GC in debug builds.
That uncovered a bug when running out of memory during GC.
(cherry picked from commit d23be39)

Co-authored-by: Sam Gross <colesbury@gmail.com>
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…hon#121799)

We should maintain the invariant that a zero `ob_tid` implies the
refcount fields are merged.

* Move the assignment in `_Py_MergeZeroLocalRefcount` to immediately
  before the refcount merge.
* Update `_PyTrash_thread_destroy_chain` to set `ob_ref_shared` to
  `_Py_REF_MERGED` when setting `ob_tid` to zero.

Also check this invariant with assertions in the GC in debug builds.
That uncovered a bug when running out of memory during GC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant
X Tutup