X Tutup
Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions Lib/test/test_itertools.py
Original file line number Diff line number Diff line change
Expand Up @@ -754,6 +754,40 @@ def keys():
next(g)
next(g) # must pass with address sanitizer

def test_grouper_next_reentrant_eq_does_not_crash(self):
# regression test for gh-145678: _grouper_next() did not protect
# gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool,
# so a reentrant __eq__ that advanced the parent groupby could free
# those objects while they were still being compared (use-after-free).
Comment on lines +758 to +761
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference is enough.

Suggested change
# regression test for gh-145678: _grouper_next() did not protect
# gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool,
# so a reentrant __eq__ that advanced the parent groupby could free
# those objects while they were still being compared (use-after-free).
# regression test for gh-145678

outer_grouper = None

class Key:
def __init__(self, val, do_advance):
self.val = val
self.do_advance = do_advance

def __eq__(self, other):
if self.do_advance:
self.do_advance = False
# Advance the parent groupby iterator from inside __eq__,
# which calls groupby_step() and frees the old currkey.
try:
next(outer_grouper)
except StopIteration:
pass
Comment on lines +772 to +777
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try/except is not necessary; we don't get here with an exhausted iterator.

Lose the comment.

return NotImplemented
return self.val == other.val

def __hash__(self):
return hash(self.val)

values = [1, 1, 2]
keys_iter = iter([Key(1, True), Key(1, False), Key(2, False)])
g = itertools.groupby(values, lambda _: next(keys_iter))
outer_grouper = g
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call it outer_grouper, or g. No need for two names.

k, grp = next(g)
list(grp) # must not crash with address sanitizer

def test_filter(self):
self.assertEqual(list(filter(isEven, range(6))), [0,2,4])
self.assertEqual(list(filter(None, [0,1,0,2,0])), [1,2])
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Fix a use-after-free in :func:`itertools.groupby`: the internal
``_grouper_next()`` function did not hold strong references to
``gbo->currkey`` and ``igo->tgtkey`` before calling
:c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced
the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on
``currkey``) could free those objects while they were still being compared.
The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``.
Comment on lines +1 to +7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Fix a use-after-free in :func:`itertools.groupby`: the internal
``_grouper_next()`` function did not hold strong references to
``gbo->currkey`` and ``igo->tgtkey`` before calling
:c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced
the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on
``currkey``) could free those objects while they were still being compared.
The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``.
Fix a use-after-free crash in :func:`itertools.groupby` when a user-defined
``__eq__`` advanced the parent iterator while the iterator of groups was advanced.
The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``.

12 changes: 11 additions & 1 deletion Modules/itertoolsmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -678,7 +678,17 @@ _grouper_next(PyObject *op)
}

assert(gbo->currkey != NULL);
rcmp = PyObject_RichCompareBool(igo->tgtkey, gbo->currkey, Py_EQ);
/* A user-defined __eq__ can re-enter groupby (via the parent iterator)
and call groupby_step(), which frees gbo->currkey via Py_XSETREF while
we are still comparing it. Take local snapshots with strong references
so INCREF/DECREF apply to the same objects even under re-entrancy. */
PyObject *tgtkey = igo->tgtkey;
PyObject *currkey = gbo->currkey;
Py_INCREF(tgtkey);
Py_INCREF(currkey);
Comment on lines +681 to +688
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to record historical issues in the source. General reasons for holding refs while calling user code are clear -- the trick is finding cases where this was forgotten.

Suggested change
/* A user-defined __eq__ can re-enter groupby (via the parent iterator)
and call groupby_step(), which frees gbo->currkey via Py_XSETREF while
we are still comparing it. Take local snapshots with strong references
so INCREF/DECREF apply to the same objects even under re-entrancy. */
PyObject *tgtkey = igo->tgtkey;
PyObject *currkey = gbo->currkey;
Py_INCREF(tgtkey);
Py_INCREF(currkey);
PyObject *tgtkey = Py_NewRef(igo->tgtkey);
PyObject *currkey = Py_NewRef(gbo->currkey);

(cc @picnixz, reviewer of the earlier PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to record historical issues in the source

Victor sometimes asks me to do that, while Serhiy usually prefers not to, and I personally avoid but I did so in the past so I'm not that consistent either. I think I did add one such comment in OrderedDict so I don't mind.

+1 for Py_NewRef, it slipped through my review my bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, comments are good when it's not clear why the code does something. Here, it's relatively clear that we're calling user code. The exact way user code can clear references isn't that important -- and may change in the future.


Py_NewRef: no worries, that's a style nitpick.

rcmp = PyObject_RichCompareBool(tgtkey, currkey, Py_EQ);
Py_DECREF(tgtkey);
Py_DECREF(currkey);
if (rcmp <= 0)
/* got any error or current group is end */
return NULL;
Expand Down
Loading
X Tutup