-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-122179: Fix hashlib.file_digest and non-blocking I/O #122183
base: main
Are you sure you want to change the base?
Conversation
|
While this is technically an API change, I think this should be backported to releases that are still in bugfix mode, as the current behavior can cause hard to find bugs instead of failing fast. |
| @@ -231,6 +231,8 @@ def file_digest(fileobj, digest, /, *, _bufsize=2**18): | |||
| view = memoryview(buf) | |||
| while True: | |||
| size = fileobj.readinto(buf) | |||
| if size is None: | |||
| raise BlockingIOError("I/O operation would block.") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use IOError instead of BlockingIOError
BlockingIOError means
Raised when an operation would block on an object (e.g. socket) set for non-blocking operation. Corresponds to errno EAGAIN, EALREADY, EWOULDBLOCK and EINPROGRESS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IOError is the same as OSError since Python 3.3, and I don't think we use those aliases now (EDIT: we don't use it in the codebase). For me, BlockingIOError is correct in the sense that None is being returned by readinto so it's as if we were simply re-raising the exception (for buffered IO, they may raise BlockingIOError; for raw IO, they may return None to indicate the same thing).
| def readable(self): | ||
| return True | ||
|
|
||
| with self.assertRaises(BlockingIOError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is good for checking that the non-blocking error is raised, but doesn't validate that the hash changes in this case. Is it possible to add a test that the hash changes from before this change to after?

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

hashlib.file_digest()can't handle non-blocking I/O #122179