gh-67022: Document bytes/str inconsistency in email.header.decode_header() and add .decode_header_to_string() as a sane alternative #92900
+85
−16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.


This function's possible return types have been surprising and error-prone
for the entirety of its Python 3.x history. It can return either:
This function can't be rewritten to be more consistent in a backwards-compatible way, because some users of this function depend on the existing return type(s).
This PR addresses the inconsistency as suggested by @JelleZijlstra in #67022 (comment):
The "sane", Pythonic way to handle the decoding of an email/MIME message header value is simply to convert the whole header to a
str; the details of exactly which parts of that header were encoded in which charsets are not relevant to the users. Fortunately, theemail.headermodule already contains a mechanism to do this, via the__str__method ofemail.header.header, so we can simply create a wrapper function to guide users in the right direction.Example of the old/inconsistent (
decode_header) vs. new/sane (decode_header_to_string) functions:(Closes #30548 and replaces it.)