bpo-11671: add header validation from http.client to wsgiref.headers.Headers#15299
bpo-11671: add header validation from http.client to wsgiref.headers.Headers#15299epicfaace wants to merge 8 commits intopython:mainfrom
Conversation
…, and validate only when inserting a header
Lib/wsgiref/headers.py
Outdated
| if not _is_legal_header_name(header.encode('ascii')): | ||
| raise ValueError('Invalid header name %r' % (header,)) | ||
| if _is_illegal_header_value(value.encode('ascii')): | ||
| raise ValueError('Invalid header value %r' % (value,)) |
There was a problem hiding this comment.
I would prefer to initialize self.headers to an empty list and then call "for header, value in headers: self.add_header(header, value)". Then remove self._convert_string_type(k) calls below.
There was a problem hiding this comment.
Added this. Why would we want to remove self._convert_string_type(k) calls though?
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
I have made the requested changes; please review again |
|
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
| ('ValidName', 'InvalidValue\r'), | ||
| ('ValidName', 'InvalidValue\n'), | ||
| (b'InvalidName', 'ValidValue', AssertionError), | ||
| ('ValidName', b'InvalidValue', AssertionError) |
There was a problem hiding this comment.
I would prefer to get a TypeError rather than an AssertionError.
| ('ValidName', b'InvalidValue', AssertionError) | ||
| ) | ||
|
|
||
| def test_add_header_validation(self): |
There was a problem hiding this comment.
I suggest to merge the two tests a single test, to factorize code.
I suggest to remove the b'InvalidName' and b'InvalidValue' from validation_cases. Maybe create a second list, maybe put it directly in the test method to avoid with self.assertRaises(exception[0]) if len(exception) else self.assertRaisesRegex(ValueError, 'Invalid header'): complexity.
| '\r\n' | ||
| ) | ||
|
|
||
| validation_cases = ( |
There was a problem hiding this comment.
Please add a comment mentioning bpo-11671.
Rather than an hardcoded list, why not building such list using a few invalid characters. For example for header names:
invalid_names = []
invalid_name_chars = ['\r', '\n', '\r\n', ':']
for invalid_chars in invalid_name_chars:
invalid_names.append(f'{invalid_chars}InvalidName')
invalid_names.append(f'Invalid{invalid_chars}Name')
invalid_names.append(f'InvalidName{invalid_chars}')
invalid_names.append(' InvalidName')
Are whitespaces (" ") and colon (":") only invalid at the beginning of the name?
| v = self._convert_string_type(v) | ||
| parts.append(_formatparam(k.replace('_', '-'), v)) | ||
| self._headers.append((self._convert_string_type(_name), "; ".join(parts))) | ||
| header = self._convert_string_type(_name) |
There was a problem hiding this comment.
Can you rename the variable to header_name (or maybe name, I didn't check if it's already used)?
| value = "; ".join(parts) | ||
| if not _is_legal_header_name(header.encode('ascii')): | ||
| raise ValueError('Invalid header name %r' % (header,)) | ||
| if _is_illegal_header_value(value.encode('ascii')): |
There was a problem hiding this comment.
http/client.py encodes the value to Latin1, not ASCII
| ('ValidName', 'InvalidValue\n'), | ||
| (b'InvalidName', 'ValidValue', AssertionError), | ||
| ('ValidName', b'InvalidValue', AssertionError) | ||
| ) |
There was a problem hiding this comment.
Are there already tests for header names which cannot be encoded to ASCII? And tests for values not encodable to Latin1?
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
@epicfaace, please address the code review comments. Thank you! |
|
@epicfaace ping |
|
Closing because superseded by #143917 |
Adds header validation to disallow invalid characters in header names and values. This also fixes https://bugs.python.org/issue28778
I based this off the initial patch in bpo-11671, with a few changes:
I know that http.client's regex (added in bpo-22928, a112a8a) is not RFC compliant, but it seemed better to at least make both libraries have the same behavior with regards to header validation.
https://bugs.python.org/issue11671