Fix http_parser_parse_url to handle very long URLs #480
Conversation
Earlier http_parser_parse_url would incorrectly parse very long URLs: The resulting off and len parameters would just get truncated. Even though very long URLs are typically considered invalid by servers they could still end up being parsed by http_parser_parse_url. Thus it's better handle this situation gracefully. This change is unfortunately not backwards ABI compatible due to changes in http_parser_url structure field types, hence the major version number bump.
|
I'd also note that the issue of failing to parse very long URLs properly might also have a security impact: For example if a security critical code would use http_parser_parse_url function to parse the path of a request and then examine the path to see if it should be passed thru some filter. Due to this issue the parsed and actual paths could differ, which might result in filter bypass. However, to be exploitable the actual implementation would still been to accept the very long URL. If this PR is deemed too drastic of a change, alternatively adding checks for |

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

Earlier http_parser_parse_url would incorrectly parse very long URLs: The resulting off and len parameters would just get truncated. Even though very long URLs are typically considered invalid by servers they could still end up being parsed by http_parser_parse_url. Thus it's better handle this situation gracefully.
This change is unfortunately not backwards ABI compatible due to changes in http_parser_url structure field types, hence the major version number bump.