http: decode username and password before encoding #31450
Conversation
|
I think this PR should also change the docs: https://nodejs.org/api/url.html#url_url_username. Technically this can be seen as a patch or semver-major change, and I'm uncertain between the two. |
Done, PTAL |
|
semver-patch in my opinion. |
|
lgtm |
| - version: REPLACEME | ||
| pr-url: https://github.com/nodejs/node/pull/31450 | ||
| description: When used with `http.request()`, this field will now be | ||
| percent-decoded. |
bnoordhuis
Jan 22, 2020
Member
The section on http.request() in http.md seems like a more logical place to me for this note.
| - version: REPLACEME | ||
| pr-url: https://github.com/nodejs/node/pull/31450 | ||
| description: When used with `http.request()`, this field will now be | ||
| percent-decoded. |
bnoordhuis
Jan 22, 2020
Member
The section on http.request() in http.md seems like a more logical place to me for this note.
| @@ -1284,7 +1284,8 @@ function urlToOptions(url) { | |||
| options.port = Number(url.port); | |||
| } | |||
| if (url.username || url.password) { | |||
| options.auth = `${url.username}:${url.password}`; | |||
| options.auth = | |||
| `${decodeURIComponent(url.username)}:${decodeURIComponent(url.password)}`; | |||
bnoordhuis
Jan 22, 2020
Member
This introduces a spec violation. From RFC 7617, section 2:
Furthermore, a user-id containing a colon character is invalid, as
the first colon in a user-pass string separates user-id and password
from one another; text after the first colon is part of the password.
User-ids containing colons cannot be encoded in user-pass strings.
With this change, I can now write something like:
https.get('https://not%3Agood:god@example.com')And the server will interpret that as user="not" and pass="good:god" instead of pass="god" (which is the password we still all use for our root accounts, right? Right!)
izonder
Jan 22, 2020
•
User-ids containing colons cannot be encoded in user-pass strings.
But what would be proper behavior in case of passing : in username? Encoding username - the wrong way for the computation Authorization header. Pre-validation with throwing an exception? Maybe...
addaleax
Jan 22, 2020
Author
Member
I don’t really want to introduce new exceptions here … maybe just leave it encoded?
The same question also pops up for e.g. %2F for / in the password.
bnoordhuis
Jan 23, 2020
Member
Yes. I think : and / are the only problematic ones; for @ it's "last at-sign wins."
It's probably okay to leave those encoded, that's no worse from how it was before.
|
I'm generally unconvinced we should do this; and if we do make any changes here, it should be done in a manner that is consistent with the WHATWG URL parser. For instance, given the input: URL {
href: 'https://not%3Agood:g%3aod@example.com/',
origin: 'https://example.com',
protocol: 'https:',
username: 'not%3Agood',
password: 'g%3aod',
host: 'example.com',
hostname: 'example.com',
port: '',
pathname: '/',
search: '',
searchParams: URLSearchParams {},
hash: ''
}(note that |
@jasnell I’m not 100 % sure what “consistent with the WHATWG URL parser” means here – the parser itself is not changed in any way. If you do pass in special characters as part of the original URL that are not > url = new URL('https://not"good:g^od@example.com')
URL {
href: 'https://not%22good:g%5Eod@example.com/',
origin: 'https://example.com',
protocol: 'https:',
username: 'not%22good',
password: 'g%5Eod',
host: 'example.com',
hostname: 'example.com',
port: '',
pathname: '/',
search: '',
searchParams: URLSearchParams {},
hash: ''
}
> urlToOptions(url)
{
protocol: 'https:',
hostname: 'example.com',
hash: '',
search: '',
pathname: '/',
path: '/',
href: 'https://not%22good:g%5Eod@example.com/',
auth: 'not"good:g^od'
} |
|
(needs a rebase) |
|
@addaleax, what I mean is that any decoding/encoding that happens with the username and password here should be identical to what is produced when parsing/serializing with the WHATWG URL parser. This is close, the URL standard does specify some specifics for username and password that are not covered by decodeURIComponent and encodeURIComponent. |
|
Any updates on the MR? |
| @@ -1265,6 +1265,10 @@ function domainToUnicode(domain) { | |||
| return _domainToUnicode(`${domain}`); | |||
| } | |||
|
|
|||
| function decodeAuth(str) { | |||
| return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F'); | |||
| @@ -1265,6 +1265,10 @@ function domainToUnicode(domain) { | |||
| return _domainToUnicode(`${domain}`); | |||
| } | |||
|
|
|||
| function decodeAuth(str) { | |||
| return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F'); | |||
szmarczak
Jul 16, 2020
Contributor
Is it possible for url.username to be undefined? Isn't it an empty string instead?
|
@addaleax ... I took another look at this and I want to remove any objection I have on this. The change looks good |
8ae28ff
to
2935f72
|
@addaleax: Is this still blocked? |
|
@ronag No, I don’t think so, although it’s not ready to land either (there’s a few valid comments above and this needs a rebase) |
| @@ -1265,6 +1265,10 @@ function domainToUnicode(domain) { | |||
| return _domainToUnicode(`${domain}`); | |||
| } | |||
|
|
|||
| function decodeAuth(str) { | |||
| return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F'); | |||
szmarczak
Jul 16, 2020
Contributor
const forwardSlashRegEx = /\//g;
+const colonRegEx = /:/g;| return decodeURIComponent(str).replace(':', '%3A').replace('/', '%2F'); | |
| return decodeURIComponent(str).replace(colonRegEx, '%3A').replace(forwardSlashRegEx, '%2F'); |
|
Friendly ping :) |
|
Conflicts |
|
I'm removing the blocked label as I suspect it might make collaborators less likely to engage.
|
|
Adding a request changes to avoid landing while there are still things to address. |
|
@addaleax Would you be able to address the remaining feedback? |
|
@addaleax friendly reminder |
|
Friendly ping :) |

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.


@nodejs/url @nodejs/security I’m marking this as blocked, because I’d like somebody to take a look at this who is somewhat familiar with the security implications of this – there are no obvious downsides to me here, but I know this can be quite a sensitive area.
Fixes: #31439
/cc @izonder
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passes