-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-25324: copy tok_name before changing it #1608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@albertjan, thanks for your PR! By analyzing the history of the files in this pull request, we identified @serhiy-storchaka, @1st1 and @tpn to be potential reviewers. |
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
Lib/test/test_tokenize.py
Outdated
| @@ -1,7 +1,7 @@ | |||
| from test import support | |||
| from tokenize import (tokenize, _tokenize, untokenize, NUMBER, NAME, OP, | |||
| STRING, ENDMARKER, ENCODING, tok_name, detect_encoding, | |||
| open as tokenize_open, Untokenizer) | |||
| open as tokenize_open, Untokenizer, tok_name as tokenize_tok_name) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too long line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that it's the right way to fix the issue: see http://bugs.python.org/issue25324 discussion. I also have comments on the change itself, but I will want until we agree on the way to fix the issue before reviewing the change.
|
Due to a new release of Sphinx, we had to fix the documentation to build on Travis again. Please do a merge to get these changes to help get Travis passing on your PR. |
Lib/token.py
Outdated
| @@ -66,8 +66,11 @@ | |||
| OP = 53 | |||
| AWAIT = 54 | |||
| ASYNC = 55 | |||
| ERRORTOKEN = 56 | |||
| N_TOKENS = 57 | |||
| COMMENT = 56 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't change ERRORTOKEN value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please copy token.h comment here.
Include/token.h
Outdated
| @@ -66,8 +66,12 @@ extern "C" { | |||
| #define OP 53 | |||
| #define AWAIT 54 | |||
| #define ASYNC 55 | |||
| #define ERRORTOKEN 56 | |||
| #define N_TOKENS 57 | |||
| /* These aren't used by the c tokenizer but are needed for tokenize.py */ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Niypick: replace c with C.
|
Updated the PR. comments are now copied from token.h to token.py automatically. And I moved ERRORTOKEN back to where it was. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I added a new (last I hope) serie of nitpicking comments :-p
Lib/test/test_tokenize.py
Outdated
| @@ -1417,7 +1417,6 @@ def test_pathological_trailing_whitespace(self): | |||
| # See http://bugs.python.org/issue16152 | |||
| self.assertExactTypeEqual('@ ', token.AT) | |||
|
|
|||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not PEP 8 compliant :-)
Lib/token.py
Outdated
| @@ -104,13 +110,23 @@ def _main(): | |||
| prog = re.compile( | |||
| "#define[ \t][ \t]*([A-Z0-9][A-Z0-9_]*)[ \t][ \t]*([0-9][0-9]*)", | |||
| re.IGNORECASE) | |||
| comment = re.compile( | |||
| "^\s*/\*\s*(.+)\s*\*/\s*$", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This string emits:
<stdin>:1: DeprecationWarning: invalid escape sequence \s
I suggest to always use raw strings for regular expressions.
Lib/token.py
Outdated
| val = int(val) | ||
| tokens[val] = name # reverse so we can sort them... | ||
| prev_val = int(val) | ||
| tokens[prev_val] = {'token': name} # reverse so we can sort them... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to use val here, but set prev_val after the tokens assignement. Here we use the current value, not the previous value.
Lib/token.py
Outdated
| tokens[prev_val] = {'token': name} # reverse so we can sort them... | ||
| else: | ||
| comment_match = comment.match(line) | ||
| if comment_match and prev_val: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"prev_val is not None" to support prev_val == 0 (ENDMARKER = 0).
Lib/token.py
Outdated
| else: | ||
| comment_match = comment.match(line) | ||
| if comment_match and prev_val: | ||
| val = comment_match.group(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: i suggest to rename the variable "comment".
Lib/token.py
Outdated
| @@ -128,7 +144,9 @@ def _main(): | |||
| sys.exit(3) | |||
| lines = [] | |||
| for val in keys: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I suggest to rename "val" to "key" to be more consistent.
|
This should address your comments. Thanks for taking the time to review my PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I will merge the change once tests pass.
|
Please wait with merging. I'm finishing my patch for generating |
Wait? Do you expect conflicts? |
|
Yes, conflicts, and maybe this will lead to redesigning both patches. |
Misc/NEWS
Outdated
| @@ -10,6 +10,10 @@ What's New in Python 3.7.0 alpha 1? | |||
| Core and Builtins | |||
| ----------------- | |||
|
|
|||
| - bpo-25324: Tokens needed for parsing in python moved to C. ``COMMENT``, | |||
| ``NL`` AND ``ENCODING``. This way the tokens and tok_names in token.py | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and: lower case
in token.py: in the token module
import tokenize.py: import the tokenize module.
|
Should I do a merge or a rebase to resolve the conflicts? |
As you want. |
f02a84e to
e7113fa
Compare
Misc/NEWS
Outdated
| @@ -10,6 +10,10 @@ What's New in Python 3.7.0 alpha 1? | |||
| Core and Builtins | |||
| ----------------- | |||
|
|
|||
| - bpo-25324: Tokens needed for parsing in python moved to C. ``COMMENT``, | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python: title case.
Doc/library/token.rst
Outdated
| N_TOKENS | ||
| NT_OFFSET | ||
|
|
||
| .. versionchanged:: 3.5 | ||
| Added :data:`AWAIT` and :data:`ASYNC` tokens. Starting with | ||
| Python 3.7, "async" and "await" will be tokenized as :data:`NAME` | ||
| tokens, and :data:`AWAIT` and :data:`ASYNC` will be removed. | ||
|
|
||
| .. versionchanged:: 3.7 | ||
| Added :data:`COMMENT`, :data:`NL` and :data:`ENCODING`. To bring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't a period is redundant here?
Doc/library/token.rst
Outdated
| .. versionchanged:: 3.7 | ||
| Added :data:`COMMENT`, :data:`NL` and :data:`ENCODING`. To bring | ||
| the tokens in the C code in line with the tokens needed in | ||
| tokenize.py. These tokens aren't used by the C tokenizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokenize.py -> the :mod:`tokenize` module
|
LGTM. Thanks @albertjan! |
|
Nevermind @albertjan, looks like the issue I described here was caused by a merge issue. Thanks anyway :) |


Saw this open bug report, and since I was looking at tokenize.py anyway. I figured I address it.
This may catch people off guard though because they may be relying on
tok_namecontainingENCODING,COMMENTandNL. 🤷♂️