gh-126807: pygettext: Do not attempt to extract messages from function definitions.#126808
gh-126807: pygettext: Do not attempt to extract messages from function definitions.#126808serhiy-storchaka merged 8 commits intopython:mainfrom
Conversation
Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. This commit fixes that by keeping track of the previous token and checking if it's 'def' or 'class'.
| @@ -1,11 +1,10 @@ | |||
| #! /usr/bin/env python3 | |||
| # -*- coding: iso-8859-1 -*- | |||
There was a problem hiding this comment.
There's no other file that uses this encoding, I think it's safe (and more practical) to use utf-8.
There was a problem hiding this comment.
This is not related change, so please keep the coding cookie.
There was a problem hiding this comment.
Got it! I'll revert :) Would you accept a separate (perhaps not backported) PR that removes the coding and the commented-out code or do you think it's not worth it?
There was a problem hiding this comment.
I'll accept it if there are pygettext tests for files with non-UTF-8 encoding.
There was a problem hiding this comment.
Fair enough, I'll add it to my todo list :)
Tools/i18n/pygettext.py
Outdated
| if ( | ||
| ttype == tokenize.NAME and tstring in opts.keywords | ||
| and (not self.__prev_token or not _is_def_or_class_keyword(self.__prev_token)) | ||
| ): |
There was a problem hiding this comment.
The new logic is, if we see one of the gettext keywords and the previous token is not def or class, only then we transition to __keywordseen.
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Note that no warnings are emitted if option --docstrings is used. I think that we can use a similar approach. We can add
if ttype == tokenize.NAME and tstring in ('class', 'def'):
self.__state = self.__ignorenext
return
where __ignorenext simply sets self.__state = self.__waiting.
| @@ -1,11 +1,10 @@ | |||
| #! /usr/bin/env python3 | |||
| # -*- coding: iso-8859-1 -*- | |||
There was a problem hiding this comment.
This is not related change, so please keep the coding cookie.
| @@ -1,11 +1,10 @@ | |||
| #! /usr/bin/env python3 | |||
| # -*- coding: iso-8859-1 -*- | |||
There was a problem hiding this comment.
I'll accept it if there are pygettext tests for files with non-UTF-8 encoding.
|
Thanks @tomasr8 for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. |
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
|
GH-126846 is a backport of this pull request to the 3.13 branch. |
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
|
GH-126847 is a backport of this pull request to the 3.12 branch. |
…function definitions. (GH-126808) (GH-126847) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R <tomas.roun8@gmail.com>
…function definitions. (GH-126808) (GH-126846) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R <tomas.roun8@gmail.com>
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning.
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning.
Fixes a bug where pygettext would attempt to extract a message from a code like this:
This is because pygettext only looks at one token at a time and
_(x)looks like a function call.However, since
xis not a string literal, it would erroneously issue a warning.This PR fixes that by keeping track of the previous token and checking if it's
deforclass.