unittest cannot load module whose name starts with Unicode #68451

sih4sing5hong5 · 2015-05-22T06:49:01Z

BPO	24263
Nosy	@vstinner, @rbtcollins, @ezio-melotti, @abadger, @bitdancer, @serhiy-storchaka, @mlouielu, @tonybaloney
PRs	bpo-24263: Fix unittest can not load unicode pattern test #1338 bpo-24263: Fix unittest to discover tests named with non-ascii characters #13149
Files	test_dir.tar.gz: failure example VALID_MODULE_NAME.patch VALID_MODULE_NAME2.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-05-22.06:49:00.729>
labels = ['easy', 'type-bug', 'library']
title = 'unittest cannot load module whose name starts with Unicode'
updated_at = <Date 2019-05-10.16:59:09.302>
user = 'https://bugs.python.org/sih4sing5hong5'

bugs.python.org fields:

activity = <Date 2019-05-10.16:59:09.302>
actor = 'a.badger'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2015-05-22.06:49:00.729>
creator = 'sih4sing5hong5'
dependencies = []
files = ['39779', '39789', '39794']
hgrepos = []
issue_num = 24263
keywords = ['patch', 'easy']
message_count = 17.0
messages = ['245662', '245663', '245667', '245668', '245675', '245677', '245692', '245718', '248928', '248930', '261822', '292519', '341572', '341703', '341952', '342015', '342078']
nosy_count = 9.0
nosy_names = ['vstinner', 'rbcollins', 'ezio.melotti', 'a.badger', 'r.david.murray', 'serhiy.storchaka', 'sih4sing5hong5', 'louielu', 'anthonypjshaw']
pr_nums = ['1338', '13149']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue24263'
versions = ['Python 3.5', 'Python 3.6']

sih4sing5hong5 · 2015-06-23T03:50:50Z

Because VALID_MODULE_NAME is r'[_a-z]\w*\.py$' in unittest/loader.py.

Using r'[^\\W\\d]\w*\.py$' insteaded.

rbtcollins · 2015-06-23T04:04:53Z

Are the module names valid in import statements?

it would help if you could perhaps attach a little tar/zip file with an example failure.

sih4sing5hong5 · 2015-06-23T04:46:53Z

There is an attached file for examples.

I ran
{{{
cd test_dir
python -m unittest -v
}}}

and got
"Ran 1 test in 0.000s"

sih4sing5hong5 · 2015-06-23T05:36:53Z

By the way, I ran with Python 3.4.0.

serhiy-storchaka · 2015-06-23T10:00:05Z

r'[^\\W\\d]\w*' doesn't match all valid Python identifiers. It would be more correct to write the check as:

root, ext = os.path.splitext(basename)
if not (ext == '.py' and root.isidentifier()):
    # valid Python identifiers only
    return None, False

bitdancer · 2015-06-23T10:57:35Z

Yes, I bet that regex is left over from python2, where we didn't have isidentifier.

sih4sing5hong5 · 2015-06-23T14:27:43Z

Thank you.
I updated my patch in VALID_MODULE_NAME.patch.

sih4sing5hong5 · 2015-06-24T04:29:34Z

update by adding `except AttributeError:`

rbtcollins · 2015-08-21T00:11:44Z

Thank you very much for writing your patch in backwards compatible style - it will make backporting to unittest2 much easier.

rbtcollins · 2015-08-21T00:16:58Z

I'm torn on whether this needs a test or not. It would be hard to regress, but testing this properly really wants hypothesis with a valid-python-identifier-strategy.

I think on balance we do need one.

So - we need a test in test_discover that mocks the presence of a file with a name containing e.g. \u2603.

rbtcollins · 2016-03-15T19:27:27Z

sih4sing5hong5 - I think we do need a test in fact - it can be done using mocks, but right now I think the patch has a bug - it looks for isidentifier on $thing.py, but not on just $thing (which we need to do to handle packages, vs modules).

mlouielu · 2017-04-28T08:54:00Z

Add PR: #1338

rbcollins: Need for help to review the patch, I think that both $thing and $thing.py can't be used in python (and for UNIX dir), and \u2603 (☃) though can do something like ☃.py, but it is not a valid identifier in python, too.

tonybaloney · 2019-05-06T17:36:27Z

The original PR refers to a branch that no longer exists, but the behaviour documented still applies to master. There were some changes to the test loader, but none that fixed this issue.

abadger · 2019-05-07T10:35:54Z

I've opened a new PR at #13149 with the commit from #1338 and some additional changes to address the review comments given by serhiy.storchaka and rbcollins

tonybaloney · 2019-05-09T01:14:29Z

thanks, will wait for a review from Serhiy, Rbcollins or ezio

vstinner · 2019-05-10T00:04:52Z

What is the current error on test_dir.tar.gz? I'm not sure which problem is trying to be solved here.

Why does PR 13149 use str.isidentifier() method? unittest doesn't allow arbitrary Unicode in filenames?

abadger · 2019-05-10T16:59:09Z

From the description, I think the bug is that filenames that *begin* with non-ascii are not searched for tests. Looking at the test_dir.tar.gz contents, this is the test case that I'd use:

Broken:

$ python3 -m unittest discover -vv -p '*.py'
test_走 (tests試驗.Test試驗.試驗) ... ok
test_走 (tests試驗.test試驗.試驗) ... ok

Ran 2 tests in 0.000s

OK

Corrected:
$ /srv/python/cpython/python -m unittest discover -vv -p '*.py'
test_走 (tests試驗.Test試驗.試驗) ... ok
test_走 (tests試驗.test試驗.試驗) ... ok
test_走 (tests試驗.試驗.試驗) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.000s

OK

isidentifier() is used because filenames to be discovered must be importable and thus valid identifiers: https://docs.python.org/3/library/unittest.html#test-discovery

sih4sing5hong5 mannequin added topic-unicode type-bug An unexpected behavior, bug, or error labels May 22, 2015

ned-deily removed the topic-unicode label May 22, 2015

sih4sing5hong5 mannequin changed the title ~~Why VALID_MODULE_NAME in unittest/loader.py is r'[_a-z]\w*\.py$' not r'\w+\.py$' ?~~ unittest cannot load module whose name starts with Unicode Jun 23, 2015

ezio-melotti added easy stdlib Python modules in the Lib dir labels Jan 1, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

ezio-melotti mentioned this issue May 6, 2022

bpo-24263: Fix unittest to discover tests named with non-ascii characters #13149

Open

Apr	MAY	Jun
	13
2023	2024	2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unittest cannot load module whose name starts with Unicode #68451

unittest cannot load module whose name starts with Unicode #68451

sih4sing5hong5 mannequin commented May 22, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

rbtcollins commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

serhiy-storchaka commented Jun 23, 2015

bitdancer commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 24, 2015

rbtcollins commented Aug 21, 2015

rbtcollins commented Aug 21, 2015

rbtcollins commented Mar 15, 2016

mlouielu mannequin commented Apr 28, 2017

tonybaloney mannequin commented May 6, 2019

abadger mannequin commented May 7, 2019

tonybaloney mannequin commented May 9, 2019

vstinner commented May 10, 2019

abadger mannequin commented May 10, 2019

unittest cannot load module whose name starts with Unicode #68451

unittest cannot load module whose name starts with Unicode #68451

Comments

sih4sing5hong5 mannequin commented May 22, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

rbtcollins commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

serhiy-storchaka commented Jun 23, 2015

bitdancer commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 23, 2015

sih4sing5hong5 mannequin commented Jun 24, 2015

rbtcollins commented Aug 21, 2015

rbtcollins commented Aug 21, 2015

rbtcollins commented Mar 15, 2016

mlouielu mannequin commented Apr 28, 2017

tonybaloney mannequin commented May 6, 2019

abadger mannequin commented May 7, 2019

tonybaloney mannequin commented May 9, 2019

vstinner commented May 10, 2019

abadger mannequin commented May 10, 2019