X Tutup
The Wayback Machine - https://web.archive.org/web/20241215155321/https://github.com/python/cpython/issues/58992
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkgutil.walk_packages returns extra modules #58992

Open
cjerdonek opened this issue May 12, 2012 · 11 comments
Open

pkgutil.walk_packages returns extra modules #58992

cjerdonek opened this issue May 12, 2012 · 11 comments
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@cjerdonek
Copy link
Member

BPO 14787
Nosy @ncoghlan, @ericvsmith, @merwok, @cjerdonek, @scorphus

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2012-05-12.09:01:10.723>
labels = ['type-bug', 'library', 'docs']
title = 'pkgutil.walk_packages returns extra modules'
updated_at = <Date 2020-01-29.00:16:41.772>
user = 'https://github.com/cjerdonek'

bugs.python.org fields:

activity = <Date 2020-01-29.00:16:41.772>
actor = 'brett.cannon'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation', 'Library (Lib)']
creation = <Date 2012-05-12.09:01:10.723>
creator = 'chris.jerdonek'
dependencies = []
files = []
hgrepos = []
issue_num = 14787
keywords = []
message_count = 11.0
messages = ['160464', '160469', '165094', '165537', '165605', '165612', '165618', '165627', '205021', '221986', '261589']
nosy_count = 10.0
nosy_names = ['ncoghlan', 'eric.smith', 'eric.araujo', 'Arfrever', 'chris.jerdonek', 'docs@python', 'gennad', 'faassen', 'scorphus', 'Andrey Nehaychik']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue14787'
versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

@cjerdonek
Copy link
Member Author

pkgutil.walk_packages(paths) seems to return incorrect results when the name of a subpackage of a path in paths matches the name of a package in the standard library. It both excludes modules it should include, and includes modules it should exclude. Here is an example:

mkdir temp
touch temp/init.py
touch temp/foo.py
mkdir temp/logging
touch temp/logging/init.py
touch temp/logging/bar.py
python
Python 3.2.3 (default, Apr 29 2012, 01:19:06)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> from pkgutil import walk_packages
>>> for info in walk_packages(['temp']):
...   print(info[1], info[0].path)
... 
foo temp
logging temp
logging.config /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/logging
logging.handlers /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/logging
>>> 

Observe that logging.bar is absent from the list, and logging.config and logging.handlers are included.

@cjerdonek cjerdonek added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 12, 2012
@gennad
Copy link
Mannequin

gennad mannequin commented May 12, 2012

I confirm this behavior in 2.7 and 3.2 versions. In my 3.3.0a3+ it actually outputs nothing.
Also note that if you rename logging to logging2, you actually get

foo temp
logging2 temp

@brettcannon
Copy link
Member

So the lack of output in 3.3 is not surprising as walk_packages() won't work with the new import implementation as it relies on a non-standard method on loaders that import does not provide.

@cjerdonek
Copy link
Member Author

For the record, this issue is still present after Nick's pkgutil changes documented in bpo-15343 (not that I expected it to be resolved since this issue is a bit different).

@ncoghlan
Copy link
Contributor

Right, this is a separate bug in pkgutil. Specifically, when it goes to import a package in order to check it for submodules, it invokes the global import system via __import__() rather than constraining the import to the path argument supplied to walk_packages.

This means that it will only find it if the path being walked is already on sys.path. In the case of your example, it isn't (it's on a subdirectory).

The reason my new tests didn't pick this up is that they're built on the test_runpy infrastructure, and one of the steps in that infrastructure is to add the new package path to sys.path so it can be imported.

This isn't an easy one to fix - you basically need something along the lines of a PEP-406 style import engine API in order to do the import without having potentially adverse effects on the state in the sys module.

@ncoghlan
Copy link
Contributor

At the very least, the pkgutil docs need to state clearly that walk_packages only works properly with sys.path entries, and the constraint feature may not descend into packages correctly if an entry is shadowed by a sys.modules entry or an entry earlier on sys.meta_path or sys.path.

@ncoghlan ncoghlan added the docs Documentation in the Doc dir label Jul 16, 2012
@ncoghlan
Copy link
Contributor

I just realised this is going to behave strangely with namespace packages as well: the __import__ step will pick up *every* portion of the namespace package, not just those defined in the identified subset of sys.path.

@cjerdonek
Copy link
Member Author

This isn't an easy one to fix - you basically need something along the lines of a PEP-406 style import engine API in order to do the import without having potentially adverse effects on the state in the sys module.

By adverse, do you just mean side effects? If so, since the documentation doesn't explicitly say so, is there any reason for the user to think there shouldn't be side effects? For example, I tried this in Python 2.7:

>>> import os, sys, pkgutil, unittest
>>> len(sys.modules)
86
>>> g = pkgutil.walk_packages([os.path.dirname(unittest.__file__)])
>>> len(sys.modules)
86
>>> for i in g:
...   pass
... 
>>> len(sys.modules)
95

Or maybe this isn't what you mean. If not, can you provide an example?

@faassen
Copy link
Mannequin

faassen mannequin commented Dec 2, 2013

I just ran into this bug myself with namespace packages (in Python 2.7). When you have multiple packages (ns.a, ns.b) under a namespace package (ns), and constrain the paths in walk_packages so it should only pick up ns.a, it will pick up ns.b as well.

Any hope for a fix or workaround?

@BreamoreBoy
Copy link
Mannequin

BreamoreBoy mannequin commented Jun 30, 2014

Note that this is reference from bpo-15358.

@AndreyNehaychik
Copy link
Mannequin

AndreyNehaychik mannequin commented Mar 11, 2016

Any hope to add the warning in pkgutil docs about this problem?

For example:
Warning!!! The walk_packages function uses sys.path to import nested packages for provided paths. It means it walks deeply by relative import for subpackages. If you provide path that is not in sys.path as an argument the result won't be correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants
X Tutup