X Tutup
The Wayback Machine - https://web.archive.org/web/20240930223320/https://github.com/python/cpython/issues/76823
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for path-like objects in sys.path #76823

Closed
cjerdonek opened this issue Jan 24, 2018 · 30 comments
Closed

add support for path-like objects in sys.path #76823

cjerdonek opened this issue Jan 24, 2018 · 30 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@cjerdonek
Copy link
Member

BPO 32642
Nosy @malemburg, @brettcannon, @jaraco, @ncoghlan, @merwok, @cjerdonek, @ericsnowcurrently, @serhiy-storchaka, @eryksun, @jayyyin, @noamcohen97
PRs
  • bpo-32642: adding compatibility for pathlike objects in sys.path #5691
  • bpo-32642: Allow for PathLike objects in sys.path #22000
  • bpo-32642: add support for path-like objects in sys.path #32118
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2018-01-24.05:27:12.840>
    labels = ['type-feature', 'library', '3.11']
    title = 'add support for path-like objects in sys.path'
    updated_at = <Date 2022-03-26.18:25:50.267>
    user = 'https://github.com/cjerdonek'

    bugs.python.org fields:

    activity = <Date 2022-03-26.18:25:50.267>
    actor = 'eryksun'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2018-01-24.05:27:12.840>
    creator = 'chris.jerdonek'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32642
    keywords = ['patch']
    message_count = 28.0
    messages = ['310560', '311693', '311864', '311895', '311986', '312070', '312072', '313283', '313337', '313388', '313460', '313883', '313885', '313887', '313923', '314065', '340950', '409732', '415012', '416037', '416042', '416048', '416050', '416052', '416053', '416057', '416065', '416078']
    nosy_count = 12.0
    nosy_names = ['lemburg', 'brett.cannon', 'jaraco', 'ncoghlan', 'eric.araujo', 'chris.jerdonek', 'python-dev', 'eric.snow', 'serhiy.storchaka', 'eryksun', 'jayyin11043', 'ncohen']
    pr_nums = ['5691', '22000', '32118']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue32642'
    versions = ['Python 3.11']

    @cjerdonek
    Copy link
    Member Author

    This issue is to suggest enhancing sys.path to recognize path-like objects, per PEP-519.

    I recently ran into an issue where a path was getting added to sys.path, but the corresponding imports weren't working as they should, even though sys.path showed the path as being present. Eventually I tracked this down to the path being a pathlib.Path object rather than a string. This wasn't obvious because Path objects appear as strings in normal debug output, etc.

    The sys.path docs currently say this:

    Only strings and bytes should be added to sys.path; all other data types are ignored during import.

    @cjerdonek cjerdonek added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jan 24, 2018
    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Feb 5, 2018

    This looks a lot like https://bugs.python.org/issue32446, I'd like to tackle this, if we are going through with it.

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Feb 9, 2018

    what file(s) is/are the sys.path code located in?

    @yan12125
    Copy link
    Mannequin

    yan12125 mannequin commented Feb 9, 2018

    what file(s) is/are the sys.path code located in?

    If I understand it correctly, sys.path is handled in importlib._bootstrap_external.PathFinder.find_spec(). I can patch PathFinder for handling path-like objects: https://github.com/yan12125/cpython/commit/e3fd473b54cbb368533e651fd896bbc813a43924

    Here's an example usage:

    # t.py
    import pathlib
    import sys
    
    sys.path.append(pathlib.Path('foo'))
    
    import s
    
    # foo/s.py
    print(123)

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Feb 11, 2018

    Thanks for the reply

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Feb 12, 2018

    ok, so I found the PathFinder class you referenced, just making sure, this issue pertains to changing "self.path"'s usage and declaration to be a path-like object instead of the hard coded 'sys', 'path' returns? or is that part of it already?, also since this uses a tuple to keep track of the parent path, is there functionality in the path-like objects to find parent paths?

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Feb 12, 2018

    class _NamespacePath:
    """Represents a namespace package's path. It uses the module name
    to find its parent module, and from there it looks up the parent's
    __path__. When this changes, the module's own path is recomputed,
    using path_finder. For top-level modules, the parent module's path
    is sys.path."""
    def __init__(self, name, path, path_finder):
    self._name = name
    self._path = path
    self._last_parent_path = tuple(self._get_parent_path())
    self._path_finder = path_finder
    def _find_parent_path_names(self):
    """Returns a tuple of (parent-module-name, parent-path-attr-name)"""
    parent, dot, me = self._name.rpartition('.')
    if dot == '':
    # This is a top-level module. sys.path contains the parent path.
    return 'sys', 'path'
    # Not a top-level module. parent-module.__path__ contains the
    # parent path.
    return parent, '__path__'

    those are the particular class and lines I'm referring to

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 5, 2018

    I'm unsure how to regenerate the files that interact with the code for sys.path as travisCI states
    "
    Generated files not up to date
    M Python/importlib_external.h
    "
    since my code changes some of how the importing is handled

    @brettcannon
    Copy link
    Member

    'make regen-all' should do it (but so will just running 'make -s -j' like
    any old build of CPython).

    On Mon, Mar 5, 2018, 12:43 Jay Yin, <report@bugs.python.org> wrote:

    Jay Yin <jayyin11043@gmail.com> added the comment:

    I'm unsure how to regenerate the files that interact with the code for
    sys.path as travisCI states
    "
    Generated files not up to date
    M Python/importlib_external.h
    "
    since my code changes some of how the importing is handled

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue32642\>


    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 7, 2018

    I've been stuck on "test_poplib" for over 149661 sec now, that's about 41 hours... I don't think this is working correctly, although I'm unsure what test_poplib is doing that has been affected by what I've changed here.

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 8, 2018

    The issue was resolved by updating my version of the rest of the package apparently and remaking the whole thing, must have been some outdated stuff on my end causing the issue

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 15, 2018

    I'm having issues with my local changes for my PR, I'm unsure why my local machine takes a seemingly infinite amount of time on test_poplib, so again I think something to do with my local environment is causing issues again, any help would be appreciated...

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 15, 2018

    https://pastebin.com/q4FKnPZH

    the trace for the test_poplib

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 15, 2018

    it seems to me like the issue in my tests is that some SSL thing is failing?, anyone have any experience with this?

    @brettcannon
    Copy link
    Member

    Please open a separate issue for the test_poplib issue.

    @jayyyin
    Copy link
    Mannequin

    jayyyin mannequin commented Mar 18, 2018

    srry I opened another issue bpo-33099

    @brettcannon
    Copy link
    Member

    This seems fine to me and I can't think of any negatives. Can anyone think of why this might be a bad idea? (Messing with how sys.path is used is just so fundamental I'm being paranoid. :)

    @merwok
    Copy link
    Member

    merwok commented Jan 5, 2022

    I’m not an import expert but would have misgivings about having fancy types in sys.path too! It seems so fundamental, and used from C and Python, with a simple interface of a direct list (plus importers and finders etc), that I would understand it pure strings were required and not Paths (and not all path-likes).

    @merwok merwok added 3.11 only security fixes and removed 3.7 (EOL) end of life labels Jan 5, 2022
    @jaraco
    Copy link
    Member

    jaraco commented Mar 12, 2022

    I'm in support of adding Path support for sys.path, but I also agree with Eric, there are innumerable consumers of sys.path beyond importlib. and since pathlib.Path isn't a str, it would likely introduce incompatibility. On the other hand, users introducing Path objects to sys.path could be warned that although importlib supports Path objects, other consumers may not, and that support for it in importlib isn't endorsement of the use of those types and the consequences aren't necessarily supported.

    As an aside, it's too bad a Path object couldn't have been a str subclass (as it is for path, which would have made problems like this one much safer to solve.

    @serhiy-storchaka
    Copy link
    Member

    Are there any problems with converting a Path to string before adding it to sys.path? You do this one time, and any users of sys.path will not need to bother about conversion. It is better even for performance.

    Otherwise we will need to revise and update every code that uses sys.path, and many bugs will left in third-party code for years. For example, in unittest the code

            if not top_level_dir in sys.path:
                sys.path.insert(0, top_level_dir)

    should be replaced with more cumbersome

            for path in sys.path:
                if os.fsdecode(path) == top_level_dir:
                    break
            else:
                sys.path.insert(0, top_level_dir)

    In multiprocessing the code

        sys_path=sys.path.copy()
        try:
            i = sys_path.index('')
        except ValueError:
            pass
        else:
            sys_path[i] = process.ORIGINAL_DIR

    should be replaced with

        sys_path=sys.path.copy()
        for i, path in enumerate(sys_path):
            if os.fsdecode(path) == '':
                sys_path[i] = process.ORIGINAL_DIR
                break

    It is just two examples. I did not review the whole stdlib, and there should be more third-party code.

    @noamcohen97
    Copy link
    Mannequin

    noamcohen97 mannequin commented Mar 26, 2022

    You've got a point.
    But in my opinion path-like object usage should be intuitive and users should not worry about converting it into string in some of the places.

    What do you feel about sub-classing list for sys.path and converting path-like objects upon setting items / reading?

    @ncoghlan
    Copy link
    Contributor

    The import system is already complex, so I think the hesitation about allowing arbitrary Path-like objects is warranted. (For example: are importlib's caching semantics really valid for *arbitrary* path-like objects? An object can be path-like without being immutable)

    Coercion on input (as Noam suggests) would have a much lower risk of unwanted side effects, as any dynamic behaviour would be resolved at insertion time.

    @noamcohen97
    Copy link
    Mannequin

    noamcohen97 mannequin commented Mar 26, 2022

    So I've got in mind a PyListObject subclass with calls to PyOS_FSPath before insert, append, extend and __getitem__.

    But I feel like this is an overkill; also, extend function might be trickier to implement.
    Do any of you have better idea?

    @serhiy-storchaka
    Copy link
    Member

    I think you are trying to solve a wrong problem.

    This wasn't obvious because Path objects appear as strings in normal debug output, etc.

    How is it?

    >>> pathlib.Path('/usr/lib')
    PosixPath('/usr/lib')
    >>> [pathlib.Path('/usr/lib')]
    [PosixPath('/usr/lib')]

    I think the problem is with the debug output. Always use repr() for it, otherwise you will trick yourself.

    but the corresponding imports weren't working as they should

    It would be easier to catch if it was never working with non-string. I think we should deprecate any non-string elements in sys.path.

    As I shown above some code only works correctly with strings anyway.

    @serhiy-storchaka
    Copy link
    Member

    pkgutil just skips non-string elements in sys.path.

        for dir in search_path:
            if not isinstance(dir, str):
                continue

    @noamcohen97
    Copy link
    Mannequin

    noamcohen97 mannequin commented Mar 26, 2022

    that's why pre-insert conversion was suggested.
    path-like objects could still be added to sys.path, while converting to str upon insert, preserving sys.path's bytes and strings only policy.

    @malemburg
    Copy link
    Member

    On 26.03.2022 08:59, Nick Coghlan wrote:

    The import system is already complex, so I think the hesitation about allowing arbitrary Path-like objects is warranted. (For example: are importlib's caching semantics really valid for *arbitrary* path-like objects? An object can be path-like without being immutable)

    Coercion on input (as Noam suggests) would have a much lower risk of unwanted side effects, as any dynamic behaviour would be resolved at insertion time.

    This is not only about the import system. Lots of Python code out there
    manipulates sys.path or reads sys.path for various reasons and does not
    expect Path objects as list members, since only strings and bytes
    are allowed:

    https://docs.python.org/3/library/sys.html#sys.path

    Conversion to strings sounds like a good way to get the best out of
    both worlds.

    I'm just curious on how this would work. You'd like have to create a
    list subclass for use with sys.path which applies the conversion
    whenever a non-string member gets added. Or perhaps add helper methods
    to Path objects to safely add their value to sys.path.

    @eryksun
    Copy link
    Contributor

    eryksun commented Mar 26, 2022

    I've got in mind a PyListObject subclass with calls to PyOS_FSPath
    before insert, append, extend and __getitem__.

    The sys module doesn't prevent rebinding sys.path. Most code I think is careful to update sys.path. But surely some code replaces it with a new list instead of updating via insert(), append(), extend(), or updating a slice such as sys.path[:] = new_path. For example, ModifiedInterpreter.transfer_path() in Lib/idlelib/pyshell.py rebinds sys.path. It doesn't have to, but it does.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @zxrobinson
    Copy link

    Why use a subclass of PyListObject when you could use composition? Create a singleton SysPath object containing a list. Add pass through methods for all of the List behavior you want. Any methods that modify the list by adding an item can convert the item to an str. This might make sys.path much easier to understand and maintain. Letting people completely replace the whole thing would seem to make it very fragile.

    @merwok
    Copy link
    Member

    merwok commented Jun 28, 2022

    This change requires a discussion on python-dev, maybe even a PEP.

    Closing this ticket for now, please reopen if a discussion reaches positive agreement on this.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    9 participants
    X Tutup