X Tutup
The Wayback Machine - https://web.archive.org/web/20240819090553/https://github.com/python/cpython/issues/73991
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support moving across filesystems in pathlib.Path, as shutil.move() does #73991

Open
LaurentMazuel mannequin opened this issue Mar 13, 2017 · 65 comments
Open

Support moving across filesystems in pathlib.Path, as shutil.move() does #73991

LaurentMazuel mannequin opened this issue Mar 13, 2017 · 65 comments
Labels
3.14 new features, bugs and security fixes stdlib Python modules in the Lib dir topic-pathlib type-feature A feature request or enhancement

Comments

@LaurentMazuel
Copy link
Mannequin

LaurentMazuel mannequin commented Mar 13, 2017

BPO 29805
Nosy @brettcannon, @pfmoore, @ericvsmith, @tjguk, @zware, @eryksun, @zooba

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2017-03-13.21:03:44.694>
labels = ['3.8', 'type-feature', 'library', '3.9', '3.10']
title = 'Support moving across filesystems in pathlib.Path, as shutil.move() does'
updated_at = <Date 2021-03-15.21:26:30.922>
user = 'https://bugs.python.org/LaurentMazuel'

bugs.python.org fields:

activity = <Date 2021-03-15.21:26:30.922>
actor = 'eryksun'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2017-03-13.21:03:44.694>
creator = 'Laurent.Mazuel'
dependencies = []
files = []
hgrepos = []
issue_num = 29805
keywords = []
message_count = 6.0
messages = ['289549', '289552', '289559', '289630', '289687', '289688']
nosy_count = 8.0
nosy_names = ['brett.cannon', 'paul.moore', 'eric.smith', 'tim.golden', 'Laurent.Mazuel', 'zach.ware', 'eryksun', 'steve.dower']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue29805'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

Linked PRs

@LaurentMazuel
Copy link
Mannequin Author

LaurentMazuel mannequin commented Mar 13, 2017

Trying to use Pathlib and Path.replace on Windows if drive are different leads to an issue:

  File "D:\\myscript.py", line 184, in update
    client_generated_path.replace(destination_folder)
  File "c:\\program files (x86)\\python35-32\\Lib\\pathlib.py", line 1273, in replace
    self.\_accessor.replace(self, target)
  File "c:\\program files (x86)\\python35-32\\Lib\\pathlib.py", line 377, in wrapped
    return strfunc(str(pathobjA), str(pathobjB), \*args)
OSError: [WinError 17] The system cannot move the file to a different disk drive: 'C:\\\\MyFolder' -\> 'D:\\\\MyFolderNewName'

This is a known situation of os.rename, and workaround I found is to use shutil or to copy/delete manually in two steps (e.g. http://stackoverflow.com/questions/21116510/python-oserror-winerror-17-the-system-cannot-move-the-file-to-a-different-d)

When using Pathlib, it's not that easy to workaround using shutil (even if thanks to Brett Cannon now shutil accepts Path in Py3.6, not everybody has Py3.6). At least this should be documented with a recommendation for that situation. I love Pathlib and it's too bad my code becomes complicated when it was so simple :(

@brettcannon brettcannon added stdlib Python modules in the Lib dir and removed topic-IO labels Mar 13, 2017
@LaurentMazuel
Copy link
Mannequin Author

LaurentMazuel mannequin commented Mar 13, 2017

Just to confirm, I was able to workaround it with Py3.6:

    # client_generated_path.replace(destination_folder)
    shutil.move(client_generated_path, destination_folder)

It is reasonable to think about adding a similar feature to pathlib if it doesn't support it and just special-case the drive-to-drive scenario for Windows

@eryksun
Copy link
Contributor

eryksun commented Mar 14, 2017

Moving a file across volumes isn't atomic. Getting an exception in this case can be useful. There could be a "strict" keyword-only parameter that defaults to False. If it's true, then replace() won't try to move the file.

@ericvsmith
Copy link
Member

I agree this needs to be different from replace(), due to not being atomic. That makes it an enhancement, so I'm removing 3.5.

I'm +1 on Path supporting something like shutil.move().

@ericvsmith ericvsmith added the type-feature A feature request or enhancement label Mar 15, 2017
@brettcannon
Copy link
Member

I also support the idea of getting something like shutil.move() into pathlib.

@brettcannon
Copy link
Member

I should also mention that rename() (https://docs.python.org/3/library/pathlib.html#pathlib.Path.rename) and replace() (https://docs.python.org/3/library/pathlib.html#pathlib.Path.replace) already do exist, so it might be best to add a keyword-only flag to one of those for this use-case.

@eryksun eryksun added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes and removed OS-windows labels Mar 15, 2021
@eryksun eryksun changed the title Pathlib.replace cannot move file to a different drive on Windows if filename different Support moving across filesystems in pathlib.Path, as shutil.move() does Mar 15, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@barneygale
Copy link
Contributor

I'm also +1 on adding pathlib.Path.move(). However I'd prefer we move the shutil implementation into pathlib, as proposed here. The implementation in shutil would become a stub that calls into pathlib, roughly:

def move(src, dst, copy_function=copy2):
    return pathlib.Path(src).move(dst, copy_function=copy2)

I'm interested in this approach because I'm hoping to add a pathlib.AbstractPath class in future. Users would be able to subclass AbstractPath, and by implementing a small handful of low-level abstract methods like stat(), open() and iterdir(), benefit from high-level methods like move() in their objects. This would work even if the source and destination are different kinds of path (e.g. a local filesystem path and a path in a tar archive).

To achieve this we'd need to make pathlib accept bytes paths, as shutil does. Personally I'm fine with this - it's not much work and perfectly reasonable.

We'd also need to move other shutil functions that move() relies on to pathlib, notably the copy*() functions, excluding copyfileobj() and friends.

On rename() / replace(): I'd like these to call through to move(atomic=True). IIRC these methods are identical on POSIX and only differ slightly on Windows. That's a wrinkle we should smooth out in pathlib IMO.

@zooba
Copy link
Member

zooba commented Jan 31, 2023

I'd prefer we move the shutil implementation into pathlib

Sounds reasonable to me, though would we want to include copy_function in the API? It seems like an opportunity to say that Path is opinionated about the "most move-like way to copy", and if you need to override that then handle it yourself or stick with shutil.

@ericvsmith
Copy link
Member

I think @barneygale 's plan is sound.

@pfmoore
Copy link
Member

pfmoore commented Feb 1, 2023

I agree with @zooba, exposing the copy_function in the pathlib API seems incompatible with the general style of pathlib. Otherwise the plan looks good.

@barneygale
Copy link
Contributor

barneygale commented Feb 1, 2023

I agree about copy_function. Perhaps we could do something like this?

    def copy(self, dst, *, follow_symlinks=True, copy_mode=True, copy_stat=False):
        ...

    def move(self, dst, *, atomic=False, **kwargs):
        if atomic:
            return os.rename(self, dst)
        self.copy(dst, **kwargs)
        self.unlink()

@zooba
Copy link
Member

zooba commented Feb 1, 2023

I don't like the atomic argument, and especially its default. Implementations should get to implement it as efficiently as possible without needing to respect that kind of argument, and callers shouldn't expect that level of control over semantics.

For example, on Windows we explicitly disallow os.rename from moving files to other drives, because it wouldn't match our documented behaviour. But a new WindowsPath.move function should be able to allow that, and take advantage of the OS optimisations (yes, they exist!) for moving a file rather than copying it across drives.

If we were to define semantics for atomic, the only way to allow this would be "atomic=True will try to move, and atomic=False won't bother trying to move", at which point why would anybody ever pass False? Especially since there's no back-compat here, and so code on older versions will have to stick with try: ...rename ... except: ... copy.

We should make these functions an improvement in semantics, not just discoverability, and ideally more appropriate for whatever platform they're being used on than the POSIX semantics inherent to posixmodule.

So I think the only option we want on move is follow_symlinks=True,1 and the method "does whatever is necessary to move the file or directory as efficiently as possible into the new location while matching the semantics of an in-place rename" (roughly, copy_mode=True, copy_stat=True)

Similarly, I think we should simplify copy down to only have follow_symlinks=True,2 and the method "does whatever is necessary to copy the file or directory as efficiently as possible into the new location while matching the semantics of a newly created file" (roughly, copy_mode=some bits, copy_stat=some bits - and this one is why the vague definition is important 😉 )

Footnotes

  1. Where follow_symlinks=False is carefully defined to mean "if self is a link, it moves the link unless that would cause the new link to fail to refer to the same target as the original link, in which case the target is copied into the new location and the original link is removed".

  2. I think the same definition as for move works here, but I haven't thought through this one as thoroughly.

@barneygale
Copy link
Contributor

barneygale commented Feb 1, 2023

My example was a bit rubbish - here's a corrected version:

    def copy(self, dst, *, follow_symlinks=True, copy_mode=True, copy_stat=True):
        ...

    def move(self, dst, *, atomic=False, **kwargs):
        try:
            return os.rename(self, dst)
        except OSError:
            if atomic:
                raise  # Move can't be achieved atomically
        self.copy(dst, **kwargs)
        self.unlink()

So move(atomic=True) would call os.rename() only and allow exceptions to propagate, whereas atomic=False will try a rename, and failing that use a copy/delete. I think there's common use cases for both.

I think I agree with you on removing copy_mode and copy_stat.

@barneygale
Copy link
Contributor

I suppose one could argue that folks should just use os.rename() if they want an atomic rename! :)

@zooba
Copy link
Member

zooba commented Feb 2, 2023

I suppose one could argue that folks should just use os.rename() if they want an atomic rename! :)

Yeah, I'll argue that one ;)

We'll have to explain that move may result in both old and new names existing simultaneously while the move takes place, depending on the operating system and file systems in use. That's a perfectly fine place to say that if you'd rather that not happen, and are willing to accept exceptions if it's unavoidable, then you should use rename.

@barneygale
Copy link
Contributor

Should we deprecate pathlib.Path.rename() and replace() when we add move()?

noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
…thon#121444)

Follow-up to python#120806. Use `os_helper.skip_unless_xattr` to skip testing
xattr preservation when unsupported.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
Add a `Path.copy()` method that copies the content of one file to another.

This method is similar to `shutil.copyfile()` but differs in the following ways:

- Uses `fcntl.FICLONE` where available (see pythonGH-81338)
- Uses `os.copy_file_range` where available (see pythonGH-81340)
- Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.

The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.

Incorporates code from pythonGH-81338 and pythonGH-93152.

Co-authored-by: Eryk Sun <eryksun@gmail.com>
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…ython#120517)

Preparatory work for moving `_rmtree_unsafe()` and `_rmtree_safe_fd()` to
`pathlib._os` so that they can be used from both `shutil` and `pathlib`.

Move implementation-specific setup from `rmtree()` into the safe/unsafe
functions, and give them the same signature `(path, dir_fd, onexc)`.

In the tests, mock `os.open` rather than `_rmtree_safe_fd()` to ensure the
FD-based walk is used, and replace a couple references to
`shutil._use_fd_functions` with `shutil.rmtree.avoids_symlink_attacks`
(which has the same value).

No change of behaviour.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…r()` (python#120715)

In preparation for the addition of `PathBase.rmtree()`, implement
`DummyPath.unlink()` and `rmdir()`, and move corresponding tests into
`test_pathlib_abc` so they're run against `DummyPath`.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…python#120519)

Add support for not following symlinks in `pathlib.Path.copy()`.

On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs.

No news as `copy()` was only just added.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
Add `pathlib.Path.copytree()` method, which recursively copies one
directory to another.

This differs from `shutil.copytree()` in the following respects:

1. Our method has a *follow_symlinks* argument, whereas shutil's has a
   *symlinks* argument with an inverted meaning.
2. Our method lacks something like a *copy_function* argument. It always
   uses `Path.copy()` to copy files.
3. Our method lacks something like a *ignore_dangling_symlinks* argument.
   Instead, users can filter out danging symlinks with *ignore*, or
   ignore exceptions with *on_error*
4. Our *ignore* argument is a callable that accepts a single path object,
   whereas shutil's accepts a path and a list of child filenames.
5. We add an *on_error* argument, which is a callable that accepts
   an `OSError` instance. (`Path.walk()` also accepts such a callable).

Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…ython#120807)

Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and
raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and
fall back to the `PathBase.copy()` implementation.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…ython#120806)

Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied.

Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…thon#121444)

Follow-up to python#120806. Use `os_helper.skip_unless_xattr` to skip testing
xattr preservation when unsupported.
barneygale added a commit to barneygale/cpython that referenced this issue Jul 20, 2024
AA-Turner added a commit to barneygale/cpython that referenced this issue Jul 20, 2024
barneygale added a commit that referenced this issue Jul 20, 2024
Add a `Path.rmtree()` method that removes an entire directory tree, like
`shutil.rmtree()`. The signature of the optional *on_error* argument
matches the `Path.walk()` argument of the same name, but differs from the
*onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within
pathlib is probably more important.

In the private pathlib ABCs, we add an implementation based on `walk()`.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
barneygale added a commit that referenced this issue Jul 20, 2024
…121438)

Add *preserve_metadata* keyword-only argument to `pathlib.Path.copytree()`,
defaulting to false. When set to true, we copy timestamps, permissions,
extended attributes and flags where available, like `shutil.copystat()`.
barneygale added a commit to barneygale/cpython that referenced this issue Jul 21, 2024
Add a `Path.move()` method that moves a file or directory tree and returns
a new `Path` instance.

This method is similar to `shutil.move()`, except that it doesn't accept a
*copy_function* argument, and it doesn't support copying into an existing
directory.
barneygale added a commit to barneygale/cpython that referenced this issue Jul 21, 2024
Add a `Path.move()` method that moves a file or directory tree and returns
a new `Path` instance.

This method is similar to `shutil.move()`, except that it doesn't accept a
*copy_function* argument, and it doesn't support copying into an existing
directory.
@barneygale
Copy link
Contributor

barneygale commented Jul 27, 2024

I've been closely following the shutil API so far, resulting in this:

accepts delete copy move
file unlink() copy()
dir (empty) rmdir()
dir (any) rmtree() copytree()
anything move()

(new methods shown in italics)

Having given it some thought, I have the following concerns:

  1. copy() and move() are the shortest/most obvious methods, but the former accepts only files, whereas the latter accepts both files and directories
  2. move() can be used to move any kind of file, but there are no corresponding methods for copying and deletion.

And so I'd like to propose that:

  1. We rename copy() to copyfile()
  2. We rename rmtree() to delete(), and copytree() to copy(), and add support for files to both methods.

This would result in:

accepts delete copy move
file unlink() copyfile()
dir (empty) rmdir()
dir (any)
anything delete() copy() move()

Thoughts?

@barneygale
Copy link
Contributor

(Also, if we wanted to add support for moving/copying into an existing directory via new methods, it would be better to have move_into() and copy_into(), rather than move_into(), copyfile_into() and copytree_into() IMO.)

@pfmoore
Copy link
Member

pfmoore commented Jul 27, 2024

+1 from me on the revised names and the _into forms.

Do we even need a copyfile() function, if copy() will do the same?

@barneygale
Copy link
Contributor

Do we even need a copyfile() function, if copy() will do the same?

Good point - perhaps not!

barneygale added a commit to barneygale/cpython that referenced this issue Jul 27, 2024
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
barneygale added a commit to barneygale/cpython that referenced this issue Jul 27, 2024
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)
barneygale added a commit that referenced this issue Aug 7, 2024
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
barneygale added a commit that referenced this issue Aug 11, 2024
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
barneygale added a commit to barneygale/cpython that referenced this issue Aug 12, 2024
barneygale added a commit to barneygale/cpython that referenced this issue Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 new features, bugs and security fixes stdlib Python modules in the Lib dir topic-pathlib type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

9 participants
X Tutup