tempfile.TemporaryDirectory() removes entire directory tree even if it's a mount-point #80603

riccardomurri · 2019-03-25T11:16:46Z

BPO	36422
Nosy	@giampaolo, @tarekziade, @MojoVampire, @websurfer5
PRs	gh-81483: Add an "onerror" callback parameter to the tempfile.TemporaryDirectory member functions #14292
Files	test-mounted-image.py: onerror test program for mount points (macOS)

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-06-29.11:57:21.088>
created_at = <Date 2019-03-25.11:16:45.703>
labels = ['type-bug', 'library', '3.9']
title = "tempfile.TemporaryDirectory() removes entire directory tree even if it's a mount-point"
updated_at = <Date 2019-06-29.11:57:21.074>
user = 'https://bugs.python.org/riccardomurri'

bugs.python.org fields:

activity = <Date 2019-06-29.11:57:21.074>
actor = 'giampaolo.rodola'
assignee = 'none'
closed = True
closed_date = <Date 2019-06-29.11:57:21.088>
closer = 'giampaolo.rodola'
components = ['Library (Lib)']
creation = <Date 2019-03-25.11:16:45.703>
creator = 'riccardomurri'
dependencies = []
files = ['48429']
hgrepos = []
issue_num = 36422
keywords = ['patch']
message_count = 11.0
messages = ['338795', '338804', '338806', '344775', '344776', '344785', '344851', '345022', '345173', '346107', '346878']
nosy_count = 5.0
nosy_names = ['giampaolo.rodola', 'tarek', 'riccardomurri', 'josh.r', 'Jeffrey.Kintscher']
pr_nums = ['14292']
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue36422'
versions = ['Python 3.9']

riccardomurri · 2019-03-25T11:16:45Z

The behavior of tempfile.TemporaryDirectory() is to delete the
temporary directory when done; this behavior cannot be turned off
(there's no delete=Falselike NamedTemporaryFile has instead).

However, in case a filesystem has been mounted on the temporary
directory, this can lead to the entire filesystem being removed.

While I agree that it would be responsibility of the programmer to
ensure that anything that has been mounted on the temp dir is
unmounted, the current behavior makes it quite easy to shoot oneself
in the foot. Consider the following code::

    @contextmanager
    def mount_sshfs(localdir, remote):
        subprocess.run(f"sshfs {remote} {localdir}")
        yield
        subprocess.run(f"fusermount -u {localdir}", check=True)

    with TemporaryDirectory() as tmpdir:
         with mount_sshfs(tmpdir, remote):
              # ... do stuff ...

Now, even if the fusermount call fails, cleanup of
TemporaryDirectory() will be performed and the remote filesystem
will be erased!

Is there a way this pattern can be prevented or at least mitigated?
Two options that come to mind:

add a `delete=True/False` option to `TemporaryDirectory` like `NamedTemporaryFile` already has
add a `delete_on_error` option to avoid performing cleanup during error exit from a `with:` block

I have seen this happen with Py 3.6 but it's likely there in the entire 3.x series since TemporaryDirectory was added to stdlib.

Thanks,
Riccardo

MojoVampire · 2019-03-25T15:43:07Z

Allowing delete_on_error=False is kind of defeating the purpose here; the whole point of the with statement is guaranteed, consistent cleanup in both normal and error cases. If you knew enough to know you needed to pass delete_on_error, you'd also know enough to know you should be handling errors properly in the first place, e.g. by changing your mount_sshfs manager to:

    @contextmanager
    def mount_sshfs(localdir, remote):
        subprocess.run(f"sshfs {remote} {localdir}")
        try:
            yield
        finally:
            subprocess.run(f"fusermount -u {localdir}", check=True)

so it actually performed the guaranteed cleanup you expected from it.

I don't see anything wrong with adding a delete=True argument to TemporaryDirectory, though I'm not seeing it as being as useful as it is with TemporaryFile (since the "rewrite file to tempfile, atomic replace old file with new file" pattern for updating a file safely doesn't transfer directly to directories, where atomic renames aren't an option).

It just seems like your fundamental problem is code that doesn't properly handle failure, and I don't think the solution is to make TemporaryDirectory handle failure badly as well.

riccardomurri · 2019-03-25T15:53:24Z

you should be handling errors properly in the first place,
e.g. by changing your mount_sshfs manager to:
@contextmanager
def mount_sshfs(localdir, remote):
    subprocess.run(f"sshfs {remote} {localdir}")
    try:
        yield
    finally:
        subprocess.run(f"fusermount -u {localdir}", check=True)
so it actually performed the guaranteed cleanup you expected from it.

This would fix the case where errors occur in the "yield" part of the
mount_sshfs context manager, but would not protect from errors in
the fusermount -u call itself: if fusermount -u fails and throws
an exception, the entire mounted filesystem will be erased.

I would contend that, in general, TemporaryDirectory.cleanup()
should stop at filesystem boundaries and not descend filesystems
mounted in the temporary directory tree (whether the mount has been
done via a context manager as in the example above or by any other
means is irrelevant).

websurfer5 · 2019-06-05T22:56:16Z

Only deleting from the local filesystem seems reasonable to me. In the context of a temporary directory tree, I don't see a semantic difference between "unmounting a mount point" and "removing a subdirectory entry" -- i.e. they both remove the offending subdirectory tree from the temporary directory tree.

Internally, tempfile.TemporaryDirectory() calls shutil.rmtree() with a custom error handler. The handler tries to reset the permissions on the offending entry and calls os.unlink(). If this throws an error, it recursively calls shutil.rmtree() on the offending entry (ignoring the temp directory entry itself). This seems like where a mounted directory tree would get deleted.

shutil.rmtree() follows the "rm -rf" semantics and refuses to delete a mount point. It seems reasonable to me to add special handling for mount points to the error handler in tempfile.TemporaryDirectory(). The high-level logic would be something like:

if os.path.ismount(path):
    try:
        unmount_func(path)
        _shutil.rmtree(path)
    except FileNotFoundError:
        pass

The tricky part is implementing unmount_func() in a portable manner since there is no such functionality currently implemented in the Python library.

Also, what would be the appropriate behavior when unmount_func() fails (e.g. for a permission error)? Right now, any exceptions other than IsADirectoryError, IsADirectoryError, and FileNotFoundError are passed on to the caller.

websurfer5 · 2019-06-05T22:59:11Z

(correcting typos)

What would be the appropriate behavior when unmount_func() fails (e.g. for a permission error)? Right now, any exceptions other than IsADirectoryError, PermissionError, and FileNotFoundError are passed on to the caller.

websurfer5 · 2019-06-06T03:26:26Z

Since tempfile.TemporaryDirectory() already returns some exceptions to the caller when it can't figure out how to delete a directory item, I don't see a problem with throwing PermissionError when encountering a mount point. This would preserve the 'rm -rf' semantics of the underlying shutil.rmtree() calls.

websurfer5 · 2019-06-06T18:37:03Z

Since having tempfile.TemporaryDirectory() automatically unmount mount points is difficult to implement in a portable way (Windows, BSD/macOS, and Linux are all different), a shorter-term solution could be to add an 'onerror' callback function as a class initialization parameter. This would allow the caller to inspect and try to handle any directory entries that the cleanup() function can't handle (like unmounting a mount point).

websurfer5 · 2019-06-08T05:32:38Z

Another data point:

On macOS 10.14.4, unlink() returns EBUSY when it tries to delete the mount point. tempfile.TemporaryDirectory() doesn't handle EBUSY and ends up skipping over the mount point and propagates the OSError exception up to the caller. It leaves the mounted directory tree untouched.

websurfer5 · 2019-06-11T01:53:32Z

tempfile.TemporaryDirectory() relies upon shutil.rmtree() to do the actual cleanup. Up through 3.7, it simply calls shutil.rmtree(). 3.8 adds some more logic using the onerror callback parameter of shutil.rmtree() to try changing the permissions on otherwise undeletable directory entries. In order to make tempfile.TemporaryDirectory() handle mount points, shutil.rmtree() requires some enhancements so that mount points can be detected before their contents are deleted.

I am working on some generic enhancements to tempfile.TemporaryDirectory() and shutil.rmtree() that add (hopefully) useful functionality that covers mount points as well as other corner cases:

Add an "onitem" callback paramter to shutil.rmtree() that, if provided, gets called for each directory entry as it is encountered. This allows the caller to perform any required special handling of individual directory entries (e.g. unmounting a mount point, closing a file, shutting down a named pipe, etc.).
Add an "onerror" callback parameter to the tempfile.TemporaryDirectory member functions so that the caller can perform special handling for directory items that it can't automatically delete. The caller created the undeletable directory entries, so it is reasonable to believe the caller may know how to unmake what they made.
Add an "onitem" callback parameter to the tempfile.TemporaryDirectory member functions that they pass on to shutil.rmtree().

I debated implementing "ondelete" instead of "onitem". "ondelete" would be called just prior to deleting an item, while "onitem" is called when the item is encountered. The difference in semantics is that a subdirectory entry triggers a call to "onitem" immediately, while "ondelete" would get called after the subdirectory tree had been traversed and all of its items deleted. "onitem" seems more useful because it allows the caller to implement special handling per item.

I will have a pull request ready soon. In the mean time, I am open to any other suggestions for handling non-portable corner cases.

websurfer5 · 2019-06-20T08:04:55Z

I implemented an onerror callback for tempfile.TemporaryDirectory() and confirmed that it cannot be used to unmount a mount point. Attempting to unmount the mount point from within the callback results in a resource busy error. It may be related to shutil.rmtree() holding an open file descriptor to the parent directory of the mount point.

I uploaded the program I used for testing on macOS (test-mounted-image.py). The subprocess.run() calls can be modified to run it on other platforms.

giampaolo · 2019-06-29T11:57:21Z

in case a filesystem has been mounted on the temporary directory, this can lead to the entire filesystem being removed

-1

That is expected behavior and the use case looks pretty unusual. Such a new parameter wouldn't even be supported by other "batteries" since there's no portable/standard way to either mount or unmount a directory (in fact you had to use a subprocess in your unit-tests).

A delete=bool parameter would be a more reasonable proposal in principle but:

if you want to keep the directory around then you can just use tempfile.mkdtemp() (see "there should preferably be only one way to do it")
it would conflict with the context manager usage which is expected to delete the dir on ctx manager exit

In summary, I think this would over-complicate the API for no good reason.
I'm closing this out as rejected.

riccardomurri mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 25, 2019

giampaolo added the 3.9 only security fixes label Jun 29, 2019

giampaolo closed this as completed Jun 29, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

This was referenced Jan 15, 2024

Add an "onerror" callback parameter to the tempfile.TemporaryDirectory member functions #81483

Closed

gh-81483: Add an "onerror" callback parameter to the tempfile.TemporaryDirectory member functions #14292

Closed

Mar	APR	May
	12
2023	2024	2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tempfile.TemporaryDirectory() removes entire directory tree even if it's a mount-point #80603

tempfile.TemporaryDirectory() removes entire directory tree even if it's a mount-point #80603

riccardomurri mannequin commented Mar 25, 2019

riccardomurri mannequin commented Mar 25, 2019

MojoVampire mannequin commented Mar 25, 2019

riccardomurri mannequin commented Mar 25, 2019

websurfer5 mannequin commented Jun 5, 2019

websurfer5 mannequin commented Jun 5, 2019

websurfer5 mannequin commented Jun 6, 2019

websurfer5 mannequin commented Jun 6, 2019

websurfer5 mannequin commented Jun 8, 2019

websurfer5 mannequin commented Jun 11, 2019

websurfer5 mannequin commented Jun 20, 2019

giampaolo commented Jun 29, 2019

tempfile.TemporaryDirectory() removes entire directory tree even if it's a mount-point #80603

tempfile.TemporaryDirectory() removes entire directory tree even if it's a mount-point #80603

Comments

riccardomurri mannequin commented Mar 25, 2019

riccardomurri mannequin commented Mar 25, 2019

MojoVampire mannequin commented Mar 25, 2019

riccardomurri mannequin commented Mar 25, 2019

websurfer5 mannequin commented Jun 5, 2019

websurfer5 mannequin commented Jun 5, 2019

websurfer5 mannequin commented Jun 6, 2019

websurfer5 mannequin commented Jun 6, 2019

websurfer5 mannequin commented Jun 8, 2019

websurfer5 mannequin commented Jun 11, 2019

websurfer5 mannequin commented Jun 20, 2019

giampaolo commented Jun 29, 2019