New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trouble with dir_util created dir cache #55157
Comments
|
There is a problem with dir_util cache (defined by "_path_created" global variable). It appears to be useful but it isn't, just repeat these steps to understand the problem I'm facing:
Expected behavior: What happens: I'm working with parallel applications that deal with files asynchronously, this problem gave me a headache. Anyway, the solution is easy: remove the cache. |
|
Thanks for the report and diagnosis. Why does your application randomly removes files created by distutils? |
|
Well. My application does not actually randomly remove the folders, it just can't guarantee for a given process how the folder it created will be deleted. I have many tasks running on a cluster using the same disk. Some tasks creates the folders/files and some of them remove them after processing. What each task will do depends of the availability of computational resources. The application is also aware of possible user interaction, that is, I need to be able to manipulate folders manually (adding or removing) without crashing the application or corrupting data. |
|
Maybe I’m tired, but I don’t understand why your application would remove directories that distutils creates. We’ve fixed a bug related to a race condition when *creating* directories (bpo-9281), but behaving sanely on an unstable tree seems something different to me. |
|
Suppose the application creates one folder and add some data to it:
While the application is still running (it is not using the folder anymore), you see the data, copy it to somewhere and delete everything manually using the terminal. After some time, (maybe a week or a month later, it doesn't really matter) the application wants to write again on that folder, but ops, the folder was removed. As application is very well coded :-), it checks for that folder and note that it doesn't exist anymore and needs to be recreated. But, when the application try to do so, nothing happens, because the cache is not updated. ;/ Maybe distutils package was not designed for the purpose I am using it (I am not using it to install python modules or anything), but this behavior is not well documented anyway. If you really think the cache is important, two things need to be done:
|
|
“Maybe distutils package was not designed for the purpose I am using it (I am not using it to install python modules or anything), but this behavior is not well documented anyway.” Aaaah, I had no idea you were using the function directly for something unrelated to distutils’s purpose. There is no clear distinction between public and private functions in distutils, so I understand how you could find this seemingly useful function and use it in your code. The solution is to use a public function like os.makedirs. For distutils, I don’t think a doc change is needed: the cache is an implementation detail. |
|
You were right, "os.makedirs" fits my needs. :-) Anyway, I still think the change in the documentation is needed. The user should be aware of the behavior when he call this function twice. In my opinion, the documentation should be clear about everything. We could call this an implementation detail iff it does not affect anything externally, but this is not the case (it affects subsequent calls). This function does exactly the same of "os.makedirs" but the why is discribed only in a comment inside the code. We know this is a poor programming style. This information need to be available in the documentation too. |
|
“This is not an implementation detail, it is part of the way the function works. The user should be aware of the behavior when [they] call this function twice.” I would agree if mkpath were a public function. I think it’s an implementation detail used by other distutils code, especially commands. Considering that dir_util is gone in distutils2, I see no benefit in editing the doc. |
|
"I would agree if mkpath were a public function." If new people get involved in distutils development they will need to read all the code, line by line and every comment, because the old developers decided not to document the inner workings of its functions. "Considering that dir_util is gone in distutils2, I see no benefit in editing the doc." |
I am now closing this issue. If I have misunderstood your last message and you’re not satisfied with that, please reopen. Thanks again for your report, and don’t hesitate to report any bug you may find in the future. |
|
Please reopen this issue. The distutils2 project has now been abandoned, so that's no longer a justification for taking no action. At the very least, the documentation should be fixed to either warn about this surprising behavior, or make it clear that the the dir_util functions are for distutils internal use only. |
|
distutils.dir_util is easily found in the documentation. If this behaviour is not fixed, at least the docs should state dir_util is not recommended for public use. |
|
Agreed, a doc PR to warn against using any of the distutils *util modules would be useful. |
|
distutils is deprecated now. |


Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: