X Tutup
The Wayback Machine - https://web.archive.org/web/20230125125723/https://github.com/python/cpython/issues/55157
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble with dir_util created dir cache #55157

Closed
diegoqueiroz mannequin opened this issue Jan 19, 2011 · 14 comments
Closed

Trouble with dir_util created dir cache #55157

diegoqueiroz mannequin opened this issue Jan 19, 2011 · 14 comments
Labels
3.7 3.8 docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error

Comments

@diegoqueiroz
Copy link
Mannequin

diegoqueiroz mannequin commented Jan 19, 2011

BPO 10948
Nosy @tarekziade, @merwok, @mhsmith, @ivtashev

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2011-01-19.17:07:10.607>
labels = ['3.8', 'type-bug', '3.7', 'docs']
title = 'Trouble with dir_util created dir cache'
updated_at = <Date 2019-03-14.00:58:58.366>
user = 'https://bugs.python.org/diegoqueiroz'

bugs.python.org fields:

activity = <Date 2019-03-14.00:58:58.366>
actor = 'eric.araujo'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2011-01-19.17:07:10.607>
creator = 'diegoqueiroz'
dependencies = []
files = []
hgrepos = []
issue_num = 10948
keywords = []
message_count = 13.0
messages = ['126540', '126548', '126550', '126551', '126554', '126564', '126568', '126569', '126625', '126773', '332972', '337779', '337892']
nosy_count = 5.0
nosy_names = ['tarek', 'eric.araujo', 'diegoqueiroz', 'Malcolm Smith', 'ivtashev']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'pending'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue10948'
versions = ['Python 2.7', 'Python 3.7', 'Python 3.8']

@diegoqueiroz
Copy link
Mannequin Author

diegoqueiroz mannequin commented Jan 19, 2011

There is a problem with dir_util cache (defined by "_path_created" global variable).

It appears to be useful but it isn't, just repeat these steps to understand the problem I'm facing:

  1. Use mkpath to create any path (eg. /home/user/a/b/c)
  2. Open the terminal and manually delete the directory "/home/user/a" and its contents
  3. Try to create "/home/user/a/b/c" again using mkpath

Expected behavior:
mkpath should create the folder tree again.

What happens:
Nothing, mkpath "thinks" the folder already exists because its creation was cached. Moreover, if you try to create one more folder level (eg. /home/user/a/b/c/d) it raises an exception because it thinks that part of the tree was already created and fails to create the last folder.

I'm working with parallel applications that deal with files asynchronously, this problem gave me a headache.

Anyway, the solution is easy: remove the cache.

@diegoqueiroz diegoqueiroz mannequin assigned tarekziade Jan 19, 2011
@diegoqueiroz diegoqueiroz mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 19, 2011
@merwok
Copy link
Member

merwok commented Jan 19, 2011

Thanks for the report and diagnosis. Why does your application randomly removes files created by distutils?

@merwok merwok assigned merwok and unassigned tarekziade Jan 19, 2011
@diegoqueiroz
Copy link
Mannequin Author

diegoqueiroz mannequin commented Jan 19, 2011

Well. My application does not actually randomly remove the folders, it just can't guarantee for a given process how the folder it created will be deleted.

I have many tasks running on a cluster using the same disk. Some tasks creates the folders/files and some of them remove them after processing. What each task will do depends of the availability of computational resources.

The application is also aware of possible user interaction, that is, I need to be able to manipulate folders manually (adding or removing) without crashing the application or corrupting data.

@merwok
Copy link
Member

merwok commented Jan 19, 2011

Maybe I’m tired, but I don’t understand why your application would remove directories that distutils creates. We’ve fixed a bug related to a race condition when *creating* directories (bpo-9281), but behaving sanely on an unstable tree seems something different to me.

@diegoqueiroz
Copy link
Mannequin Author

diegoqueiroz mannequin commented Jan 19, 2011

Suppose the application creates one folder and add some data to it:

  • /scratch/a/b/c

While the application is still running (it is not using the folder anymore), you see the data, copy it to somewhere and delete everything manually using the terminal.

After some time, (maybe a week or a month later, it doesn't really matter) the application wants to write again on that folder, but ops, the folder was removed. As application is very well coded :-), it checks for that folder and note that it doesn't exist anymore and needs to be recreated.

But, when the application try to do so, nothing happens, because the cache is not updated. ;/

Maybe distutils package was not designed for the purpose I am using it (I am not using it to install python modules or anything), but this behavior is not well documented anyway.

If you really think the cache is important, two things need to be done:

  1. Implement a way to update/clear the cache
  2. Include details about the cache and its implications on distutils documentation

@merwok
Copy link
Member

merwok commented Jan 19, 2011

“Maybe distutils package was not designed for the purpose I am using it (I am not using it to install python modules or anything), but this behavior is not well documented anyway.” Aaaah, I had no idea you were using the function directly for something unrelated to distutils’s purpose. There is no clear distinction between public and private functions in distutils, so I understand how you could find this seemingly useful function and use it in your code.

The solution is to use a public function like os.makedirs. For distutils, I don’t think a doc change is needed: the cache is an implementation detail.

@diegoqueiroz
Copy link
Mannequin Author

diegoqueiroz mannequin commented Jan 19, 2011

You were right, "os.makedirs" fits my needs. :-)

Anyway, I still think the change in the documentation is needed.
This is not an implementation detail, it is part of the way the function works.

The user should be aware of the behavior when he call this function twice. In my opinion, the documentation should be clear about everything. We could call this an implementation detail iff it does not affect anything externally, but this is not the case (it affects subsequent calls).

This function does exactly the same of "os.makedirs" but the why is discribed only in a comment inside the code. We know this is a poor programming style. This information need to be available in the documentation too.

@merwok
Copy link
Member

merwok commented Jan 19, 2011

“This is not an implementation detail, it is part of the way the function works. The user should be aware of the behavior when [they] call this function twice.”

I would agree if mkpath were a public function. I think it’s an implementation detail used by other distutils code, especially commands. Considering that dir_util is gone in distutils2, I see no benefit in editing the doc.

@diegoqueiroz
Copy link
Mannequin Author

diegoqueiroz mannequin commented Jan 20, 2011

"I would agree if mkpath were a public function."
So It is better to define what a "public function" is. Any function in any module of any project, if it is indented to be used by other modules, it is public by definition.

If new people get involved in distutils development they will need to read all the code, line by line and every comment, because the old developers decided not to document the inner workings of its functions.

"Considering that dir_util is gone in distutils2, I see no benefit in editing the doc."
Well, I know nothing about this. However, if you tell me that distutils2 will replace distutils, I may agree with you and distutils just needs to be deprecated. Otherwise, I keep my opinion.

@merwok
Copy link
Member

merwok commented Jan 21, 2011

So It is better to define what a "public function" is.
That is no easy task. See bpo-10894 for a general discussion. For the particular case of distutils, there is no distinction between internal helpers that we should be free to change and public functions provided to third-party code. That’s one of the reasons we had to fork under a new name to have a chance to clean things up (i.e. make nearly everything private).

If new people get involved in distutils development they will need to
read all the code, line by line and every comment, because the old
developers decided not to document the inner workings of its functions.
A lot of people have bee learning distutils internals in recent years: Tarek, the current maintainer; hackers from Montreal; Google Summer of Code students like me. So it is possible to get involved with distutils, starting with one area (network code, or versions and dependencies, or commands, or compilers...). That said, I agree the doc is very lacking, and improving it is one of my big goals for distutils2 in Python 3.3. (I will give priority to important user-facing functions and classes over helpers like mkpath, however.)

Well, I know nothing about this. However, if you tell me that
distutils2 will replace distutils, I may agree with you and distutils
just needs to be deprecated. Otherwise, I keep my opinion.
distutils is frozen and only gets bug fixes; distutils2 is a fork where we can break compatibility to fix the design and behavior. More information on http://tarekziade.wordpress.com/2010/03/03/the-fate-of-distutils-pycon-summit-packaging-sprint-detailed-report/

I am now closing this issue. If I have misunderstood your last message and you’re not satisfied with that, please reopen. Thanks again for your report, and don’t hesitate to report any bug you may find in the future.

@merwok merwok closed this as completed Jan 21, 2011
@mhsmith
Copy link
Mannequin

mhsmith mannequin commented Jan 4, 2019

Please reopen this issue. The distutils2 project has now been abandoned, so that's no longer a justification for taking no action.

At the very least, the documentation should be fixed to either warn about this surprising behavior, or make it clear that the the dir_util functions are for distutils internal use only.

@mhsmith mhsmith mannequin added the 3.7 label Jan 4, 2019
@ivtashev
Copy link
Mannequin

ivtashev mannequin commented Mar 12, 2019

distutils.dir_util is easily found in the documentation. If this behaviour is not fixed, at least the docs should state dir_util is not recommended for public use.

@merwok
Copy link
Member

merwok commented Mar 14, 2019

Agreed, a doc PR to warn against using any of the distutils *util modules would be useful.

@merwok merwok added 3.8 docs Documentation in the Doc dir and removed stdlib Python modules in the Lib dir labels Mar 14, 2019
@merwok merwok reopened this Mar 14, 2019
@merwok merwok removed their assignment Mar 14, 2019
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel
Copy link
Member

distutils is deprecated now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 3.8 docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants
X Tutup