bpo-37936: Systematically distinguish rooted vs. unrooted in .gitignore.#15823
bpo-37936: Systematically distinguish rooted vs. unrooted in .gitignore.#15823zware merged 10 commits intopython:masterfrom
Conversation
A root cause of bpo-37936 is that it's easy to write a .gitignore rule that's intended to apply to a specific file (e.g., the `pyconfig.h` generated by `./configure`) but actually applies to all similarly-named files in the tree (e.g., `PC/pyconfig.h`.) Specifically, any rule with no non-trailing slashes is applied in an "unrooted" way, to files anywhere in the tree. This means that if we write the rules in the most obvious-looking way, then * for specific files we want to ignore that happen to be in subdirectories (like `Modules/config.c`), the rule will work as intended, staying "rooted" to the top of the tree; but * when a specific file we want to ignore happens to be at the root of the repo (like `platform`), then the obvious rule (`platform`) will apply much more broadly than intended: if someone tries to add a file or directory named `platform` somewhere else in the tree, it will unexpectedly get ignored. That's surprising behavior that can make the .gitignore file's behavior feel finicky and unpredictable. To avoid it, we can simply always give a rule "rooted" behavior when that's what's intended, by systematically using leading slashes. Further, to help make the pattern obvious when looking at the file and minimize any need for thinking about the syntax when adding new rules: separate the rules into one group for each type, with brief comments identifying them. For most of these rules it's clear whether they're meant to be rooted or unrooted, but in a handful of cases I've only guessed. In that case the safer default (the choice that won't hide information) is the narrower, rooted meaning, with a leading slash. If for some of these the unrooted meaning is desired after all, it'll be easy to move them to the unrooted section at the top.
zware
left a comment
There was a problem hiding this comment.
This mostly looks good, but I have several suggestions for further cleanup.
.gitignore
Outdated
| /config.status | ||
| /config.status.lineno | ||
| /db_home | ||
| /.hg/ |
There was a problem hiding this comment.
This can safely be ignored globally; we're not going to check in a mercurial repository anywhere :)
There was a problem hiding this comment.
Solid prediction, I think :)
.gitignore
Outdated
| /python-gdb.py | ||
| /python.exe-gdb.py | ||
| /reflog.txt | ||
| /.svn/ |
There was a problem hiding this comment.
This should also be globally ignored.
.gitignore
Outdated
| /python.exe-gdb.py | ||
| /reflog.txt | ||
| /.svn/ | ||
| /.coverage |
.gitignore
Outdated
| /config.status.lineno | ||
| /db_home | ||
| /.hg/ | ||
| /ipch/ |
There was a problem hiding this comment.
I don't think this was meant to be rooted at /; IIRC it's intermediate files in the Windows build rooted in PCbuild/<something>, and thus can probably just go away since those entire directories are ignored now.
.gitignore
Outdated
| /libpython*.a | ||
| /libpython*.so* | ||
| /libpython*.dylib | ||
| /libpython*.dll |
There was a problem hiding this comment.
Each of these 4 can be generalized to ignore anything with these extensions; we don't have anything with these extensions checked in and likely never will.
.gitignore
Outdated
| pybuilddir.txt | ||
| /autom4te.cache | ||
| /build/ | ||
| /buildno |
There was a problem hiding this comment.
I believe this is an ancient leftover from before the buildno section of sys.version referred to a VCS hash; it can just go away.
.gitignore
Outdated
| /python-config | ||
| /python-config.py | ||
| /python.bat | ||
| /python.exe |
There was a problem hiding this comment.
We could globally ignore *.exe, with a specific exception for !Lib/distutils/command/*.exe.
.gitignore
Outdated
| /config.log | ||
| /config.status | ||
| /config.status.lineno | ||
| /db_home |
There was a problem hiding this comment.
This is a holdover from the bsddb package, and can go away (so can the explicit callout of db_home in Lib/test/libregrtest/runtest.py).
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
.gitignore
Outdated
| /config.log | ||
| /config.status | ||
| /config.status.lineno | ||
| /db_home |
.gitignore
Outdated
| /libpython*.a | ||
| /libpython*.so* | ||
| /libpython*.dylib | ||
| /libpython*.dll |
.gitignore
Outdated
| /python-config | ||
| /python-config.py | ||
| /python.bat | ||
| /python.exe |
.gitignore
Outdated
| /config.status | ||
| /config.status.lineno | ||
| /db_home | ||
| /.hg/ |
There was a problem hiding this comment.
Solid prediction, I think :)
|
@zware Also just edited the PR description (aka draft commit message) to credit you as a coauthor, because that seems appropriate given all the new substantive changes you contributed 🙂 (If you disagree, though, please feel free to take it out when merging.) |
|
Sorry, @gnprice and @zware, I could not cleanly backport this to |
|
Sorry @gnprice and @zware, I had trouble checking out the |
…itignore (pythonGH-15823) A root cause of bpo-37936 is that it's easy to write a .gitignore rule that's intended to apply to a specific file (e.g., the `pyconfig.h` generated by `./configure`) but actually applies to all similarly-named files in the tree (e.g., `PC/pyconfig.h`.) Specifically, any rule with no non-trailing slashes is applied in an "unrooted" way, to files anywhere in the tree. This means that if we write the rules in the most obvious-looking way, then * for specific files we want to ignore that happen to be in subdirectories (like `Modules/config.c`), the rule will work as intended, staying "rooted" to the top of the tree; but * when a specific file we want to ignore happens to be at the root of the repo (like `platform`), then the obvious rule (`platform`) will apply much more broadly than intended: if someone tries to add a file or directory named `platform` somewhere else in the tree, it will unexpectedly get ignored. That's surprising behavior that can make the .gitignore file's behavior feel finicky and unpredictable. To avoid it, we can simply always give a rule "rooted" behavior when that's what's intended, by systematically using leading slashes. Further, to help make the pattern obvious when looking at the file and minimize any need for thinking about the syntax when adding new rules: separate the rules into one group for each type, with brief comments identifying them. For most of these rules it's clear whether they're meant to be rooted or unrooted, but in a handful of cases I've only guessed. In that case the safer default (the choice that won't hide information) is the narrower, rooted meaning, with a leading slash. If for some of these the unrooted meaning is desired after all, it'll be easy to move them to the unrooted section at the top.. (cherry picked from commit 455122a) Co-authored-by: Greg Price <gnprice@gmail.com>
|
GH-15900 is a backport of this pull request to the 3.8 branch. |
|
The 3.7 backport is just complicated enough that I'm going to leave it alone. Thanks for the patch, @gnprice! |
…itignore (GH-15823) (GH-15900) A root cause of bpo-37936 is that it's easy to write a .gitignore rule that's intended to apply to a specific file (e.g., the `pyconfig.h` generated by `./configure`) but actually applies to all similarly-named files in the tree (e.g., `PC/pyconfig.h`.) Specifically, any rule with no non-trailing slashes is applied in an "unrooted" way, to files anywhere in the tree. This means that if we write the rules in the most obvious-looking way, then * for specific files we want to ignore that happen to be in subdirectories (like `Modules/config.c`), the rule will work as intended, staying "rooted" to the top of the tree; but * when a specific file we want to ignore happens to be at the root of the repo (like `platform`), then the obvious rule (`platform`) will apply much more broadly than intended: if someone tries to add a file or directory named `platform` somewhere else in the tree, it will unexpectedly get ignored. That's surprising behavior that can make the .gitignore file's behavior feel finicky and unpredictable. To avoid it, we can simply always give a rule "rooted" behavior when that's what's intended, by systematically using leading slashes. Further, to help make the pattern obvious when looking at the file and minimize any need for thinking about the syntax when adding new rules: separate the rules into one group for each type, with brief comments identifying them. For most of these rules it's clear whether they're meant to be rooted or unrooted, but in a handful of cases I've only guessed. In that case the safer default (the choice that won't hide information) is the narrower, rooted meaning, with a leading slash. If for some of these the unrooted meaning is desired after all, it'll be easy to move them to the unrooted section at the top. (cherry picked from commit 455122a) Co-authored-by: Greg Price <gnprice@gmail.com>
| *.iml | ||
| *.o | ||
| *.a | ||
| *.so* |
There was a problem hiding this comment.
fwiw, this is causing a minor inconvenience in packaging for debian as this now matches debian/README.source
the original pattern was libpython*.so* could that be restored?
There was a problem hiding this comment.
debian/README.source does not exist in the cpython repo, so presumably you can unignore it wherever you're patching it in. I would think you would want to explicitly unignore anything added to avoid issues like this.
I prefer to keep *.so*, because we we don't want to accidentally check in any compiled extension modules (even though that shouldn't happen anyway and any *.so other than libpython* should theoretically only appear in the ignored build directory, ignoring them anyway seems better to me).
However, I could be persuaded to expand that rule to
*.so
*.so.*
If that works for you @asottile, please submit a PR and ping me.
In python#15823 the pattern was changed from `libpython*.so*` to `*.so*` which matches a bit too greedily for some packagers. For instance this trips up `debian/README.source`. A more specific pattern fixes this issue.
In GH-15823 the pattern was changed from `libpython*.so*` to `*.so*` which matches a bit too greedily for some packagers. For instance this trips up `debian/README.source`. A more specific pattern fixes this issue.
In pythonGH-15823 the pattern was changed from `libpython*.so*` to `*.so*` which matches a bit too greedily for some packagers. For instance this trips up `debian/README.source`. A more specific pattern fixes this issue.
In pythonGH-15823 the pattern was changed from `libpython*.so*` to `*.so*` which matches a bit too greedily for some packagers. For instance this trips up `debian/README.source`. A more specific pattern fixes this issue.
…re (pythonGH-15823) A root cause of bpo-37936 is that it's easy to write a .gitignore rule that's intended to apply to a specific file (e.g., the `pyconfig.h` generated by `./configure`) but actually applies to all similarly-named files in the tree (e.g., `PC/pyconfig.h`.) Specifically, any rule with no non-trailing slashes is applied in an "unrooted" way, to files anywhere in the tree. This means that if we write the rules in the most obvious-looking way, then * for specific files we want to ignore that happen to be in subdirectories (like `Modules/config.c`), the rule will work as intended, staying "rooted" to the top of the tree; but * when a specific file we want to ignore happens to be at the root of the repo (like `platform`), then the obvious rule (`platform`) will apply much more broadly than intended: if someone tries to add a file or directory named `platform` somewhere else in the tree, it will unexpectedly get ignored. That's surprising behavior that can make the .gitignore file's behavior feel finicky and unpredictable. To avoid it, we can simply always give a rule "rooted" behavior when that's what's intended, by systematically using leading slashes. Further, to help make the pattern obvious when looking at the file and minimize any need for thinking about the syntax when adding new rules: separate the rules into one group for each type, with brief comments identifying them. For most of these rules it's clear whether they're meant to be rooted or unrooted, but in a handful of cases I've only guessed. In that case the safer default (the choice that won't hide information) is the narrower, rooted meaning, with a leading slash. If for some of these the unrooted meaning is desired after all, it'll be easy to move them to the unrooted section at the top.
A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
pyconfig.hgenerated by./configure) but actually applies to allsimilarly-named files in the tree (e.g.,
PC/pyconfig.h.)Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree. This means that if we
write the rules in the most obvious-looking way, then
for specific files we want to ignore that happen to be in
subdirectories (like
Modules/config.c), the rule will workas intended, staying "rooted" to the top of the tree; but
when a specific file we want to ignore happens to be at the root of
the repo (like
platform), then the obvious rule (platform) willapply much more broadly than intended: if someone tries to add a
file or directory named
platformsomewhere else in the tree, itwill unexpectedly get ignored.
That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.
To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.
Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.
For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed. In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash. If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.
While here, also made some rules more general where that seems
most natural, and deleted some rules that appear long obsolete.
Co-Authored-By: Zachary Ware zach@python.org
https://bugs.python.org/issue37936