Giter Site home page Giter Site logo

gitignore_parser's People

Contributors

ahernsean avatar butteredptarmigan avatar ericwb avatar excitoon avatar inverse avatar jherland avatar julienpalard avatar matthieumarrast avatar mherrmann avatar pidgeybe avatar ruancomelli avatar szczeles avatar thebaptiste avatar wsams avatar x-way avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gitignore_parser's Issues

Negation rules don't work

It's as simple as that. With the following ignore file, nothing is matched:

*
!/keepme

The file "keepme" is matched by the * and is not inverted by the ! rule.

Looking through the source, I can see that the concept of a negation rule is in there. The ! is parsed, and 'negation' is set in an IgnoreRule. But it's never used.

More specifically, as lines are read from the ignore_file, each line is turned into a rule using rule_from_pattern. Those rules are then all evaluated via:

lambda file_path: any(r.match(file_path) for r in rules)

This makes all rules equal. Negation rules are not. They negate any match that happens above them, so separate processing would need to be done.

leading slash directory doesn't work. you will see all files within the root directory

hey @mherrmann. thanks for this python module.

we discovered that a .gitignore with a leading slash directory doesn't work
you will see all files within the root directory

for instance

root
--node_modules
--etc 

whenever you read a .gitignore with a leading slash it still is not able to pick up those paths

example .gitignore may look like the following

/node_modules

and this will return false coming back

directories should be ignored even without trailing slash

To put it in code, I think that this one shall pass:

    def test_ignore_directory_no_slash(self):
        matches = _parse_gitignore_string('.venv', fake_base_dir='/home/michael')
        self.assertTrue(matches('/home/michael/.venv'))
        self.assertTrue(matches('/home/michael/.venv/folder'))
        self.assertTrue(matches('/home/michael/.venv/file.txt'))

Because according to man gitignore:

$ man gitignore | grep 'If there is a separator at the end of the pattern'
       •   If there is a separator at the end of the pattern then the pattern will only match directories, otherwise the pattern can match both files and directories.

foo/**/Bar/ not working

I tried to use ** to indicate that I want all folders called 'Bar' at any level below foo/ to be ignored. Works fine for git, but given:

.gitignore

foo/**/Bar/

and

ignorePatterns = gitignore_parser.parse_gitignore(ignorefile)
ignorePatterns("c:\foo\Bar\foofile")   # returns false
ignorePatterns("c:\foo\test\Bar\foofile")   # returns false
ignorePatterns("c:\foo\test\test\Bar\foofile")  # returns false

deprecation: gitignore-parser is being installed using the legacy 'setup.py install' method

when installing a project of us:

 DEPRECATION: gitignore-parser is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559

Just to notify you.

Unsafe string slicing in IgnoreRule.match method. It fails with Path type.

Synopsis

After upgrade from 0.1.0 to 0.1.1 I faced an issue with existing test. Looks like there is unsafe operation with abs_path which supposed to be a string only. That's a wrong assuming, because most libraries support the both types: [str, Path].

Code impacted: https://github.com/mherrmann/gitignore_parser/blob/v0.1.1/gitignore_parser.py#L143

Steps to reproduce

import tempfile
from pathlib import Path

from gitignore_parser import parse_gitignore


def reproduce_issue():
    with tempfile.TemporaryDirectory() as git_dir:
        git_dir = Path(git_dir)
        gitignore_file_path = git_dir / ".gitignore"
        gitignore_file_path.write_text("file1\n!file2")
        matches = parse_gitignore(gitignore_file_path)

        assert matches(git_dir / "file1")
        assert not matches(git_dir / "file2")


if __name__ == '__main__':
    reproduce_issue()

Observed behaviour

Traceback (most recent call last):
  File "/Users/kukusan2/Workspace/gitignore_issue/reproduce_issue.py", line 20, in <module>
    reproduce_issue()
  File "/Users/kukusan2/Workspace/gitignore_issue/reproduce_issue.py", line 15, in reproduce_issue
    assert matches(git_dir / "file1")
  File "/Users/kukusan2/Workspace/gitignore_issue/venv2/lib/python3.9/site-packages/gitignore_parser.py", line 36, in <lambda>
    return lambda file_path: handle_negation(file_path, rules)
  File "/Users/kukusan2/Workspace/gitignore_issue/venv2/lib/python3.9/site-packages/gitignore_parser.py", line 11, in handle_negation
    if rule.match(file_path):
  File "/Users/kukusan2/Workspace/gitignore_issue/venv2/lib/python3.9/site-packages/gitignore_parser.py", line 143, in match
    if self.negation and abs_path[-1] == '/':
TypeError: 'PosixPath' object is not subscriptable

unit tests fail on windows runners

Hello, unit tests fail on windows runners, specifically for the trailing whitespace tests. This occured for the recent changes made to fix issues with resolving symlinks. The Path call in _normalize_path retains whitespaces on Linux, but removes them on Windows.

A simple (but not elegant) fix I found was to re-apply the trailing whitespace on windows systems.

# At bottom of file
def _count_trailing_whitespace(text: str):
    count = 0
    for char in reversed(str(text)):
        if char.isspace():
            count += 1
        else:
            break
    return count
# In IgnoreRule
def match(self, abs_path):
    """Returns True or False if the path matches the rule."""
    matched = False
    if self.base_path:
        rel_path = str(_normalize_path(abs_path).relative_to(self.base_path))
    else:
        rel_path = str(_normalize_path(abs_path))
    
    # Path() strips trailing spaces on windows
    if sys.platform.startswith('win'):        
        rel_path += " " * _count_trailing_whitespace(abs_path)
    
    # Path() strips the trailing slash, so we need to preserve it
    # in case of directory-only negation
    if self.negation and isinstance(abs_path, str) and abs_path[-1] == '/':
        rel_path += '/'
    if rel_path.startswith('./'):
        rel_path = rel_path[2:]
    if re.search(self.regex, rel_path):
        matched = True
    return matched

Use spaces consistently

While Python allows you to use tabs or spaces the community often chooses spaces.

I noticed this repo contains a mixture with the majority tabs.

I would argue that spaces is more with the community and is actually part of PEP8.

See: https://www.python.org/dev/peps/pep-0008/

If you agree I can PR the changes to this and once the #15 is merged can add an additional linter to enforce on CI.

Codacy integration

Would you be interested in setting up codacy integration into this project?

It's free and relatively painless aside the tuning of the tools that are are used to lint the project and can scan for multiple things like:

  • security issues
  • code quality
  • linting

As well as it can be used to collect code coverage metrics from the test suite which I can help set up.

support parsing all .gitignore files in specified directory tree

gitignore specification states:

when deciding whether to ignore a path, Git normally checks gitignore patterns from multiple sources [...] patterns read from a .gitignore file in the same directory as the path, or in any parent directory, with patterns in the higher level files (up to the top level of the work tree) being overridden by those in lower level files down to the directory containing the file

so it would be nice to have a way to build a list of ignore rules not only from specific .gitignore file, but from all .gitignore files in specific directory tree.

actually, for my work project I reused your code and built custom ignore function for shutil.copytree. this work I'd done can be back ported to your project, if you see idea fit your vision of this package. waiting to hear back from you to open PR.

Clear release notes

What's your opinion on providing clear release notes with each release so that people know what's changed between?

I've seen some projects adopting a CHANGELOG.md file in the repo to track these and others leverage github to handle this.

I don't have a strong opinion on either.

ValueError raised on symlinks that point to a different path

I have a file that is a symbolic link to another location, e.g., /one/two/three -> /four/five/six. When using parse_gitignore on the symlink path (/one/two/three) it throws a ValueError:

raise ValueError("{!r} is not in the subpath of {!r}"

I believe it's caused by this line:

rel_path = str(Path(abs_path).resolve().relative_to(self.base_path))

According to the Path() documentation, resolve() will follow symlinks and remove "..". Was this intentional behavior to resolve symlinks? If so, some additional logic may be needed to catch this situation.

Not working without base_path given

Hi ! Thank you for the implementation, It's great but I had some trouble implementing it in https://github.com/seluj78/potodo

Here's what is happening

.potodoignore content

venv/
from gitignore_parser import parse_gitignore
from pathlib import Path

potodoignore_path = Path("/Users/seluj78/Projects/potodo/.potodoignore")
bad_file = Path("/Users/seluj78/Projects/potodo/venv/bin/python")
matches = parse_gitignore(gitignore_path)
matches(bad_file)  # False

But if I change it to this, then it works

from gitignore_parser import parse_gitignore
from pathlib import Path

base_path = Path("/Users/seluj78/Projects/potodo")
bad_file = Path("/Users/seluj78/Projects/potodo/venv/bin/python")
matches = parse_gitignore(".potodoignore", base_dir=base_path)
matches(bad_file)  # True

hope this helps !

Tag the source

Could you please tag the source again? This allows distributions to get the complete source from GitHub if they want.

0.1.2 was tagged but not 0.1.3.

Thanks

Allow string input rather than a path

Hi,

The premise of this issue is that I want to ignore patterns from .gitignore and .git/info/exclude as well (for local ignores). For performance reasons, it's probably better to use a single matcher for both files.

I noticed in #1 that you don't want to spend time on the project. Would you accept a PR of a small change that adds an function that take a file-like object as argument and does what's in parse_gitignore's with open(): block ?

DeprecationWarning: Flags not at the start of the expression

For anchored matches, the ^ is being inserted before the flags:

>>> import gitignore_parser
>>> from pathlib import Path
>>> gitignore_parser.rule_from_pattern("/foo", Path(".").resolve()).match("42")
/home/mdk/clones/gitignore_parser/gitignore_parser.py:143: DeprecationWarning: Flags not at the start of the expression '^(?ms)foo$'
  if re.search(self.regex, rel_path):
False

Single Wildcard Error

Whilst processing an open source folder which happens to be obs-studio there is a .gitignore file

*
!.gitignore
!data/
!exec32/
!exec32r/
!exec32d/
!exec64/
!exec64r/
!exec64d/
!libs32/
!libs32r/
!libs32d/
!libs64/
!libs64r/
!libs64d/
!misc/

The library errors when it see's the single * wildcard.

~/.local/lib/python3.8/site-packages/gitignore_parser.py in parse_gitignore(full_path, base_dir)
     16     return matched
---> 17 
     18 def parse_gitignore(full_path, base_dir=None):

~/.local/lib/python3.8/site-packages/gitignore_parser.py in rule_from_pattern(pattern, base_path, source)
     67         start_index = m.start()
---> 68         if (start_index != 0 and start_index != len(pattern) - 2 and
     69                 (pattern[start_index - 1] != '/' or

IndexError: string index out of range

I think that the single character is causing issues. Causing the failure. I will look into the fix.

Bug with matching paths within .gitignore directory

Is this correct? It feels like a bug to me since it works when you specify the rule in .gitignore.

Imagine .gitignore contents including

.venv/

And a script evaluating this

gitignore_matcher('/path/to/repo/.venv/')
True
gitignore_matcher('/path/to/repo/.venv/bin')
False

To me the second one should return True

problem with '*' in .gitignore

Hi,

I try to set a .gitignore file like this one:

https://stackoverflow.com/questions/8024924/gitignore-ignore-all-files-then-recursively-allow-foo/8025106#8025106

It doesn't work with this library.

If you try with a a simple .gitignore file with:

*

You got something like that:

  File "/home/fab/metwork/mfext/build/opt/python3_core/lib/python3.5/site-packages/gitignore_parser.py", line 18, in parse_gitignore
    source=(full_path, counter))
  File "/home/fab/metwork/mfext/build/opt/python3_core/lib/python3.5/site-packages/gitignore_parser.py", line 68, in rule_from_pattern
    if pattern[0] == '*' and pattern[1] == '*':
IndexError: string index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.