Giter Site home page Giter Site logo

Comments (11)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 1, 2024

Original comment by [email protected] on 15 Nov 2013 at 12:51

  • Changed state: Accepted

from gitinspector.

imposeren avatar imposeren commented on July 1, 2024

Can this be connected to following problem?

After some massive changes in repo gitinspector consumes all the CPU with a lot of git blame ... and does not finish repo processing for at least 4 hours.

Can anything be tuned to speedup processing or at least to reduce CPU usage?

from gitinspector.

adam-waldenberg avatar adam-waldenberg commented on July 1, 2024

Hi @imposeren.

Gitinspector is quite slow on very large repos with a big history, as it blames every single file.

This would partly solve it, yes. When this feature is implemented (assuming it's possible) it would mean gitinspector would not have to re-blame every single file each time you run it. Instead, it would only process the files that changed since last time, making it substantially less painful.

The only thing you can really do to speed up processing is to not use the "-H" (hard) option. If you were not using it - you are out of luck. The only option would be to optimize git itself :).

from gitinspector.

imposeren avatar imposeren commented on July 1, 2024

Thanks for reply. Maybe there are some options for optimizing history?

Something like this:
http://stevelorek.com/how-to-shrink-a-git-repository.html

But I do not know if removing unused files from history will affect gitinspector as it seems to operate only on existing files (cs this the correct?)

from gitinspector.

adam-waldenberg avatar adam-waldenberg commented on July 1, 2024

@imposeren

Yes. Just running "git gc" will speed things up. Sometimes quite significantly. If you have never done it before, passing the --agressive switch might be a good idea. The following is from the git docs;

--aggressive
           Usually git gc runs very quickly while providing good disk space utilization and performance. This option will cause git gc to more
           aggressively optimize the repository at the expense of taking much more time. The effects of this optimization are persistent, so this option
           only needs to be used occasionally; every few hundred changesets or so.

I'm not sure how much the other stuff in that article will affect processing speed, but I guess it's always worth a try.

The blame section of gitinspector only operates on existing files, yes. However, with the -H flag, git still scans the whole history in order to be able to correctly blame each row to each author. So I guess even "git blame" should run faster. A blamed row can also, for example be from one of those big files so it still needs to take them into account, to some extent (even without -H passed to gitinspector).

Hard to say without a deeper investigation into the inner workings of git itself.

from gitinspector.

imposeren avatar imposeren commented on July 1, 2024

@adam-waldenberg
And one more question: does gitispector blame files excluded by '-x' option?

from gitinspector.

adam-waldenberg avatar adam-waldenberg commented on July 1, 2024

@imposeren

No. It does not.

from gitinspector.

adam-waldenberg avatar adam-waldenberg commented on July 1, 2024

@imposeren

Neither does it blame any files that have an invalid extension. Binary files are also skipped.

from gitinspector.

imposeren avatar imposeren commented on July 1, 2024

is there any way to reduce concurrency of git blame? I can see up to 8 git blame processes when git inspector runs and each consumes 40-99% of processor core

from gitinspector.

imposeren avatar imposeren commented on July 1, 2024

I can already see that there are no such options:
https://github.com/ejwa/gitinspector/blob/master/gitinspector/blame.py#L31

I'll create separate issue for these and maybe will make a pull request later

from gitinspector.

adam-waldenberg avatar adam-waldenberg commented on July 1, 2024

@imposeren

Gitinspector starts as many processes as there are threads/cores. There is no configuration option for it, and never will be. However, there is a constant at the top of changes.py and blame.py that controls the number of threads.

from gitinspector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.