Giter Site home page Giter Site logo

libgit2's Introduction

libgit2 - the Git linkable library

libgit2 is a portable, pure C implementation of the Git core methods provided as a re-entrant linkable library with a solid API, allowing you to write native speed custom Git applications in any language with bindings.

libgit2 is licensed under a very permissive license (GPLv2 with a special Linking Exception). This basically means that you can link it (unmodified) with any kind of software without having to release its source code.

What It Can Do

libgit2 is already very usable.

  • SHA conversions, formatting and shortening
  • object reading (loose and packed)
  • object writing (loose)
  • commit, tag, tree and blob parsing and write-back
  • tree traversal
  • revision walking
  • index file (staging area) manipulation
  • custom ODB backends
  • reference management (including packed references)
  • ...and more

Building libgit2 - External dependencies

libgit2 builds cleanly on most platforms without any external dependencies. Under Unix-like systems, like Linux, *BSD and Mac OS X, libgit2 expects pthreads to be available; they should be installed by default on all systems. Under Windows, libgit2 uses the native Windows API for threading.

Additionally, he following libraries may be used as replacement for built-in functionality:

libgit2 can be built using the SHA1 implementation of LibSSL-Crypto, instead of the built-in custom implementations. Performance wise, they are quite similar.

Building libgit2 - Using waf

Waf is a minimalist build system which only requires a Python 2.5+ interpreter to run. This is the default build system for libgit2.

To build libgit2 using waf, first configure the build system by running:

$ ./waf configure

Then build the library, either in its shared (libgit2.so) or static form (libgit2.a):

$ ./waf build-static
$ ./waf build-shared

You can then run the full test suite with:

$ ./waf test

And finally you can install the library with (you may need to sudo):

$ sudo ./waf install

The waf build system for libgit2 accepts the following flags:

--debug
	build the library with debug symbols.
	Defaults to off.

--sha1=[builtin|ppc|openssl]
	use the builtin SHA1 functions, the optimized PPC versions
	or the SHA1 functions from LibCrypto (OpenSSL).
	Defaults to 'builtin'.

--msvc=[7.1|8.0|9.0|10.0]
	Force a specific version of the MSVC compiler, if more than
	one version is installed.

--arch=[ia64|x64|x86|x86_amd64|x86_ia64]
	Force a specific architecture for compilers that support it.

--with-sqlite
	Enable sqlite support.

You can run ./waf --help to see a full list of install options and targets.

Building libgit2 - Using CMake

The libgit2 library can also be built using CMake 2.6+ (http://www.cmake.org) on all platforms.

On most systems you can build the library using the following commands

$ mkdir build && cd build
$ cmake ..
$ cmake --build .

Alternatively you can point the CMake GUI tool to the CMakeLists.txt file and generate platform specific build project or IDE workspace.

To install the library you can specify the install prefix by setting:

$ cmake .. -DCMAKE_INSTALL_PREFIX=/install/prefix
$ cmake --build . --target install

For more advanced use or questions about CMake please read http://www.cmake.org/Wiki/CMake_FAQ.

Language Bindings

Here are the bindings to libgit2 that are currently available:

If you start another language binding to libgit2, please let us know so we can add it to the list.

How Can I Contribute

Fork libgit2/libgit2 on GitHub, add your improvement, push it to a branch in your fork named for the topic, send a pull request.

You can also file bugs or feature requests under the libgit2 project on GitHub, or join us on the mailing list by sending an email to:

[email protected]

License

libgit2 is under GPL2 with linking exemption. This means you can link to the library with any program, commercial, open source or other. However, you cannot modify libgit2 and distribute it without supplying the source.

See the COPYING file for the full license text.

libgit2's People

Contributors

ageric avatar alexbudgh avatar arhibot avatar basepi avatar bnoordhuis avatar carlosmn avatar chobie avatar ctimmerm avatar dborowitz avatar drahosp avatar enzbang avatar hausdorff avatar icefox avatar ingmarv avatar joychenhuanhui avatar julioes avatar justinlove avatar jwiegley avatar madcoder avatar marvil07 avatar neopallium avatar nud avatar nulltoken avatar przemoc avatar sakari avatar schacon avatar spearce avatar tclem avatar vimalloc avatar vmg avatar

Watchers

 avatar  avatar  avatar

libgit2's Issues

merge dev-algo-[patience core]

I've merged core into patience, but I think it is at a point where we can merge them into dev-diff-algo.
My branch has all of core and patience in it without merge conflicts, and compiles.

Decide on bitshift vs division/mutiplication by 2

Recently we changed some of the places where we divide or multiply by a power of 2 to use bitshifts. My impression was that this was implemented automatically by the compiler.

We need to decide which to use.

patience: fillhashmap possible improvements

We are passing in data1 and data2, but also the env variable that already has pointers to data1/2 (though I'm not sure if those pointers point to data yet in the pipeline)

static int fill_hashmap(diff_mem_data *data1, diff_mem_data *data2,
        git_diffresults_conf const *results_conf,
        diff_environment *env, struct hashmap *result,
        int line1, int count1, int line2, int count2)
{
    /*
     * If env already has data1/2, then there is no reason to pass
     * in two data structs
     */
    result->file1 = data1; /* maybe? result->file1 = env->data1 */
    result->file2 = data2; /* "" */
    result->results_conf = results_conf;
    result->env = env;

extraneous spaces in libdiff.c

There are some places where spaces should be tabs in libdiff.c. Anytime there are 4 spaces, that can be a tab, even if those 4 spaces are for lining things up. If there are 6 spaces, four of them should be 1 tab and the other 2 just spaces. I can fix it if you want.

Build public-facing diff API function interface

We need to settle on, and implement, the functions that will face outwards, directly to the users. This includes:

  1. Designing the functions and function signatures to be used directly by the users
  2. Define, implement, and typedef all structs associated with these functions.

Check off documentation for diff

Particularly, we need to:

  • Check off comments in public-facing API functions before deployment
  • Check off the main comment that explicitly says where we got the ideas for the diff code, and who to contact for problems with it.

This needs to be done before we send it out.

EDIT:

We also need to

  • Make sure the comments match up with the things they describe. Right now there's some dissonance between the two.

Use or define the libdiff malloc

We're currently just calling free() and malloc(). This isn't extensible. We should define some sort of macro that specifies which malloc we're using, and it should probably default to libgit's malloc. I propose we call it ld_malloc()

Style compliancy problems

Structs

Previously, I thought it was the case that typedef'ing structs was mainly for readability. Recent readings indicate that this is incorrect, and libgit in general seems to agree.

We need to:

  • Remove typedefs in structs that are not opaque.
  • For structs that are opaque, we want to access them only via functions that are designed to handle them.

Error Codes

We often return -1 after an error problem; we should return error codes instead. Per @crakdmirror's request, these error codes seem to be defined in include/common.h. Not sure if that's all of them.

Function declarations

Should we declare functions? Where? What sort (e.g., static)?

Line wrap

Probably 80 columns.

Tabs vs Spaces

libgit2 uses tabs. End of story. Spaces are unacceptable. Additionally, it is not true that any instance of 4 spaces should become a tab. Consider the following:

   int some_func(int param,
                 int param2)

Note that the initial indentation for line 1 and 2 are the same -- we use 1 tab. BUT, the second line must use spaces after this 1-tab indentation so that the params line up.

Another example is here; we indent both the first and the second line with 1 tab, and then use spaces to line up the xdl_change_compact() calls. There are more than 4 spaces.

Comments

This is C99, so either a // Comment or `/* Comment */ format are acceptable.

Spaces

for(i=0; i<len; i++)
if(...)

Should be...

for (i = 0; i < len; i++)
if (...)

Notice: spaces after for, if, else, else if and between the following symbols:

=  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :

Braces

Braces are placed on the same lines in control-flow statements, e.g., conditionals, loops, and switches. The are NOT placed on the same lines of functions. See style guide for more details.

YES:

int has_cow(struct farmer *bob)
{
    if(bob->cow) {
        return 0;
    }
    else {
        return NO_COW;
    }
}

NO:

int has_cow(struct farmer *bob) {
    if(bob->cow)
    {
        return 0;
    }
    else
    {
        return NO_COW;
    }
}

Patience diff

Do we want patience diff in a separate file? I don't think it would hurt to do so. We could model it after JGit, where the both algos are in different files.

Use OS-agnostic IO handlers

PROBLEM:

Current implementation of src/diff.c uses a method called load_file(). This opens a file given a directory. Arguably, though, this is a function that belongs in a file whose job it is to supply OS-agnostic IO handlers. As it turns out, this instinct is correct, as there is such a file, which you will see in the "resources" section.

OBJECTIVES:

  • Transition diff.c to OS-agnostic IO handlers, to the exclusion of diff.c's load_file().
  • Transition any other file that does this which I overlooked in this bug report.

RESOURCES:

  • fileops.h and fileops.c supplies the OS-agnostic IO handlers required to complete this bug report. They're pretty easy to use.

use c89 style comments

According to the wiki:

Linux style for comments is the C89 "/* ... */" style. Don't use C99-style "// ..." comments.

Are we following the same? If so, we need to change a bunch of stuff.

Consolidate records-centric functionality to a record-handling file

Right now, libdiff.c is a hodgepodge of functions that deal with records seamlessly alongside the the diffing algorithm functions. This should change to provide us with better abstraction. Specifically, in this new file should go:

  • Functions that deal with classification (e.g., those that deal with record_classifier and classd_record structs)
  • Functions that build and administrate the memstore struct
  • Functions that deal with actual records (e.g., diff_record, etc.)
  • Possibly functions that prepare the metadata and context of records and diff metadata (e.g., prepare_data_ctx())

Decide between size_t and long

We need to know whether to use size_t and, say, int and long. libgit2 uses size_t all over the place, so the question is likely really a question about when to use it.

NOTES:

  • We will need to pay special attention to converting the stuff in all structs when doing this.

Prepare backlogs and tasks in ScrumWorks

Project creation, backlog item creation, and task creation within each backlog item

Some of these will come from, or at least be clarified by, input from Vicent when we get that. However, due to the necessity of showing this setup to the TA tomorrow, we need to get this done tonight.

Implement core diff protocols

git produces both raw diff output (as in git diff) and a raw diff internal representation used to apply merges, patches, etc. This task will be composed of:

  1. Nailing down what that protocol is.
  2. Seeing if we can implement it faster.

The next step is to implement it inside and around the core diff function.

Flags variable is getting passed in uninitialized (full 'o garbage)

PROBLEM: We're sending in a flags parameter as part of the diff query. Normally, it's the job of a set of helper function (e.g. the hypothetical function print_std_out()) to create a git_diffresults_conf struct that configures the result of the diff -- whether we are printing it or merging it, etc. This does not exist yet, and so the flags param is not set, and thus it tends to be full of garbage.

Review and complete TODOs and FIXMEs before ship of diff

Often we'll write something like "FIXME/TODO: change this function name to something that makes more sense then int if_i_see_one_more_uncommented_function_in_libxdiff_i_am_sending_davide_libenzi_a_bomb_in_the_mail" in the code to remind ourselves that we have some shit to do.

We should:

  • Complete as many of these as possible
  • Before shipping
  • Unless we can't do it for some reason

Explore bogosqrt vs q3sqrt

We use square roots for a number of things. Our current implementation is here -- just the standard approximation method.

One of the team suggested that we look into using Quake 3's sqrt hack. Initial drawbacks seem to be that it's for floats, not longs. Perhaps we can adapt it to fit longs.

Create diff error codes

We must add error codes to common.h specific to diff results.
Discussion here for what error codes are needed go here.

Implement core diff function

Tentatively leaning towards Patience. We will need to look at both that and the classical diff algorithm.

This step will mostly be composed of writing the core diff function, possibly in both algorithms, and profiling each. The next step will be integrating it with the protocols (e.g., the internal diff protocol used to merge etc.).

The goals are:

  • Fast. Particularly, to make it parallel if we can.

Scrumworks Sprint 2 update

Everyone please go into scrumworks and take a look. I figure once we actually start sprint 2, I'll move the unfinished tasks over to it from Sprint 1. Doesn't allow me to add pieces of backlog items, requires whole backlock item, so I just added all of the main merge implementation to sprint 2, even though I doubt we'll get it done. When you look at how things have worked out, this is really only a 3 week project, not 5. (Since we have all the beginning and end administrative stuff) We're doing the final presentation 2 weeks from tomorrow. Crazy stuff.

patience: insert_record possible improvements

    while (map->entries[index].line1) {
        /*
         * Set other to the record corresponding to the line we are on
         * This seems to be comparing file1 to file1 at times
         * If we are on pass = 1, then diff_record will be equal to
         * data_ctx1->recs[line-1], which other gets set to here
         * TODO: see if this can be bypassed once
         */
        other = map->env->data_ctx1.recs[map->entries[index].line1 - 1];

We need to rehearse the final presentation and what-not.

Just need to decide what all we need to do to prepare for this. I suppose we can just give each person stuff to talk about decide ordering, and wing it. I'm not too worried about the public speaking part of it, once we have all the content.

This is obviously low-priority until we have the actual code done, but thought I'd throw up an issue.

EDIT: So when are we going to do it? I suppose we'll talk more at the meeting on Saturday.

diff_no_index() fails to read files

PROBLEM:

Calls to diff_no_index() fail even when provided with correct paths for files.

RESOURCES:

Code ran is here:

int main() {
    git_diffresults_conf *conf;
    git_repository *repo;
    git_repository_open(&repo, "");
    git_index *index;
    git_commit *commit1;
    git_commit *commit2;

    //git_diff(diffdata, commit1, repo);

    printf("MAIN\n");
    printf("%d\n", git_diff_no_index(&conf, "difftest_before", "difftest_after"));

    //git_diff_cached(diffdata, commit1, index);
    //git_diff_commits(diffdata, commit1, commit2);
}

ls gives us this:

a.out           difftest_after  difftest_before main.c          tests.c

tests.c is unrelated.

git diff with newly added files.

In git, if you use git diff and there is a newly added file in your file system but not in the repository, then the contents of that file are not part of the diff. If you delete a file from the file system that was in the repository, then that file is still included in the diff.

Should we follow this behavior? If so it would make the git_diff() function virtually done, minus actually doing the diffs on the files that have changed.

xdlclassifier and xdlclass do not appear to have any function whatsoever

A lot of preparations happen in the diff pipeline. One of these preparations is to build xdlclassifiers and xdlclasses for every record. They're niftily constructed and the code is nicely written. The only problem is that I can't find a place where they're actually used. In our code, we build it in algo_environment() and in their code, they build it in xdl_prepare_env(). In both cases, the classifier and its classes are created and then simply thrown away. What does this code do?

If we didn't have to actually build these things, we could cut out probably 1/8th of the diff running time. We need to find out:

  • If it's the case that these are not used.
  • And un-implement them if it is

We free a pointer that wasn't malloc'd

PROBLEM: A call to git_diff_no_index() will allocate the contents of file at params filepath1 and filepath2 to a char *buffer1 and char *buffer2. Running this method will produce the following error:
a.out(5679) malloc: *** error for object 0x100800000: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug [1] 5679 abort ./a.out
What's going on here is that we're free'ing here without actually having malloc'd the memory to begin with.

NOTES:

I would've just fixed this, but I wasn't sure if that would conflict with resolution of #14.

Memory-profile diff algorithm

We will need to make sure that we're deleting everything and not causing memory leaks. This is a production library, and that would be bad.

@trane seems good at this sort of thing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.