basepi / libgit2 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from libgit2/libgit2

0.0 3.0 0.0 3.14 MB

The Library

Home Page: http://libgit2.github.com

License: Other

C 100.00%

libgit2's Introduction

libgit2 - the Git linkable library

libgit2 is a portable, pure C implementation of the Git core methods provided as a re-entrant linkable library with a solid API, allowing you to write native speed custom Git applications in any language with bindings.

libgit2 is licensed under a very permissive license (GPLv2 with a special Linking Exception). This basically means that you can link it (unmodified) with any kind of software without having to release its source code.

Mailing list: [email protected]
Website: http://libgit2.github.com
API documentation: http://libgit2.github.com/libgit2/modules.html
Usage guide: http://libgit2.github.com/api.html

What It Can Do

libgit2 is already very usable.

SHA conversions, formatting and shortening
object reading (loose and packed)
object writing (loose)
commit, tag, tree and blob parsing and write-back
tree traversal
revision walking
index file (staging area) manipulation
custom ODB backends
reference management (including packed references)
...and more

Building libgit2 - External dependencies

libgit2 builds cleanly on most platforms without any external dependencies. Under Unix-like systems, like Linux, *BSD and Mac OS X, libgit2 expects pthreads to be available; they should be installed by default on all systems. Under Windows, libgit2 uses the native Windows API for threading.

Additionally, he following libraries may be used as replacement for built-in functionality:

LibSSL (optional) http://www.openssl.org/

libgit2 can be built using the SHA1 implementation of LibSSL-Crypto, instead of the built-in custom implementations. Performance wise, they are quite similar.

Building libgit2 - Using waf

Waf is a minimalist build system which only requires a Python 2.5+ interpreter to run. This is the default build system for libgit2.

To build libgit2 using waf, first configure the build system by running:

$ ./waf configure

Then build the library, either in its shared (libgit2.so) or static form (libgit2.a):

$ ./waf build-static
$ ./waf build-shared

You can then run the full test suite with:

$ ./waf test

And finally you can install the library with (you may need to sudo):

$ sudo ./waf install

The waf build system for libgit2 accepts the following flags:

--debug
	build the library with debug symbols.
	Defaults to off.

--sha1=[builtin|ppc|openssl]
	use the builtin SHA1 functions, the optimized PPC versions
	or the SHA1 functions from LibCrypto (OpenSSL).
	Defaults to 'builtin'.

--msvc=[7.1|8.0|9.0|10.0]
	Force a specific version of the MSVC compiler, if more than
	one version is installed.

--arch=[ia64|x64|x86|x86_amd64|x86_ia64]
	Force a specific architecture for compilers that support it.

--with-sqlite
	Enable sqlite support.

You can run ./waf --help to see a full list of install options and targets.

Building libgit2 - Using CMake

The libgit2 library can also be built using CMake 2.6+ (http://www.cmake.org) on all platforms.

On most systems you can build the library using the following commands

$ mkdir build && cd build
$ cmake ..
$ cmake --build .

Alternatively you can point the CMake GUI tool to the CMakeLists.txt file and generate platform specific build project or IDE workspace.

To install the library you can specify the install prefix by setting:

$ cmake .. -DCMAKE_INSTALL_PREFIX=/install/prefix
$ cmake --build . --target install

For more advanced use or questions about CMake please read http://www.cmake.org/Wiki/CMake_FAQ.

Language Bindings

Here are the bindings to libgit2 that are currently available:

Rugged (Ruby bindings) https://github.com/libgit2/rugged
objective-git (Objective-C bindings) https://github.com/libgit2/objective-git
pygit2 (Python bindings) https://github.com/libgit2/pygit2
libgit2sharp (.NET bindings) https://github.com/libgit2/libgit2sharp
php-git (PHP bindings) https://github.com/libgit2/php-git
luagit2 (Lua bindings) https://github.com/libgit2/luagit2
GitForDelphi (Delphi bindings) https://github.com/libgit2/GitForDelphi
node-gitteh (Node.js bindings) https://github.com/libgit2/node-gitteh
nodegit (Node.js bindings) https://github.com/tbranyen/nodegit
go-git (Go bindings) https://github.com/str1ngs/go-git
libqgit2 (C++ QT bindings) https://projects.kde.org/projects/playground/libs/libqgit2/
libgit2-ocaml (ocaml bindings) https://github.com/burdges/libgit2-ocaml
Geef (Erlang bindings) https://github.com/schacon/geef

If you start another language binding to libgit2, please let us know so we can add it to the list.

How Can I Contribute

Fork libgit2/libgit2 on GitHub, add your improvement, push it to a branch in your fork named for the topic, send a pull request.

You can also file bugs or feature requests under the libgit2 project on GitHub, or join us on the mailing list by sending an email to:

[email protected]

License

libgit2 is under GPL2 with linking exemption. This means you can link to the library with any program, commercial, open source or other. However, you cannot modify libgit2 and distribute it without supplying the source.

See the COPYING file for the full license text.

libgit2's People

Contributors

Watchers

libgit2's Issues

merge dev-algo-[patience core]

I've merged core into patience, but I think it is at a point where we can merge them into dev-diff-algo.
My branch has all of core and patience in it without merge conflicts, and compiles.

Decide on bitshift vs division/mutiplication by 2

Recently we changed some of the places where we divide or multiply by a power of 2 to use bitshifts. My impression was that this was implemented automatically by the compiler.

We need to decide which to use.

Core protocol team's commits break the build

Merge introduced in 833e409f527993bb8726 breaks build. If you need help building, take a look at this.

patience: fillhashmap possible improvements

We are passing in data1 and data2, but also the env variable that already has pointers to data1/2 (though I'm not sure if those pointers point to data yet in the pipeline)

static int fill_hashmap(diff_mem_data *data1, diff_mem_data *data2,
        git_diffresults_conf const *results_conf,
        diff_environment *env, struct hashmap *result,
        int line1, int count1, int line2, int count2)
{
    /*
     * If env already has data1/2, then there is no reason to pass
     * in two data structs
     */
    result->file1 = data1; /* maybe? result->file1 = env->data1 */
    result->file2 = data2; /* "" */
    result->results_conf = results_conf;
    result->env = env;

extraneous spaces in libdiff.c

There are some places where spaces should be tabs in libdiff.c. Anytime there are 4 spaces, that can be a tab, even if those 4 spaces are for lining things up. If there are 6 spaces, four of them should be 1 tab and the other 2 just spaces. I can fix it if you want.

Build public-facing diff API function interface

We need to settle on, and implement, the functions that will face outwards, directly to the users. This includes:

Designing the functions and function signatures to be used directly by the users
Define, implement, and typedef all structs associated with these functions.

Check off documentation for diff

Particularly, we need to:

Check off comments in public-facing API functions before deployment
Check off the main comment that explicitly says where we got the ideas for the diff code, and who to contact for problems with it.

This needs to be done before we send it out.

EDIT:

We also need to

Make sure the comments match up with the things they describe. Right now there's some dissonance between the two.

Use or define the libdiff malloc

We're currently just calling free() and malloc(). This isn't extensible. We should define some sort of macro that specifies which malloc we're using, and it should probably default to libgit's malloc. I propose we call it ld_malloc()

Style compliancy problems

Structs

Previously, I thought it was the case that typedef'ing structs was mainly for readability. Recent readings indicate that this is incorrect, and libgit in general seems to agree.

We need to:

Remove typedefs in structs that are not opaque.
For structs that are opaque, we want to access them only via functions that are designed to handle them.

Error Codes

We often return -1 after an error problem; we should return error codes instead. Per @crakdmirror's request, these error codes seem to be defined in include/common.h. Not sure if that's all of them.

Function declarations

Should we declare functions? Where? What sort (e.g., static)?

Line wrap

Probably 80 columns.

Tabs vs Spaces

libgit2 uses tabs. End of story. Spaces are unacceptable. Additionally, it is not true that any instance of 4 spaces should become a tab. Consider the following:

   int some_func(int param,
                 int param2)

Note that the initial indentation for line 1 and 2 are the same -- we use 1 tab. BUT, the second line must use spaces after this 1-tab indentation so that the params line up.

Another example is here; we indent both the first and the second line with 1 tab, and then use spaces to line up the xdl_change_compact() calls. There are more than 4 spaces.

Comments

This is C99, so either a // Comment or `/* Comment */ format are acceptable.

Spaces

for(i=0; i<len; i++)
if(...)

Should be...

for (i = 0; i < len; i++)
if (...)

Notice: spaces after for, if, else, else if and between the following symbols:

=  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :

Braces

Braces are placed on the same lines in control-flow statements, e.g., conditionals, loops, and switches. The are NOT placed on the same lines of functions. See style guide for more details.

YES:

int has_cow(struct farmer *bob)
{
    if(bob->cow) {
        return 0;
    }
    else {
        return NO_COW;
    }
}

NO:

int has_cow(struct farmer *bob) {
    if(bob->cow)
    {
        return 0;
    }
    else
    {
        return NO_COW;
    }
}

Patience diff

Do we want patience diff in a separate file? I don't think it would hurt to do so. We could model it after JGit, where the both algos are in different files.

Use OS-agnostic IO handlers

PROBLEM:

Current implementation of src/diff.c uses a method called load_file(). This opens a file given a directory. Arguably, though, this is a function that belongs in a file whose job it is to supply OS-agnostic IO handlers. As it turns out, this instinct is correct, as there is such a file, which you will see in the "resources" section.

OBJECTIVES:

Transition diff.c to OS-agnostic IO handlers, to the exclusion of diff.c's load_file().
Transition any other file that does this which I overlooked in this bug report.

RESOURCES:

fileops.h and fileops.c supplies the OS-agnostic IO handlers required to complete this bug report. They're pretty easy to use.

Prepare final presentation slides

@trane mentioned he might be able to start on these since he's waiting on @DrSleep for something. Thus, he gets assigned the issue! =P

Implement diff unit tests for libgit2

Try to re-use as many use cases as possible from git diff unit tests. Use the libgit2 test framework.

use c89 style comments

According to the wiki:

Linux style for comments is the C89 "/* ... */" style. Don't use C99-style "// ..." comments.

Are we following the same? If so, we need to change a bunch of stuff.

Memory-profile diff interface

This is critical because this is a production library.

Seems like something for @kyeana

Consolidate records-centric functionality to a record-handling file

Right now, libdiff.c is a hodgepodge of functions that deal with records seamlessly alongside the the diffing algorithm functions. This should change to provide us with better abstraction. Specifically, in this new file should go:

Functions that deal with classification (e.g., those that deal with record_classifier and classd_record structs)
Functions that build and administrate the memstore struct
Functions that deal with actual records (e.g., diff_record, etc.)
Possibly functions that prepare the metadata and context of records and diff metadata (e.g., prepare_data_ctx())

C doesn't allow function overloading, and this is annoying

I think you got this, @crakdmirror.

patience: find_longest_common_sequence possible improvements

    /*
     * Could we use fewer comparisons by making this a while loop?
     * entry = map->first
     * while (entry->next) {...
     * ?
     */
    for (entry = map->first; entry; entry = entry->next) {

Function Declarations

We are declaring some function at the top of .c files like such: https://github.com/crakdmirror/libgit2/blob/development/src/diff.c#L13

I understand why we would do this, but glancing around it doesn't look like the rest of the project is doing this. Is it ok if we remove these to keep the same feel as the rest of the project?

Decide between size_t and long

We need to know whether to use size_t and, say, int and long. libgit2 uses size_t all over the place, so the question is likely really a question about when to use it.

NOTES:

We will need to pay special attention to converting the stuff in all structs when doing this.

Prepare backlogs and tasks in ScrumWorks

Project creation, backlog item creation, and task creation within each backlog item

Some of these will come from, or at least be clarified by, input from Vicent when we get that. However, due to the necessity of showing this setup to the TA tomorrow, we need to get this done tonight.

Implement core diff protocols

git produces both raw diff output (as in git diff) and a raw diff internal representation used to apply merges, patches, etc. This task will be composed of:

Nailing down what that protocol is.
Seeing if we can implement it faster.

The next step is to implement it inside and around the core diff function.

Flags variable is getting passed in uninitialized (full 'o garbage)

PROBLEM: We're sending in a flags parameter as part of the diff query. Normally, it's the job of a set of helper function (e.g. the hypothetical function print_std_out()) to create a git_diffresults_conf struct that configures the result of the diff -- whether we are printing it or merging it, etc. This does not exist yet, and so the flags param is not set, and thus it tends to be full of garbage.

Review and complete TODOs and FIXMEs before ship of diff

Often we'll write something like "FIXME/TODO: change this function name to something that makes more sense then int if_i_see_one_more_uncommented_function_in_libxdiff_i_am_sending_davide_libenzi_a_bomb_in_the_mail" in the code to remind ourselves that we have some shit to do.

We should:

Complete as many of these as possible
Before shipping
Unless we can't do it for some reason

Explore bogosqrt vs q3sqrt

We use square roots for a number of things. Our current implementation is here -- just the standard approximation method.

One of the team suggested that we look into using Quake 3's sqrt hack. Initial drawbacks seem to be that it's for floats, not longs. Perhaps we can adapt it to fit longs.

Create diff error codes

We must add error codes to common.h specific to diff results.
Discussion here for what error codes are needed go here.

Implement core diff function

Tentatively leaning towards Pa tie nce. We will need to look at both that and the classical diff algorithm.

This step will mostly be composed of writing the core diff function, possibly in both algorithms, and profiling each. The next step will be integrating it with the protocols (e.g., the internal diff protocol used to merge etc.).

The goals are:

Fast. Particularly, to make it parallel if we can.

Scrumworks Sprint 2 update

Everyone please go into scrumworks and take a look. I figure once we actually start sprint 2, I'll move the unfinished tasks over to it from Sprint 1. Doesn't allow me to add pieces of backlog items, requires whole backlock item, so I just added all of the main merge implementation to sprint 2, even though I doubt we'll get it done. When you look at how things have worked out, this is really only a 3 week project, not 5. (Since we have all the beginning and end administrative stuff) We're doing the final presentation 2 weeks from tomorrow. Crazy stuff.

patience: insert_record possible improvements

    while (map->entries[index].line1) {
        /*
         * Set other to the record corresponding to the line we are on
         * This seems to be comparing file1 to file1 at times
         * If we are on pass = 1, then diff_record will be equal to
         * data_ctx1->recs[line-1], which other gets set to here
         * TODO: see if this can be bypassed once
         */
        other = map->env->data_ctx1.recs[map->entries[index].line1 - 1];

Get Alex's private branch merged

Code freeze is tonight and I know that @DrSleep has been working on his own private branch. Can we have this integrated into dev-testing or something? @crakdmirror @kyeana

Figure out Git's diff unit tests, implement for libgit2

We need to rehearse the final presentation and what-not.

Just need to decide what all we need to do to prepare for this. I suppose we can just give each person stuff to talk about decide ordering, and wing it. I'm not too worried about the public speaking part of it, once we have all the content.

This is obviously low-priority until we have the actual code done, but thought I'd throw up an issue.

EDIT: So when are we going to do it? I suppose we'll talk more at the meeting on Saturday.

diff_no_index() fails to read files

PROBLEM:

Calls to diff_no_index() fail even when provided with correct paths for files.

RESOURCES:

Code ran is here:

int main() {
    git_diffresults_conf *conf;
    git_repository *repo;
    git_repository_open(&repo, "");
    git_index *index;
    git_commit *commit1;
    git_commit *commit2;

    //git_diff(diffdata, commit1, repo);

    printf("MAIN\n");
    printf("%d\n", git_diff_no_index(&conf, "difftest_before", "difftest_after"));

    //git_diff_cached(diffdata, commit1, index);
    //git_diff_commits(diffdata, commit1, commit2);
}

ls gives us this:

a.out           difftest_after  difftest_before main.c          tests.c

tests.c is unrelated.

Set up scrumworks first sprint

Should be a 10-second job, but still on the list of things to-do.

git diff with newly added files.

In git, if you use git diff and there is a newly added file in your file system but not in the repository, then the contents of that file are not part of the diff. If you delete a file from the file system that was in the repository, then that file is still included in the diff.

Should we follow this behavior? If so it would make the git_diff() function virtually done, minus actually doing the diffs on the files that have changed.

xdlclassifier and xdlclass do not appear to have any function whatsoever

A lot of preparations happen in the diff pipeline. One of these preparations is to build xdlclassifiers and xdlclasses for every record. They're niftily constructed and the code is nicely written. The only problem is that I can't find a place where they're actually used. In our code, we build it in algo_environment() and in their code, they build it in xdl_prepare_env(). In both cases, the classifier and its classes are created and then simply thrown away. What does this code do?

If we didn't have to actually build these things, we could cut out probably 1/8th of the diff running time. We need to find out:

If it's the case that these are not used.
And un-implement them if it is

We free a pointer that wasn't malloc'd

PROBLEM: A call to git_diff_no_index() will allocate the contents of file at params filepath1 and filepath2 to a char *buffer1 and char *buffer2. Running this method will produce the following error:
a.out(5679) malloc: *** error for object 0x100800000: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug [1] 5679 abort ./a.out
What's going on here is that we're free'ing here without actually having malloc'd the memory to begin with.