Giter Site home page Giter Site logo

duffer's People

Contributors

porterjamesj avatar vaibhavsagar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

duffer's Issues

Merging/Diffing

This library doesn't currently support any merging/diffing operations. There are existing libraries that provide text diffs, maybe we could use those?

Fuzzy finder for objects.

There should be a function that takes one of: a SHA1, a symref (HEAD, ORIG_HEAD, etc.), or a branch/lightweight tag name (master, test, etc.) and provides the corresponding object if the provided information unambiguously identifies one. This is what git cat-file -p does.

More unit tests.

We have alright integration tests to check that the system as a whole seems to be doing the right thing, but we have minimal unit tests that check that individual units are working correctly. This would be useful if/when things break so we can isolate the change that is causing the problem, something that the current tests are very bad at.

Thin Packfiles

The packfile resolution strategies do not handle thin packfiles. Maybe we should extend them to be able to?

Unpack single object from packfile.

When generating a packfile, git heuristics account for the fact that older objects are infrequently accessed. These objects are more likely to be stored later in the packfile as deltas against newer objects. This means that we should provide a way of getting just a single object out of a packfile instead of unpacking all of them.

HTTP API

We should be able to query repositories over the network. This could be fun to implement. Maybe we should use servant?

Getting Started Guide

Would be useful to explain how to get started with this project, i.e. basic build/test steps. Info on stack would be useful for non-Haskellers who are interested.

Example App

Nobody knows how to use this library. I think we should have a sample app, such as a to-do list app that keeps track of all previous versions.

Pluggable backends

My vision for this library is to support pluggable backends like libgit2 does. I think this should be implemented as a Repository typeclass with the minimum methods to support using an arbitrary backend as a database (e.g. writeObject and readObject, perhaps a hasObject method?).

Handling GPG-signed objects.

Commits and tags can be GPG-signed. These might parse due to a quirk in the parser implementation but we should handle this (and other commit extras) like hs-git does.

GraphQL Interface

Since git commits form an acyclic directed graph, we should be able to query a repository using a graph query language instead of git log's arcane format. I think this would be awesome to implement.

Tests fail after `git repack -ad`

After git repack -ad the tests fail because an offset delta representing highlight.js is encoded differently by our logic, which means that the CRC is different. Strangely, the decoded representation of both the original delta and our incorrect encoding are the same.

This test does too many things.

In an attempt to reduce duplication, this one test checks that:

  1. A decoded packfile can be encoded to be equal to the input.
  2. A decoded re-encoded packfile is equal to a decoded packfile.
  3. The CRCs of an encoded packfile match the CRCs provided in the pack index.
  4. A list of objects resolved with reference to the index matches the list of objects resolved without reference to the index.
  5. The hashes of the resolved objects match the ones found in the pack index.
  6. The resolved objects can be written to disk (writeObject works correctly).
  7. We can generate a list of pack index entries that matches our input.

This is absurd. It would be better to split this test into many smaller tests and test writeObject separately so that it is explicit that the rest of the tests depend on the presence of the correct loose objects.

The API needs to be better.

The library as currently implemented provides the absolute minimum API for writing applications using git as a database/storage layer. This needs to be improved. So far my best idea for this is an in-memory repository representation.

Can't generate a pack index from a packfile.

Objective

This library supports reading packfile contents only if the corresponding pack index is also present. In some situations (i.e. git clone) we will be streamed a packfile and expected to generate the index ourselves. This should be supported by the library.

Solution

Generate a map of offsets to bytestrings representing each entry.

Approach

git packfiles contain some content that is compressed and some content that is not compressed. Parsing the uncompressed content is straightforward but the length of each compressed section is unknown from previous input. Instead the length of the decompressed output is provided, which isn't
helpful for our purposes as none of the libraries currently in use support streaming decompression.

The solution is to use a streaming IO library with zlib decompression support such as Pipes or Conduit to separate a packfile into entries and generate the offsets of each pack entry. This can then be processed using our existing functions to generate the necessary pack index.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.