vaibhavsagar / duffer Goto Github PK
View Code? Open in Web Editor NEWA git-compatible content tracker in Haskell.
Home Page: http://vaibhavsagar.com/duffer
A git-compatible content tracker in Haskell.
Home Page: http://vaibhavsagar.com/duffer
This library doesn't currently support any merging/diffing operations. There are existing libraries that provide text diffs, maybe we could use those?
There should be a function that takes one of: a SHA1, a symref (HEAD, ORIG_HEAD, etc.), or a branch/lightweight tag name (master, test, etc.) and provides the corresponding object if the provided information unambiguously identifies one. This is what git cat-file -p
does.
We have alright integration tests to check that the system as a whole seems to be doing the right thing, but we have minimal unit tests that check that individual units are working correctly. This would be useful if/when things break so we can isolate the change that is causing the problem, something that the current tests are very bad at.
The packfile resolution strategies do not handle thin packfiles. Maybe we should extend them to be able to?
I have calls to error
everywhere, this should be refactored to use Maybe
/Either
as appropriate.
When generating a packfile, git
heuristics account for the fact that older objects are infrequently accessed. These objects are more likely to be stored later in the packfile as deltas against newer objects. This means that we should provide a way of getting just a single object out of a packfile instead of unpacking all of them.
A rudimentary implementation of v2 is available here.
This thing has zero documentation. Why is that?
We should be able to query repositories over the network. This could be fun to implement. Maybe we should use servant
?
Would be useful to explain how to get started with this project, i.e. basic build/test steps. Info on stack
would be useful for non-Haskellers who are interested.
Nobody knows how to use this library. I think we should have a sample app, such as a to-do list app that keeps track of all previous versions.
My vision for this library is to support pluggable backends like libgit2
does. I think this should be implemented as a Repository
typeclass with the minimum methods to support using an arbitrary backend as a database (e.g. writeObject
and readObject
, perhaps a hasObject
method?).
Commits and tags can be GPG-signed. These might parse due to a quirk in the parser implementation but we should handle this (and other commit extras) like hs-git
does.
Since git
commits form an acyclic directed graph, we should be able to query a repository using a graph query language instead of git log
's arcane format. I think this would be awesome to implement.
Duffer.Loose.Objects has concat
s and append
s everywhere. This probably isn't good for performance and we should refactor this to use ByteString.Builder. We can roughly measure the difference using stack test --profile
.
After git repack -ad
the tests fail because an offset delta representing highlight.js
is encoded differently by our logic, which means that the CRC is different. Strangely, the decoded representation of both the original delta and our incorrect encoding are the same.
The packfile types and bit-twiddling helpers are currently intermingled, and separation would also aid a refactoring to e.g. a Vector of Word8s.
In an attempt to reduce duplication, this one test checks that:
writeObject
works correctly).This is absurd. It would be better to split this test into many smaller tests and test writeObject
separately so that it is explicit that the rest of the tests depend on the presence of the correct loose objects.
The library as currently implemented provides the absolute minimum API for writing applications using git
as a database/storage layer. This needs to be improved. So far my best idea for this is an in-memory repository representation.
The library currently handles only loose references, but it needs to handle at least reading packed references.
This library supports reading packfile contents only if the corresponding pack index is also present. In some situations (i.e. git clone
) we will be streamed a packfile and expected to generate the index ourselves. This should be supported by the library.
Generate a map of offsets to bytestrings representing each entry.
git
packfiles contain some content that is compressed and some content that is not compressed. Parsing the uncompressed content is straightforward but the length of each compressed section is unknown from previous input. Instead the length of the decompressed output is provided, which isn't
helpful for our purposes as none of the libraries currently in use support streaming decompression.
The solution is to use a streaming IO library with zlib decompression support such as Pipes or Conduit to separate a packfile into entries and generate the offsets of each pack entry. This can then be processed using our existing functions to generate the necessary pack index.
Cannot reproduce and fix locally: https://travis-ci.org/vaibhavsagar/duffer/builds/160462242
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.