Giter Site home page Giter Site logo

patchitup's Introduction

patchitup
Version Code Coverage Code Coverage Donate

Backup your file to a cloud server using minimum bandwidth.

patchitup is a way to keep the cloud up-to-date through incremental patches. In a nutshell, this is a pure-Golang library and a CLI tool for creating a client+server that exchange incremental gzipped patches to overwrite a remote copy to keep it up-to-date with the client's local file.

Why? I wrote this program to reduce the bandwidth usage when backing up SQLite databases to a remote server from Raspberry Pis. I have deployed some software on Raspberry Pis that periodically dumps the database to SQL text. Since Raspberry Pi's can die sometimes, I want to keep their data stored remotely. As the databases can get fairly large, a patch from SQL text will only ever be the changed/new records. patchitup allows the client to just send to the cloud only the changed/new records and still maintain the exact copy on the cloud. This can massively reduce bandwidth between the client and the cloud.

Why not git? While git basically does this already, its not terribly easy to setup a git server to support multiple users (though gitolite does a great job of simplifying the process). Also, most of the features of git are not necessary for my use-case.

Quickstart

In addition to being a Golang library, the patchitup is a server+client. To try it, first install patchitup with Go:

$ go install -u -v github.com/schollz/patchitup/...

Then start a patchitup server:

$ patchitup -host
Running at http://0.0.0.0:8002

Then you can patch a file:

$ patchitup -u me -s http://localhost:8002 -f SOMEFILE
2018-02-23 08:56:44 [INFO] patched 2.4 kB (62.8%) to remote 'SOMEFILE' for 'me'
2018-02-23 08:56:44 [INFO] remote server is up-to-date

$ vim SOMEFILE # make some edits

$ patchitup -u me -s http://localhost:8002 -f SOMEFILE
2018-02-23 08:57:40 [INFO] patched 408 B (9.9%) to remote 'SOMEFILE' for 'me'
2018-02-23 08:57:40 [INFO] remote server is up-to-date

The first time you patch will basically just send up the gzipped file. Subsequent edits will just send up the patches. The percentage (e.g. 9.9%) specifies the percentage of the entire file size that is being sent (to get an idea of bandwidth savings). The server also will log bandwidth usage.

How does it work?

Note: patchitup does not work for binary files (yet).

Why not just do "diff -u old new > patch && rsync patch your@server:"? Well, patchitup keeps things organized a lot better and uses gzip by default to reduce the bandwidth cost even further. Also, in order to patch a remote file you first need a copy of the remote file to create the patch. In patchitup, if the local copy of remote file is not available, a local copy of the remote file is reconstructed it in a way that can massively reduce bandwidth (i.e. instead of just downloading the remote file). To reconstruct a local copy of remote file:

  1. The client asks the remote server for a hash of every line and its corresponding line number in the remote file.
  2. The client checks to see if any lines are needed (i.e. the set of line hashes that do not exist in the current local file). The client then asks the remote server for the actual lines corresponding to the missing hashes.
  3. The client uses these data (the local line hashes, the remote line hashes, and the hash line numbers) to reconstruct a copy of the remote file for doing the patching.

Once the local copy of the remote file is established, a patch is created and gzipped and sent to the server for overwriting the current remote copy. A current remote copy is cached locally so that it need not be reconstructed the next time.

A more detailed flow chart:

Roadmap

I would love PRs.

Some ideas I'd like to add:

  • Built-in security (authentication tokens?)
  • Encryption option (to keep data on server private)

License

MIT

Thanks

Logo designed by www.Vecteezy.com

patchitup's People

Contributors

schollz avatar

Watchers

James Cloos avatar Leo Palomares avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.