Giter Site home page Giter Site logo

Comments (15)

ThomasWaldmann avatar ThomasWaldmann commented on May 22, 2024

I tried it with system-rescuecd (includes borg, ntfsclone and other useful stuff), a MBR partitioned disk, windows on 2 partitions:

Please verify the commands before you use them, especially the disk identifiers!

# backup
sfdisk -d /dev/sdx > sfdisk.txt
dd if=/dev/sdx of=mbr_gap count=2048  # all sectors until 1st partition, see sfdisk output
borg create --compression lz4 repo::hostname-partinfo-mbr-gap sfdisk.txt mbr_gap
ntfsclone -s -o - /dev/sdx1 | borg create --compression lz4 repo::hostname-part1 -
ntfsclone -s -o - /dev/sdx2 | borg create --compression lz4 repo::hostname-part2 -
# restore
borg extract repo::hostname-partinfo-mbr-gap
sfdisk /dev/sdx < sfdisk.txt      # this is a bit redundant, but notifies the OS
dd if=mbr_gap of=/dev/sdx bs=1M
borg extract --stdout repo::hostname-part1 | ntfsclone -r -O /dev/sdx1 -
borg extract --stdout repo::hostname-part2 | ntfsclone -r -O /dev/sdx2 -

Note: mbr_gap contains the MBR (1 sector) plus the gap after it (which may be used or not).

from borg.

ThomasWaldmann avatar ThomasWaldmann commented on May 22, 2024

TODO: verify it, check dedup, maybe optimize chunker params, create nice rst docs.

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

I just tried this and it works great.

585GB ntfsclone image, piped to borg, LZ4 compressed down to 475GB, then deduplicated down to 445GB.

Took about 12 hours.

from borg.

ThomasWaldmann avatar ThomasWaldmann commented on May 22, 2024

@HomerSlated I had the impression that at least the first run is quite a lot slower than just writing it to disk. Maybe it gets better if one saves another version of that image to same repo.

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

I'll test this when I make my first incremental, in a few days time.

However, I believe this is expected behaviour from any deduplication system. Borg is very far from being the slowest. I'm currently doing an initial CrashPlan backup on exactly the same dataset, except half is excluded, and CrashPlan is predicting -> 3 DAYS <- for completion. That's 3 days to backup 250GB, vs Borg backing up the full 500GB in only 12 hours. I haven't tested OpenDedup yet, but I've heard that its even slower than CrashPlan.

Also, CrashPlan's size estimate would suggest, bizarrely, that its final archive will actually be bigger than the input data, whereas Borg's was markedly smaller even on the initial run.

I'll let you know the final results when they're ready.

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

Here's the results.

In summary, a full backup of ~550GB took ~10 hours, and the first incremental (three weeks later) took ~4 hours, deduplicating down to ~20GB.

# On 2016-11-20
Name: Winders-20161119-230121.nfc
Fingerprint: 371062a4f5f0399485889d08bec4ab7ae037989a7ea565f92cdef98cfcfbe7f8
Hostname: lexy
Username: root
Time (start): Sat, 2016-11-19 23:01:22
Time (end):   Sun, 2016-11-20 08:44:01
Command line: /usr/lib/python-exec/python3.4/borg create --progress --stats --compression lz4 ::Winders-20161119-230121.nfc -
Number of files: 1

                       Original size      Compressed size    Deduplicated size
This archive:              534.74 GB            473.29 GB            451.56 GB
All archives:              546.89 GB            482.92 GB            460.50 GB

                       Unique chunks         Total chunks
Chunk index:                  538671               591406

# On 2016-12-09
Name: Winders-20161209-085926.nfc
Fingerprint: c8b91c276ca6834788b04205d4b37872cd0a8f9feb08614d25774b2154db5afc
Hostname: lexy
Username: root
Time (start): Fri, 2016-12-09 08:59:26
Time (end):   Fri, 2016-12-09 12:58:08
Command line: /usr/lib/python-exec/python3.4/borg create --progress --stats --compression lz4 ::Winders-20161209-085926.nfc -
Number of files: 1

                       Original size      Compressed size    Deduplicated size
This archive:              549.57 GB            475.01 GB             19.63 GB
All archives:                1.10 TB            957.93 GB            480.13 GB

                       Unique chunks         Total chunks
Chunk index:                  551119               798954

# On 2016-12-09
Name: Winders-20161119-230121.nfc
Fingerprint: 371062a4f5f0399485889d08bec4ab7ae037989a7ea565f92cdef98cfcfbe7f8
Hostname: lexy
Username: root
Time (start): Sat, 2016-11-19 23:01:22
Time (end):   Sun, 2016-11-20 08:44:01
Command line: /usr/lib/python-exec/python3.4/borg create --progress --stats --compression lz4 ::Winders-20161119-230121.nfc -
Number of files: 1

                       Original size      Compressed size    Deduplicated size
This archive:              534.74 GB            473.29 GB             16.51 GB
All archives:                1.10 TB            957.93 GB            480.13 GB

                       Unique chunks         Total chunks
Chunk index:                  551119               798954

I'm also testing UrBackup with its CBT (Changed Block Tracker) on the same dataset. So far it's predicting ~7 hours to back up (which seems wrong, unless the CBT isn't working properly), although I don't yet know the final size of the differential.

I'm also planning to test burp2 at some point, for comparison.

from borg.

ThomasWaldmann avatar ThomasWaldmann commented on May 22, 2024

Hint: using more recent python (e.g. 3.5, if possible) may give better speed.

The borg binary bundles latest python release (3.5.2 currently).

from borg.

ThomasWaldmann avatar ThomasWaldmann commented on May 22, 2024

@HomerSlated keep us updated about performance and other results of your comparison (I wanted to create and set up something to compare backup tools performance, but didn't come to it yet).

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

Yes, a defined testing environment for controlled results would be useful. The problem is that at least one of those tests would require running from a different OS (Windows), thus breaking the control conditions (UrBackup imaging and CBT currently only works under Windows).

Nonetheless, a "real-world" speed and size comparison is useful.

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

Also Python 3.5 is currently masked unstable on Gentoo, and will cause all kinds of dependency issues if I emerge it, so I'll leave it for now.

[I] dev-lang/python
     Available versions:
     (2.7)  2.7.10-r1 2.7.12
     (3.4)  3.4.3-r1 3.4.5(3.4/3.4m)
     (3.5)  ~3.5.2(3.5/3.5m)
       {-berkdb build doc examples gdbm hardened ipv6 libressl +ncurses +readline sqlite +ssl +threads tk +wide-unicode wininst +xml ELIBC="uclibc"}
     Installed versions:  2.7.12(2.7)(13:54:00 04/12/16)(gdbm ipv6 ncurses readline ssl threads wide-unicode xml -berkdb -build -doc -examples -hardened -libressl -sqlite -tk -wininst ELIBC="-uclibc") 3.4.5(3.4)(14:02:59 04/12/16)(gdbm ipv6 ncurses readline ssl threads xml -build -examples -hardened -libressl -sqlite -tk -wininst ELIBC="-uclibc")
     Homepage:            http://www.python.org/
     Description:         An interpreted, interactive, object-oriented programming language

* app-backup/borgbackup
     Available versions:  ~1.0.7 ~1.0.8 **9999 {+fuse libressl PYTHON_TARGETS="python3_4 python3_5"}
     Homepage:            https://borgbackup.readthedocs.io/
     Description:         Deduplicating backup program with compression and authenticated encryption.

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

Well it turned out that UrBackup took 222 minutes (3 hours, 42 mins) to complete, with an incremental size of just 4.94GB, and that's without CBT functioning correctly (would have reduced the time to maybe 5 minutes).

from borg.

enkore avatar enkore commented on May 22, 2024

15 MB/s is indeed rather slow (even for Borg 1.0). UrBackup's 40 MB/s is a lot better, but still kinda slow. I'm not sure how UrBackup works, it seems to be a classic full+delta backup system, so it probably uses much less CPU - likely the limiting factor in your case?

CBT is indeed an interesting technology. It would seem to enable a similar behaviour as Borg's file metadata cache (which allows Borg to instantaneously backup an unchanged file, regardless of size), just for images, and not file systems -- quite interesting!

from borg.

HomerSlated avatar HomerSlated commented on May 22, 2024

I should add that the backup storage is on USB 3.0, which on my hardware benches at ~200 MB/s seq r/w, although clearly that's more than 40 MB/s, so that's probably not the bottleneck.

Actually I'm not really bothered about the speed (except when the CBT client I paid for doesn't seem to work). I'm more interested in saving storage space, and both borg and UrBackup work well in that regard, although the latter seems to have the edge right now (in my tests so far).

I just tried burp 2.x, but can't figure out how to get it to work. It seems I need to RTFM on OpenSSL certificates before I can even get started.

from borg.

prnrrgxf avatar prnrrgxf commented on May 22, 2024

@ThomasWaldmann
You told in the 2 hours long borg video from 2017 that the users should write to the end of a bug in the bugtracker when they need it. I do this right now.
I need exactly that. I have to backup a windows machine with a 2TB hdd where only 200GB are used. At the moment i boot up a live linux, connect a second 2TB hdd to the computer and run dd to get everything (mbr, just all).
This is huge waste of time and resources. Everytime a full 2TB backup of ~90% no-data.
Could you please make a production ready borg solution for that? That would be awesome!

from borg.

dragetd avatar dragetd commented on May 22, 2024

Please note: If your deleted sectors contain random old data, deduplication will not be able to deduplicate everything and it will be still slow and large. 'dd' is not the best choice here.

ntfsclone is designed to only copy the used sectors. Even without borg, you should benefit form using ntfsclone. Using borg+ntfsclone also seems to work without issues, giving even smaller backups. (just the discussed performance issues are left to be investigated)

from borg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.