Giter Site home page Giter Site logo

fcorbelli / zpaqfranz Goto Github PK

View Code? Open in Web Editor NEW
229.0 12.0 18.0 48.78 MB

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix

License: MIT License

C++ 99.96% Makefile 0.03% Batchfile 0.01%
zpaq compression deduplication backup solaris

zpaqfranz's Introduction

zpaqfranz: advanced multiversioned archiver, with HW acceleration and SFX (on Windows)

Swiss army knife for backup and disaster recovery, like 7z or RAR on steroids, with deduplicated "snapshots" (versions). Conceptually similar to the Mac time machine, but much more efficiently. zpaq 7.15 fork.

Platform OS package Version Video
Windows 32/64bit Badge
OpenBSD pkg_add zpaqfranz Badge
FreeBSD pkg install zpaqfranz
MacOS brew install zpaqfranz Badge
OpenSUSE sudo zypper install zpaqfranz
Debian (Ubuntu etc) 58.10i Desktop
Linux generic 58.10g
Arch AUR user repository 58.10i Terminal
Solaris
OmniOS
QNAP (Annapurna)
Haiku
ESXi
Freeware GUI for Windows latest
ZpaqTreeView Third Party Python software

Freeware GUI for Windows Wiki Help Sourceforge

Classic archivers (tar, 7z, RAR etc) are obsolete, when used for repeated backups (daily etc), compared to the ZPAQ technology, that maintain "snapshots" (versions) of the data. This is even more true in the case of ASCII dumps of databases (e.g. MySQL/MariaDB)

Let's see. Archiving a folder multiple times (5), simulating a daily run Monday-to-Friday, with 7z

7Z-1.mp4

Same, but with zpaqfranz

Zpaq-1.mp4

As you can see, the .7z "daily" 5x backups takes ~ 5x the space of the .zpaq

compare

Seeing is believing ("real world")

I thought it's best to show the difference for a more realistic example.

Physical (small fileserver) Xeon machine with 8 cores, 64GB RAM and NVMe disks, plus Solaris-based NAS, 1Gb ethernet

Rsync update from filesystem to filesystem (real speed)

Rsync.Backup-1.mp4

Rsync update to Solaris NAS (real speed)

Rsync.Nas-1.mp4

Backup update from file system with zpaqfranz (real speed)

Zpaq.Backup-1.mp4

Backup upgrade via zfsbackup (real speed)

Zfs.Backup-1.mp4

What?

At every run only data changed since the last execution will be added, creating a new version (the "snapshot"). It is then possible to restore the data @ the single version, just like snapshots by zfs or virtual machines, but a single-file level.

  • Keeps a forever-to-ever copy (even thousands of versions), conceptually similar to Mac's time machine, but much more efficiently.
  • Ideal for virtual machine disk storage (ex backup of vmdk), virtual disks (VHDx) and even TrueCrypt containers.
  • Easily handles millions of files and tens of TBs of data.
  • Allows rsync (or zfs replica) copies to the cloud with minimal data transfer and encryption.
  • Multiple possibilities of data verification, fast, advanced and even paranoid.
  • Some optimizations for modern hardware (aka: SSD, NVMe, multithread).
  • By default triple-check with "chunked" SHA-1, XXHASH64 and CRC-32 (!).
For even higher level of paranoia, it is possible to use others hash algorithms, as
  • MD5
  • SHA-1 of the full-file (NIST FIPS 180-4)
  • XXH3-128
  • BLAKE3 128
  • SHA-2-256 (NIST FIPS 180-4)
  • SHA-3-256 (NIST FIPS 202)
  • WHIRLPOOL (ISO/IEC 10118-3)
  • HIGHWAY (64,128,256) ...And much more.

No complex (and fragile) repository folders, with hundreds of "whatever", just only a single file!

Windows client? Minimum size (without software) VSS backups

It is often important to copy the %desktop% folder, Thunderbird's data, %download% and generally the data folders of a Windows system, leaving out the programs

Real speed (encrypted) update of C: without software (-frugal)

vss.mp4

Are you a really paranoid Windows user (like me)? You can get sector-level copies of C:, too.

In this case the space used is obviously larger, as is the execution time, but even the "most difficult" folders are also taken. Deliberately the bitmap of occupied clusters is ignored: if you are paranoid, be all the way down!

It is just like a dd. You can't (for now) restore with zpaqfranz. You have to extract to a temporary folder and then use other software (e.g., 7z, OSFMount) to extract the files directly from the image

Accelerated speed (encrypted) every-sector update of a 256GB C: @ ~150MB/s

dd.mp4

To date, there is no software, free or paid, that matches this characteristics

AFAIK of course
10+ years of developing (2009-now).

Who did that?

One of the world's leading scientists in compression.

No, not me, but this guy ZPAQ - Wikipedia

When?

From 2009 to 2016.

Where?

On a Russian compression forum, one of the most famous, but obviously super-niche

Why is it not known as 7z or RAR, despite being enormously superior?

Because lack of users who ... try it!

Who are you?

A user (and a developer) who has proposed and made various improvements that have been implemented over the years. When the author left the project, I made my fork to make the functions I need as a data storage manager.

Why is it no longer developed? Why should I use your fork?

Because Dr. Mahoney is now retired and no longer supports it (he... run!)

Why should I trust? It will be one of 1000 other programs that silently fail and give problems

As the Russians (and Italians) say, trust me, but check.

Archiving data requires safety. How can I be sure that I can then extract them without problems?

It is precisely the portion of the program that I have evolved, implementing a barrage of controls up to the paranoid level, and more. Let's say there are verification mechanisms which you have probably never seen. Do you want to use SHA-2/SHA-3 to be very confident? You can.

Accelerated speed of real world testing of archive >1GB/s

test.mp4

ZPAQ (zpaqfranz) allows you to NEVER delete the data that is stored and will be available forever (in reality typically you starts from scratch every 1,000 or 2,000 versions, for speed reasons, on HDD. 10K+ on SSD), and restore the files present to each archived version, even if a month or three years ago.

Real-speed updating (on QNAP NAS) of a small server (300GB); ~7GB of Thunderbird mbox become ~6MB (!) in ~4 minutes.

update-nas.mp4

In this "real world" example (a ~500.000 files / ~500GB file server of a mid-sized enterprise), you will see 1042 "snapshots", stored in 877GB.

root@f-server:/copia1/copiepaq/spaz2020 # zpaqfranz i fserver_condivisioni.zpaq
zpaqfranz v51.27-experimental snapshot archiver, compiled May 26 2021
fserver_condivisioni.zpaq:
1042 versions, 1.538.727 files, 15.716.105 fragments, 877.457.003.477 bytes (817.20 GB)
Long filenames (>255)     4.526

Version(s) enumerator
-------------------------------------------------------------------------
< Ver  > <  date  > < time >  < added > <removed>    <    bytes added   >
-------------------------------------------------------------------------
00000001 2018-01-09 16:56:02  +00308608 -00000000 ->      229.882.913.501
00000002 2018-01-09 18:06:28  +00007039 -00000340 ->           47.356.864
00000003 2018-01-10 15:06:25  +00007731 -00000159 ->            7.314.709
00000004 2018-01-10 15:17:44  +00007006 -00000000 ->              612.584
00000005 2018-01-10 15:47:03  +00007005 -00000000 ->              611.980
00000006 2018-01-10 18:03:08  +00008135 -00000829 ->        2.698.417.427
(...)
00000011 2018-01-10 19:20:30  +00007007 -00000000 ->              613.273
00000012 2018-01-11 07:00:36  +00007008 -00000000 ->              613.877
(...)
00000146 2018-03-27 17:08:39  +00001105 -00000541 ->          164.399.767
00000147 2018-03-28 17:08:28  +00000422 -00000134 ->          277.237.055
00000148 2018-03-29 17:12:02  +00011953 -00011515 ->          826.218.948
(...)
00001039 2021-05-02 17:17:42  +00030599 -00031135 ->       12.657.155.316
00001040 2021-05-03 17:14:03  +00000960 -00000095 ->          398.358.496
00001041 2021-05-04 17:13:40  +00000605 -00000004 ->           95.909.988
00001042 2021-05-05 17:15:13  +00000579 -00000008 ->           82.487.415

54.799 seconds (all OK)

Do you want to restore @ 2018-03-28?

00000147 2018-03-28 17:08:28  +00000422 -00000134 ->          277.237.055

Version 147 =>

zpaqfranz x ... -until 147

Do you want 2021-03-05? Version 984 =>

zpaqfranz x ... -until 984

Another real world example: 4900 versions, from mid-2017

zpaqfranz v51.10-experimental journaling archiver, compiled Apr  5 2021
franz:use comment
old_aserver.zpaq:

4904 versions, 385.830 files, 3.515.679 fragments, 199.406.200.193 bytes (185.71
GB)

Version comments enumerator
------------
00000001 2017-08-16 19:26:15  +00090863 -00000000 ->       79.321.339.869
00000002 2017-08-17 13:29:25  +00000026 -00000000 ->              629.055
00000003 2017-08-17 13:30:41  +00000005 -00000000 ->               18.103
00000004 2017-08-17 14:34:12  +00000005 -00000000 ->               18.149
00000005 2017-08-17 15:28:42  +00000008 -00000000 ->               99.062
00000006 2017-08-17 19:30:03  +00000008 -00000000 ->            1.013.616
00000007 2017-08-18 19:33:14  +00000021 -00000001 ->            2.556.335
00000008 2017-08-19 19:29:23  +00000025 -00000000 ->            1.377.082
00000009 2017-08-20 19:29:56  +00000002 -00000000 ->               24.153
00000010 2017-08-21 19:34:35  +00000031 -00000000 ->            2.554.582
(...)
00004890 2021-02-16 16:40:51  +00000190 -00000005 ->           99.051.540
00004891 2021-02-16 19:30:17  +00000065 -00000006 ->           16.467.364
00004892 2021-02-17 19:34:04  +00000381 -00000257 ->           95.354.305
(...)
00004900 2021-02-25 19:35:47  +00000755 -00000611 ->          132.241.557
00004901 2021-02-26 19:57:16  +00000406 -00000253 ->          122.669.868
00004902 2021-02-27 20:33:45  +00000029 -00000002 ->           12.677.932
00004903 2021-02-28 20:34:00  +00000027 -00000001 ->            6.978.088
00004904 2021-03-01 20:33:52  +00000174 -00000019 ->           77.113.147

until 2021 (4 years later)

This is a ~200GB server

(...)
- 2019-09-23 10:14:44       2.943.578.106  0666 /tank/mboxstorico/inviata_spazzatura__2017_2018
- 2021-02-18 10:16:25           4.119.172  0666 /tank/mboxstorico/inviata_spazzatura__2017_2018.msf
- 2019-10-25 15:39:15       1.574.715.392  0666 /tank/mboxstorico/nstmp
- 2020-11-28 20:33:22           2.038.165  0666 /tank/mboxstorico/nstmp.msf
- 2021-02-25 17:48:11               8.802  0644 /tank/mboxstorico/sha1.txt

214.379.664.412 (199.66 GB) of 214.379.664.412 (199.66 GB) in 154.975 files shown

so for 4900 versions you need 200GB*4900 = ~980TB with something like tar, 7z, RAR etc (yes, 980 terabytes), versus ~200GB (yes, 200GB) with zpaq.

Same things for virtual machines (vmdks)

Why you say uniqueness? We got (hb) hashbackup, borg, restic, bupstash etc

Because other software (sometimes very, very good) runs on complex "repositories", very fragile and way too hard to manage (at least for my tastes).
It may happen that you have to worry about backing up ... the backup, because maybe some files were lost during a transfer, corrupted etc.
If it's simple, maybe it will work

Too good to be true?

Obviously this is not "magic", it is simply the "chaining" of a block deduplicator with a compressor and an archiver. There are faster compressors. There are better compressors. There are faster archivers. There are more efficient deduplicators.

But what I have never found is a combination of these that is so simple to use and reliable, with excellent handling of non-Latin filenames (Chinese, Russian etc).

This is the key: you don't have to use complex "pipe" of tar | srep | zstd | something hoping that everything will runs file, but a single ~4MB executable, with 7z-like commands.
You don't even have to install a complex program with many dependencies that will have to read a folder (the repository) with maybe thousands of files, hoping that they are all fully functional.

There are also many great features for backup, I mention only the greatest.
The ZPAQ file is "in addition", it is never modified

So rsync --append will copy only the portion actually added, for example on ssh tunnel to a remote server, or local NAS (QNAP etc) with tiny times.
TRANSLATION
You can pay ~$4 a month for 1TB cloud-storage-space to store just about everything

You don't have to copy or synchronize let's say 700GB of tar.gz,7z or whatever, but only (say) the 2GB added in the last copy, the first 698GB are untouched.

This opens up the concrete possibility of using VDSL connections (upload ~ 2/4MB /s) to backup even virtual servers of hundreds of gigabytes in a few minutes.

In this (accelerated) video the rsync transfer of 2 remote backups: "standard" .zpaq archive (file level) AND zfsbackup (bit-level) for a small real-world server 1 day-update of work

download.mp4

Bonus: for a developer it's just like a "super-git-versioning"

In the makefile just put at top a zpaq-save-everything and you will keep all the versions of your software, even with libraries, SQL dump etc. A single archive keeps everything, forever, with just one command (or two, for verify)

Defects?

Some.

The main one is that the listing of files is not very fast, when there are many versions (thousands), due to the structure of the archiver-file-format. I could get rid of it, but at the cost of breaking the backward compatibility of the file format, so I don't want to. On 52+ there is a workaround (-filelist)

It is not the fastest tool out there, with real world performance of 80-200MB/s (depending on the case and HW of course). Not a big deal for me (I have very powerful HW, and/or run nightly cron-tasks)

Extraction can require a number of seeks (due to various deduplicated blocks), which can slow down extraction on magnetic disks (but not on SSDs).
If you have plenty of RAM now it is possible to bypass with the w command

No other significant ones come to mind, except that it is known and used by few

Very hard to use?
It is a tool for power users and administrators, who are used to the command line. A text-based GUI is being developed to make data selection and complex extraction easier (!).

In this example we want to extract all the .cpp files as .bak from the 1.zpaq archive. This is something you typically cannot do with other archives such as tar, 7z, rar etc.

With a "sort of" WYSIWYG 'composer'

First f key (find) and entering .cpp
Then s (search) every .cpp substring
Then r (replace) with .bak
Then t (to) for the z:\example folder
Finally x to run the extraction

gui.mp4

I do not trust you, but I am becoming curious. So?

On FreeBSD you can try to build the port (of paq, inside archivers) but it is very, very, very old (v 6.57 of 2014)
You can get a "not too old" zpaqfranz with a pkg install zpaqfranz

On OpenBSD pkg_add zpaqfranz is usually rather updated

On Debian there is a zpaq 7.15 package
You can download the original version (7.15 of 2016) directly from the author's website, and compile it, or get the same from github.
In this case be careful, because the source is divided into 3 source files, but nothing difficult for the compilation.

OK, let's assume I want to try out zpaqfranz. How?

From branch 51 all source code is merged in one zpaqfranz.cpp aiming to make it as easy as possible to compile on "strange" systems (NAS, vSphere etc).
Updating, compilation and Makefile are now trivial.

How to build

My main development platforms are INTEL Windows (non-Intel Windows (arm) currently unsupported) and FreeBSD.

I rarely use Linux or MacOS or whatever (for compiling), so fixing may be needed.

As explained the program is single file, be careful to link the pthread library.
You need it for ESXi too, even if it doesn't work. Don't be afraid, zpaqfranz knows!

zpaqfranz's People

Contributors

aleksandrmelnikov avatar dertuxmalwieder avatar fcorbelli avatar justinormont avatar omar-polo avatar tansy avatar zhongruoyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zpaqfranz's Issues

macOS build failure for v57.5 (Homebrew)

Hello 👋 . I'm a maintainer for the Homebrew project. While packaging v57.5 we are encountering build failures. The relevant GitHub Actions run can be found here: https://github.com/Homebrew/homebrew-core/actions/runs/4557146919/jobs/8038523624

==> make install -f NONWINDOWS/Makefile BINDIR=/opt/homebrew/Cellar/zpaqfranz/57.5/bin/zpaqfranz
  objc[41729]: Class AMSupportURLConnectionDelegate is implemented in both /usr/lib/libauthinstall.dylib (0x1fbffb480) and /System/Library/PrivateFrameworks/MobileDevice.framework/Versions/A/MobileDevice (0x110e902b8). One of the two will be used. Which one is undefined.
  objc[41729]: Class AMSupportURLSession is implemented in both /usr/lib/libauthinstall.dylib (0x1fbffb4d0) and /System/Library/PrivateFrameworks/MobileDevice.framework/Versions/A/MobileDevice (0x110e90308). One of the two will be used. Which one is undefined.
  clang  -DNOJIT -O3 -Dunix zpaqfranz.cpp -o zpaqfranz -pthread -lstdc++ -lm
  zpaqfranz.cpp:37744:128: error: a space is required between consecutive right angle brackets (use '> >')
          int64_t         franzparallelhashfiles(string i_hashtype,int64_t i_totalsize,vector<string> i_thefiles,vector<std::pair<string,string>>& o_hashname);
                                                                                                                                               ^~
                                                                                                                                               > >
  zpaqfranz.cpp:64457:34: error: a space is required between consecutive right angle brackets (use '> >')
          vector<std::pair<uint64_t,string>> bysize;
                                          ^~
                                          > >
  zpaqfranz.cpp:64493:37: error: a space is required between consecutive right angle brackets (use '> >')
                                  vector<std::pair<uint64_t,string>>::iterator basso      = std::lower_bound(bysize.begin(), bysize.end(), p->second.size,mycompareuint64string());
                                                                  ^~
                                                                  > >
  zpaqfranz.cpp:64502:38: error: a space is required between consecutive right angle brackets (use '> >')
                                          vector<std::pair<uint64_t,string>>::iterator alto       = std::upper_bound(bysize.begin(), bysize.end(), p->second.size,mycompareuint64string());
                                                                          ^~
                                                                          > >
  zpaqfranz.cpp:64503:43: error: a space is required between consecutive right angle brackets (use '> >')
                                          for (vector<std::pair<uint64_t,string>>::iterator a=basso; a!=alto; a++)
                                                                               ^~
                                                                               > >
  zpaqfranz.cpp:64529:32: error: a space is required between consecutive right angle brackets (use '> >')
          vector<std::pair<string,string>> tobehashedsource_hashname;
                                        ^~
                                        > >
  zpaqfranz.cpp:64544:32: error: a space is required between consecutive right angle brackets (use '> >')
          vector<std::pair<string,string>> tobehasheddest_hashname;
                                        ^~
                                        > >
  zpaqfranz.cpp:64575:33: error: a space is required between consecutive right angle brackets (use '> >')
                  vector<std::pair<string,string>>::iterator basso        = std::lower_bound(tobehasheddest_hashname.begin(), tobehasheddest_hashname.end(), cerco,mycomparestringstring());
                                                ^~
                                                > >
  zpaqfranz.cpp:64578:34: error: a space is required between consecutive right angle brackets (use '> >')
                          vector<std::pair<string,string>>::iterator alto                 = std::upper_bound(tobehasheddest_hashname.begin(), tobehasheddest_hashname.end(), cerco,mycomparestringstring());
                                                        ^~
                                                        > >
  zpaqfranz.cpp:64580:39: error: a space is required between consecutive right angle brackets (use '> >')
                          for (vector<std::pair<string,string>>::iterator a=basso; a<alto; a++)
                                                             ^~
                                                             > >
  zpaqfranz.cpp:64606:33: error: a space is required between consecutive right angle brackets (use '> >')
                  vector<std::pair<string,string>>::iterator basso        = std::lower_bound(tobehasheddest_hashname.begin(), tobehasheddest_hashname.end(), cerco,mycomparestringstring());
                                                ^~
                                                > >
  zpaqfranz.cpp:64609:34: error: a space is required between consecutive right angle brackets (use '> >')
                          vector<std::pair<string,string>>::iterator alto                 = std::upper_bound(tobehasheddest_hashname.begin(), tobehasheddest_hashname.end(), cerco,mycomparestringstring());
                                                        ^~
                                                        > >
  zpaqfranz.cpp:64613:39: error: a space is required between consecutive right angle brackets (use '> >')
                          for (vector<std::pair<string,string>>::iterator a=basso; a<alto; a++)
                                                             ^~
                                                             > >
  zpaqfranz.cpp:64712:133: error: a space is required between consecutive right angle brackets (use '> >')
  int64_t Jidac::franzparallelhashfiles(string i_hashtype,int64_t i_totalsize,vector<string> i_thefiles,vector<std::pair<string,string>>& o_hashname)
                                                                                                                                      ^~
                                                                                                                                      > >
  14 errors generated.
  make: *** [zpaqfranz] Error 1

Relates to Homebrew/homebrew-core#126991

Code style (questions?)

From the source code:

From branch 51 all source code merged in one zpaqfranz.cpp
aiming to make it as easy as possible to compile on "strange" systems (NAS, vSphere etc).

Does that really help? (Because it makes reading the code much harder.)

The source is composed of the fusion of different software
from different authors, therefore there is no uniform style of programming.

Wouldn't it make sense to run a tool like astyle over it to have a coherent indentation style, at least? (That would not violate any licenses.)

Zpaq 7.15 (2016) is 5.62% faster than Zpaqfranz v54.9-experimental (2021) (can u improve it?)

zpaq64

zpaq v7.15 journaling archiver, compiled Aug 17 2016
Creating util-zpaq64.zpaq at offset 0 + 0
Adding 15.834624 MB in 1 files -method 56 -threads 4 at 2021-12-07 23:44:21.
100.00% 0:00:00 + Util.tar 15834624
100.00% 0:00:00 [1..235] 15835572 -method 56,18,0
1 +added, 0 -removed.

0.000000 + (15.834624 -> 15.834624 -> 14.094668) = 14.094668 MB
109.547 seconds (all OK)

zpaqfranz

zpaqfranz v54.9-experimental (HW BLAKE3), SFX64 v52.15, compiled Nov 4 2021
Integrity check type: XXHASH64+CRC-32 + CRC-32 by fragments
Creating util-zpaqfranz.zpaq at offset 0 + 0
Adding 15.834.624 (15.10 MB) in 1 files at 2021-12-07 23:46:10

1 +added, 0 -removed.

0 + (15.834.624 -> 15.834.624 -> 14.094.701) = 14.094.701

115.703 seconds (000:01:55) (all OK)

Multipart file corruption recovery

Say I'm using multipart files and inadvertently delete a version.
What's the supposed route to recover? Operations on the file gives errors:

❯ zpaqfranz l 'mainbackup_????.zpaq' -index mainbackup_0000.zpaq -filelist
zpaqfranz v55.10f archiver,  compiled Jan  1 1980
franz:-flagfilelist
11896:mainbackup_0001.zpaq: No such file or directory
23013: zpaqfranz error: mainbackup_0001.zpaq

0.039 seconds (00:00:00)  (with errors)

How should I proceed to remove the dependency on that version?

When substituting the missing file with the index, I can list but not extract files:

❯ zpaqfranz l 'mainbackup_????.zpaq' -index mainbackup_0000.zpaq -filelist
zpaqfranz v55.10f-experimental-5078cf5dcd0b04c7f362a9ee406d077c77732d02 archiver,  compiled Jan  1 1980
franz:-flagfilelist
mainbackup_????.zpaq:
Unordered fragment tables: expected >= 7 found 3
Unordered fragment tables: expected >= 7 found 5
5 versions, 15 files, 6 fragments, 15 blocks, 5.168 bytes (5.05 KB)

- 2022-09-04 15:46:26                 217  0644 VFILE-l-filelist.txt
- 2022-09-04 15:46:25                   0 d0755 bkp/
- 2022-09-04 15:45:22                   3  0644 bkp/01
- 2022-09-04 15:46:18                   3  0644 bkp/02
- 2022-09-04 15:46:25                   3  0644 bkp/03

                  226 (226.00 B) of 226 (226.00 B) in 5 files shown
                6.695 compressed
Note: 1.027 of 2.554 compressed bytes are in archive

0.001 seconds (00:00:00)  (all OK)
❯ zpaqfranz x 'mainbackup_????.zpaq' -to extraction/
zpaqfranz v55.10f-experimental-5078cf5dcd0b04c7f362a9ee406d077c77732d02 archiver,  compiled Jan  1 1980
mainbackup_????.zpaq:
Unordered fragment tables: expected >= 7 found 3
Unordered fragment tables: expected >= 7 found 5
5 versions, 15 files, 6 fragments, 15 blocks, 5.168 bytes (5.05 KB)
23013: zpaqfranz error: unordered frags

0.001 seconds (00:00:00)  (with errors)

❯ ls extraction/

❯ 

Also, how do I recreate an index file for such use?

zpaq gui numkey issue

Hi there.
First of all i need to say that i simply love the new GUI, even if is still not complete! What a beautiful idea!
During my playing with that, i have found a little issue that i want to report.

Under the filter/search/replace/extract dialog, this two keys from the numpad doesn't work: / and *****
And this other two keys have a strange effects: + and -
Actually the + and - keys are binded to another function but what if i need to insert these symbols as part of a filename? Like "zpaq+.txt" or "zpaq-advanced.txt"

Thank you!

Compiling version 52.15 on Fedora-34

The build completed with some warning.

I will archive the same data set (as done with zpaq v7.15 yesterday) with this compile version and compare the results and update here. (using defaults)

./zpaqfranz --help
zpaqfranz v52.15-experimental snapshot archiver, compiled Aug  7 2021
Usage: zpaqfranz command archive[.zpaq] files|directory... -switches...
Multi-part archive name               | ????? => "test_?????.zpaq"
             a: Append files          |          t: Fast test
             x: Extract versions      |          l: List files
 l d0 d1 d2...: Compare (test) content|          i: how archive's versions
                                   Various
 c d0 d1 d2...: Compare d0 to d1,d2...| s d0 d1 d2: cumulative size of d0,d1,d2
 r d0 d1 d2...: Mirror  d0 in d1...   |       d d0: deduplicate d0 WITHOUT MERCY
 z d0 d1 d2...: Delete empty dirs     |        m X: merge multipart archive
          f d0: Fill (wipe) free space|     utf d0: Detox filenames in d0
 sum  d0 d1...: Hashing/deduplication |     dir d0: Win dir (/s /a /os /od)
 n d0 -n X    : Keep X files in d0    |          b: Benchmarking
 zfsadd zfslist zfspurge              =>  zfs-specific commands (typically FreBSD)
                                Main switches
      -all [N]: All versions N digit  |     -key X: archive password X
 -mN -method N: 0..5= faster..better  |     -force: always overwrite
         -test: Verify (extract/add)  |      -kill: allow destructive operations
    -to out...: Prefix files to out   |   -until N: Roll back to N'th version
 -h -? (param): Long help    -examples (param): common examples   -diff: against 7.15
Help    param : a b c d dir f i k l m n r s sum t utf x z zfs franz main normal voodoo 
Usage full:
CMD   a (add)
DIFF:               zpaqfranz store CRC-32/XXH of each file, detecting SHA-1 collisions,
                    while zpaq cannot by design. Can be disabled by -crc32 or -715,
                    on modern CPU slow down ~10%.
-- More (q, Q or control C to exit) --
ls -l
total 2636
-rw-rw-r--. 1 fcc  fcc     3828 Aug  6 20:51 CHANGELOG.md
-rw-rw-r--. 1 fcc  fcc    15391 Aug  6 20:51 differences715.txt
-rw-rw-r--. 1 fcc  fcc      484 Aug  6 20:51 Makefile
-rw-rw-r--. 1 fcc  fcc    43609 Aug  6 20:51 notes.txt
-rw-rw-r--. 1 fcc  fcc    29141 Aug  6 20:51 README.md
-rw-rw-r--. 1 fcc  fcc   209486 Aug  6 20:51 zpaq206.pdf
-rwxr-xr-x. 1 root root 1197584 Aug  7 13:18 zpaqfranz
-rw-rw-r--. 1 fcc  fcc  1180666 Aug  6 20:51 zpaqfranz.cpp
g++ --version
g++ (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ g++ -O3 -Dunix zpaqfranz.cpp -pthread -o zpaqfranz
In function ‘size_t compress_parents_parallel(const uint8_t*, size_t, const uint32_t*, uint8_t, uint8_t*)’,
    inlined from ‘void compress_subtree_to_parent_node(const uint8_t*, size_t, const uint32_t*, uint64_t, uint8_t, uint8_t*)’ at zpaqfranz.cpp:9406:34,
    inlined from ‘void blake3_hasher_update.part.0(blake3_hasher*, const void*, size_t)’ at zpaqfranz.cpp:9590:38:
zpaqfranz.cpp:9294:11: warning: ‘void* memcpy(void*, const void*, size_t)’ writing 32 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
 9294 |     memcpy(&out[parents_array_len * BLAKE3_OUT_LEN],
      |     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9295 |            &child_chaining_values[2 * parents_array_len * BLAKE3_OUT_LEN],
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9296 |            BLAKE3_OUT_LEN);
      |            ~~~~~~~~~~~~~~~
zpaqfranz.cpp: In function ‘void blake3_hasher_update.part.0(blake3_hasher*, const void*, size_t)’:
zpaqfranz.cpp:9403:11: note: at offset 32 into destination object ‘out_array’ of size 32
 9403 |   uint8_t out_array[MAX_SIMD_DEGREE_OR_2 * BLAKE3_OUT_LEN / 2];
      |           ^~~~~~~~~
In function ‘void compress_subtree_to_parent_node(const uint8_t*, size_t, const uint32_t*, uint64_t, uint8_t, uint8_t*)’,
    inlined from ‘void blake3_hasher_update.part.0(blake3_hasher*, const void*, size_t)’ at zpaqfranz.cpp:9590:38:
zpaqfranz.cpp:9407:11: warning: ‘void* memcpy(void*, const void*, size_t)’ reading between 64 and 96 bytes from a region of size 32 [-Wstringop-overread]
 9407 |     memcpy(cv_array, out_array, num_cvs * BLAKE3_OUT_LEN);
      |     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zpaqfranz.cpp: In function ‘void blake3_hasher_update.part.0(blake3_hasher*, const void*, size_t)’:
zpaqfranz.cpp:9403:11: note: source object ‘out_array’ of size 32
 9403 |   uint8_t out_array[MAX_SIMD_DEGREE_OR_2 * BLAKE3_OUT_LEN / 2];

zpaq over ssh?

Hello, first of all this is an awesome fork.
I am searching for a solution to create zpaq archive on my local PC containing the the folders on remote PCs. (I dont want to create archives on remote pcs and download them, since it would take long time and I cannot guarantee remote disk sizes)
Is there a way I can target a remote ssh/powershell directory with zpaqfranz ?

or is there a better suggested way for it?

Thanks

Slow dedup speed

Major limitation in original ZPAQ is slow dedup speed. If I remember well it's single threaded. For example max dedup speed on Core i7-6700 is about 120MiB/s. It becomes major bottleneck now in the era of SSD/NVMe.
I'm trying to backup 10TB daily backup, but due to slow dedup speed backup never able to complete within 24h.

Is there easy way to make dedup algo faster ... or any other workarounds?

Acting like "dir" does not work

From the source code:

If the executable is named "dir" act (just about) like Windows' dir

I renamed zpaqfranz.exe to dir.exe and it still does not list directory contents...?

SFX GUI (or more interactivity)

I consider using zpaqfranz for the sfx archives of my Vim builds as it squeezes one or two MB out of the file size (I tried it with v54). Yet, the usability is limited as the .exe files don’t provide a way to (easily) set the target directory.

I know that low-level Windows I/O is a mess, but I’d like to suggest that the SFX module lets the user choose the target directory, either with a pop-up or with a terminal input.

minor typo

Currently, zpaqfranz h a displays the following:

+ : -xxh3 Very fast an dstrong, zpaqfranz default for file comparing

The space and d should be transposed in an dstrong so that it reads and strong instead.

Key differences

Dear Franco,
Could you be so kind to lay out key differences between your version and the original
either on the front page or in the wiki section, so users could easily grasp your mission?

Am I using the extract command wrong?

So im trying to use .\zpaqfranz.exe x D:\ZPAQ.zpaq D:/Restore/zpaqfranz.exe -to "D:\SW" -until 2 in powershell, since my archive is D:\ZPAQ.zpaq. In that archive is the file D:/Restore/zpaqfranz.exe and I want to extract it to D:\SW. However the output I get is

zpaqfranz v54.11-experimental (HW BLAKE3), SFX64 v52.15, compiled Dec 28 2021

D:/ZPAQ.zpaq -until 2:

2 versions, 15.396 files, 147.586 fragments, 7.943.033.988 bytes (7.40 GB)

Non-latin (UTF-8) 4

000001 ?existing files skipped (-force overwrites).

`Extracting 0 bytes (0.00 B) in 0 files -threads 8``

What am I doing wrong?

Issues with OneDrive

Hi.
I'm having problems archiving anything from OneDrive. The only way around it is to copy it to another directory and then archive it. The same problem was with original zpaq.

C:\Users\akumi
λ dir %OneDrive%\test
 Volume in drive C has no label.
 Volume Serial Number is 1AEE-DAA3

 Directory of C:\Users\akumi\OneDrive\test

2022-11-10  14:37    <DIR>          .
2022-11-10  14:37    <DIR>          ..
2022-10-10  02:10             7 690 _R9A8517.xmp
2022-01-07  11:14        45 299 798 _R9A8518.CR3
               2 File(s)     45 307 488 bytes
               2 Dir(s)  1 450 850 488 320 bytes free

C:\Users\akumi
λ xcopy %OneDrive%\test z:
C:\Users\akumi\OneDrive\test\_R9A8517.xmp
C:\Users\akumi\OneDrive\test\_R9A8518.CR3
2 File(s) copied

C:\Users\akumi
λ zpaqfranzhw.exe a z:\test.zpaq %OneDrive%\test
zpaqfranz v55.15q-experimental-JIT-L (HW BLAKE3,SHA1), SFX64 v55.1, (Sep 19 2022)

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?

0.016 seconds (00:00:00)  (with warnings)

C:\Users\akumi
λ zpaqfranzhw.exe a z:\test.zpaq "%OneDrive%\test"
zpaqfranz v55.15q-experimental-JIT-L (HW BLAKE3,SHA1), SFX64 v55.1, (Sep 19 2022)

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?

0.016 seconds (00:00:00)  (with warnings)

Same thing happens if I access OneDrive directly (c:\Users\akumi\OneDrive).

Windows: error shown with symlinked files

At first I have to thank you for [maintaining] this excellent tool!

Here a little issue using zpaqfranz v54.12,
packing symlinked files under NTFS I get

"38385: WARN expected 0 getted for ".

Further on this results in
"38721: HOUSTON something seems wrong: expected , done + "
"..."
"... almost certainly incomplete"
(that is definetly wrong, the archive is fine)

Kind regards

Jolly in archivename and bad frag IDs

Hello Mr. Corbelli,

I managed to create a simple delta / patch system using Zpaqfranz but I encountered some problems.

The purpose is to create a delta file from the wim archives below captured with wimlib :

Source file, the base of the delta :

Hp-ProBook-850-G8-Win10-2022-H2-SysPrep.wim , size 10.881.277.996 bytes

This is the new file, the delta will be created from this one :

Hp-450-G7-Win10-2022-H2-Sysprep.wim, size 11.122.396.929 bytes

Processing the base file :

D:\Tools>zpaqfranz.exe a Archive?.zpaq Hp-ProBook-850-G8-Win10-2022-H2-SysPrep.wim -index archive-index.zpaq
zpaqfranz v57.3f-JIT-L (HW BLAKE3),SFX64 v55.1, (15 Feb 2023)
franz:-index                     archive-index.zpaq
Creating Archive1.zpaq at offset 0 + 0
Adding 10.881.277.996 (10.13 GB) in 1 files (0 dirs), 32 threads @ 2023-03-06 23:25:11
1 +added, 0 -removed.

0 + (10.881.277.996 -> 10.639.858.136 -> 10.616.381.783) = 10.616.381.783 @ 80.13 MB/s

129.750 seconds (000:02:09) (all OK)

The new Archive2.zpaq file smaller than the first version will be the delta \ patch file :

D:\Tools>zpaqfranz.exe a Archive?.zpaq Hp-450-G7-Win10-2022-H2-Sysprep.wim -index archive-index.zpaq
zpaqfranz v57.3f-JIT-L (HW BLAKE3),SFX64 v55.1, (15 Feb 2023)
franz:-index                     archive-index.zpaq
archive-index.zpaq:
1 versions, 1 files, 151.245 frags, 639 blks, 4.303.304 bytes (4.10 MB)
Creating Archive2.zpaq at offset 0 + 10.616.381.783
Adding 11.122.396.929 (10.36 GB) in 1 files (0 dirs), 32 threads @ 2023-03-06 23:28:10
1 +added, 0 -removed.

0 + (11.122.396.929 -> 3.594.298.431 -> 3.587.561.008) = 3.587.561.008 @ 88.28 MB/s

120.390 seconds (000:02:00) (all OK)

Archive1.zpaq, 10.616.381.783 bytes
Archive2.zpaq , 3.587.561.008 bytes

Archive2.zpaq is copied to the remote server.

Calculating the MD5 signature of the wim file to be regenerated on the remote server :

D:\Tools>md5sum.exe Hp-450-G7-Win10-2022-H2-Sysprep.wim
8d87b8003518e67c5ba0cc1f216aabd4 *Hp-450-G7-Win10-2022-H2-Sysprep.wim

On the command prompt of the remote system :

Creating the same preliminary archive from the base file, Hp-ProBook-850-G8-Win10-2022-H2-SysPrep.wim which is already transferred to this server :

F:\Tools>zpaqfranz.exe a Archive?.zpaq Hp-ProBook-850-G8-Win10-2022-H2-SysPrep.wim -index archive-index.zpaq
zpaqfranz v57.3f-JIT-L (HW BLAKE3),SFX64 v55.1, (15 Feb 2023)
franz:-index                       archive-index.zp
Creating Archive1.zpaq at offset 0 + 0
Adding 10.881.277.996 (10.13 GB) in 1 files (0 dirs), 2 threads @ 2023-03-07 09:55:19
1 +added, 0 -removed.

0 + (10.881.277.996 -> 10.639.858.136 -> 10.616.381.783) = 10.616.381.783 @ 70.65 MB/s

146.922 seconds (000:02:26) (all OK)

Extracting the new wim file, Hp-450-G7-Win10-2022-H2-Sysprep.wim :

F:\Tools>zpaqfranz.exe x Archive?.zpaq -only Hp-450-G7-Win10-2022-H2-Sysprep.wim
zpaqfranz v57.3f-JIT-L (HW BLAKE3),SFX64 v55.1, (15 Feb 2023)
43787: Jolly in archivename!
franz:-only                 Hp-450-G7-Win10-2022-H2-Sysprep.wim
--------------------------------------------------------------------------------------------------------------------
Searching for archive(s) in <<Archive?.zpaq>>
Founded 2 archive(s), extracting
Archive1.zpaq:
1 versions, 1 files, 151.245 frags, 639 blks, 10.616.381.783 bytes (9.89 GB)
Extracting 0 bytes (0.00 B) in 0 files (0 folders) with 2 threads

Archive2.zpaq:
1 versions, 1 files, 198.198 frags, 218 blks, 3.587.561.008 bytes (3.34 GB)
Hp-450-G7-Win10-2022-H2-Sysprep.wim: bad frag IDs, skipping...
Extracting / bytes (-1.00 B) in 1 files (0 folders) with 2 threads

0.328 seconds (00:00:00) (all OK)

Here, I receive two messages : Jolly in archivename! and bad frag IDs
What does mean Jolly in archivename?

I managed to extract Hp-450-G7-Win10-2022-H2-Sysprep.wim using Mr.Mahoney's zpaq tool :

F:\Tools>zpaq64 x Archive?.zpaq -only Hp-450-G7-Win10-2022-H2-Sysprep.wim
.
.
99.54% 0:00:00 [197505..197704] -> 16641940
99.69% 0:00:00 [197705..197924] -> 16755357
99.84% 0:00:00 [198187..198198] -> 1330347
99.85% 0:00:00 [197925..198186] -> 16726187
131.328 seconds (all OK)

The MD5 value of the file is correct and it's the same as the original one :

F:\Tools>md5sum.exe Hp-450-G7-Win10-2022-H2-Sysprep.wim
8d87b8003518e67c5ba0cc1f216aabd4 *Hp-450-G7-Win10-2022-H2-Sysprep.wim

I guess both of those issues should be reviewed as Mr.Mahoney's zpaq does not complain about the .my zpaq archives.

Thanks,

Erol

How to use this on Linux ??

i'm trying to run code without compile, i got this error :

--❯ ./zpaqfranz.cpp
./zpaqfranz.cpp: line 1: /bin: Is a directory
./zpaqfranz.cpp: line 2: __: command not found
./zpaqfranz.cpp: line 3: _____: command not found
./zpaqfranz.cpp: line 3: _: command not found
./zpaqfranz.cpp: line 4: syntax error near unexpected token `|'
./zpaqfranz.cpp: line 4: `          |_  / '_ \ / _` |/ _` | |_| '__/ _` | '_ \|_  /'

but, when i try to compile it. I always get this error
(i have done trying compile it with gcc, g++ and clang, and the error is same)

zpaqfranz.cpp: In function ‘size_t compress_parents_parallel(const uint8_t*, size_t, const uint32_t*, uint8_t, uint8_t*)’:
zpaqfranz.cpp:10763:17: warning: ‘void* memcpy(void*, const void*, size_t)’ writing 32 bytes into a region of size 8 overflows the destination [-Wstringop-overflow=]
10763 |           memcpy(&out+(parents_array_len * BLAKE3_OUT_LEN),
      |           ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10764 |            &child_chaining_values+(2 * parents_array_len * BLAKE3_OUT_LEN),
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10765 |            BLAKE3_OUT_LEN); /// FAKE COMPILER WARNING
      |            ~~~~~~~~~~~~~~~
zpaqfranz.cpp:10744:50: note: destination object ‘out’ of size 8
10744 |                                         uint8_t *out) {
      |                                         ~~~~~~~~~^~~

cannot read all files from deduplicated volume under Windows

Hi, I'm trying to backup hyper-v virtual machines. They are stored on ReFS and deduplicated. ZPAQ/ZPAQFRANZ both cannot see all files. Most files are (just like) invisible for ZPAQ.

I found that problem is with the deduplication. No matter ReFS or NTFS. ZPAQ cannot see all files if they are dedulpicated.
I tested on different machines with Server 2019 and Server 2022.

Do you have any clues about this problem?

You can easily recreate problem by:

  1. Create VHDX via Disk Management
  2. Mount it, format it
  3. Copy some files there - files that can be deduplicated
  4. Enable deduplication on that volume
  5. Run Start-DedupJob -Path -Type Optimization
  6. Wait until process done via Get-dedupjob
    1. Try to compress zpaqfranz a aa .

D:\vmware\PC64\pc64\Virtual Hard Disks>dir
Volume in drive D is dedup refs
Volume Serial Number is DA5A-F2BE

Directory of D:\vmware\PC64\pc64\Virtual Hard Disks

02.09.2021 г. 14:18

.
02.09.2021 г. 14:18 ..
31.08.2021 г. 21:26 29 699 866 624 pc64.vhdx
07.09.2021 г. 19:32 13 155 434 496 pc64_4B42846B-2C9F-49E6-A161-4959FE3B469E.avhdx
02.09.2021 г. 10:05 21 609 054 208 pc64_779B26A0-8C9F-44C7-9722-89D3EE1D2242.avhdx
3 File(s) 64 464 355 328 bytes
2 Dir(s) 575 747 760 128 bytes free

D:\vmware\PC64\pc64\Virtual Hard Disks>zpaqfranz a aa .
zpaqfranz v54.3-experimental (HW BLAKE3), SFX64 v52.15, compiled Sep 8 2021
Integrity check type: XXHASH64+CRC-32 + CRC-32 by fragments
Creating aa.zpaq at offset 0 + 0
Adding 0 (0.00 B) in 0 files at 2021-09-18 17:21:12

0 +added, 0 -removed.
deleted aa.zpaq

0 + (0 -> 0 -> 0) = 0

0.032 seconds (all OK)

D:\vmware\PC64\pc64\Virtual Hard Disks>

CRC-32 used as undocumented default

I have a job running using the following syntax:

zpaqfranz a foo.zpaq foo -m5 -verbose -xxh3 -pakka

The running job is reporting:

Integrity check type: XXH3+CRC-32

The use of CRC-32 isn't specified on the command line, and seems to occur whether or not I specify a specific xxhash or chunked format. For example, leaving off -xxh3 -pakka just results in the output line changing to:

Integrity check type: XXHASH64+CRC-32

instead, which is not what the documentation seems to define as the default either. While I can see why the default of -xxhash would default to xxhash64 on a 64-bit system, I'm not sure why CRC-32 is being calculated or why it is a default, especially on a 64-bit system where 32 bits would seem to invite collisions.

If you just want a fast default to add, why not use MD5, which (while cryptographically weak) is at least 128 bits? This seems like either an error in the documentation, an error in the defaults, or a sub-optimal choice for a fast and well-supported checksum.

GUI: PDCurses could be optional

Suggestion: Do not make zpaqfranz's code larger than required - provide a way to use our own *Curses installation. That would be recommended for OpenBSD (and OmniOS?) installations anyway.

-DHWSHA2 segfaults on OpenBSD

$ uname -a
OpenBSD rosaelefanten.org 7.3 GENERIC.MP#1125 amd64
$ c++ -O2 -pipe   -Dunix -DHWSHA2 -o zpaqfranz zpaqfranz.cpp -lm -lpthread

Well...

$ zpaqfranz b -debug
48149: array franz flag size 57
48150: -715                   0  <<Runs just about like zpaq 7.15>>
48150: -append                0  <<Append-only (antiransomware, slow)>>
48150: -big                   0  <<Big>>
48150: -checksum              0  <<Do checksums>>
48150: -checktxt              0  <<Checktxt (MD5)>>
48150: -comment               0  <<Comment version>>
48150: -debug                 1  <<Activate debug mode>>
48150: -debug -zero           0  <<Add files but zero-filled (debugging)>>
48150: -debug -zero -kill     0  <<Add 0-byte long file (debugging)>>
48150: -desc                  0  <<Orderby desc>>
48150: -filelist              0  <<Create a filelist .txt>>
48150: -fix255                0  <<Fix 255>>
48150: -fixeml                0  <<Fix eml filenames>>
48150: -flat                  0  <<Flat filenames>>
48150: -force                 0  <<Force>>
48150: -forcewindows          0  <<Store ADS stuff                (default: NO)>>
48150: -forcezfs              0  <<Enforce using .zfs>>
48150: -frugal                0  <<Frugal backup>>
48150: -hashdeep              0  <<Hashdeep>>
48150: -hw                    0  <<Use HW SHA1>>
48150: -kill                  0  <<Kill>>
48150: -mm                    0  <<Memory mapped>>
48150: -noattributes          0  <<Nessun attributo>>
48150: -nodedup               0  <<Turn off deduplicator>>
48150: -noeta                 0  <<Do not show ETA>>
48150: -nopath                0  <<Do not store path>>
48150: -noqnap                0  <<Exclude QNAP snap & trash>>
48150: -norecursion           0  <<Do not recurse into folders (default: YES)>>
48150: -nosort                0  <<Do not sort file>>
48150: -pakka                 0  <<New output>>
48150: -paranoid              0  <<Paranoid>>
48150: -quiet                 0  <<Do not show filesystem errors>>
BROKEN-sparc64 = SIGBUS due to unaligned access when running tests
48150: -ramdisk               0  <<ramdisk>>
48150: -rename                0  <<Rename>>
48150: -sfxall                0  <<Sfx all>>
48150: -sfxforce              0  <<Sfx force>>
48150: -silent                0  <<Silent>>
48150: -space                 0  <<Do not check space/writeability>>
48150: -ssd                   0  <<SSD>>
48150: -stat                  0  <<Statistics>>
48150: -stdin                 0  <<stdin>>
48150: -stdout                0  <<stdout>>
48150: -store                 0  <<Store mode: no deduplication, no compression>>
48150: -tar                   0  <<TAR mode (store posix)>>
48150: -test                  0  <<Only do test>>
48150: -touch                 0  <<Force 'touch' on date (7.15 to zpaqfranz)>>
48150: -utc                   0  <<Use UTC time>>
48150: -utf                   0  <<UTF-8>>
48150: -verbose               0  <<Verbose output>>
48150: -verify                0  <<Verify>>
48150: -vss                   0  <<Enable Volume Shadow Copies>>
48150: -xls                   0  <<Do NOT force adding of XLS/PPT (default: NO)>>
48150: -zero                  0  <<Flag zero>>
48150: -zfs                   0  <<Do NOT ignore .zfs             (default: YES)>>
48150: /od                    0  <<Order by date>>
48150: /on                    0  <<Order by name>>
48150: /os                    0  <<Order by size>>
zpaqfranz v58.1e-JIT-L,HW SHA1/2,(2023-03-21)
FULL exename <</usr/local/pobj/zpaqfranz-58.1/zpaqfranz-58.1/zpaqfranz>>
42993: The chosen algo 3 SHA-1
1838: new ecx -380413
1843: new ebx -778088517
SSSE3 :OK
SSE41 :OK
SHA   :NO
Segmentation fault

As it does not segfault without -DHWSHA2, I presume that there is a bug hidden somewhere, probably related to "SHA: NO"?

Request: add some real-world usage examples to the README?

Hi,

I learnt about zpaq during my search for some zfs-related issue. zpaq sounds very interesting. Can you please add some real-world examples where zpaq is superior to conventional methods (like Apple TimeMachine, git, zfs snapshots etc.)?

Cannot allocate memory on M1 using higher compression levels

The command: zpaqfranz a backup.zpaq backup fails with


Integrity check type: XXHASH64+CRC-32
all.zpaq: 
0 versions, 0 files, 0 fragments, 0 blocks, 0 bytes (0.00 B)
Updating all.zpaq at offset 0 + 0                                           
Adding 19.857.884.160 (18.49 GB) in 1 files (0 dirs), 8 threads -method 56 @ 2022-11-27 13:05:16 
(001%)   0.34% 00:01:41 (  64.04 MB)->(    0.00 B) of (  18.49 GB)   64.04 MB/sejob 1: allocx failed

Windows - Parameters are added to folder name on x?

Using the current version v56.4 on Windows Server 2022 x64

This runs fine on the first run:

zpaqfranz.exe x "D:\!XYplorer\!Archive\17\40\v17.40.zpaq" "D:/!XYplorer/!Archive/17/40/XYplorerFree 17.40.0100" -to "D:\!XYplorer\!Fresh\XYplorerFree 17.40.0100\"

Now imagine I'd like to extract again and overwrite stuff:
In this case the -force parameter isn't used, it's just appended to the destination folder name?
XYplorerFree 17.40.0100 -force in D:\!XYplorer\!Fresh

zpaqfranz.exe x "D:\!XYplorer\!Archive\17\40\v17.40.zpaq" "D:/!XYplorer/!Archive/17/40/XYplorerFree 17.40.0100" -to "D:\!XYplorer\!Fresh\XYplorerFree 17.40.0100\" -force

Another issue? (or maybe I'm just doing it wrong...)
E.g. extract only a specific version (the archive has 29 versions):
It extracts the whole archive into XYplorerFree 17.40.0100 -until 29 not only the 29th version...

zpaqfranz.exe x "D:\!XYplorer\!Archive\17\40\v17.40.zpaq" -to "D:\!XYplorer\!Fresh\XYplorerFree 17.40.0100\" -until 29

Text based GUI

You can make it like DOS RAR 2.50, Midnight Commander or Far Manager, or a Midnight Commander/Far Manager plugin (a proxy to the main program).

SIGBUS on sparc64

Hello,

this issue was brought to my attention from another OpenBSD developer -- I don't have any sparc64 to test. Sparc64 is a strict-alignment architecture, and MD5::processBlocks() seems to access data that's not aligned on a 4-byte boundary, hence getting a SIGBUS and crashing.

This is the stacktrace when running zpaqfranz autotest -all:

(the lines are relative to v56.2)

Program terminated with signal SIGBUS, Bus error.
#0  0x000000084aa44a10 in MD5::processBlock (this=0xfffffffffffc7100, data=0xb17af402b)
    at zpaqfranz.cpp:10243
10243     uint32_t word0  = MYLITTLEENDIAN(words[ 0]);
(gdb) bt
#0  0x000000084aa44a10 in MD5::processBlock (this=0xfffffffffffc7100, data=0xb17af402b)
    at zpaqfranz.cpp:10243
#1  0x000000084aa46a74 in MD5::add (this=0xfffffffffffc7100, data=0xb17af4000, numBytes=333290)
    at zpaqfranz.cpp:10357
#2  0x000000084aad8020 in Jidac::autotest (this=0xfffffffffffc8900) at zpaqfranz.cpp:50097
#3  0x000000084aa97e70 in Jidac::doCommand (this=0xfffffffffffc8900, argc=3,
    argv=0xfffffffffffc8f18) at zpaqfranz.cpp:42649
#4  0x000000084aadd26c in main (argc=3, argv=0xfffffffffffc8f18) at zpaqfranz.cpp:50586

data = 0x...2b is not aligned on a 4-byte boundary. It's cast to a uint32_t pointer on line 10235:

const uint32_t* words = (uint32_t*)data;

and than accessed as words[0] a few lines later.

Help prints twice on Ubuntu 20.04 using g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

zpaq --help
zpaqfranz v50.13-experimental journaling archiver, compiled Mar 22 2021
Unknown option ignored: --help
Usage: zpaqfranz command archive[.zpaq] files|directory... -options...
Use ???? in archive name for multi-part ex "test_????.zpaq
Commands:
   a             Append files (-checksum -force -crc32 -comment foo -test)
   x             Extract versions of files (-until N -comment foo)
   l             List files (-all -checksum -find something -comment foo)
   l d0 d1 d2... Compare content agains directory d0, d1, d2...
   c d0 d1 d2... Quickly compare dir d0 to d1,d2... (-all M/T scan -crc32)
   s d0 d1 d2... Cumulative size (excluding .zfs) of d0,d1,d2 (-all M/T)
   t             Fast test of most recent version of files (-verify -verbose)
   p             Paranoid test. Use lots of RAM (-verbose)
   sha1          Calculate hash (-sha256 -crc32 -crc32c -xxhash)
   dir           Like Windows dir/find duplicates (/s /a /os /od, -crc32)
   help          Long help (-h, -he examples, -diff differences)
Optional switch(es):
  -all [N]       Extract/list versions in N [4] digit directories
  -f -force      Add: append files if contents have changed
                 Extract: overwrite existing output files
  -key X         Create or access encrypted archive with password X
  -mN  -method N Compress level N (0..5 = faster..better, default 1)
  -test          Extract: verify but do not write files
                 Add: Post-check of files (-verify for CRC-32)
  -to out...     Prefix files... to out... or all to out/all
  -until N       Roll back archive to N'th update or -N from end
  -comment foo   Add/find ASCII comment string to versions
Usage: zpaqfranz command archive[.zpaq] files|directory... -options...
Use ???? in archive name for multi-part ex "test_????.zpaq
Commands:
   a             Append files (-checksum -force -crc32 -comment foo -test)
   x             Extract versions of files (-until N -comment foo)
   l             List files (-all -checksum -find something -comment foo)
   l d0 d1 d2... Compare content agains directory d0, d1, d2...
   c d0 d1 d2... Quickly compare dir d0 to d1,d2... (-all M/T scan -crc32)
   s d0 d1 d2... Cumulative size (excluding .zfs) of d0,d1,d2 (-all M/T)
   t             Fast test of most recent version of files (-verify -verbose)
   p             Paranoid test. Use lots of RAM (-verbose)
   sha1          Calculate hash (-sha256 -crc32 -crc32c -xxhash)
   dir           Like Windows dir/find duplicates (/s /a /os /od, -crc32)
   help          Long help (-h, -he examples, -diff differences)
Optional switch(es):
  -all [N]       Extract/list versions in N [4] digit directories
  -f -force      Add: append files if contents have changed
                 Extract: overwrite existing output files
  -key X         Create or access encrypted archive with password X
  -mN  -method N Compress level N (0..5 = faster..better, default 1)
  -test          Extract: verify but do not write files
                 Add: Post-check of files (-verify for CRC-32)
  -to out...     Prefix files... to out... or all to out/all
  -until N       Roll back archive to N'th update or -N from end
  -comment foo   Add/find ASCII comment string to versions

0.021 seconds (all OK)

[info] OpenBSD support

FYI, re:supported systems,

unsurprisingly, zpaqfranz 54.15 works just fine on OpenBSD 7.1. :-)

(I built a port for it, but it's not published yet.)

Too much help

If I accidentally set the switches and the other parameters in the wrong order, zpaqfranz tells me three times what the correct order is:

> ..\..\Downloads\zpaqfranz.exe a -m5 upload.zpaq .\upload
zpaqfranz v55.6-experimental (HW BLAKE3), SFX64 v55.1, compiled Jul 26 2022
Usage: zpaqfranz command archive[.zpaq] files|directory... -switches...
             h: **** Help on Help ****|  With great power comes great  ...  help!
             a: Append files          |          t: Test (integrity)
             x: Extract versions      |          l: List files
             v: Verify on filesystem  |          i: Info (show versions)
           sfx: Create SFX (Windows)  |         rd: Remove  'highlander'  folders
 c d0 d1 d2...: Compare d0 to d1,d2.. | s d0 d1 d2: Cumulative size of d0, d1, d2
 r d0 d1 d2...: Mirror  d0 in d1...   |       d d0: Deduplicate d0  WITHOUT MERCY
 z d0 d1 d2...: Delete empty dirs     |        m X: Merge multipart archive
          f d0: Fill /wipe free space |     utf d0: Detox filenames  in  d0
  sum d0 d1...: Hashing/deduplication |     dir d0: Win dir (/s /a /os /od)
     n d0 -n X: Keep X files in d0    |          b: CPU benchmarking (-all)
   rsync d0 d1: Purge temporary rsync | cp X -to Y: Copy files (w/wildcards) to Y
          trim: Trim incomplete add() |    dirsize: Get size of archive's folders
      password: Add/remove/change pwd |   q z:\foo: VSS-archive C: in z:\foo.zpaq
                                 Main | switches
      -all [N]: All versions N digit  |     -key X: Use/set archive password to X
 -mN -method N: 0..5= faster..better  |     -force: Always overwrite (extraction)
         -test: Test (extract/add)    |      -kill: Allow destructive (NO dryrun)
    -to out...: Prefix files to out   |   -until N: Roll back to N'th version
Usage: zpaqfranz command archive[.zpaq] files|directory... -switches...
             h: **** Help on Help ****|  With great power comes great  ...  help!
             a: Append files          |          t: Test (integrity)
             x: Extract versions      |          l: List files
             v: Verify on filesystem  |          i: Info (show versions)
           sfx: Create SFX (Windows)  |         rd: Remove  'highlander'  folders
 c d0 d1 d2...: Compare d0 to d1,d2.. | s d0 d1 d2: Cumulative size of d0, d1, d2
 r d0 d1 d2...: Mirror  d0 in d1...   |       d d0: Deduplicate d0  WITHOUT MERCY
 z d0 d1 d2...: Delete empty dirs     |        m X: Merge multipart archive
          f d0: Fill /wipe free space |     utf d0: Detox filenames  in  d0
  sum d0 d1...: Hashing/deduplication |     dir d0: Win dir (/s /a /os /od)
     n d0 -n X: Keep X files in d0    |          b: CPU benchmarking (-all)
   rsync d0 d1: Purge temporary rsync | cp X -to Y: Copy files (w/wildcards) to Y
          trim: Trim incomplete add() |    dirsize: Get size of archive's folders
      password: Add/remove/change pwd |   q z:\foo: VSS-archive C: in z:\foo.zpaq
                                 Main | switches
      -all [N]: All versions N digit  |     -key X: Use/set archive password to X
 -mN -method N: 0..5= faster..better  |     -force: Always overwrite (extraction)
         -test: Test (extract/add)    |      -kill: Allow destructive (NO dryrun)
    -to out...: Prefix files to out   |   -until N: Roll back to N'th version
Usage: zpaqfranz command archive[.zpaq] files|directory... -switches...
             h: **** Help on Help ****|  With great power comes great  ...  help!
             a: Append files          |          t: Test (integrity)
             x: Extract versions      |          l: List files
             v: Verify on filesystem  |          i: Info (show versions)
           sfx: Create SFX (Windows)  |         rd: Remove  'highlander'  folders
 c d0 d1 d2...: Compare d0 to d1,d2.. | s d0 d1 d2: Cumulative size of d0, d1, d2
 r d0 d1 d2...: Mirror  d0 in d1...   |       d d0: Deduplicate d0  WITHOUT MERCY
 z d0 d1 d2...: Delete empty dirs     |        m X: Merge multipart archive
          f d0: Fill /wipe free space |     utf d0: Detox filenames  in  d0
  sum d0 d1...: Hashing/deduplication |     dir d0: Win dir (/s /a /os /od)
     n d0 -n X: Keep X files in d0    |          b: CPU benchmarking (-all)
   rsync d0 d1: Purge temporary rsync | cp X -to Y: Copy files (w/wildcards) to Y
          trim: Trim incomplete add() |    dirsize: Get size of archive's folders
      password: Add/remove/change pwd |   q z:\foo: VSS-archive C: in z:\foo.zpaq
                                 Main | switches
      -all [N]: All versions N digit  |     -key X: Use/set archive password to X
 -mN -method N: 0..5= faster..better  |     -force: Always overwrite (extraction)
-- More (q, Q or control C to exit) --

I think one time would be enough?

Error with -vss switch

Hi there. I was trying using zpaqfranz to "backup" a folder with open files inside it from another running application.
The command i used is:

zpaqfranq a test.zpaq D:\LOCKEDFOLDER -vss

but i'm getting this as result:

zpaqfranz v56.2g-JIT-L (HW BLAKE3), SFX64 v55.1, (01 Dec 2022)
franz: VSS Volume Shadow Copy (-vss)
02/12/2022 01:30:35 VSS: starting
02/12/2022 01:30:39 VSS: intermediate
42360: something wrong preparing

3.360 seconds (000:00:03) (with errors)

Any clue?

Windows -longpath switch, inconsistently reports number of files

Noticed an inconsistency, not sure if this is documented?

I have a folder with 187,372 files.

  • 1466 folders

I cannot seem to find the right combination of flags to pick up all of these files, because:

  • Some of the paths are longer than 255 characters
  • UNC or //?/, -longpath, doesn't seem to pick up all the files either.

I am attaching output of various commands I've tried.

PS C:\Users\Alex\_tempworkdir> zpaqfranz.exe a EmailProcessingBucket.zpaq .\EmailProcessing\ -test -longpath
zpaqfranz v55.14b-experimental-JIT-L (HW BLAKE3), SFX64 v55.1, (Sep  5 2022)
franz:Long path (on Windows)
38992: INFO: getting Windows' long filenames
EmailProcessingBucket.zpaq:
3 versions, 352.056 files, 299.464 fragments, 6.983 blocks, 9.940.897.057 bytes (9.26 GB)

15367: Windows error # 53
       //?/UNC/EmailProcessing/EMLProcessing/*


MAYBE OUT OF FREE SPACE OR INVALID PATH? 15367

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?

5.922 seconds (000:00:05)  (with warnings)
PS C:\Users\Alex\_tempworkdir>
zpaqfranz.exe a EmailProcessingBucket.zpaq \\TAICHI\C$\Users\Alex\_tempworkdir\EmailProcessing\ -test 
16672: path not found : maybe length 00000273 >255? //TAICHI/C$/Users/Alex/_tempworkdir/EmailProcessing/unzip/FastmailEmls3/FastmailEmls 3/ProcessToPDF-Nov92019/2018/12/2018-12-10 17.49.51Z  UPDATE  Door issue @ Civic Center cleared. Resuming #subwaysvc. Residual congestion slow svc possible while we work to balance svc.pdf

16672: path not found : maybe length 00000277 >255? //TAICHI/C$/Users/Alex/_tempworkdir/EmailProcessing/unzip/FastmailEmls3/FastmailEmls 3/ProcessToPDF-Nov92019/2018/12/2018-12-10 21.31.21Z  HEAL Initiative  Preventing Opioid Use Disorder in Older Adolescents and Young Adults (ages 16–30) (UG3 UH3 Clinical Trial Required).pdf

[...thousands more line...s]
PS C:\Users\Alex\_tempworkdir> zpaqfranz.exe a EmailProcessingBucket.zpaq \\TAICHI\C$\Users\Alex\_tempworkdir\EmailProcessing\ -test -longpath
zpaqfranz v55.14b-experimental-JIT-L (HW BLAKE3), SFX64 v55.1, (Sep  5 2022)
franz:Long path (on Windows)
38992: INFO: getting Windows' long filenames
EmailProcessingBucket.zpaq:
3 versions, 352.056 files, 299.464 fragments, 6.983 blocks, 9.940.897.057 bytes (9.26 GB)
Updating EmailProcessingBucket.zpaq at offset 9.940.897.057 + 0
Adding 212.584 (207.60 KB) in 2 files (50 dirs), 16 threads @ 2022-09-07 10:09:52
Long filenames (>255)       991 *** WARNING *** (-fix255)
2 +added, 1 -removed.

9.940.897.057 + (212.584 -> 0 -> 1.101) = 9.940.898.158 @ 154.58 KB/s
=============================================================================================================================================================================================================
Compare archive content of:EmailProcessingBucket.zpaq:
4 versions, 352.058 files, 299.464 fragments, 6.985 blocks, 9.940.898.158 bytes (9.26 GB)
Scanning     10.000 2.69s      3.720 file/s (        1.160.305.159)
   10.641 in <<//?/UNC/TAICHI/C$/Users/Alex/_tempworkdir/EmailProcessing/EMLProcessing/>>
Total files found: 10.641
PS C:\Users\Alex\_tempworkdir> zpaqfranz.exe a EmailProcessingBucket.zpaq \\TAICHI\C$\Users\Alex\_tempworkdir\EmailProcessing\ -test -longpath -fix255
zpaqfranz v55.14b-experimental-JIT-L (HW BLAKE3), SFX64 v55.1, (Sep  5 2022)
franz:-fix255 Long path (on Windows)
38992: INFO: getting Windows' long filenames
EmailProcessingBucket.zpaq:
4 versions, 352.058 files, 299.464 fragments, 6.985 blocks, 9.940.898.158 bytes (9.26 GB)
Updating EmailProcessingBucket.zpaq at offset 9.940.898.158 + 0
Adding 0 (0.00 B) in 0 files (50 dirs), 16 threads @ 2022-09-07 10:11:31
Long filenames (>255)       991 *** WARNING *** (-fix255)
0 +added, 0 -removed.

9.940.898.158 + (0 -> 0 -> 0) = 9.940.898.158 @ 0.00 B/s
=============================================================================================================================================================================================================
Compare archive content of:EmailProcessingBucket.zpaq:
4 versions, 352.058 files, 299.464 fragments, 6.985 blocks, 9.940.898.158 bytes (9.26 GB)
Scanning     10.000 2.69s      3.720 file/s (        1.160.305.159)
   10.641 in <<//?/UNC/TAICHI/C$/Users/Alex/_tempworkdir/EmailProcessing/EMLProcessing/>>
Total files found: 10.641

Feedback: Hashing superstrong with PRVhash and superfast with DoubleDeuceAES

Hi Franco,
seeing how you search for faster alternatives, here is my feedback/suggestion:

Speaking of SHA-like strength, in Nakamichi I use @avaneev PRVhash (along with SHA3-224):
avaneev/prvhash#1

Speaking of superfast preliminary stage of whether it LOOKS-LIKE (thus bypassing SHA-like invocations), in Nakamichi I use my own DoubleDeuceAES inspired by @jandrewrogers.
I use it for reducing 18,36,64 bytes long keys down to 16 bytes, if a collision occurs then it can be strengthened by adding 4 or 8 or even 16 bytes, it is not such a pain since those 32 bytes will be generated brutally fast:

#ifdef _NAquaHash
// https://github.com/jandrewrogers/AquaHash/blob/master/aquahash.h
#include <wmmintrin.h>
// or may be just use #include <x86intrin.h> for all
#endif

#ifdef _NAquaHash
void DoubleDeuceAES(const uint8_t *buffer, const size_t length)
{
	uint32_t i;
	char MaxTo64a[64];
	char MaxTo64b[64];
	char MaxTo64c[64/1];
	char MaxTo64d[64/1];
        __m128i hashA = _mm_setzero_si128();
        __m128i hashB = _mm_setzero_si128();
        __m128i hashC = _mm_setzero_si128();
        __m128i hashD = _mm_setzero_si128();
        const __m128i *ptr128a;
        const __m128i *ptr128b;
        const __m128i *ptr128c;
        const __m128i *ptr128d;

	memset(MaxTo64a,0,4*(128/8)); // padding the keys to be multiples of 128, up to 64 bytes
	memset(MaxTo64b,0,4*(128/8)); // padding the keys to be multiples of 128, up to 64 bytes
	memcpy( MaxTo64a, buffer, length ); // Relax, no problema with padding since all the keys in Leprechaun/Nakamichi are to be put in respective pools with fixed length, so a key of len 4 AAAA{padded with 60 ASCII 000} is not as of len 5 AAAA{ASCII 000}{padded with 59 ASCII 000}, despite having same hashes they won't collide!
	for (i = 0; i < length; i++) {
		MaxTo64b[63-i]=MaxTo64a[i];
	}
	// Make C a derivative of A, interleaved HALFWARD left-to-right, with BYTE granularity - [BYTE00][BYTE01]...[BYTE31] | [BYTE32]...[BYTE63] as [BYTE00][BYTE32]...[BYTE31][BYTE63]
	// Make D a derivative of B, interleaved HALFWARD left-to-right, with BYTE granularity - [BYTE00][BYTE01]...[BYTE31] | [BYTE32]...[BYTE63] as [BYTE00][BYTE32]...[BYTE31][BYTE63]
	for (i = 0; i < (64>>1)/1; i++) { // 64/2/BYTE=31 i.e 0..31
		MaxTo64c[(i<<1)+0]=MaxTo64a[i+0]; // a: 00,32 / 01,33 / ...31,63
		MaxTo64c[(i<<1)+1]=MaxTo64a[i+32]; // c: 0*2+0,0*2+1 / 1*2+0,1*2+1 / 2*2+0,2*2+1 which is 0,1 / 2,3 / 4,5
		MaxTo64d[(i<<1)+0]=MaxTo64b[i+0]; 
		MaxTo64d[(i<<1)+1]=MaxTo64b[i+32]; 
	}
	ptr128a=(__m128i *)MaxTo64a;
	ptr128b=(__m128i *)MaxTo64b;
	ptr128c=(__m128i *)MaxTo64c;
	ptr128d=(__m128i *)MaxTo64d;
            for (i = 0; i < 64 / 16; i++) {
                __m128i a = _mm_loadu_si128(ptr128a++);
                __m128i b = _mm_loadu_si128(ptr128b++);
                __m128i c = _mm_loadu_si128(ptr128c++);
                __m128i d = _mm_loadu_si128(ptr128d++);
                hashA = _mm_aesenc_si128(hashA, a);
                hashB = _mm_aesenc_si128(hashB, b);
                hashC = _mm_aesenc_si128(hashC, c);
                hashD = _mm_aesenc_si128(hashD, d);
            }
	hashA = _mm_aesenc_si128(hashA, hashB);
	hashA = _mm_aesenc_si128(hashA, hashC);
	hashA = _mm_aesenc_si128(hashA, hashD);

	SlowCopy128bit( (const char *)(&hashA), (char *)&DDAES[0]);
	//void SlowCopy128bit (const char *SOURCE, char *TARGET) { _mm_storeu_si128((__m128i *)(TARGET), _mm_loadu_si128((const __m128i *)(SOURCE))); }
}
#endif

It is 625 bytes of code, I'm not aware of other 128bit hash being so fast, hope you find it useful as I do:

; mark_description "Intel(R) C++ Compiler XE for applications running on Intel(R) 64, Version 15.0.0.108 Build 20140726";

?DoubleDeuceAES@@YAXPEBE_K@Z	PROC 
; parameter 1: rcx
; parameter 2: rdx
.B14.1::                        

  00000 41 57            push r15                               
  00002 48 81 ec 00 01 
        00 00            sub rsp, 256                           
  00009 49 89 d7         mov r15, rdx                           
  0000c 44 0f 29 bc 24 
        a0 00 00 00      movaps XMMWORD PTR [160+rsp], xmm15    
  00015 66 0f ef c0      pxor xmm0, xmm0                        
  00019 44 0f 29 b4 24 
        b0 00 00 00      movaps XMMWORD PTR [176+rsp], xmm14    
  00022 66 0f ef c9      pxor xmm1, xmm1                        
  00026 44 0f 29 ac 24 
        c0 00 00 00      movaps XMMWORD PTR [192+rsp], xmm13    
  0002f 66 0f ef d2      pxor xmm2, xmm2                        
  00033 44 0f 29 a4 24 
        d0 00 00 00      movaps XMMWORD PTR [208+rsp], xmm12    
  0003c 66 0f ef db      pxor xmm3, xmm3                        
  00040 44 0f 29 9c 24 
        e0 00 00 00      movaps XMMWORD PTR [224+rsp], xmm11    
  00049 48 89 ca         mov rdx, rcx                           
  0004c 44 0f 29 94 24 
        f0 00 00 00      movaps XMMWORD PTR [240+rsp], xmm10    
  00055 0f 29 44 24 20   movaps XMMWORD PTR [32+rsp], xmm0      
  0005a 0f 29 4c 24 30   movaps XMMWORD PTR [48+rsp], xmm1      
  0005f 0f 29 54 24 40   movaps XMMWORD PTR [64+rsp], xmm2      
  00064 0f 29 5c 24 50   movaps XMMWORD PTR [80+rsp], xmm3      

.B14.2::                        

  00069 66 0f ef c0      pxor xmm0, xmm0                        
  0006d 66 0f ef c9      pxor xmm1, xmm1                        
  00071 0f 29 44 24 60   movaps XMMWORD PTR [96+rsp], xmm0      
  00076 66 0f ef d2      pxor xmm2, xmm2                        
  0007a 0f 29 4c 24 70   movaps XMMWORD PTR [112+rsp], xmm1     
  0007f 66 0f ef db      pxor xmm3, xmm3                        
  00083 0f 29 94 24 80 
        00 00 00         movaps XMMWORD PTR [128+rsp], xmm2     
  0008b 0f 29 9c 24 90 
        00 00 00         movaps XMMWORD PTR [144+rsp], xmm3     

.B14.3::                        

  00093 4d 89 f8         mov r8, r15                            
  00096 48 8d 4c 24 20   lea rcx, QWORD PTR [32+rsp]            
  0009b e8 fc ff ff ff   call _intel_fast_memcpy                

.B14.4::                        

  000a0 4d 85 ff         test r15, r15                          
  000a3 76 5b            jbe .B14.11 

.B14.5::                        
  000a5 4c 89 f8         mov rax, r15                           
  000a8 b9 01 00 00 00   mov ecx, 1                             
  000ad 48 d1 e8         shr rax, 1                             
  000b0 33 d2            xor edx, edx                           
  000b2 48 85 c0         test rax, rax                          
  000b5 76 33            jbe .B14.9 

.B14.7::                        

  000b7 8d 0c 12         lea ecx, DWORD PTR [rdx+rdx]           
  000ba 41 89 ca         mov r10d, ecx                          
  000bd f7 d9            neg ecx                                
  000bf 46 8a 5c 14 20   mov r11b, BYTE PTR [32+rsp+r10]        
  000c4 44 8d 49 3f      lea r9d, DWORD PTR [63+rcx]            
  000c8 83 c1 3e         add ecx, 62                            
  000cb 46 88 5c 0c 60   mov BYTE PTR [96+rsp+r9], r11b         
  000d0 44 8d 4c 12 01   lea r9d, DWORD PTR [1+rdx+rdx]         
  000d5 ff c2            inc edx                                
  000d7 48 3b d0         cmp rdx, rax                           
  000da 46 8a 4c 0c 20   mov r9b, BYTE PTR [32+rsp+r9]          
  000df 44 88 4c 0c 60   mov BYTE PTR [96+rsp+rcx], r9b         
  000e4 72 d1            jb .B14.7 

.B14.8::                        
  000e6 8d 4c 12 01      lea ecx, DWORD PTR [1+rdx+rdx]         

.B14.9::                        
  000ea ff c9            dec ecx                                
  000ec 89 c8            mov eax, ecx                           
  000ee 49 3b c7         cmp rax, r15                           
  000f1 73 0d            jae .B14.11 

.B14.10::                       
  000f3 f7 d9            neg ecx                                
  000f5 83 c1 3f         add ecx, 63                            
  000f8 8a 44 04 20      mov al, BYTE PTR [32+rsp+rax]          
  000fc 88 44 0c 60      mov BYTE PTR [96+rsp+rcx], al          

.B14.11::                       

  00100 66 0f 6f 44 24 
        20               movdqa xmm0, XMMWORD PTR [32+rsp]      
  00106 66 0f 6f 4c 24 
        40               movdqa xmm1, XMMWORD PTR [64+rsp]      
  0010c 66 44 0f 6f f8   movdqa xmm15, xmm0                     
  00111 66 44 0f 60 f9   punpcklbw xmm15, xmm1                  
  00116 66 45 0f ef f6   pxor xmm14, xmm14                      
  0011b 66 0f 68 c1      punpckhbw xmm0, xmm1                   
  0011f 66 0f 6f 4c 24 
        60               movdqa xmm1, XMMWORD PTR [96+rsp]      
  00125 66 0f 6f 94 24 
        80 00 00 00      movdqa xmm2, XMMWORD PTR [128+rsp]     
  0012e 66 44 0f 6f e9   movdqa xmm13, xmm1                     
  00133 66 0f 6f 5c 24 
        30               movdqa xmm3, XMMWORD PTR [48+rsp]      
  00139 66 0f 6f 64 24 
        50               movdqa xmm4, XMMWORD PTR [80+rsp]      
  0013f 66 0f 6f 6c 24 
        70               movdqa xmm5, XMMWORD PTR [112+rsp]     
  00145 66 44 0f 60 ea   punpcklbw xmm13, xmm2                  
  0014a 66 0f 68 ca      punpckhbw xmm1, xmm2                   
  0014e 66 0f 6f d3      movdqa xmm2, xmm3                      
  00152 44 0f 10 5c 24 
        20               movups xmm11, XMMWORD PTR [32+rsp]     
  00158 44 0f 10 64 24 
        60               movups xmm12, XMMWORD PTR [96+rsp]     
  0015e 66 44 0f 6f 94 
        24 90 00 00 00   movdqa xmm10, XMMWORD PTR [144+rsp]    
  00168 66 0f 60 d4      punpcklbw xmm2, xmm4                   
  0016c 66 0f 68 dc      punpckhbw xmm3, xmm4                   
  00170 66 0f 6f e5      movdqa xmm4, xmm5                      
  00174 66 41 0f 60 e2   punpcklbw xmm4, xmm10                  
  00179 66 41 0f 68 ea   punpckhbw xmm5, xmm10                  
  0017e 66 45 0f ef d2   pxor xmm10, xmm10                      
  00183 66 45 0f 38 dc 
        d3               aesenc xmm10, xmm11                    
  00189 66 45 0f ef db   pxor xmm11, xmm11                      
  0018e 66 45 0f 38 dc 
        dc               aesenc xmm11, xmm12                    
  00194 66 45 0f ef e4   pxor xmm12, xmm12                      
  00199 66 45 0f 38 dc 
        e7               aesenc xmm12, xmm15                    
  0019f 44 0f 10 7c 24 
        30               movups xmm15, XMMWORD PTR [48+rsp]     
  001a5 66 45 0f 38 dc 
        f5               aesenc xmm14, xmm13                    
  001ab 44 0f 10 6c 24 
        70               movups xmm13, XMMWORD PTR [112+rsp]    
  001b1 66 45 0f 38 dc 
        d7               aesenc xmm10, xmm15                    
  001b7 44 0f 10 7c 24 
        40               movups xmm15, XMMWORD PTR [64+rsp]     
  001bd 66 45 0f 38 dc 
        dd               aesenc xmm11, xmm13                    
  001c3 44 0f 10 ac 24 
        80 00 00 00      movups xmm13, XMMWORD PTR [128+rsp]    
  001cc 66 45 0f 38 dc 
        d7               aesenc xmm10, xmm15                    
  001d2 44 0f 10 7c 24 
        50               movups xmm15, XMMWORD PTR [80+rsp]     
  001d8 66 45 0f 38 dc 
        dd               aesenc xmm11, xmm13                    
  001de 44 0f 10 ac 24 
        90 00 00 00      movups xmm13, XMMWORD PTR [144+rsp]    
  001e7 66 44 0f 38 dc 
        e0               aesenc xmm12, xmm0                     
  001ed 66 45 0f 38 dc 
        d7               aesenc xmm10, xmm15                    
  001f3 66 45 0f 38 dc 
        dd               aesenc xmm11, xmm13                    
  001f9 66 44 0f 38 dc 
        e2               aesenc xmm12, xmm2                     
  001ff 66 44 0f 38 dc 
        f1               aesenc xmm14, xmm1                     
  00205 66 45 0f 38 dc 
        d3               aesenc xmm10, xmm11                    
  0020b 66 44 0f 38 dc 
        e3               aesenc xmm12, xmm3                     
  00211 66 44 0f 38 dc 
        f4               aesenc xmm14, xmm4                     
  00217 66 45 0f 38 dc 
        d4               aesenc xmm10, xmm12                    
  0021d 66 44 0f 38 dc 
        f5               aesenc xmm14, xmm5                     
  00223 66 45 0f 38 dc 
        d6               aesenc xmm10, xmm14                    
  00229 44 0f 11 15 00 
        00 00 00         movups XMMWORD PTR [?DDAES@@3PAEA], xmm10 
  00231 44 0f 28 94 24 
        f0 00 00 00      movaps xmm10, XMMWORD PTR [240+rsp]    
  0023a 44 0f 28 9c 24 
        e0 00 00 00      movaps xmm11, XMMWORD PTR [224+rsp]    
  00243 44 0f 28 a4 24 
        d0 00 00 00      movaps xmm12, XMMWORD PTR [208+rsp]    
  0024c 44 0f 28 ac 24 
        c0 00 00 00      movaps xmm13, XMMWORD PTR [192+rsp]    
  00255 44 0f 28 b4 24 
        b0 00 00 00      movaps xmm14, XMMWORD PTR [176+rsp]    
  0025e 44 0f 28 bc 24 
        a0 00 00 00      movaps xmm15, XMMWORD PTR [160+rsp]    
  00267 48 81 c4 00 01 
        00 00            add rsp, 256                           
  0026e 41 5f            pop r15                                
  00270 c3               ret                                    
  00271 0f 1f 84 00 00 
        00 00 00 0f 1f 
        80 00 00 00 00   ALIGN     16

.B14.12::
?DoubleDeuceAES@@YAXPEBE_K@Z ENDP

Currently I am running heavy compression on 4GB and 10GB corpora using my DoubleDeuceAES, very glad so far...

55.3 on Ubuntu (gcc version 9.4.0)

g++ -Dunix -O3 -march=native zpaqfranz.cpp -o zpaqfranz -pthread
zpaqfranz.cpp: In function ‘std::string makelongpath(std::string)’:
zpaqfranz.cpp:20379:9: error: ‘i_string’ was not declared in this scope
20379 | return i_string;
| ^~~~~~~~
zpaqfranz.cpp:20382:6: error: ‘flaglongpath’ was not declared in this scope; did you mean ‘makelongpath’?
20382 | if (flaglongpath)
| ^~~~~~~~~~~~
| makelongpath
make: *** [Makefile:10: zpaqfranz] Error 1

Missing option to run BAT after making VSS snapshot.

Many backup apps have option to run scripts BEFORE and AFTER VSS snapshot.
Example use case is to stop db, make VSS, then start db and after that to continue with backup process.
I miss VSS AFTER script in ZPAQFRANZ.

Example in Drive Snapshot:
--exec:"NET START ORACLE" 
--exec:"RestartExchange.bat"

for use with the internal driver: after snapshot creation (just a few seconds after starting the backup), you can execute external command. has to be the last command on command line!

Feature request: -silent switch

Sometimes, even -summary is too much. I'd like to see a -silent switch that only echoes anything if an error has occurred.

9802: ERR kind 123 + Some questions

Hello :)

When I enter...

zpaqfranz.exe a test_????.zpaq *.jpg -index test_0000.zpaq

...and then add another version, with zpaqfranz I get this error:

zpaqfranz.exe a test_????.zpaq *.jpg -index test_0000.zpaq
zpaqfranz v54.11-experimental (HW BLAKE3), SFX64 v52.15, compiled Dec 28 2021
9802: ERR <test_????.zpaq> kind 123

When using zpaq 7.15, it works fine.

I also have some questions... I'm currently testing if zpaq works for my scenario.

zpaq and zpaqfranz restore to the correct time modified but using -list they report time that seems to be UTC?

In my test I have an index file locally and multi-part versions stored remotely. The first version is 18 MiB.
Is it possible to access individual files inside the archive without downloading the full archive?
I have it mounted via rclone vfs.

When I list "test_0001.zpaq" which is 18 MiB from the remote it downloads the whole file before showing the list.
When I list "test_0002.zpaq" which is just 9 MiB, it downloaded some 65 MiB even though all versions combined are just 29 MiB.
Is this a flaw in rclone/remote or is this a limitation of zpaq?

safer handling of passwords

Would you consider adding features for safer handling of passwords?

At the moment, it appears that the only supported method for the user providing a password is to pass it directly on command line.

This approach is insecure, for a variety of reasons, including the following:

  • The text of the password is printed on the console.
  • The text of the password is visible to any user on the system, by listing the process table.
  • The text of the password is recorded in the command history maintained by the user's shell.

Common, safer ways for an application to allow passwords to be passed are as follows:

  • Reading character input from an interactive prompt that suppresses echoing of the input text.
  • Reading a file specified by the user.
  • Invoking a command specified by the user that outputs the password, for capture by the application.

Better program name (zpaq 7.xx or fpaq 1.xx)

Hello, excellent improvement to the already good zpaq 7.15, the name "zpaqfranz" seems very long, why didn't you just keep calling it zpaq v7.xx or fpaq or something like that.

longpath

Hi Franco,
I'm not sure if I'm using the -longpath attribute on windows correctly
Well, it seems to me that when using it, only the first file / folder is added to the archive
e.g.:
d:\main\1file.txt
d:\main\2folder ...
d:\main\3folder ...
zpaqfranz.exe a test.zpaq d:\main -longpath

  • only d:\main\1file.txt was added to the archive
    Without the -longpath attribute, the entire structure is added (1file.txt, 2folder ..., 3folder ...)

How to exclude some files

Is there any way to exclude some files from archiving? For example, do not include temporary directories like /tmp or /var/tmp when archiving /.

NOJIT and non-x86 platforms

Saw somewhere else

Short version: on non-Intel CPU, and not sse2, interpretation is required (-DNOJIT). On "modern" Intel CPUs, native machine code will be used by default

It will be at least 3 years that I no longer verify the interpreted functionality, maybe more. Nothing prevents other users from making a new fork and proposing it on other architectures.

My use case consists of an aarch64 server appending some files to an archive. To correctly compile and run without segfaults, I needed to -DNOJIT the build.

Is there some suspicion of bugs in the NO JIT "branch"? It merely means "not tested"?

Can you use zpaqfranz to fully backup windows 10?

Hello, i found this project while browsing online, it seems to be advertised as superior to other similar tools, however, can it be used to FULLY backup a windows 10 system? In the same way that programs like Acronis True Image and Macrium Reflect do? And when i restore from the backup, will everything be 1:1 ?

Default to an archive with same name as the target directory

Expected Behavior

I intuitively expect that when passed a directory named foo that zpaqfranz would create a foo.zpaq archive using the name of the directory as a default. For example:

zpaqfranz a foo

Actual Behavior

This is likely because it's not obvious what a default output archive should be named, but zpaqfranz a foo just prints the usage information. Instead, I have to specify both the directories and files after naming an archive, like so:

zpaqfranz a foo.zpaq foo

This is in contrast to tools like lrztar which will do the "sensible thing" and simply create a sensibly-named archive unless one already exists in which case I need to specify the -f flag to force an overwrite, or the -o flag to specify the output file.

Side Note

It would be nice if zpaqfranz played nicely with lrztar, which many other formats do (including zpaq, although zpipe doesn't currently work correctly on the M2 processor making lrztar with standard zpaq currently unusable).

Broken help

Hi,
I found that somewhere around version 55.10 full help (h h) stopped working
and seems -diff is also not available for a longer time

is it possible to consider support for some helpful functions on linux in the future ?

  • preserving some file metadata like in tar ?
    (file owner, timestamp, symlink, ...)

thanks
Ratay

Folder exclusion issue

Hello,

Is it possible to restrict the recursion depth of the -not option while creating archives with zpaqfranz?

Let's consider this batch file creating a simple folder hierarchy, sample.bat :

md C:\Documents\Department1\Fred\Archive

md C:\Documents\Department1\Fred\Docs

md C:\Documents\Department1\Fred\test1\Archive

md C:\Documents\Department2\George\Docs

md C:\Documents\Department2\George\Archive

md C:\Documents\Department2\George\folder\sample\Archive

echo test1 > C:\Documents\Department1\Fred\Archive\file1.txt

echo test2 > C:\Documents\Department2\George\Archive\file2.txt

echo test3 > C:\Documents\Department1\Fred\file3.txt

echo test4 > C:\Documents\Department2\George\file4.txt

echo test5 > C:\Documents\Department2\George\folder\sample\Archive\file5.txt

echo test6 > C:\Documents\Department2\George\Docs\test6.txt

echo test7 > C:\Documents\Department1\Fred\test7.txt

echo test8 > C:\Documents\Department2\George\test8.txt

echo test9 > C:\Documents\Department1\Fred\test1\Archive\test9.txt

Creating the backup :

zpaqfranz.exe a backup c:\Documents -not C:\Documents\ * \ * \Archive

zpaqfranz v55.16h-experimental-JIT-L (HW BLAKE3), SFX64 v55.1, (03 Oct 2022)
Creating backup.zpaq at offset 0 + 0
Adding 40 (40.00 B) in 5 files (10 dirs), 4 threads @ 2022-10-19 23:22:28
15 +added, 0 -removed.

0 + (40 -> 40 -> 1.561) = 1.561 @ 512.00 B/s

0.094 seconds (00:00:00)  (all OK)

In this example, my expectation was to exclude only the folders below :

C:\Documents\Department1\Fred\Archive
C:\Documents\Department2\George\Archive

zpaqfranz is also excluding those folders :

C:\Documents\Department1\Fred\test1\Archive
C:\Documents\Department2\George\folder\sample\Archive

zpaqfranz is excluding all the folders named Archive. I am also experiencing the same problem with Matt Mahoney's zpaq archiver.

I guess I didn't construct properly the parameters to exclude only the necessary folders :

zpaqfranz.exe a backup c:\Documents -not C:\Documents\ * \ * \Archive

Kindly, could you please point me the right direction? Thanks.
sample.txt

Incomplete Transactions using space

If I start adding new files, but something goes wrong and it can't write, and results in an incomplete transaction, there is no way to remove the incomplete files and reclaim storage other than overwriting the incomplete data with new data. It would be very useful to add a "trim" command that purges incomplete transactions.

Issue with the -stdin option

Hello Mr. Corbelli,

The new -stdin option introduced with the release v56.1j does not work . Here is a quick test.

Creating a simple file to test dd for Windows :

D:\>echo This is a test. > sample.txt

D:\>dd if=sample.txt of=test.txt
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <[email protected]>
This program is covered by terms of the GPL Version 2.

0+1 records in
0+1 records out

D:\>fc sample.txt test.txt
Comparing files sample.txt and TEST.TXT
FC: no differences encountered

Archiving the text file with zpaqfranz does not have any effect, I get only the options presenting how to use zpaqfranz.exe :

D:\>dd if=sample.txt | zpaqfranz.exe a output.zpaq -stdin
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <[email protected]>
This program is covered by terms of the GPL Version 2.

zpa0+1 records in
q0+1 records out
franz v56.1j-JIT-L (HW BLAKE3), SFX64 v55.1, (15 Nov 2022)
franz:-stdin
Usage: zpaqfranz command archive[.zpaq] files|directory... -switches...
             h: **** Help on Help ****|  With great power comes great  ...  help!
             a: Append files          |          t: Test (integrity)
             x: Extract versions      |          l: List files
             v: Verify on filesystem  |          i: Info (show versions)
           sfx: Create SFX (Windows)  |         rd: Remove  'highlander'  folders
 c d0 d1 d2...: Compare d0 to d1,d2.. | s d0 d1 d2: Cumulative size of d0, d1, d2
 r d0 d1 d2...: Mirror  d0 in d1...   |       d d0: Deduplicate d0  WITHOUT MERCY
 z d0 d1 d2...: Delete empty dirs     |        m X: Merge multipart archive
          f d0: Fill /wipe free space |     utf d0: Detox filenames  in  d0
  sum d0 d1...: Hashing/deduplication |     dir d0: Win dir (/s /a /os /od)
     n d0 -n X: Keep X files in d0    |          b: CPU benchmarking (-all)
  find d0 what: Search file(s)        | cp X -to Y: Copy files (w/wildcards) to Y
          trim: Trim incomplete add() |    dirsize: Get size of archive's folders
      password: Add/remove/change pwd |   q z:\foo: VSS-archive C: in z:\foo.zpaq
      autotest: Check internals       |      pause: Pause script
                                 Main | switches
      -all [N]: All versions N digit  |     -key X: Use/set archive password to X
 -mN -method N: 0..5= faster..better  |     -force: Always overwrite (extraction)
         -test: Test (extract/add)    |      -kill: Allow destructive (NO dryrun)
    -to out...: Prefix files to out   |   -until N: Roll back to N'th version

output.zpaq is not generated. The statements :
zpa0+1 records in
q0+1 records out

are not very meaningful since there is no any output.

Thanks for your support.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.