Giter Site home page Giter Site logo

temisu / ancient Goto Github PK

View Code? Open in Web Editor NEW
197.0 19.0 14.0 11.59 MB

Decompression routines for ancient formats

License: BSD 2-Clause "Simplified" License

C++ 96.21% Makefile 1.72% Shell 1.49% M4 0.58%
data-compression amiga gzip bzip2 decompressor decompression-library atari retrogaming retrocomputing compression

ancient's Introduction

Ancient - Modern decompressor for old data compression formats

This is a collection of decompression routines for old formats popular in the Amiga, Atari computers and some other systems from 80's and 90's as well as some that are currently used which were used in a some specific way in these old systems.

Even though most of these algorithms are still available for download, scavenging and using them might prove to be a challenge. Thus the purpose of this project is to:

  • Provide a clean, modern implementation of the algorithms - Typically the implementations were not meant to be used outside of the original systems they were made for. Some other ported implementations are incomplete, bad quality or direct translations from old M68K assembly code.
  • Provide a clean BSD-style licensing - Original implementations or their ports might have strange license or no visible license at all. There are also implementations that have been ripped off from some other source thus their legality is questionable at best.
  • Provide a tested implementation - The code is no good if it does not work properly and the old code have a lot of corner cases. These implementations are tested using a cache of available files (~10k) that used these algorithms. Although it does not offer any guarantee especially when we are talking about undocumented formats, it gives hope that there are less "stupid errors" in the code. I have also generated a small batch of test files for different formats for testing. The source files are known public domain sources

For simple usage both a simple command line application as well as a simple API to use the decompressors are provided. The compression algorithm is automatically detected in most cases, however there are some corner cases where it is not entirely reliable due to weaknesses in the old format used. Please see the main.cpp and ancient.hpp to get an idea.

This code should compile cleanly on most C++17 capable compilers, and it is tested on clang and MSVC.

Some formats have incorporated weak password protection on them which can be bypassed. However, this project does not attempt to do any real cryptograpy.

Currently the project does not support any archival files nor self extracting executables.

Decompression algorithms provided:

  • bzip2
    • both normal and randomized bitstreams
  • Compact (Unix)
  • Compress (Unix)
    • Supports both old and new formats
  • CrunchMania by Thomas Schwarz
    • CrM!: Crunch-Mania standard-mode
    • Crm!: Crunch-Mania standard-mode, sampled
    • CrM2: Crunch-Mania LZH-mode
    • Crm2: Crunch-Mania LZH-mode, sampled
    • ID 0x18051973 (CrunchMania CrM2 Clone)
    • ID CD³¹ (CrunchMania CrM2 Clone)
    • ID DCS! (CrunchMania CrM! Clone)
    • ID Iron (CrunchMania CrM2 Clone)
    • ID MSS! (CrunchMania CrM2 Clone)
    • ID mss! (CrunchMania Crm2 Clone)
  • Disk Masher System a.k.a. DMS
    • Supports all different compression methods (NONE,SIMPLE,QUICK,MEDIUM,DEEP,HEAVY1,HEAVY2)
    • Supports password bypassing
  • File Imploder
    • ID ATN! (Imploder Clone)
    • ID BDPI (Imploder Clone)
    • ID CHFI (Imploder Clone)
    • ID EDAM (Imploder Clone)
    • ID M.H. (Imploder Clone)
    • ID RDC9 (Imploder Clone)
    • ID FLT! (Imploder Clone) (verification missing)
    • ID Dupa (Imploder Clone) (verification missing)
    • ID PARA (Imploder Clone) (verification missing)
  • Freeze/Melt
    • Supports both old and new formats
  • gzip
  • LOB's File Compressor (Also known as a Multipak)
    • Supports all original 6 modes and their combinations (BMC, HUF, LZW, LZB, MSP, MSS)
    • Does not support mode 8 (as defined by some game files)
  • Pack (Unix)
    • Supports both old and new formats
  • PowerPacker
    • PP 1.1 (verification missing)
    • PP 2.0
    • PX20: Supports bypassing password protected files.
    • ID CHFC (PowerPacker Clone)
    • ID DEN! (PowerPacker Clone)
    • ID DXS9 (PowerPacker Clone)
    • ID H.D. (PowerPacker Clone)
    • ID RVV! (PowerPacker Clone)
  • Quasijarus Strong Compression
  • Rob Northen compressors.
    • RNC1: Supports both new and old format of RNC1
    • RNC2: Supports both new and old format of RNC2
    • ID ...1 (RNC1 Clone)
  • Turbo Packer by Wolfgang Mayerle.
  • MMCMP: Music Module Compressor
  • SCO Compress LZH
  • StoneCracker
    • SC: StoneCracker v2.69 - v2.81
    • SC: StoneCracker v2.92, v2.99
    • S300: StoneCracker v3.00
    • S310: StoneCracker v3.10, v3.11b
    • S400: StoneCracker pre v4.00
    • S401: StoneCracker v4.01
    • S403: StoneCracker v4.02a
    • S404: StoneCracker v4.10
    • ID 1AM (StoneCracker S300 Clone)
    • ID 2AM (StoneCracker S401 Clone)
    • ID AYS! (StoneCracker S404 Clone)
    • ID Z&G! (StoneCracker S403 Clone)
    • ID ZULU (StoneCracker S403 Clone)
  • Vice / Vic2 Huffman compressor with RLE
  • XPK-encapsulated files
    • ACCA: Andre's Code Compression Algorithm
    • ARTM: Arithmetic encoding compressor
    • BLZW: LZW-compressor
    • BZP2: Bzip2 backend for XPK
    • CBR0: RLE compressor
    • CBR1: RLE compressor
    • CRM2: CrunchMania backend for XPK
    • CRMS: CrunchMania backend for XPK, sampled
    • CYB2: xpkCybPrefs container
    • DLTA: Delta encoding
    • DUKE: NUKE with Delta encoding
    • ELZX: LZX-compressor
    • FAST: LZ77-compressor
    • FBR2: CyberYAFA compressor
    • FRHT: LZ77-compressor
    • FRLE: RLE compressor
    • GZIP: Deflate backend for XPK
    • HUFF: Huffman modeling compressor
    • HFMN: Huffman modeling compressor
    • ILZR: Incremental Lempel-Ziv-Renau compressor
    • IMPL: File Imploder backend for XPK
    • LHLB: LZRW-compressor
    • LIN1: Lino packer
    • LIN2: Lino packer
    • LIN3: Lino packer
    • LIN4: Lino packer
    • LZBS: CyberYAFA compressor
    • LZCB: LZ-compressor
    • LZW2: CyberYAFA compressor
    • LZW3: CyberYAFA compressor
    • LZW4: CyberYAFA compressor
    • LZW5: CyberYAFA compressor
    • MASH: LZRW-compressor
    • NONE: Null compressor
    • NUKE: LZ77-compressor
    • PPMQ: PPM compressor
    • PWPK: PowerPacker backend for XPK
    • RAKE: LZ77-compressor
    • RDCN: Ross Data Compression
    • RLEN: RLE compressor
    • SASC: LZ-compressor with arithmetic encoding
    • SDHC: Sample delta huffman compressor
    • SHR3: LZ-compressor with arithmetic encoding
    • SHRI: LZ-compressor with arithmetic encoding
    • SHSC: Context modeling compressor
    • SLZ3: CyberYAFA compressor
    • SLZX: LZX-compressor with delta encoding
    • SMPL: Huffman compressor with delta encoding
    • SQSH: Compressor for sampled sounds
    • TDCS: LZ77-compressor
    • ZENO: LZW-compressor

There is some support for archival decompressors: However, these are not built in at the moment but the code can be as a reference

  • Zip decompressor backend (decompressor only, no Zip file format reading yet)
    • Shrink
    • Reduce
    • Implode
    • Deflate
    • Deflate64
    • Bzip2
  • Lha/Lzh decompressor backend (decompressor only, no Lha file format reading yet)
    • LH0: Null compressor
    • LH1: LZRW-compressor with 4kB window
    • LH2: LZRW-compressor with Dynamic Huffman Encoding (experimental)
    • LH3: LZRW-compressor (experimental)
    • LH4: LZRW-compressor with 4kB window
    • LH5: LZRW-compressor with 8kB window
    • LH6: LZRW-compressor with 32kB window
    • LH7: LZRW-compressor with 64kB window
    • LH8: LZRW-compressor with 64kB window (Joe Jared extension)
    • LHX: LZRW-compressor with up to 512kB window (UnLHX extension)
    • LZ4: Null compressor
    • LZ5: LZ-compressor
    • LZS: LZ-compressor
    • PM0: Null compressor
    • PM1: LZ-compressor
    • PM2: LZ-compressor

Special thanks go to Cholok for providing me references to many of the XPK-compressors.

BZIP2 tables for randomization have been included, they have BZIP2-license.

SASC/SHSC decompressors have been re-implemented by using the original HA code from Harri Hirvola as reference. (No code re-used)

Some of the rare Lzh-compressors have been re-implemented by using Lhasa as a reference. (No code re-used)

I'm slowly adding new stuff. If your favorite is not listed contact me and maybe I can add it.

Currently not planned to be supported:

  • PPC only XPK compressors. XPK implementation is now considered complete in practical terms for classic Amiga.

Wishlist:

  • More files for my testbench.

Feedback: tz at iki dot fi

ancient's People

Contributors

invisibleup avatar manxorist avatar sagamusix avatar temisu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ancient's Issues

A cleaner public API

This issue is a follow-up to things already partially discussed in #13 (3).

  1. I'd prefer a true facade-like public API that does not leak any implementation detail. The current implementation leaks Decompressor::Registry (and related functionality), and in particular also leaks all classes derived from Decompressor. In order for RTTI to work in client code (when ancient is compiled with RTTI (the current Makefile does not, however for best interoperability with client code, a shared library should be)), all classes derived from Decompressor would also need to have public visibility. I do not think this is feasible or desirable. As the currently exposed interface are base classes with a vtable, the size of the vtable becomes part of the ABI. This makes extending these interfaces (which are also used internally) difficult without breaking the ABI.
  2. I do not think Buffer needs to be exposed at all. I think in the public API, std::vectorstd::byte (or std::vector<uint8_t>) is all that is needed. For internal use, this can be wrapped in a Buffer at the API boundary for ease of use.
  3. Is OutOfMemoryError really needed? I think standard C++ std::bad_alloc has exactly the same meaning. OutOfMemoryError can be annoying for client code: Anything in the standard library will already throw std::bad_alloc in case of allocation failure, and ancient will itself throw OutOfMemoryError in the same situation. This means client code would need to always handle both, as ancient internally uses standard library features that throw std::bad_alloc (the obvious one being std::make_unique here).

I am currently working on this and will suggest a pull request when I am done.

Version number confusion between release version and autotools version

When I contributed the Autotools build system in #21, I noted that I added version information to configure.ac:

Introduce package versioning (using SemVer) and soname versioning via libtool.
Details are documented at the top of configure.ac.

As the updated API which is incompatible with the older ad-hoc API also went in, I did set the version to 2.0.0-pre, as required by SemVer.

Yet, the update to ancient as a whole you recently released has version 1.1, and the Autotools build system still advertises a pre-release package version 2.0.0-pre.2.

This likely confuses users. I suggest these versions should match and ancient should follow the versioning guidelines that I did add at the top of configure.ac.

Setting the Autotools package version to 2.0.0, and releasing a release versioned also 2.0.0 in a timely fashion should cleanup the situation for now, I think.

I am sorry if I might have caused confusion about the intentions that my versioning changes implied, and I maybe did not communicate the changes properly.

Consider implementing some self-extracting executables

Hi!,

I'm working for the ScummVM project to re-implement the freescape engine. This engine was used to produce a variety of classic videos games such as Driller, Total Eclipse and Castle Master in different computers from the 80s, including AtariST and Amiga.

There are many cases of compressed executable for the AtariST/Amiga releases of the freescape games. For instance, there is a self-extracting AtariST Driller demo that we cannot run since it seems to be packed with some unknown algorithm. Could you please consider implementing some code to perform decompressions of self-extracting executables like this one or at least, taking a look and point us into the right direction on how to do it.

Thanks!

warning in LH3Decompressor with GCC 9 (autotools build)

src/Lzh/LH3Decompressor.cpp: In member function ‘virtual void ancient::internal::LH3Decompressor::decompressImpl(ancient::internal::Buffer&, bool)’:
src/Lzh/LH3Decompressor.cpp:138:52: warning: ‘distanceDecoder.ancient::internal::OptionalHuffmanDecoder<unsigned char>::_emptyValue’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  138 |    uint32_t distance=distanceDecoder.decode(readBit);
      |                                                    ^
src/Lzh/LH3Decompressor.cpp:137:4: warning: ‘decoder.ancient::internal::OptionalHuffmanDecoder<unsigned int>::_emptyValue’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  137 |    if (code==285) code+=readBits(8);
      |    ^~

Incorrect checksum calculation for older MMCMP files

Creagaia.zip

This is the demo song coming with Impulse Tracker 1.06, compressed with MMCMP. When checksum verification is enabled, it fails to load (actual checksum 0x418b9db4, expected checksum 0x02df3fbc). Since OpenMPT never verified checksums, I'm not sure if the checksum is simply incorrect or if the checksum calculation is wrong. As far as I can tell, the uncompressed data is correct when ignoring checksum verification.

Patents

Do any implementations here have patents on them?

am I using it wrong?

Hi

Thanks for the update, I can see you're shipping pkg-config file and library now:

# ancient -h
/usr/bin/ancient: error: '/usr/bin/.libs/ancient' does not exist
This script is just a wrapper for ancient.
See the libtool documentation for more information.
root@phd-sid:/var/www/debian/ancient/2022# dpkg -L ancient
/.
/usr
/usr/bin
/usr/bin/ancient
/usr/include
/usr/include/ancient
/usr/include/ancient/ancient.hpp
/usr/lib
/usr/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu/libancient.a
/usr/lib/x86_64-linux-gnu/libancient.la
/usr/lib/x86_64-linux-gnu/libancient.so.2.0.0
/usr/lib/x86_64-linux-gnu/pkgconfig
/usr/lib/x86_64-linux-gnu/pkgconfig/libancient.pc
/usr/share
/usr/share/doc
/usr/share/doc/ancient
/usr/share/doc/ancient/LICENSE
/usr/share/doc/ancient/README.md.gz
/usr/share/doc/ancient/changelog.Debian.gz
/usr/share/doc/ancient/copyright
/usr/share/man
/usr/share/man/man1
/usr/share/man/man1/ancient.1.gz
/usr/lib/x86_64-linux-gnu/libancient.so
/usr/lib/x86_64-linux-gnu/libancient.so.2

Seems some things are missing

Here's a build attempte after a fresh checkout:

$ make
clang++ -Os -Wall -Wsign-compare -Wshorten-64-to-32 -Wno-error=multichar -Wno-multichar -Isrc -std=c++14 -fno-rtti -o Buffer.o -c src/Buffer.cpp
make: *** No rule to make target 'Common.o', needed by 'ancient'.  Stop.

Ok so let's remove Common.o from the Makefile.

In file included from src/LHLBDecompressor.cpp:7:
src/DynamicHuffmanDecoder.hpp:19:5: error: no member named 'memset' in the global namespace; did you mean 'wmemset'?
                ::memset(_nodes,0xff,sizeof(_nodes));

Well, let's get rid of that too.

LZBSDecompressor.o:(.text+0x2b8): undefined reference to rotateBits(unsigned int, unsigned int)'`

Seems there's some stuff missing.

Surprising increase in distfile size in 2.1.0

The GitHub generated tarball of ancient 2.0.0 was 102KB. The tarball of ancient 2.1.0 is 15.7MB. This increase is surprising. I think most/all of this is coming from the new testing directory. Is this directory required to build/install ancient? I tried deleting it and ./configure and make and make install worked fine.

fails to decompress some MMCMP files

https://manx.datengang.de/openmpt/temp/mmcmp.zip contains 3 files, only mmcmp-ELECTR~1.MOD is decompressed successfully by ancient.

They have been compressed with MMCMP 1.34 (available here: http://cd.textfiles.com/scene96-2/programs/mmcmp134/).

manx@quadratus:~/projects/ancient/ancient.git$ ./ancient decompress /home/manx/c-home/stuff/mmcmp/compressed/BENJAM.IT /home/manx/c-home/stuff/mmcmp/ancient/BENJAM.IT
Decompression failed for /home/manx/c-home/stuff/mmcmp/compressed/BENJAM.IT
manx@quadratus:~/projects/ancient/ancient.git$ ./ancient decompress /home/manx/c-home/stuff/mmcmp/compressed/MELTED.XM /home/manx/c-home/stuff/mmcmp/ancient/MELTED.XM
Verify (raw) failed for /home/manx/c-home/stuff/mmcmp/compressed/MELTED.XM
manx@quadratus:~/projects/ancient/ancient.git$ ./ancient decompress /home/manx/c-home/stuff/mmcmp/compressed/mmcmp-ELECTR~1.MOD /home/manx/c-home/stuff/mmcmp/ancient/mmcmp-ELECTR~1.MOD
manx@quadratus:~/projects/ancient/ancient.git$

OpenMPT/libopenmpt can decompress them successfully and match mmuncmp output exactly.

Our current implementation (BSD-3-Clause licensed, so only as reference): https://github.com/OpenMPT/openmpt/blob/master/soundlib/ContainerMMCMP.cpp.

I am no expert on actual compression formats, so I have not looked in any further detail.

Allow for better packaging and system integration

Allow for better packaging and system integration

OpenMPT and libopenmpt (https:://openmpt.org/ and https://lib.openmpt.org/) currently implement PP20, XPK, and MMCMP decompression to allow for loading old module formats. However, in the long term, we probably want to get rid of these implementations in our codebase and use an external library to handle these in OpenMPT (probably actually not in libopenmpt itself because compression formats are better handled by libopenmpt client code in order to reduce dependencies).

In order to be able to use ancient, we would ultimately need it to be packaged in Linux distributions (so that openmpt123 can pick it up as a dependency). Many distributions (in particular Debian) do not allow 3rd party packages (ancient in this case) to be distributed inside of other packages' source trees (libopenmpt in our case), but instead require to use distribution-packages for all dependencies.

In order to allow for friction-free packaging in the context of Linux distributions, there are a couple of things that would need to be (or at least would be great to have) done in ancient:

  1. namespace all internal C++ symbols in namespace ancient

    This avoids any potential symbol conflicts when linking ancient statically.

  2. proper symbol visibility for internal and public APIs

    This avoids any potential symbol conflicts when linking ancient dynamially.

  3. public C++ API header, preferably without inline functions or templates

    ancient should clarify which headers belong to the public API and which do not. I would suggest a minimal subset of Decompressor.hpp and Buffer.hpp here.

  4. public C API (not strictly required for libopenmpt)

    To facilitate using ancient from other programming languages, a C API wrapper would probably be useful.

  5. distribution-friendly build system

    Candidates here are most likely autotools, cmake, or meson. This would be in addition and as an alternative to the current Makefile. I am only familiar with autotools.

  6. stable ABI, in particular also proper soname versioning

    The build system should take care of properly incrementing the library soname when the ABI and/or API changes.

  7. stable API

    The public C and C++ APIs should be designed to be as stable as possible.

As we have already experience with all of these concerns (having done the complete work for libopenmpt here), we would be willing to provide initial implementations and pull requests for all of them.

This issue is intended to gauge whether you would be willing to evolve ancient into that direction. In case you want us to go forward, we can create additional issues and/or pull request to discuss any details.

Cheers,

@manxorist and @sagamusix

PMC. format request

Hi,
I've stumbled upon a couple of tracker music files compressed by PowerPlayer Music Cruncher, a proprietary thing apparently employing lh.library. Then I noticed "LHLB" compression also depends on the library. Since you have XPK-LHLB, I thought it'd be trivial for you to adapt it to unpack them (although I couldn't find any other open project that can).
Here they are. I strongly suspect they're files whose unpacked versions start with "MMD1" (or MMD2 at least). The beginning of the format is like this: uint32be fourcc, uint32 rawsz, uint32 filesz-12 (the entire header would be 12 bytes long).

StuffitIt Support

This is an amazing project! I’d love to see StuffIt support for those 80s and 90s Mac archives. As far as I can tell, only The Unarchiver https://theunarchiver.com/ offers decompression support for these files.

warnings with VS2019 /W3

1>ACCADecompressor.cpp
1>API.cpp
1>ARTMDecompressor.cpp
1>BLZWDecompressor.cpp
1>BZIP2Decompressor.cpp
1>CBR0Decompressor.cpp
1>CRMDecompressor.cpp
1>CYB2Decoder.cpp
1>DEFLATEDecompressor.cpp
1>DLTADecode.cpp
1>DMSDecompressor.cpp
1>Decompressor.cpp
1>FASTDecompressor.cpp
1>FBR2Decompressor.cpp
1>FRLEDecompressor.cpp
1>HFMNDecompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\FBR2Decompressor.cpp(61,11): warning C4146: unary minus operator applied to unsigned type, result still unsigned
1>HUFFDecompressor.cpp
1>ILZRDecompressor.cpp
1>IMPDecompressor.cpp
1>InputStream.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\ILZRDecompressor.cpp(63,46): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
1>LHLBDecompressor.cpp
1>LIN1Decompressor.cpp
1>LIN2Decompressor.cpp
1>LZBSDecompressor.cpp
1>LZCBDecompressor.cpp
1>LZW2Decompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\LZBSDecompressor.cpp(66,49): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
1>LZW4Decompressor.cpp
1>LZW5Decompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\LZCBDecompressor.cpp(279,12): warning C4244: 'initializing': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\LZCBDecompressor.cpp(314,31): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>LZXDecompressor.cpp
1>LH1Decompressor.cpp
1>LH2Decompressor.cpp
1>LH3Decompressor.cpp
1>LHXDecompressor.cpp
1>LZ5Decompressor.cpp
1>LZHDecompressor.cpp
1>LZSDecompressor.cpp
1>PMDecompressor.cpp
1>MASHDecompressor.cpp
1>MMCMPDecompressor.cpp
1>NONEDecompressor.cpp
1>NUKEDecompressor.cpp
1>OutputStream.cpp
1>PPDecompressor.cpp
1>RAKEDecompressor.cpp
1>RDCNDecompressor.cpp
1>RLENDecompressor.cpp
1>RNCDecompressor.cpp
1>RangeDecoder.cpp
1>SDHCDecompressor.cpp
1>SHR3Decompressor.cpp
1>SHRIDecompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SDHCDecompressor.cpp(75,13): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SDHCDecompressor.cpp(92,13): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SDHCDecompressor.cpp(94,13): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>SLZ3Decompressor.cpp
1>SMPLDecompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SHRIDecompressor.cpp(48,12): warning C4146: unary minus operator applied to unsigned type, result still unsigned
1>SQSHDecompressor.cpp
1>SXSCDecompressor.cpp
1>StoneCrackerDecompressor.cpp
1>TDCSDecompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(200,46): warning C4244: 'argument': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(203,56): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(213,115): warning C4267: 'initializing': conversion from 'size_t' to 'uint16_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(650,27): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(734,24): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(746,26): warning C4244: 'argument': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\SXSCDecompressor.cpp(751,21): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>TPWMDecompressor.cpp
1>XPKDecompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\StoneCrackerDecompressor.cpp(181,25): warning C4101: 'e': unreferenced local variable
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\StoneCrackerDecompressor.cpp(520,25): warning C4244: 'argument': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\StoneCrackerDecompressor.cpp(581,41): warning C4244: 'argument': conversion from 'uint16_t' to 'uint8_t', possible loss of data
1>XPKMain.cpp
1>ZENODecompressor.cpp
1>ImplodeDecompressor.cpp
1>ReduceDecompressor.cpp
1>ShrinkDecompressor.cpp
1>ZIPDecompressor.cpp
1>C:\Users\manx\projects\openmpt\wc\trunk-8\include\ancient\src\Zip\ImplodeDecompressor.cpp(79,41): warning C4018: '>': signed/unsigned mismatch
1>Buffer.cpp
1>CRC16.cpp
1>CRC32.cpp
1>Common.cpp
1>MemoryBuffer.cpp
1>StaticBuffer.cpp
1>SubBuffer.cpp
1>WrappedVectorBuffer.cpp

trivial warnings with clang-cl

src/common/MemoryBuffer.hpp(18,35): warning : class with destructor marked 'final' cannot be inherited from [-Wfinal-dtor-non-final-class]
src/common/MemoryBuffer.hpp(13,7): note: mark 'ancient::internal::MemoryBuffer' as 'final' to silence this warning

src/common/WrappedVectorBuffer.hpp(20,42): warning : class with destructor marked 'final' cannot be inherited from [-Wfinal-dtor-non-final-class]
src/common/WrappedVectorBuffer.hpp(16,7): note: mark 'ancient::internal::WrappedVectorBuffer' as 'final' to silence this warning

warning in main.cpp with GCC 9 (Ubuntu 20.04)

main.cpp: In lambda function:
main.cpp:182:45: warning: ignoring attributes on template argument ‘int (*)(DIR*)’ {aka ‘int (*)(__dirstream*)’} [-Wignored-attributes]
  182 |    std::unique_ptr<DIR,decltype(&::closedir)> dir{::opendir(inputDir.c_str()),::closedir};
      |                                             ^

I guess it warns because in <dirent.h>, closedir is declared as extern int closedir (DIR *__dirp) __nonnull ((1)); with __nonnull being # define __nonnull(params) __attribute__ ((__nonnull__ params)).

Not sure how to fix that, short of just ignoring the warning.

Directory walking could also be rewritten with std::filesystem (if available on all platforms that ancient cares about).

I do not consider this high priority at all, though.

Quarterback backup sets

Would you consider supporting Quarterback (amiga backup program) sets for this? Happy to supply test material if so.

new VS2022 warnings in 2.1.0

src\BLZWDecompressor.cpp(66,38): warning C4267: 'argument': conversion from 'size_t' to 'uint32_t', possible loss of data
src\CompressDecompressor.cpp(117,47): warning C4018: '>=': signed/unsigned mismatch
src\CompactDecompressor.cpp(89,33): warning C4244: 'argument': conversion from 'uint16_t' to 'uint8_t', possible loss of data
src\DMSDecompressor.cpp(555,24): warning C4267: 'initializing': conversion from 'size_t' to 'uint32_t', possible loss of data
src\DMSDecompressor.cpp(571,32): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
src\DMSDecompressor.cpp(587,23): warning C4267: 'initializing': conversion from 'size_t' to 'uint32_t', possible loss of data
src\FreezeDecompressor.cpp(52,16): warning C4244: '=': conversion from 'uint16_t' to 'uint8_t', possible loss of data
src\LOBDecompressor.cpp(228,54): warning C4018: '>=': signed/unsigned mismatch
src\PPDecompressor.cpp(266,26): warning C4267: 'initializing': conversion from 'size_t' to 'uint32_t', possible loss of data
src\PPDecompressor.cpp(306,18): warning C4267: 'initializing': conversion from 'size_t' to 'uint32_t', possible loss of data
src\PPDecompressor.cpp(326,36): warning C4267: '=': conversion from 'size_t' to 'uint32_t', possible loss of data
src\PPDecompressor.cpp(387,53): warning C4267: 'argument': conversion from 'size_t' to 'uint32_t', possible loss of data
src\PackDecompressor.cpp(171,27): warning C4244: 'argument': conversion from 'uint16_t' to 'uint8_t', possible loss of data

Found via fuzzing: File takes really long to unpack

This was found while fuzzing ancient. Not sure if much can be done here, since the file in question appears to be identified as an encrypted file, so ancient appears to brute-force an encryption key. However, given the size of the file (9KB), the time to open the file (ancient eventually gives up) is disproportionate (it took several minutes here). Maybe this can be improved?

fuzzing result.zip

g++

g++ -Wno-multichar -Isrc -std=c++14 -fno-rtti -o LZCBDecompressor.o -c src/LZCBDecompressor.cpp
src/LZCBDecompressor.cpp:141:59: error: conflicting declaration 'constexpr const std::array<unsigned int, FrequencyTree<T>::levels()> FrequencyTree<T>::_levelOffsets'
  141 | constexpr std::array<uint32_t,FrequencyTree<T>::levels()> FrequencyTree<T>::_levelOffsets;
      |                                                           ^~~~~~~~~~~~~~~~
src/LZCBDecompressor.cpp:134:49: note: previous declaration as 'constexpr const std::array<unsigned int, FrequencyTree<T>::levels()> FrequencyTree<T>::_levelOffsets'
  134 |  static constexpr std::array<uint32_t,levels()> _levelOffsets=makeArray(makeLevelOffsetSequence(std::make_integer_sequence<uint32_t,levels()>{}));
      |                                                 ^~~~~~~~~~~~~
src/LZCBDecompressor.cpp:144:59: error: conflicting declaration 'constexpr const std::array<unsigned int, FrequencyTree<T>::levels()> FrequencyTree<T>::_levelSizes'
  144 | constexpr std::array<uint32_t,FrequencyTree<T>::levels()> FrequencyTree<T>::_levelSizes;
      |                                                           ^~~~~~~~~~~~~~~~
src/LZCBDecompressor.cpp:135:49: note: previous declaration as 'constexpr const std::array<unsigned int, FrequencyTree<T>::levels()> FrequencyTree<T>::_levelSizes'
  135 |  static constexpr std::array<uint32_t,levels()> _levelSizes=makeArray(makeLevelSizeSequence(std::make_integer_sequence<uint32_t,levels()>{}));
      |                                                 ^~~~~~~~~~~

BTW https://stackoverflow.com/questions/45918292/gcc-equivalent-of-wshorten-64-to-32

Registry does not work when built as a static library

Ancient works fine when compiled directly into the resulting executable or when compiled as a shared library (dylib/so/dll).

However, it fails when built as a static library that gets linked into an executable or shared library. The sympton is ancient::internal::decompressors being nullptr due to ancient::internal::Decompressor::registerDecompressor never getting called.

The reason is due to how linkers handle static libraries. They see static libraries as a collection of individual contained object files. They only include any such object file if any symbol it defines was required by any other object file the linker has seen before. The linker makes this decision on the individual object file level contained in a static library. While each individual Decompressor in ancient references the registry registration functions in Decompressor.cpp, it is itself never actually referenced by any other object file inside the library, and thus will not be linked in, which then obviously never invokes the global static initializer for the Registry.

This behavior is documented for example here:

Now, I do consider this behavior (which happens on all major platforms, as far as I know (however, I did not try to reproduce on non-Windows yet)) an outright bug and in violation of the intentions of the C++ standard. However, sadly, reality disagrees with my opinion.

I am not aware of a simple clean solution to this problem. One thing that will work, is making all Decompressor::Registry<> instantiations inline in their respective header files (a C++17 feature) and then including all individual Decompressor header files in 1 single cpp file (which somewhat defeats the purpose of the Registry) that gets referenced when using the library (obvious candidates would be API.cpp or Decompressor.cpp).

This bug actually bit us when integrating ancient into OpenMPT (which is Windows-only, and for simplicity reasons links as much as possible statically, and splits individual libraries into their own respective static library in the build system). While we can work-around the issue by using /WHOLEARCHIVE, I think ancient still needs a solution so that this issue does not bite any other future users.

Vice/Vic2

Any plan for supporting this type of crunched files in the future release?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.