Giter Site home page Giter Site logo

fx-cmix's Introduction

fx-cmix

The fx-cmix is a updated implementation of fast-cmix.

Prize awarded on February 2, 2024. http://prize.hutter1.net/

Submission Description

This submission contains fallowing modifications on top of the recent fast-cmix Hutter Prize winner:

  • paq8hp model is replaced with fxcmv1 model with fallowing notable additions:
    • Multiple state tables are used in predictors, this allows better predictability.
    • Most contexts are divided between 30 main predictors, this allows more efficient memory usage per context.
    • Added bracket, quote, first char, char in paragraph, column, table, template, word stream/paragraph context. These contexts are parsed depending on input, this includes parsing of wiki links, http links, tables, columns, paragraphs, quotes, brackets, list, templates.
    • Some contexts are swapped in predictors depending on what current input is (table, column mode, word/paragraph, list).
    • Some predictors are switched on/off depending on last char, link or current bracket which improves compression.
    • Predictions are mixed with context that are more aware of what predictors are outputting.
    • Match model (not present in paq8hp model).
    • Predictors are faster allowing more complex contexts.
  • new dictionary
  • small change in phda9 preprocessor and in two tables in cmix
  • memory usage is larger in fxcmv1 model compared to old paq8hp

Below is the fx-cmix result:

Metric Value
fx-cmix compressor's executable file size (S1) 436707 bytes
fx-cmix self-extracting archive size (S2) 112148343 bytes
Total size (S) 112585050 bytes
Previous record (L) 114156155 bytes
fx-cmix improvement (1 - S/L) 1.38%
Experiment platform
Operating system Ubuntu 20.04
Processor Intel(R) Xeon(R) CPU @ 2.20GHz Geekbench score 706
Memory 32 GB
Decompression running time 85 hours
Decompression RAM max usage 9575124 KiB
Decompression disk usage ~35GB

Time, disk, and RAM usage are approximately symmetric for compression and decompression.

Instructions

The installation and usage instructions for fx-cmix are the same as for fast-cmix.

One important note: it is recommended to change one variable in the source code for PPM. From line 26 in src/models/ppmd.cpp:

// If mmap_to_disk is set to false (recommended setting), PPM will only use RAM
// for memory.
// If mmap_to_disk is set to true, PPM memory will be saved to disk using mmap.
// This will reduce RAM usage, but will be slower as well. *Warning*: this will
// write a *lot* of data to disk, so can reduce the lifespan of SSDs. Not
// recommended for normal usage.
bool mmap_to_disk = true;

This variable is set to true by default, to comply with the Hutter Prize RAM limit.

Installing packages required for compiling fx-cmix compressor from sources on Ubuntu

Building fx-cmix compressor from sources requires clang-17, upx-ucl, and make packages. On Ubuntu, these packages can be installed by running the following scripts:

./install_tools/install_upx.sh
./install_tools/install_clang-17.sh

Compiling fx-cmix compressor from sources

A bash script is provided for compiling cmix-hp compressor from sources on Ubuntu. This script places the cmix-hp executable file named as cmix in ./run directory. The script can be run as

./build_and_construct_comp.sh

Running fx-cmix compressor

To run the cmix-hp compressor use

cd ./run
cmix -e <PATH_TO_ENWIK9> enwik9.comp

Running fx-cmix decompressor

The compressor is expected to output an executable file named archive9 in the same directory (./run). The file archive9 when executed is expected to reproduce the original enwik9 as a file named enwik9_restored. The executable file archive9 should be launched without argments from the directory containing it.

cd ./run
./archive9

fx-cmix's People

Contributors

kaitz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.