Giter Site home page Giter Site logo

christoofar / safexz Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 11.27 MB

A safe Go interface to liblzma, the "xz" compression library.

License: BSD 3-Clause "New" or "Revised" License

Go 3.29% C 79.18% CMake 4.14% Makefile 1.34% Shell 2.27% M4 3.78% C++ 0.50% Assembly 0.59% Roff 3.68% Batchfile 0.05% HTML 1.02% Common Lisp 0.15%
cgo-bindings compression-implementations xz xz-compression-utilities xz-utils

safexz's Introduction

safexz

A safe Go interface to liblzma, the "xz" compression library.

This is a Go package for compression in the xz / lzma format that provides a safer way to call liblzma without the fear of type-safety issues and simplifies the complexity of lzma-compression options into common use cases.

Installation (UNIX and Mac)

Install the liblzma-dev package from your favorite package manager. For Debian/Ubuntu, for instance:

sudo apt install liblzma-dev

then:

go get -u github.com/christoofar/safexz

At runtime you must ensure that liblzma.so (UNIX) liblzma.dll (MSYS2) can be properly found by your operating system. If you're on a Linux distro that has systemd, don't worry: you already have liblzma.

This package uses CGo. By default, that means your own project must not (at least directly) disable CGo. If you're willing to stick around, my next GitHub project is to demonstrate how you can CGO_ENABLE=0 but still vendor CGo support back into your program, which is a technique you can use to make your program hot-swappable between the musl C standard library and the GNU C library. I use this technique to make tiny CGo-supporting containers!

Installation on Windows

The easiest way to getting this done is to use MSYS2. You will be compiling your Go program with CGo using the GCC compiler that is in MinGW64 (which comes with MSYS2).

After installing MSYS2, open a MinGW64 command prompt and issue:

pacman -S base-devel git mingw-w64-x86_64-gcc mingw-w64-x86_64-lzma

Open a MinGW64 command prompt and run nano /etc/profile and add the approprite paths to the go compiler at the bottom of this file. In my case I did this:

export GOPATH="/c/Users/chris/go"
export PATH="/go/bin:${PATH}:${GOPATH}/bin"

The /go/bin folder should be going to your go.exe compiler. What I did to make this easier is symlink the default Go for Windows bin folder to /go like this:

ln -s /c/Program\ Files/Go /go

Restart your machine so all the paths can be found. Open a MinGW64 command prompt and try go version to make sure the Go compiler is reachable.

If you are using Visual Studio Code as your IDE, it can help to make MSYS2 your default terminal program instead of Powershell. Go to File -> Preferences -> Settings and in settings.json incorporate this into your vscode settings:

"terminal.integrated.profiles.windows": {
    "MSYS2": {
        "path": "C:\\msys64\\usr\\bin\\bash.exe",
        "args": ["--login", "-i"],
        "env": {
            "MSYSTEM": "MINGW64",
            "CHERE_INVOKING": "1"
        }
    }
},
"terminal.integrated.defaultProfile.windows": "MSYS2"

Restart vscode. Now you should be able to compile your Go program that uses this library. Incorporate it into your Go project:

go get -u github.com/christoofar/safexz

Eliminating the MSYS2 dependency for Windows

When you are building your program that uses safexz using CGo on Windows, you don't need to install MSYS2 on the target machine. Compile your program this way:

go build -ldflags "-extldflags \"-static\""

This will statically link the lzma dependency from MSYS2 into your binary.

Making a Windows binary from Linux/Docker

You don't need Windows to make a Windows binary if you are developing in CGo. To do this, you need the MinGW64 GCC compiler:

sudo apt-get install gcc-multilib
sudo apt-get install gcc-mingw-w64

Then to build, just pass along the env setting to use the MinGW compiler

GOOS=windows GOARCH=amd64 CGO_ENABLED=1 CC=x86_64-w64-mingw32-gcc CXX=x86_64-w64-mingw32-g++ go build -ldflags "-extldflags \"-static\""

Usage

// Compressing a file
safexz.CompressFile("data/mydata.dat", "data/mydata.xz")

// Compressing a string
compressedString := safexz.CompressString("Hello World!")

// Direct-reading a compressed `xz` archive into a decompressed slice of bytes
myPicture := safexz.DecompressFileToMemory("images/monalisa.png.xz")

// Raising the compression level and asking for all cores to be
// engaged in compression
myPictureBytes := safexz.CompressFileToMemory("images/monalisa.png", CompressionFullPowerBetter)

// Forwarding an uncompressed data stream into a compressed writer stream,
// compressing as it goes
safexz.CompressStream(networkLogSource, compressedStreamWriter)  // XZWriter is the writer

// Forwarding an XZ-compressed stream into an io.Writer stream,
// decompressing as it goes
safexz.DecompressStream(compressedStream, streamToWriteTo) // XZReader is the reader

// Wrapping an existing reader behind XZReader, which will decompress the xz/lzma stream
// inside
myXZreader := safexz.XZReader.NewReader(myCompressedDataReader)  // type XZReader

// Wrapping an existing writer behind XZWriter, so the underlying writer sees and will
// save or send a valid `xz` data stream
myXZWriter := safexz.XZWriter.NewWriter(file) // type XZWriter

// Compressing a stream with XZWriter while reading its contents
	resp, err := http.Get("https://media.istockphoto.com/id/1453319272/photo/columbus-ohio-usa-skyline-on-the-scioto-river.jpg?s=2048x2048&w=is&k=20&c=tgQ4HAX-dX7A1XTanxHMrkFOg5Fpa2kW87m96JKLcUM=")

	if err != nil {
		t.Error("Error downloading image:", err)
	}
	defer resp.Body.Close()

	// Compress the image
	f, err := os.Create("test.jpg.xz")
	if err != nil {
		t.Error("Error creating compressed file:", err)
	}
	defer f.Close()
	compressedImageWriter := safexz.NewWriter(f)
	_, err = io.Copy(compressedImageWriter, resp.Body)
	if err != nil {
		t.Error("Error compressing image:", err)
	}

	compressedImageWriter.Close()

Full API documentation with even more functions than this can be found at go.dev Please review the Go test cases, as they test over 90% of all the code.

xz backdoor

Late March 2024 CVE-2024-3094 was issued against the liblzma compression library for a supply chain attack. That episode began with the attacker gaining maintainer role on the FOSS project in a 2 year campaign to inject a highly-complicated and stealty backdoor into the software by injecting its own pre-compiled and ready-to-be-linked .o file into the build stream.

Since then, systemd and OpenSSH, the two primary projects the xz backdoor exploited to gain access to sshd have since made code changes that remove liblzma.so from static linking.

I'm not a firm believer that dlopen() is really much of a cure than it is a quick excuse to ignore your supply chain. And still: dynamic linking in software is frought with other problems, one of which is security.

I am pinning my version of lzma to post-backdoor versions as Lasse Collins refactors out the work of Jia Tan. Similarly, there have been recent commits by Sam James @thesamesam done in the spirit of simplifying maintenance of liblzma so that the build chain is more compatible with the expectations of modern C developers. The software is healing, the backdoor in xz is dead, and by the time you've found this project, new versions of liblzma will ready for production use.

limited but easy interface

Rather than expose the call-chain dependencies of lzma directly to you, a simpler Go interface is provided for your integration projects. The interface breaks down your calls into simpler one way chan transfers stream to lzma.

Direct file and []byte, I/O functions for syntactic sugar are provided, so that you can avoid standing up a stream monitor and input feeder.

safexz wraps the lzma library in isolated go routines

This project's goal is to abide by a chaos/containment theory which believes unless you wrote the C code, tread carefully.

To increase the difficulty of a user with a suspicious C library gaining control over your own go program that uses a C library, safexz:

  • Hides C call access and data structures as internal packages
  • No call is made to a C library without boxing it inside 2 layers of goroutines
  • All communication of data to and from the C library's functions must take place from a separate goroutine and data may only pass via channels using packed types (fields without pointers) wherever possible.
  • Go pointers into lzma are not possible
  • No one using this library would ever have to manage liblzma's internal state, nor will liblzma be able to "see" your data types of your own code except for the []byte stream of data to be compressed/decompressed.

More detail about this technique is over here.

credits

This work is based off the direct liblzma library stubs published by @jamespfennell under the MIT license. liblzma is the published work of Lasse Collin and many others and is published under the 0BSD software license.

safexz is the published work of Christopher Sawyer and is made available under the BSD3 license.

safexz's People

Contributors

christoofar avatar

Watchers

Neustradamus avatar  avatar

safexz's Issues

[TODO] Set up cgo-minimal container repo in DockerHub

  • Make 2 versions, one with musl and one glibc
  • add libpthread.so
  • add libc.so
  • add ld.so
  • add liblzma.so
  • make it follow the alpine/debian base images it's getting the deps from
  • so people know how to use it, write CGo Hello World and use it in the go-minimal, and for fun it will use safexz to showcase doing CGo as a minimal container type

Building: MSYS2

Need to write a step by step instruction list for bringing liblzma.dll from MSYS2 into a Go project so you can then use safexz

Retrogression: Slackware

Retrogression Test

  • Set up VMs for earlier LTS
  • Match with Go release appropriate to that era
  • git clone and go test
  • If BREAK=YES, move up Go release to where it works again and answer this:
    • Can go.mod be moved backwards to go back to earlier Go release?
    • Was the breakage caused by liblzma.so itself?
    • Anything else needs to be done to make safexz work at that release level?
    • Repeat these questions, but roll Slackware back even further until PONR is found.

Retrogression: Debian

Retrogression Test

  • Set up VMs for earlier LTS
  • Match with Go release appropriate to that era
  • git clone and go test
  • If BREAK=YES, move up Go release to where it works again and answer this:
    • Can go.mod be moved backwards to go back to earlier Go release?
    • Was the breakage caused by liblzma.so itself?
    • Anything else needs to be done to make safexz work at that release level?
    • Repeat these questions, but roll Debian back even further until PONR is found.

Retrogression: First-ever release of Rocky Linux

Retrogression Test

Note
First Rocky essentially covers the entire supported commercial install base of RHEL and variants from when it forked CentOS.

  • Set up VMs for earlier LTS
  • Match with Go release appropriate to that era
  • git clone and go test
  • If BREAK=YES, move up Go release to where it works again and answer this:
    • Can go.mod be moved backwards to go back to earlier Go release?
    • Was the breakage caused by liblzma.so itself?
    • Anything else needs to be done to make safexz work at that release level?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.