Giter Site home page Giter Site logo

archive-patcher's Introduction

Archive Patcher Documentation

Copyright 2016 Google Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Table of Contents

Introduction

Archive-patcher is an open-source project that allows space-efficient patching of zip archives. Many common distribution formats (such as jar and apk) are valid zip archives; archive-patcher works with all of them.

Because the patching process examines each individual file within the input archives, we refer to the process as File-by-File patching and an individual patch generated by that process as a File-by-File patch. Archive-patcher processes almost all zip files, but it is most efficient for zip files created with "standard" tools like PKWARE's 'zip', Oracle's 'jar', and Google's 'aapt'.

By design, File-by-File patches are uncompressed. This allows freedom in choosing the best compression algorithms for a given use case. It is usually best to compress the patches for storage or transport.

Note: Archive-patcher does not currently handle 'zip64' archives (archives supporting more than 65,535 files or containing files larger than 4GB in size).

How It Works

Archive-patcher transforms archives into a delta-friendly space to generate and apply a delta. This transformation involves uncompressing the compressed content that has changed, while leaving everything else alone. The patch applier then recompresses the content that has changed to create a perfect binary copy of the original input file. In v1, bsdiff is the delta algorithm used within the delta-friendly space. Much more information on this subject is available in the Appendix.

Diagrams and examples follow. In these examples we will use an old archive and a new archive, each containing 3 files: foo.txt, bar.xml, and baz.lib:

  • foo.txt has changed its content between the old and new archives. It is uncompressed from both the old and new archives during transformation to the delta-friendly space. This will allow the delta between v1 and v2 of the file to be encoded efficiently.
  • bar.xml has also changed its content between the old and new archives. It is already uncompressed in the old and new archives, so it is left alone during transformation to the delta-friendly space. The delta between v1 and v2 of the file can already be encoded efficiently.
  • baz.lib has not changed between the old and new archives. It is left alone during transformation to the delta-friendly space because it has not changed and the delta for an unchanged file is trivially empty.

Generating a Patch

  1. Determine which files in the new archive have changed from the old archive.
  2. Determine which of the changed files from (1) have deflate settings that can be determined and record those settings.
  3. Determine the original offsets and lengths of all files in (2) in both the old and new archives.
  4. Create delta-friendly versions of both the old and new archives, uncompressing the files from (2). The resulting intermediate artifacts are called delta-friendly blobs; they are no longer valid zip archives.
  5. Generate a delta between the old and new delta-friendly blobs from (4).
  6. Output the patch carrying the data from (2), (3) and (5).
File-by-File v1: Patch Generation Overview


                      Delta-Friendly       Delta-Friendly
   Old Archive           Old Blob             New Blob            New Archive
 ----------------    ----------------     ----------------    ----------------
 |   foo.txt    |    |   foo.txt    |     |   foo.txt    |    |   foo.txt    |
 |   version 1  |    |   version 1  |     |   version 2  |    |   version 2  |
 | (compressed) |    |(uncompressed)|     |(uncompressed)|    | (compressed) |
 |--------------|    |              |     |              |    |--------------|
 |   bar.xml    |    |              |     |              |    |   bar.xml    |
 |   version 1  |    |--------------|     |--------------|    |   version 2  |
 |(uncompressed)|--->|   bar.xml    |--┬--|   bar.xml    |<---|(uncompressed)|
 |--------------|    |   version 1  |  |  |   version 2  |    |--------------|
 |   baz.lib    |    |(uncompressed)|  |  |(uncompressed)|    |   baz.lib    |
 |   version 1  |    |--------------|  |  |--------------|    |   version 1  |
 | (compressed) |    |   baz.lib    |  |  |   baz.lib    |    | (compressed) |
 ----------------    |   version 1  |  |  |   version 1  |    ----------------
        |            | (compressed) |  |  | (compressed) |            |
        |            ----------------  |  ----------------            |
        v                              v                              v
 ----------------                 ----------                  ----------------
 |Uncompression |                 | delta  |                  |Recompression |
 |   metadata   |                 ----------                  |   metadata   |
 ----------------                      |                      ----------------
        |                              v                              |
        |                   ----------------------                    |
        └------------------>|  File-by-File v1   |<-------------------┘
                            |       Patch        |
                            ----------------------

Applying a Patch

  1. Reconstruct the delta-friendly old blob using information from the patch.
  2. Apply the delta to the delta-friendly old blob generated in (1). This generates the delta-friendly new blob.
  3. Recompress the files in the delta-friendly new blob using information from the patch. The result is the "new archive" that was the original input to the patch generator.
File-by-File v1: Patch Application Overview


                      Delta-Friendly       Delta-Friendly
   Old Archive           Old Blob             New Blob           New Archive
 ----------------    ----------------     ---------------     ----------------
 |   foo.txt    |    |   foo.txt    |     |   foo.txt    |    |   foo.txt    |
 |   version 1  |    |   version 1  |     |   version 2  |    |   version 2  |
 | (compressed) |    |(uncompressed)|     |(uncompressed)|    | (compressed) |
 |--------------|    |              |     |              |    |--------------|
 |   bar.xml    |    |              |     |              |    |   bar.xml    |
 |   version 1  |    |--------------|     |--------------|    |   version 2  |
 |(uncompressed)|-┬->|   bar.xml    |     |   bar.xml    |-┬->|(uncompressed)|
 |--------------| |  |   version 1  |     |   version 2  | |  |--------------|
 |   baz.lib    | |  |(uncompressed)|     |(uncompressed)| |  |   baz.lib    |
 |   version 1  | |  |--------------|     |--------------| |  |   version 1  |
 | (compressed) | |  |   baz.lib    |     |   baz.lib    | |  | (compressed) |
 ---------------- |  |   version 1  |     |   version 1  | |  ----------------
                  |  | (compressed) |     | (compressed) | |
                  |  ----------------     ---------------- |
                  |         |                    ^         |
 ---------------- |         |     ----------     |         |  ----------------
 |Uncompression |-┘         └---->| delta  |-----┘         └--|Recompression |
 |   metadata   |                 ----------                  |   metadata   |
 ----------------                      ^                      ----------------
        ^                              |                              ^
        |                   ----------------------                    |
        └-------------------|  File-by-File v1   |--------------------┘
                            |       Patch        |
                            ----------------------

Handled Cases

The examples above used two simple archives with 3 common files to help explain the process, but there is significantly more nuance in the implementation. The implementation searches for and handles changes of many types, including some trickier edge cases such as a file that changes compression level, becomes compressed or becomes uncompressed, or is renamed without changes.

Files that are only in the new archive are always left alone, and the delta usually encodes them as a literal copy. Files that are only in the old archive are similarly left alone, and the delta usually just discards their bytes completely. And of course, files whose deflate settings cannot be inferred are left alone, since they cannot be recompressed and are therefore required to remain in their existing compressed form.

Note: The v1 implementation does not detect files that are renamed and changed at the same time. This is the domain of similar-file detection, a feature deemed desirable - but not critical - for v1.

Sample Code: Generating a Patch

The following code snippet illustrates how to generate a patch and compress it with deflate compression. The example in the subsequent section shows how to apply such a patch.

import com.google.archivepatcher.generator.FileByFileV1DeltaGenerator;
import com.google.archivepatcher.shared.DefaultDeflateCompatibilityWindow;
import java.io.File;
import java.io.FileOutputStream;
import java.util.zip.Deflater;
import java.util.zip.DeflaterOutputStream;

/** Generate a patch; args are old file path, new file path, and patch file path. */
public class SamplePatchGenerator {
  public static void main(String... args) throws Exception {
    if (!new DefaultDeflateCompatibilityWindow().isCompatible()) {
      System.err.println("zlib not compatible on this system");
      System.exit(-1);
    }
    File oldFile = new File(args[0]); // must be a zip archive
    File newFile = new File(args[1]); // must be a zip archive
    Deflater compressor = new Deflater(9, true); // to compress the patch
    try (FileOutputStream patchOut = new FileOutputStream(args[2]);
        DeflaterOutputStream compressedPatchOut =
            new DeflaterOutputStream(patchOut, compressor, 32768)) {
      new FileByFileV1DeltaGenerator().generateDelta(oldFile, newFile, compressedPatchOut);
      compressedPatchOut.finish();
      compressedPatchOut.flush();
    } finally {
      compressor.end();
    }
  }
}

Sample Code: Applying a Patch

The following code snippet illustrates how to apply a patch that was compressed with deflate compression, as in the previous example.

import com.google.archivepatcher.applier.FileByFileV1DeltaApplier;
import com.google.archivepatcher.shared.DefaultDeflateCompatibilityWindow;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.zip.Inflater;
import java.util.zip.InflaterInputStream;

/** Apply a patch; args are old file path, patch file path, and new file path. */
public class SamplePatchApplier {
  public static void main(String... args) throws Exception {
    if (!new DefaultDeflateCompatibilityWindow().isCompatible()) {
      System.err.println("zlib not compatible on this system");
      System.exit(-1);
    }
    File oldFile = new File(args[0]); // must be a zip archive
    Inflater uncompressor = new Inflater(true); // to uncompress the patch
    try (FileInputStream compressedPatchIn = new FileInputStream(args[1]);
        InflaterInputStream patchIn =
            new InflaterInputStream(compressedPatchIn, uncompressor, 32768);
        FileOutputStream newFileOut = new FileOutputStream(args[2])) {
      new FileByFileV1DeltaApplier().applyDelta(oldFile, patchIn, newFileOut);
    } finally {
      uncompressor.end();
    }
  }
}

Background

Patching software exists primarily to make updating software or data files spatially efficient. This is accomplished by figuring out what has changed between the inputs (usually an old version and a new version of a given file) and transmitting only the changes instead of transmitting the entire file. For example, if we wanted to update a dictionary with one new definition, it's much more efficient to send just the one updated definition than to send along a brand new dictionary! A number of excellent algorithms exist to do just this - diff, bsdiff, xdelta and many more.

In order to generate spatially efficient patches for zip archives, the content within the zip archives needs to be uncompressed. This necessitates recompressing after applying a patch, and this in turn requires knowing the settings that were originally used to compress the data within the zip archive and being able to reproduce them exactly. These three problems are what make patching zip archives a unique challenge, and their solutions are what make archive-patcher interesting. If you'd like to read more about this now, skip down to Interesting Obstacles to Patching Archives.

The File-by-File v1 Patch Format

The v1 patch format is a sequence of bytes described below. Care has been taken to make the format friendly to streaming, so the order of fields in the patch is intended to reflect the order of operations needed to apply the patch. Unless otherwise noted, the following constraints apply:

  • All integer fields contain unsigned, big endian​ values. However:
  • 32-bit integer fields have a maximum value of 2^31 ­- 1 (due to limitations in Java)
  • 64-bit integer fields have a maximum value of 2^63 ­- 1 (due to limitations in Java)
|------------------------------------------------------|
| Versioned Identifier (8 bytes) (UTF-8 text)          | Literal: "GFbFv1_0"
|------------------------------------------------------|
| Flags (4 bytes) (currently unused, but reserved)     |
|------------------------------------------------------|
| Delta-friendly old archive size (8 bytes) (uint64)   |
|------------------------------------------------------|
| Num old archive uncompression ops (4 bytes) (uint32) |
|------------------------------------------------------|
| Old archive uncompression op 1...n (variable length) | (see definition below)
|------------------------------------------------------|
| Num new archive recompression ops (4 bytes) (uint32) |
|------------------------------------------------------|
| New archive recompression op 1...n (variable length) | (see definition below)
|------------------------------------------------------|
| Num delta descriptor records (4 bytes) (uint32)      |
|------------------------------------------------------|
| Delta descriptor record 1...n (variable legth)       | (see definition below)
|------------------------------------------------------|
| Delta 1...n (variable length)                        | (see definition below)
|------------------------------------------------------|

Old Archive Uncompression Op

The number of these entries is determined by the "Num old archive uncompression ops" field previously defined. Each entry consists of an offset (from the beginning of the file) and a number of bytes to uncompress. Important notes:

  • Entries must be ordered in ascending order by offset. This is to allow the transformation of the old archive into the delta-friendly space to be done by reading a the old archive as a stream, instead of requiring random access.
  • Entries must not overlap (for sanity)
  • Areas of the old archive that are not included in any uncompression op will be left alone, i.e. represent arbitrary data that should not be uncompressed, such as zip structural components or blocks of data that are stored without compression already.
|------------------------------------------------------|
| Offset of first byte to uncompress (8 bytes) (uint64)|
|------------------------------------------------------|
| Number of bytes to uncompress (8 bytes) (uint64)     |
|------------------------------------------------------|

New Archive Recompression Op

The number of these entries is determined by the "Num new archive recompression ops" field previously defined. Like an old archive uncompression op, each entry consists of an offset - but this time from the beginning of the delta-friendly new blob. This is followed by the number of bytes to compress, and finally a compression settings field. Important notes:

  • Entries must be ordered in ascending order by offset. This allows the output from the delta apply process (which creates the delta-friendly new blob) to be piped to an intelligent partially-compressing stream that is seeded with the knowledge of which ranges to recompress and the settings to use for each. This avoids the need to write the delta-friendly new blob to persistent storage, an important optimization.
  • Entries must not overlap (for sanity)
  • Areas of the new archive that are not included in any recompression op will be copied through from the delta-friendly new blob without modification. These represent arbitrary data that should not be compressed, such as zip structural components or blocks of data that are stored without compression in the new archive.
|------------------------------------------------------|
| Offset of first byte to compress (8 bytes) (uint64)  |
|------------------------------------------------------|
| Number of bytes to compress (8 bytes) (uint64)       |
|------------------------------------------------------|
| Compression settings (4 bytes)                       | (see definition below)
|------------------------------------------------------|

Compression Settings

The compression settings define the deflate level (in the range 1 to 9, inclusive), the deflate strategy (in the range 0 to 2, inclusive) and the wrapping mode (wrap or nowrap). The settings are specific to a compatibility window, discussed in the next section in more detail.

In practice almost all entries in zip archives have strategy 0 (the default) and wrapping mode 'nowrap'. The other strategies are primarily used in-situ, e.g., the compression used within the PNG format; wrapping, on the other hand, is almost exclusively used in gzip operations.

|------------------------------------------------------|
| Compatibility window ID (1 byte) (uint8)             | (see definition below)
|------------------------------------------------------|
| Deflate level (1 byte) (uint8) (range: [1,9])        |
|------------------------------------------------------|
| Deflate strategy (1 bytes) (uint8) (range: [0,2]     |
|------------------------------------------------------|
| Wrap mode (1 bytes) (uint8) (0=wrap, 1=nowrap)       |
|------------------------------------------------------|

Compatibility Window

A compatibility window specifies a compression algorithm along with a range of versions and platforms upon which it is known to produce predictable and consistent output. That is, all implementations within a given compatibility window must produce identical output for any identical inputs consisting of bytes to be compressed along with the compression settings (level, strategy, wrapping mode).

In File-by-File v1, there is only one compatibility window defined. It is the default deflate compatibility window, having ID=0 (all other values reserved for future expansion), and it specifies the following configuration:

  • Algorithm: deflate (zlib)
  • Window length: 32,768 bytes (hardcoded and implied, not explicitly set)
  • Valid compression levels: 1 through 9 (0 means store, and is unused)
  • Valid strategies: 0, 1, or 2 (java.util.zip does not support any later strategies)
  • Valid wrapping modes: wrap, nowrap

The default compatibility window is compatible with the following runtime environments based on empirical testing. Other environments may be compatible, but the ones in this table are known to be.

Runtime Environment OS Hardware Architectures Min Version Max Version Notes
Sun/Oracle JRE (including OpenJDK) Linux x64 1.7 (07 Jul, 2011) None known as of September 2016 Still compatible as of 1.8, the latest as of August 2016. Versions prior to 1.7 have different level_flags (see zlib change).
Dalvik/ART Android armeabi­v7a, arm64­v8a, x86 API 15 (19 Oct, 2011) None known as of September 2016 Still compatible as of API 24 (Nougat), the latest as of September 2016. Versions prior to API 15 (Ice Cream Sandwich) used a smaller sliding window size (see AOSP change).

Delta Descriptor Record

Delta descriptor records are grouped together before any of the actual deltas. In File-by-File v1 there is always exactly one delta, so there is exactly one delta descriptor record followed immediately by the delta data. Conceptually, the descriptor defines input and output regions of the archives along with a delta to be applied to those regions (reading from one, and writing to the other).

In subsequent versions there may be arbitrarily many deltas. When there is more than one delta, all the descriptors are listed in a contiguous block followed by all of the deltas themselves, also in a contiguous block. This allows the patch applier to pre­process the list of all deltas that are going to be applied and allocate resources accordingly. As with the other descriptors, these must be ordered by ascending offset and overlaps are not allowed.

|------------------------------------------------------|
| Delta format ID (1 byte) (uint8)                     |
|------------------------------------------------------|
| Old delta-friendly region start (8 bytes) (uint64)   |
|------------------------------------------------------|
| Old delta-friendly region length (8 bytes) (uint64)  |
|------------------------------------------------------|
| New delta-friendly region start (8 bytes) (uint64)   |
|------------------------------------------------------|
| New delta-friendly region length (8 bytes) (uint64)  |
|------------------------------------------------------|
| Delta length (8 bytes) (uint64)                      |
|------------------------------------------------------|

Description of the fields within this record are a little more complex than in the other parts of the patch:

  • Delta format: The only delta format in File-by-File v1 is bsdiff, having ID=0.
  • Old delta-friendly region start: The offset into the old archive (after transformation into the delta-friendly space) to which the delta applies. In File-by-File v1, this is always zero.
  • Old delta-friendly region length: The number of bytes in the old archive (again, after transformation into the delta-friendly space) to which the delta applies. In File-by-File v1, this is always the length of the old archive in the delta-friendly space.
  • New delta-friendly region start: The offset into the new archive (before transformation out of the delta-friendly space) to which the delta applies. In File-by-File v1, this is always zero.
  • New delta-friendly region length: The number of bytes in the new archive (again, before transformation out of the delta-friendly space) to which the delta applies. In File-by-File v1, this is always the length of the new archive in the delta-friendly space.
  • Delta length: The number of bytes in the actual delta (e.g., a bsdiff patch) that needs to be applied to the regions defined above. The type of the delta is determined by the delta format, also defined above.

Appendix

Interesting Obstacles to Patching Archives

Problem #1: Spatial Efficiency

Problem: Zip files make patching hard because compression obscures the changes. Deflate, the compression algorithm used most widely in zip archives, uses a 32k "sliding window" to compress, carrying state with it as it goes. Because state is carried along, even small changes to the data that is being compressed can result in drastic changes to the bytes that are output - even if the size remains similar. If you change the definition of 'aardvark' in our imaginary dictionary (from back in the Background section) and zip both the old and new copies, the resulting zip files will be about the same size, but will have very different bytes. If you try to generate a patch between the two zip files with the same algorithm you used before (e.g., bsdiff) you'll find that the resulting patch file is much, much larger - probably about the same size of one of the zip files. This is because the files are too dissimilar to express any changes succinctly, so the patching algorithm ends up having to just embed a copy of almost the entire file.

Solution: Archive-patcher transforms the input archives into what we refer to as delta-friendly space where changed files are stored uncompressed, allowing diffing algorithms like bsdiff to function far more effectively.

Note: There are techniques that can be applied to deflate to isolate changes and stop them from causing the entire output to be different, such those used in rsync-friendly gzip. However, zip archives created with such techniques are uncommon - and tend to be slightly larger in size.

Problem #2: Correctness When Generating Patches

Problem: In order for the generated patch to be correct, we need to know the original deflate settings that were used for any changed content that we plan to uncompress during the transformation to the delta-friendly space. This is necessary so that the patch applier can recompress that changed content after applying the delta, such that the resulting archive is exactly the same as the input to the patch generator. The deflate settings we care about are the level, strategy, and wrap mode.

Solution: Archive-patcher iteratively recompresses each piece of changed content with different deflate settings, looking for a perfect match. The search is ordered based on empirical data and one of the first 3 guesses is extremely likely to succeed. Because deflate has a stateful and small sliding window, mismatches are quickly identified and discarded. If a match is found, the corresponding settings are added to the patch stream and the content is uncompressed in-place as previously described; if a match is not found then the content is left compressed (because we lack any way to tell the patch applier how to recompress it later).

Note: While it is possible to change other settings for deflate (like the size of its sliding window), in practice this is almost never done. Content that has been compressed with other settings changed will be left compressed during patch generation.

Problem #3: Correctness When Applying Patches

Problem: The patch applier needs to know that it can reproduce deflate output in exactly the same way as the patch generator did. If this is not possible, patching will fail. The biggest risk is that the patch applier's implementation of deflate differs in some way from that of the patch generator that detected the deflate settings. Any deviation will cause the output to diverge from the original input to the patch generator. Archive-patcher relies on the java.util.zip package which in turn wraps a copy of zlib that ships with the JRE. It is this version of zlib that provides the implementation of deflate.

Solution: Archive-patcher contains a ~9000 byte corpus of text that produces a unique output for every possible combination of deflate settings that are exposed through the java.util.zip interface (level, strategy, and wrapping mode). These outputs are digested to produce "fingerprints" for each combination of deflate settings on a given version of the zlib library; these fingerprints are then hard-coded into the application. The patch applier checks the local zlib implementation's suitability by repeating the process, deflating the corpus with each combination of java.util.zip settings and digesting the results, then checks that the resulting fingerprints match the hard-coded values.

Note: At the time of this writing (September, 2016), all zlib versions since 1.2.0.4 (dated 10 August 2003) have identical fingerprints. This includes every version of Sun/Oracle Java from 1.6.0 onwards on x86 and x86_64 as well as all versions of the Android Open Source Project from 4.0 onward on x86, arm32 and arm64. Other platforms may also be compatible but have not been tested.

Note: This solution is somewhat brittle, but is demonstrably suitable to cover 13 years of zlib updates. Compatibility may be extended in a future version by bundling specific versions of zlib with the application to avoid a dependency upon the zlib in the JRE as necessary.

Areas For Improvement

The File-by-File v1 patching process dramatically improves the spatial efficiency of patches for zip archives, but there are many improvements that can still be made. Here are a few of the more obvious ones that did not make it into v1, but are good candidates for inclusion into later versions:

  • Support for detecting "similar" files between the old and new archives to handle renames that are coupled with content changes.
  • Support for additional versions of zlib or other implementations of deflate.
  • Support for other archive formats.
  • Support for other delta algorithms.
  • Support for more than one delta (i.e., applying different algorithms to different regions of the archives).
  • Support for file-specific transformations (i.e., delta-friendly optimization of different files types during the transformation into the delta-friendly space).
  • Support for partial decompression (i.e., only undoing the Huffman coding steps of deflate and operating on the LZ77 instruction stream during patch generation, allowing a much faster "recompression" step that skips LZ77).

Acknowledgements

Major software contributions, in alphabetical order:

  • Andrew Hayden - design, implementation, documentation
  • Anthony Morris - code reviews, cleanups, div suffix sort port, and invaluable suggestions
  • Glenn Hartmann - code reviews, initial bsdiff port and quick suffix sort port, bsdiff cleanups

Additionally, we wish to acknowledge the following, also in alphabetical order:

  • Colin Percival - the author of bsdiff
  • Mark Adler - the author of zlib
  • N. Jesper Larsson and Kunihiko Sadakane - for their paper "Faster Suffix Sorting", basis of the quick suffix sort algorithm
  • PKWARE, Inc. - creators and stewards of the zip specification
  • Yuta Mori - for the C implementation of the "div suffix sort" (libdivsufsort)

Contact

[email protected]

archive-patcher's People

Contributors

andrewhayden avatar archive-patcher-sync-robot avatar canamaria avatar juliantoledo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

archive-patcher's Issues

How to build code on branch v2

Hi,
I'm not very familiar with java and when I want to build the code on branch v2, I get the following error:

soheil@soheil-thinkpad:~/projects/github/archive-patcher$ gradle build

Starting a Gradle Daemon (subsequent builds will be faster)

FAILURE: Build failed with an exception.

* Where:
Build file '/home/soheil/projects/github/archive-patcher/applier/build.gradle' line: 9

* What went wrong:
A problem occurred evaluating project ':applier'.
> Could not find method annotationProcessor() for arguments [com.google.auto.value:auto-value:1.6.2] on object of type org.gradle.api.internal.artifacts.dsl.dependencies.DefaultDependencyHandler.

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 3s

Could anyone help me with how can I fix this error?

ps: I can build source code from the master branch but since there are lots of files that I need to create a patch for them daily, I really need a huge improvement on the algorithm's performance. I would really thankful if you would give me some tips on how to improve the algorithm's speed.

Thanks in advance.

On Android 12 devices, The file generated with the patch was different from the origin file;

On Android 12 devices(Almost all of android 12 devices), I found in certain cases the file generated with the patch was different from the origin file;

env: android os version:android12,archive-patcher version:v1.1

my test files as flows:
old file:19onlyjar.zip
patch file:patch-onlyjar.zip
origin file:20onlyjar-ori.zip

how to reproduce:
1、unzip patch file first;
2、apply the patch with the following code:
` public static void applyPatch(File oldFile, File patchFile, File newFile, File tempDir) throws IOException {

   if(!tempDir.exists()){
       tempDir.mkdirs();
   }
    BufferedInputStream bufferedPatchIn = null;
    BufferedOutputStream bufferedNewOut = null;
    try{
        FileByFileV1DeltaApplier applier = new FileByFileV1DeltaApplier(tempDir);
        FileInputStream patchIn = new FileInputStream(patchFile);
        bufferedPatchIn = new BufferedInputStream(patchIn);
        FileOutputStream newOut = new FileOutputStream(newFile);
        bufferedNewOut = new BufferedOutputStream(newOut);
        applier.applyDelta(oldFile, bufferedPatchIn, bufferedNewOut);
        bufferedNewOut.flush();
    }finally {
        if(bufferedPatchIn != null){
            bufferedPatchIn.close();
        }
        if(bufferedPatchIn != null){
            bufferedPatchIn.close();
        }
    }
}`

3、compare the md5 between the origin file and the generated file in step2

19onlyjar.zip
patch-onlyjar.zip
20onlyjar-ori.zip

Some zlib versions fail the DefaultDeflateCompatibilityWindow compatability test

The new DefaultDeflateCompatibilityWindow().isCompatible() test fails on many Linux distros currently, including: Debian 10, 11, Fedora 32, and Arch, however passes on at least Ubuntu 18.0.4 and 20.04. Tested on OpenJDK 11. This is due to a change in zlib:

commit 43bfaba3d718a27c1b137b1d1aa90d9427ab4a4f
Author: Mark Adler <[email protected]>
Date:   Sun Aug 2 00:02:07 2015 -0700

    Align deflateParams() and its documentation in zlib.h.

    This updates the documentation to reflect the behavior of
    deflateParams() when it is not able to compress all of the input
    data provided so far due to insufficient output space.  It also
    assures that data provided is compressed before the parameter
    changes, even if at the beginning of the stream.

 deflate.c |  3 +--
 zlib.h    | 38 ++++++++++++++++++++++++++------------
 2 files changed, 27 insertions(+), 14 deletions(-)

which is present in zlib v1.2.9, v1.2.10, and v1.2.11.

The original behavior was restored upstream in the develop branch in

commit f9694097dd69354b03cb8af959094c7f260db0a1
Author: Mark Adler <[email protected]>
Date:   Mon Jan 16 09:49:35 2017 -0800

    Permit a deflateParams() parameter change as soon as possible.

    This commit allows a parameter change even if the input data has
    not all been compressed and copied to the application output
    buffer, so long as all of the input data has been compressed to
    the internal pending output buffer. This also allows an immediate
    deflateParams change so long as there have been no deflate calls
    since initialization or reset.

which is unreleased - however Ubuntu is manually patching zlib with it in at least 18.04 and 20.04 - which is why those work.

We have additionally discovered disabling caching on the deflate compressor in DefaultDeflateCompatibilityWindow fixes this, too, eg:

    DeflateCompressor compressor = new DeflateCompressor();
    compressor.setCaching(false);

however I don't know if this is a safe change to make.

Support variable record length control fields properly

As described in the "internal file attributes" section of the spec:
https://www.pkware.com/documents/APPNOTE/APPNOTE-6.3.2.TXT

These should be trivially supported, and take the form of 4-byte headers prior 
to each "logical record". It is not clear from first glance what constitutes a 
"logical record", but presumably each Part (e.g., DataDescriptor, 
LocalFileEntry, FileData, etc...) is a "logical record".

Support for "dark bits" should fix the immediate concern, which is in 
manipulating archives that have these fields (see issue 1).

Original issue reported on code.google.com by [email protected] on 3 Aug 2014 at 8:30

Security Policy violation Binary Artifacts

This issue was automatically created by Allstar.

Security Policy Violation
Project is out of compliance with Binary Artifacts policy: binaries present in source code

Rule Description
Binary Artifacts are an increased security risk in your repository. Binary artifacts cannot be reviewed, allowing the introduction of possibly obsolete or maliciously subverted executables. For more information see the Security Scorecards Documentation for Binary Artifacts.

Remediation Steps
To remediate, remove the generated executable artifacts from the repository.

Artifacts Found

  • gradle/wrapper/gradle-wrapper.jar

Additional Information
This policy is drawn from Security Scorecards, which is a tool that scores a project's adherence to security best practices. You may wish to run a Scorecards scan directly on this repository for more details.


Allstar has been installed on all Google managed GitHub orgs. Policies are gradually being rolled out and enforced by the GOSST and OSPO teams. Learn more at http://go/allstar

This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Very slow to apply patch on Android device

Hi I try to use the applier on an Android device. However, for about 15MB old apk and 2MB patch file, it takes ~8s to complete. It is normal? (Seems to be very slow...) Thanks!

Better support for non-MSDOS "external" file attributes

As described in the "external file attributes" section of the spec:
https://www.pkware.com/documents/APPNOTE/APPNOTE-6.3.2.TXT

Currently, the library doesn't even attempt to parse these so extraction on any 
modern OS will almost certainly result in default (and possibly incorrect) file 
system attributes. This is relatively minor, since the library isn't intended 
to replace well-known command-line tools like "zip" and "jar", but it should 
still be fixed.

Original issue reported on code.google.com by [email protected] on 3 Aug 2014 at 8:44

Output Stream is closed during patch apply

The patch applier is close()ing the stream that is passed in because it traps the PartiallyDeflatingOutputStream in a finally block and close()'s there for safety. This should be a flush(), not a close(), as we don't own the output stream that is being used and other operations may need to be performed prior to close().

Don't close() the stream!

Update README to explain that bsdiff is used without compression

The README talks about the delta descriptor mentioning bsdiff, but omits the important fact that compression is not used in this version of bsdiff. That's really important, and we should update the docs to make this clear for anyone trying to write a compatible patch applier. By default bsdiff uses bzip2 compression, using ANY compression inside the patch itself will yield an incompatible patch.

For completeness, to help this item stand alone in search results: each delta within the patch is deliberately left uncompressed. This is so that an arbitrary compression algorithm can be applied to the entire patch, decoupling the compression technology from the patch generation and patch application technology. This allows file-by-file patch generators/consumers to take advantage of better compression technology as it comes along without any code changes inside the patch generator or patch applier. For example, you can trivially use bzip2, zstd, brotli, zopfli, lzma, etc as alternatives to deflate when storing and/or transmitting the patches to patch consumers.

Input too large

Hi there, I'm getting the following error when attempting to generate a patch for a large archive (just under 2GB). I've tried to use the master and v2 branches - both fail. It looks like there is a hardcoded size limit in DivSuffixSorter.

How can I make a patch for a file this large?

Thanks for your help!

Exception in thread "main" java.lang.IllegalArgumentException: Input too large (1973952678 bytes)
        at com.google.archivepatcher.generator.bsdiff.DivSuffixSorter.suffixSort(DivSuffixSorter.java:92)
        at com.google.archivepatcher.generator.bsdiff.BsDiffPatchWriter.generatePatch(BsDiffPatchWriter.java:370)
        at com.google.archivepatcher.generator.bsdiff.BsDiffPatchWriter.generatePatch(BsDiffPatchWriter.java:336)
        at com.google.archivepatcher.generator.bsdiff.BsDiffDeltaGenerator.generateDelta(BsDiffDeltaGenerator.java:52)
        at com.google.archivepatcher.generator.PatchWriter.writeDeltaEntry(PatchWriter.java:157)
        at com.google.archivepatcher.generator.PatchWriter.writePatch(PatchWriter.java:123)
        at com.google.archivepatcher.generator.FileByFileDeltaGenerator.generateDelta(FileByFileDeltaGenerator.java:120)
        at com.google.archivepatcher.generator.DeltaGenerator.generateDelta(DeltaGenerator.java:38)
        at com.google.archivepatcher.sample.SamplePatchGenerator.main(SamplePatchGenerator.java:43)

Failed to generate patch from APK: java.util.zip.ZipException: EOCD record not found in last 32k of archive, giving up

I am using this lib to generate patch from an apk file. My environment is docker in MacOS. My previous versions of apk file works very fine, but this version does not work :(

I would appreciate it if any suggestions could be given. Now I cannot deploy new versions of my app!

STDOUT: Exception in thread "main" java.util.zip.ZipException: EOCD record not found in last 32k of archive, giving up
STDOUT: 	at com.google.archivepatcher.generator.MinimalZipArchive.listEntriesInternal(MinimalZipArchive.java:72)
STDOUT: 	at com.google.archivepatcher.generator.MinimalZipArchive.listEntries(MinimalZipArchive.java:56)
STDOUT: 	at com.google.archivepatcher.generator.PreDiffExecutor.generatePreDiffPlan(PreDiffExecutor.java:210)
STDOUT: 	at com.google.archivepatcher.generator.PreDiffExecutor.prepareForDiffing(PreDiffExecutor.java:157)
STDOUT: 	at com.google.archivepatcher.generator.FileByFileV1DeltaGenerator.generateDelta(FileByFileV1DeltaGenerator.java:81)
STDOUT: 	at PatchGenerator.main(PatchGenerator.java:36)

Massive performance regression in v2 non-native generator

Even if you supply MmapByteSource's to FileByFileDeltaGenerator, it seems to replace them with RandomAccessFileByteSource's.

https://github.com/google/archive-patcher/blob/v2/generator/src/main/java/com/google/archivepatcher/generator/FileByFileDeltaGenerator.java#L108-L109

https://github.com/google/archive-patcher/blob/v2/shared/src/main/java/com/google/archivepatcher/shared/bytesource/ByteSource.java#L61-L63

The current RandomAccessFileByteSource's implementation seems to suffer extremely hard from seeks, an extremely common operation.

Simply swapping ByteSource.fromFile to always return a MmapByteSource greatly improves performance.

Both-way dependency.

It seems that share module depends on sharetest as well as sharetest depends on share.

java.lang.NullPointerException: Inflater has been closed

I am usinng archive-patcher on Android. I am getting below exception while generating patch, please help me.

2020-07-24 16:19:56.145 920-920/? E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.example.apkpatcher, PID: 920
java.lang.RuntimeException: Unable to start activity ComponentInfo{com.example.apkpatcher/com.example.apkpatcher.MainActivity}:java.lang.NullPointerException: Inflater has been closed
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3623)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3775)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2261)
at android.os.Handler.dispatchMessage(Handler.java:107)
at android.os.Looper.loop(Looper.java:237)
at android.app.ActivityThread.main(ActivityThread.java:8107)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:496)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1100)

Caused by: java.lang.NullPointerException: Inflater has been closed
at java.util.zip.Inflater.ensureOpen(Inflater.java:416)
at java.util.zip.Inflater.reset(Inflater.java:370)
at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:173)
at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:114)
at com.google.archivepatcher.generator.PreDiffExecutor.generatePreDiffPlan(PreDiffExecutor.java:216)
at com.google.archivepatcher.generator.PreDiffExecutor.prepareForDiffing(PreDiffExecutor.java:157)
at com.google.archivepatcher.generator.FileByFileV1DeltaGenerator.generateDelta(FileByFileV1DeltaGenerator.java:81)
at com.example.apkpatcher.DeltaCounter.generatePatch(DeltaCounter.java:97)
at com.example.apkpatcher.MainActivity.onCreate(MainActivity.java:37)
at android.app.Activity.performCreate(Activity.java:7957)
at android.app.Activity.performCreate(Activity.java:7946)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1307)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3598)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3775) 
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83) 
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135) 
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95) 
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2261) 
at android.os.Handler.dispatchMessage(Handler.java:107) 
at android.os.Looper.loop(Looper.java:237) 
at android.app.ActivityThread.main(ActivityThread.java:8107) 
at java.lang.reflect.Method.invoke(Native Method) 
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:496) 
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1100) 
2020-07-24 16:21:01.356 1578-1578/? E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.example.apkpatcher, PID: 1578
java.lang.RuntimeException: Unable to start activity ComponentInfo{com.example.apkpatcher/com.example.apkpatcher.MainActivity}: java.lang.NullPointerException: Inflater has been closed
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3623)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3775)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2261)
at android.os.Handler.dispatchMessage(Handler.java:107)
at android.os.Looper.loop(Looper.java:237)
at android.app.ActivityThread.main(ActivityThread.java:8107)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:496)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1100)
Caused by: java.lang.NullPointerException: Inflater has been closed
at java.util.zip.Inflater.ensureOpen(Inflater.java:416)
at java.util.zip.Inflater.reset(Inflater.java:370)
at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:173)
at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:114)
at com.google.archivepatcher.generator.PreDiffExecutor.generatePreDiffPlan(PreDiffExecutor.java:216)
at com.google.archivepatcher.generator.PreDiffExecutor.prepareForDiffing(PreDiffExecutor.java:157)
at com.google.archivepatcher.generator.FileByFileV1DeltaGenerator.generateDelta(FileByFileV1DeltaGenerator.java:81)
at com.example.apkpatcher.DeltaCounter.generatePatch(DeltaCounter.java:97)

Truth incompatible with android

Google truth imports unsafe and other unavailable apis on android.

Only test projects (And a single annotation in generator) rely on google truth.

Can be fixed by updating dependencies a little as in: PokeMMO@41b276a

Need more test data with externally-generated APKs, JARs, and ZIPs of varying types

The library should have more extensive test data generated with a variety of 
tools. Offhand, some of the following things make sense as common use cases:

1. Self-extracting ZIP archive
2. Executable Java Archives (JARs)
3. Android package files (APKs)

Beyond this, there are several edge case permutations that should be added:
1. Corrupt ZIP
2. ZIP containing the EOCD marker bytes in a comment block (correct seeking 
behavior should skip these bytes and find the "real" EOCD)
3. Archives with "opaque bits" (see issue 1)
4. Empty archives
5. Non-empty archives with empty resources inside

For correctness, these archives should not be generated using this library, but 
rather with external tools such as "zip", "jar", and "apt". This will help 
guarantee compatibility.

Original issue reported on code.google.com by [email protected] on 31 Jul 2014 at 12:12

java.lang.IllegalStateException: attempt to use Deflater after calling end

When I try the generate a Patch with the Generating a Patch sample code,
The app crash with the the following exception:
FATAL EXCEPTION: main Process: com.rotem.deltaupdatepoc2, PID: 6614 java.lang.RuntimeException: Unable to start activity ComponentInfo{com.rotem.deltaupdatepoc2/com.rotem.deltaupdatepoc2.MainActivity}: java.lang.IllegalStateException: attempt to use Deflater after calling end at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2416) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2476) at android.app.ActivityThread.-wrap11(ActivityThread.java) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1344) at android.os.Handler.dispatchMessage(Handler.java:102) at android.os.Looper.loop(Looper.java:148) at android.app.ActivityThread.main(ActivityThread.java:5417) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616) Caused by: java.lang.IllegalStateException: attempt to use Deflater after calling end at java.util.zip.Deflater.checkOpen(Deflater.java:476) at java.util.zip.Deflater.reset(Deflater.java:355) at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:183) at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:104) at com.google.archivepatcher.generator.PreDiffExecutor.generatePreDiffPlan(PreDiffExecutor.java:216) at com.google.archivepatcher.generator.PreDiffExecutor.prepareForDiffing(PreDiffExecutor.java:157) at com.google.archivepatcher.generator.FileByFileV1DeltaGenerator.generateDelta(FileByFileV1DeltaGenerator.java:81) at com.rotem.deltaupdatepoc2.SamplePatchGenerator.generatePatch(SamplePatchGenerator.java:39) at com.rotem.deltaupdatepoc2.MainActivity.cheackForPermissions(MainActivity.java:48) at com.rotem.deltaupdatepoc2.MainActivity.onCreate(MainActivity.java:20) at android.app.Activity.performCreate(Activity.java:6237) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1107) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2369) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2476)  at android.app.ActivityThread.-wrap11(ActivityThread.java)  at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1344)  at android.os.Handler.dispatchMessage(Handler.java:102)  at android.os.Looper.loop(Looper.java:148)  at android.app.ActivityThread.main(ActivityThread.java:5417)  at java.lang.reflect.Method.invoke(Native Method)  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616) 

Can anyone help please?

cpp/object-c implementation?

Thanks for this repo, before which we use bsdiff while archive-diff works much better with zip files. The problem is that we have clients of both Android and iOS, and it is not a bad idea to do the same diff and patch jobs within both platforms.

However, I can not find any similar implementation with cpp/object-c. Could anyone offer a way? I am trying to translate the patching job with my poor cpp.

Support For iOS

Thanks for archive-patcher. It solved my problem on Android. Now , I want to use archive-patcher on iOS, I want to know whether there is a brother project that could support me.

Android: java.lang.IllegalStateException: setLevel cannot be called after setInput

this is the stacktrack.

java.lang.IllegalStateException: setLevel cannot be called after setInput
at java.util.zip.Deflater.setLevel(Deflater.java:428)
at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:172)
at com.google.archivepatcher.generator.DefaultDeflateCompressionDiviner.divineDeflateParameters(DefaultDeflateCompressionDiviner.java:114)
at com.google.archivepatcher.generator.PreDiffExecutor.generatePreDiffPlan(PreDiffExecutor.java:216)
at com.google.archivepatcher.generator.PreDiffExecutor.prepareForDiffing(PreDiffExecutor.java:157)
at com.google.archivepatcher.generator.FileByFileV1DeltaGenerator.generateDelta(FileByFileV1DeltaGenerator.java:81)
at com.xingbianli.app.MainActivity$1$1.run(MainActivity.java:51)
at java.lang.Thread.run(Thread.java:818)

this is my code.

 findViewById(R.id.diff).setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {
                new Thread(new Runnable() {
                    @Override
                    public void run() {
                        if (!new DefaultDeflateCompatibilityWindow().isCompatible()) {
                            System.err.println("zlib not compatible on this system");
                            System.exit(-1);
                        }
                        File oldFile = new File(old); // must be a zip archive
                        File newFile = new File(newversion); // must be a zip archive
                        Deflater compressor = new Deflater(9, true); // to compress the patch
                        try {
                            FileOutputStream patchOut = new FileOutputStream(pat);
                            DeflaterOutputStream compressedPatchOut =
                                    new DeflaterOutputStream(patchOut, compressor, 32768);
                            new FileByFileV1DeltaGenerator().generateDelta(oldFile, newFile, compressedPatchOut);
                            compressedPatchOut.finish();
                            compressedPatchOut.flush();
                        } catch (IOException e) {
                            e.printStackTrace();
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        } finally {
                            compressor.end();
                        }
                    }
                }).start();
            }
        });

java.util.zip.ZipException: EOCD record not found in last 32k of archive, giving up

Getting above exception while using tar.gz file of size 81MB.
I tried changing the file permissions but it didn't help.
Is there a size limitation with current implementation? Any other leads are appreciable. Thanks.

StackTrace:
Exception in thread "main" java.util.zip.ZipException: EOCD record not found in last 32k of archive, giving up at com.google.archivepatcher.generator.MinimalZipArchive.listEntriesInternal(MinimalZipArchive.java:72) at com.google.archivepatcher.generator.MinimalZipArchive.listEntries(MinimalZipArchive.java:56) at com.google.archivepatcher.generator.PreDiffExecutor.generatePreDiffPlan(PreDiffExecutor.java:210) at com.google.archivepatcher.generator.PreDiffExecutor.prepareForDiffing(PreDiffExecutor.java:157) at com.google.archivepatcher.generator.FileByFileV1DeltaGenerator.generateDelta(FileByFileV1DeltaGenerator.java:81) at com.google.archivepatcher.sample.SamplePatchGenerator.main(SamplePatchGenerator.java:41)

Memory-mapped temp files used for bsdiff delta generation are not being deleted on Windows

This is a result of the following bug:
http://bugs.java.com/view_bug.do?bug_id=4715154

We already knew that the only way to force the JVM to munmap is to do an explicit GC() after nulling all references to the map obtained from the file channel in RandomAccessMmapObject (http://bugs.java.com/view_bug.do?bug_id=4469299), but this is worse. Not only is the mmap reference retained but in addition the open file handle prevents the deletion of the file itself, meaning that all bsdiff temp files stay around forever on win32. Even deleteOnExit() doesn't catch them. This is a serious problem because it means that the temp files will build up forever until they are manually purged, filling the host's disk eventually. As bsdiff needs to allocate 16x the size of the inputs, this is Very Bad (TM).

The only workaround here that I am aware of is to do a full gc() on Windows when closing a RandomAccessMmapObject, which is extremely heavyweight.

Add support for opaque binary sections

Background: to read a zip-like archive, code generally starts by seeking 
backwards from the end-of-file looking for the End Of Central Directory (EOCD) 
marker bytes. Once this is located the central directory can be read in, 
yielding a listing of all the file entries in the zip. At this point most 
engines will begin processing one or more entries in the zip file.

There is no standard that defines what comes before the start of the first 
"Local File" entry in the zip nor what comes after the End Of Central 
Directory's entry's final byte. These sections can contain arbitrary data that 
may be needed by some tools, although it seems rare to encounter these in 
practice.

Empirically, all ZIP and APK files that I have run the tool against so far have 
not had any such bits; this is known because the result of applying a patch to 
such a file would produce incorrect results if there were any such "dark bits" 
today (since they are not copied by the patching structure).

There is mention of such files "in the wild", e.g. executable JARs:
http://mesosphere.io/2013/12/07/executable-jars/

... and in older PKZIP-created stuff, there is apparently always a prefix of 
the ASCII chars 'PK', potentially followed by a bunch of stuff specific to 
whatever tool is intended to interpret it, e.g. PKLITE, PKSFX, and so on:
http://www.garykessler.net/library/file_sigs.html


This implies that the library needs a few modifications:
1. A new "OpaqueBits" (or similar) subclass of Part
2. The ability for such OpaqueBits to be present at the start and end of an 
archive.
3. The ability to send these parts along in a patch.

Since such bits are by definition opaque, there's probably nothing we can do 
special for them; running them through the configured delta provider seems the 
only sensible thing to do.

Extending this thought further, it may also be the case that some archives 
contain interstitial bits between entries. Again, this is undefined behavior; 
even if the spec declared that there should be no such bytes, it is an 
almost-certainty that every nontrivial ZIP implementation uses the central 
directory to find all the offsets for all the entries, meaning that it should 
be possible to inject extra bits between entries with no ill effects in most 
cases.

The fix for this latter problem is to generalize the problem and identify any 
and all gaps:
1. Gap between start of file and first local file entry.
2. Gap between the end of a file entry and start of the next file entry.
3. Gap between the final file entry and the start of the central directory.
4. Gap between the final bytes in the central directory and the first byte of 
the End Of Central Directory record.
5. Gap between the final byte of the End Of Central Directory record and the 
end of the file.

There's a hidden bonus to doing this, which is that it will automagically 
enhance the library to support ZIP records for which it has no specific 
support, since any such records would take the form of opaque bits by this 
definition. These would correctly be included in the patch.

This should be a fairly straightforward change; all that is required is to 
generate an offset-based linear ordering of all the entries and find their 
gaps. Since the opaque bits have no discernible structure, they are just binary 
blobs from the perspective of the library.

Original issue reported on code.google.com by [email protected] on 31 Jul 2014 at 12:08

How to run the sample code?

Hi I am not very familiar with java compilation. Thus I wonder how should I run the code in sample dir? If I directly compile it, java complains that many classes are not defined. I also tried things like javac -cp archive-patcher-1.0.jar ./SamplePatchGenerator.java && java SamplePatchGenerator (with jar downloaded and put in the right location), but still no luck. Thanks!

Android 11 (R) [API 30] : Zlib not compatible on this system

isCompatible is giving false in Android R version. Is there any reason?

if (!DefaultDeflateCompatibilityWindow().isCompatible) {
Logger.W("Zlib not compatible on this system")
return false
}

Also can we bundle zlib with apkpatcher as a library and make not to use system zlib to avoid these kind of issues during upgrade?

Bad patch generated if new archive contains multiple entries that trigger decompression of one entry in the old archive

If the new archive contains two or more entries that BOTH require uncompression of THE SAME entry in the old archive, two uncompress instructions are generated in the patch stream for the same range. The patch applier expects ranges to be sorted in increasing order, so when it encounters the second range in the stream the offset that it specifies has already been processed and so the patch is treated as invalid.

The fix is to switch from a List to a Set in the PreDiffPlanner; if any entry in the new archive requires decompression of an entry in the old archive, then the old entry should be uncompressed and the instruction should be output exactly once. A new test needs to be added for this case as well.

This didn't come up before because it's a little weird to both clone a resource AND change its compression level at the same time when building a new archive, but it is a valid use case and it is currently broken.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.