Giter Site home page Giter Site logo

murmur's Introduction

murmur

Build Status Coverage Status license Maven Central

murmur is a pure Java implementation of all Murmur hashes, namely, Murmur1, Murmur2 and Murmur3. The library is a direct Java implementation of the C++ source code. Hash generation has been 100% unit tested against the hashes generated using the C++ code. The library should help in building out bloom filters, or to just compute the hash for checking sanity of data, as Murmur3 is much faster than MD5 and SHA computations.

The library is tested on the following JDK versions:

  • Oracle JDK 11
  • Oracle JDK 8
  • Open JDK 11
  • Open JDK 10
  • Open JDK 9
  • Open JDK 8

Why murmur?

murmur was developed as we could not find pure Java implementations for Murmur1 and Murmur2 hashes. Implementations were available for Murmur3 but for some of the legacy code that I maintain, I needed the Murmur1 and Murmur2 hashes. Thus, I ported the original implementations.

You may find the hash inconsistent with Google Guava library. The hash value is the same, it is the endian-ness of the hash that makes it look different. Refer to Issue #3 for more details.

To convert the hash into byte[] or a hex-string you may use the following code:

/**
 * Convert a given long value to byte-array.
 * 
 * @param x the long value
 * 
 * @return the byte[] array representation of it
 */
public static byte[] longToBytes(long x) {
	ByteBuffer buffer = ByteBuffer.allocate(Long.BYTES);

	// The ByteOrder.LITTLE_ENDIAN format matches the Google Guava toString() format
	buffer.order(ByteOrder.LITTLE_ENDIAN);

	buffer.putLong(x);
	return buffer.array();
}

/**
 * Convert a byte-array to hex string.
 * 
 * @param bytes the byte-array
 * 
 * @return the hex string
 */
public static String bytesToHex(byte[] bytes) {
	char[] hexChars = new char[bytes.length * 2];
	for (int j = 0; j < bytes.length; j++) {
		int v = bytes[j] & 0xFF;
		hexChars[j * 2] = hexArray[v >>> 4];
		hexChars[j * 2 + 1] = hexArray[v & 0x0F];
	}
	return new String(hexChars);
}

Features

  • Pure Java implementations of various Murmur hashes
  • 100% hash compatibility with original C++ code
  • No dependencies
  • Ultra light-weight, just 7KB in size

Performance

The MurmurPerformanceTests.java file contains tests to compute hashes of 1-million random type-4 UUIDs between various Murmur hashes, and MD5, SHA-1, SHA-256, and SHA-512 hashes.

The results of a sample run on my dev machine are as under:

Windows
-------
Intel i7-2660 CPU @ 3.40Ghz
16-GB RAM
Windows 7, 64-bit, Service Pack 1
Oracle JDK 1.7.0_51 build 13, 64-bit Server VM

OS X
----
Intel i7-4870HQ CPU @ 2.50GHz
16-GB RAM
macOS Sierra 10.12.1
Oracle JDK 1.8.0_101 build 13, 64-bit Server VM
Algorithm Time Taken Windows (ms) Time Taken OSX (ms)
MD5 369 338
SHA-1 482 415
SHA-256 677 642
SHA-512 906 782
Murmur-1 143 101
Murmur-2 135 123
Murmur-2-64 102 92
Murmur-3 168 119
Murmur-3-128 160 261

Builds

1.0.0

  • First release with Murmur 1/2/3 hashes

Downloads

The library can be downloaded from Maven Central using:

<dependency>
    <groupId>com.sangupta</groupId>
    <artifactId>murmur</artifactId>
    <version>1.0.0</version>
</dependency>

Versioning

For transparency and insight into our release cycle, and for striving to maintain backward compatibility, murmur will be maintained under the Semantic Versioning guidelines as much as possible.

Releases will be numbered with the follow format:

<major>.<minor>.<patch>

And constructed with the following guidelines:

  • Breaking backward compatibility bumps the major
  • New additions without breaking backward compatibility bumps the minor
  • Bug fixes and misc changes bump the patch

For more information on SemVer, please visit http://semver.org/.

License

murmur - Pure Java implementation of the Murmur Hash algorithms
Copyright (c) 2014-2018, Sandeep Gupta

The project uses various other libraries that are subject to their
own license terms. See the distribution libraries or the project
documentation for more details.

The entire source is licensed under the Apache License, Version 2.0 
(the "License"); you may not use this work except in compliance with
the LICENSE. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

murmur's People

Contributors

dependabot[bot] avatar sangupta avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.