Giter Site home page Giter Site logo

yahoo / halodb Goto Github PK

View Code? Open in Web Editor NEW
496.0 27.0 101.0 576 KB

A fast, log structured key-value store.

Home Page: https://yahoodevelopers.tumblr.com/post/178250134648/introducing-halodb-a-fast-embedded-key-value

License: Apache License 2.0

Java 100.00%
storage-engine java embedded-database key-value-store big-data

halodb's Introduction

HaloDB

Build Status Download

HaloDB is a fast and simple embedded key-value store written in Java. HaloDB is suitable for IO bound workloads, and is capable of handling high throughput reads and writes at submillisecond latencies.

HaloDB was written for a high-throughput, low latency distributed key-value database that powers multiple ad platforms at Yahoo, therefore all its design choices and optimizations were primarily for this use case.

Basic design principles employed in HaloDB are not new. Refer to this document for more details about the motivation for HaloDB and its inspirations.

HaloDB comprises of two main components: an index in memory which stores all the keys, and append-only log files on the persistent layer which stores all the data. To reduce Java garbage collection pressure the index is allocated in native memory, outside the Java heap.

HaloDB

Basic Operations.

            // Open a db with default options.
            HaloDBOptions options = new HaloDBOptions();
    
            // Size of each data file will be 1GB.
            options.setMaxFileSize(1024 * 1024 * 1024);

            // Size of each tombstone file will be 64MB
            // Large file size mean less file count but will slow down db open time. But if set
            // file size too small, it will result large amount of tombstone files under db folder
            options.setMaxTombstoneFileSize(64 * 1024 * 1024);

            // Set the number of threads used to scan index and tombstone files in parallel
            // to build in-memory index during db open. It must be a positive number which is
            // not greater than Runtime.getRuntime().availableProcessors().
            // It is used to speed up db open time.
            options.setBuildIndexThreads(8);

            // The threshold at which page cache is synced to disk.
            // data will be durable only if it is flushed to disk, therefore
            // more data will be lost if this value is set too high. Setting
            // this value too low might interfere with read and write performance.
            options.setFlushDataSizeBytes(10 * 1024 * 1024);
    
            // The percentage of stale data in a data file at which the file will be compacted.
            // This value helps control write and space amplification. Increasing this value will
            // reduce write amplification but will increase space amplification.
            // This along with the compactionJobRate below is the most important setting
            // for tuning HaloDB performance. If this is set to x then write amplification 
            // will be approximately 1/x. 
            options.setCompactionThresholdPerFile(0.7);
    
            // Controls how fast the compaction job should run.
            // This is the amount of data which will be copied by the compaction thread per second.
            // Optimal value depends on the compactionThresholdPerFile option.
            options.setCompactionJobRate(50 * 1024 * 1024);
    
            // Setting this value is important as it helps to preallocate enough
            // memory for the off-heap cache. If the value is too low the db might
            // need to rehash the cache. For a db of size n set this value to 2*n.
            options.setNumberOfRecords(100_000_000);
            
            // Delete operation for a key will write a tombstone record to a tombstone file.
            // the tombstone record can be removed only when all previous version of that key
            // has been deleted by the compaction job.
            // enabling this option will delete during startup all tombstone records whose previous
            // versions were removed from the data file.
            options.setCleanUpTombstonesDuringOpen(true);
    
            // HaloDB does native memory allocation for the in-memory index.
            // Enabling this option will release all allocated memory back to the kernel when the db is closed.
            // This option is not necessary if the JVM is shutdown when the db is closed, as in that case
            // allocated memory is released automatically by the kernel.
            // If using in-memory index without memory pool this option,
            // depending on the number of records in the database,
            // could be a slow as we need to call _free_ for each record.
            options.setCleanUpInMemoryIndexOnClose(false);
            
            // ** settings for memory pool **
            options.setUseMemoryPool(true);
    
            // Hash table implementation in HaloDB is similar to that of ConcurrentHashMap in Java 7.
            // Hash table is divided into segments and each segment manages its own native memory.
            // The number of segments is twice the number of cores in the machine.
            // A segment's memory is further divided into chunks whose size can be configured here. 
            options.setMemoryPoolChunkSize(2 * 1024 * 1024);
    
            // using a memory pool requires us to declare the size of keys in advance.
            // Any write request with key length greater than the declared value will fail, but it
            // is still possible to store keys smaller than this declared size. 
            options.setFixedKeySize(8);
    
            // Represents a database instance and provides all methods for operating on the database.
            HaloDB db = null;
    
            // The directory will be created if it doesn't exist and all database files will be stored in this directory
            String directory = "directory";
    
            // Open the database. Directory will be created if it doesn't exist.
            // If we are opening an existing database HaloDB needs to scan all the
            // index files to create the in-memory index, which, depending on the db size, might take a few minutes.
            db = HaloDB.open(directory, options);
    
            // key and values are byte arrays. Key size is restricted to 128 bytes.
            byte[] key1 = Ints.toByteArray(200);
            byte[] value1 = "Value for key 1".getBytes();
    
            byte[] key2 = Ints.toByteArray(300);
            byte[] value2 = "Value for key 2".getBytes();
    
            // add the key-value pair to the database.
            db.put(key1, value1);
            db.put(key2, value2);
    
            // read the value from the database.
            value1 = db.get(key1);
            value2 = db.get(key2);
    
            // delete a key from the database.
            db.delete(key1);
    
            // Open an iterator and iterate through all the key-value records.
            HaloDBIterator iterator = db.newIterator();
            while (iterator.hasNext()) {
                Record record = iterator.next();
                System.out.println(Ints.fromByteArray(record.getKey()));
                System.out.println(new String(record.getValue()));
            }
    
            // get stats and print it.
            HaloDBStats stats = db.stats();
            System.out.println(stats.toString());
    
            // reset stats
            db.resetStats();
            
            // pause background compaction thread.
            // if a file is being compacted the thread
            // will block until the compaction is complete.
            db.pauseCompaction();
            
            // resume background compaction thread.
            db.resumeCompaction();
            
            // repeatedly calling pause/resume compaction methods will have no effect.

            // Close the database.
            db.close();

Binaries for HaloDB are hosted on Bintray.

<dependency>
  <groupId>com.oath.halodb</groupId>
  <artifactId>halodb</artifactId>
  <version>x.y.x</version> 
</dependency>

<repository>
  <id>yahoo-bintray</id>
  <name>yahoo-bintray</name>
  <url>https://yahoo.bintray.com/maven</url>
</repository>

Read, Write and Space amplification.

Read amplification in HaloDB is always 1—for a read request it needs to do at most one disk lookup—hence it is well suited for read latency critical workloads. HaloDB provides a configuration which can be tuned to control write amplification and space amplification, both of which trade-off with each other; HaloDB has a background compaction thread which removes stale data from the DB. The percentage of stale data at which a file is compacted can be controlled. Increasing this value will increase space amplification but will reduce write amplification. For example if the value is set to 50% then write amplification will be approximately 2

Durability and Crash recovery.

Write Ahead Logs (WAL) are usually used by databases for crash recovery. Since for HaloDB WAL is the database crash recovery is easier and faster.

HaloDB does not flush writes to disk immediately, but, for performance reasons, writes only to the OS page cache. The cache is synced to disk once a configurable size is reached. In the event of a power loss, the data not flushed to disk will be lost. This compromise between performance and durability is a necessary one.

In the event of a power loss and data corruption, HaloDB will scan and discard corrupted records. Since the write thread and compaction thread could be writing to at most two files at a time only those files need to be repaired and hence recovery times are very short.

In the event of a power loss HaloDB offers the following consistency guarantees:

  • Writes are atomic.
  • Inserts and updates are committed to disk in the same order they are received.
  • When inserts/updates and deletes are interleaved total ordering is not guaranteed, but partial ordering is guaranteed for inserts/updates and deletes.

In-memory index.

HaloDB stores all keys and their associated metadata in an index in memory. The size of this index, depending on the number and length of keys, can be quite big. Therefore, storing this in the Java Heap is a non-starter for a performance critical storage engine. HaloDB solves this problem by storing the index in native memory, outside the heap. There are two variants of the index; one with a memory pool and the other without it. Using the memory pool helps to reduce the memory footprint of the index and reduce fragmentation, but requires fixed size keys. A billion 8 byte keys currently takes around 44GB of memory with memory pool and around 64GB without memory pool.

The size of the keys when using a memory pool should be declared in advance, and although this imposes an upper limit on the size of the keys it is still possible to store keys smaller than this declared size.

Without the memory pool, HaloDB needs to allocate native memory for every write request. Therefore, memory fragmentation could be an issue. Using jemalloc is highly recommended as it provides a significant reduction in the cache's memory footprint and fragmentation.

Delete operations.

Delete operation for a key will add a tombstone record to a tombstone file, which is distinct from the data files. This design has the advantage that the tombstone record once written need not be copied again during compaction, but the drawback is that in case of a power loss HaloDB cannot guarantee total ordering when put and delete operations are interleaved (although partial ordering for both is guaranteed).

DB open time

Open db could take a few minutes, depends on number of records and tombstones. If the db open time is critical to your use case, please keep tombstone file size relatively small and increase the number of threads used in building index. See the option setting section in example code above. As best practice, set tombstone file size at 64MB and set build index threads to number of available processors divided by number of dbs being opened simultaneously.

System requirements.

  • HaloDB requires Java 8 to run, but has not yet been tested with newer Java versions.
  • HaloDB has been tested on Linux running on x86 and on MacOS. It may run on other platforms, but this hasn't been verified yet.
  • For performance disable Transparent Huge Pages and swapping (vm.swappiness=0).
  • If a thread is interrupted JVM will close those file channels the thread was operating on. Therefore, don't interrupt threads while they are doing IO operations.

Restrictions.

  • Size of keys is restricted to 128 bytes.
  • HaloDB don't support range scans or ordered access.

Benchmarks.

Benchmarks.

Contributing

Contributions are most welcome. Please refer to the CONTRIBUTING guide

Credits

HaloDB was written by Arjun Mannaly.

License

HaloDB is released under the Apache License, Version 2.0

halodb's People

Contributors

amannaly avatar bellofreedom avatar erichetti avatar goelpulkit avatar gwirvin avatar retlawrose avatar rfecher avatar wangtao724 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

halodb's Issues

Seem VERY hard to port beyond JDK 8 - anybody got some ideas on how to do it?

I really like a lot of things with HaloDB and would really have liked to contribute some time to port it to more contemporary JDK releases but after looking a bit at the code I am not so sure it can be done at least until additional support for manipulating memory larger than 2GB etc. is added to Java (implementing the features of "unsafe" no longer available at least in any way short of using JNI) but if anybody have some ideas I may be interested to put some effort into implementing....

Allocation-free reads

Every read allocates a byte[] on the java heap, even if all that read is going to do is deserialize that byte[] into something else.

It would be useful to be able to read the data directly without the intermediate byte[].

Perhaps with a signature similar to:

<A> A get(byte[] key, Function<DataInput, A> reader);

Access to a (native) ByteBuffer would also be useful, but this will become invalid if the file it points into is garbage collected. A different data structure could 'find' the memory again if it moved due to compaction and otherwise continue to use the old value. That might look like

ValueHandle getValueHandle(byte[] key);

interface ValueHandle {
  boolean updated(); // if the value was updated after the handle was created
  <A> A read(Function<DataInput, A> reader); // read whatever the current value is
}

The purpose of these would be to improve performance by decreasing allocations, and to allow for lazy-deserialization of larger data types. For example one might want to read only part of a value initially, and lazily load the remainder.

Enhancement: force data flush on write

Hi thanks for this awesome project. I would like to implement a volume server using HaloDB. Now as a file storage, data durability is a requirement. My question is how is it possible to enable some sort of write option to indicate that write must be written immediately to disk. Would setting options.setFlushDataSizeBytes(0) guarantee data durability?

Multi writer threads may result in low performance

In HaloDBInternal class, the boolean put(byte[] key, byte[] value) function is added a lock, so that it may result in low performance when multi-threads writing.

    boolean put(byte[] key, byte[] value) throws IOException, HaloDBException {
        if (key.length > Byte.MAX_VALUE) {
            throw new HaloDBException("key length cannot exceed " + Byte.MAX_VALUE);
        }

        //TODO: more fine-grained locking is possible. 
        writeLock.lock();
        try {
            Record record = new Record(key, value);
            record.setSequenceNumber(getNextSequenceNumber());
            record.setVersion(Versions.CURRENT_DATA_FILE_VERSION);
            RecordMetaDataForCache entry = writeRecordToFile(record);
            markPreviousVersionAsStale(key);

            //TODO: implement getAndSet and use the return value for
            //TODO: markPreviousVersionAsStale method.
            return inMemoryIndex.put(key, entry);
        } finally {
            writeLock.unlock();
        }
    }

HaloDBInternal delete not respecting open/close status

Performing a delete operation on an item in a db object that is already in the closed state should throw an exception. Currently this does not happen for the first occurrence of an attempted delete, since the Tombstone file has not yet been created.

A possible solution could be to always initialize a tombstone file whenever a new db file is created.

Data Compression

LZ4 is included as a dependancy, but it doesn't look to be used. Is there a reason for this?

Would you be open to a PR that optionally enables LZ4 compression of keys and/or values?

Storage clustering support

So imagine I will use HaloDB to build some important point within infrastructure, it mean I am interested in running at least 2 instances in time.

Schema 1. I can loadbalance reads, but can push write requests to both instances.
Schema 2. I can create one instance as replica over network I might mark reader writer nodes...
hm, both approaches sucks and requires lot of work. What about using https://atomix.io/

Any ideas are welcomed.

Feature Requent for sequence based store

Hi @amannaly,

As a true storage DB, would HaloDB support sequence based store? that is to implement byte[] putValue(byte[] value) that's returning sequence number as key. This would also make things even simpler and possibly faster lookup as the index as can be implemented using simple off heap Byte.allocateDirect array instead of Snazy/OHC. Operation on such store is strictly PUT, GET, DELETE. As a bonus iteration with skip limit is also easy and fast.

Cheers

Iteration properties

I need regular database exports, which is an iteration over all records.
In my tests, when I create an iterator, records created after the iterator creation are not returned, I have no problem with that.
When I update a record during iteration (insert different value with the same key), sometimes it is not returned by the iterator. I guess this is because it is a different record, the old one is marked as deleted, and the new one behaves as the first case - records after iterator creation are not returned.
Also, it sometimes happens that when I update a record during iteration, the iterator returns both versions of the record with the same key. This I cannot reproduce in my tests, but happens regularly on a production DB with millions of records.
So if I wanted to achieve more consistent results from iteration (no missing updated records or duplicates), am I supposed to stop writing to DB while iterating? Or maybe also pause compaction?

Supporting keys > 127 bytes long

HaloDB encodes key length as a signed byte on disk, and in memory. It rejects keys longer than 127 bytes.

Although most of my string keys in different databases are between 10 and 80 bytes, I have some rare outliers as large as 780 bytes. Every other K/V store I work with (I have an API abstraction that wraps about 10) can support large keys; only one other breaks at 512 bytes.

There are some options for longer keys:

Easy: read the byte as an unsigned value, and then sizes from 0 to 255 are supported. However, this will make it even harder to support larger keys.

Harder: Allow for larger sizes, perhaps up to 2KB.

  • Index/Record/Tombstone files: Steal the top 3 bits from the version byte. Since the version byte is currently 0, new code versions would interpret existing data files the same ( top 3 bits of existing version byte | key size byte ). Old code would interpret any 'extended' keys as a version mismatch and thus still be safe. Therefore I think this can remain version 0. If versions were to get up to 32, a different format would be needed at that time.
  • SegmentWithMemoryPool: No change for now, it will not support key sizes larger than its configured fixedKeySize which would be 127 or less. It could be extended to support key overflow when fixedKeySize is set to 8 or larger. In this case, when the key length is larger than fixedKeySize, then the slot holds a pointer to extended key data, plus whatever prefix of the key fits in the remaining slot (fixedKeySize - 8). An alternative when fixedKeySize is large enough is to keep a portion of the key hash in this area as well, so that the pointer to the extended key data does not need to be followed for most lookups. Even just one byte of the hash that was not used for accessing the hash bucket would decrease the chance that the pointer is followed by a factor of 256 on a miss.
  • SegmentNonMemoryPool: Since all hash entries are individually allocated, (it appears to be closed addressing with a linked chain of entries), the allocated entry in memory can either use a variable length integer encoding for the key/value lengths, or a constant two bytes for the key.

Inefficient file formats

It appears like the file formats have a lot of redundancy.

For example, ever Record, Tombstone, and Index entry have an individual crc32, a version byte, plus 4 byte record size.

Lets take IndexFileEntry for example

/**
 * checksum         - 4 bytes. 
 * version          - 1 byte.
 * Key size         - 1 bytes.
 * record size      - 4 bytes.
 * record offset    - 4 bytes.
 * sequence number  - 8 bytes
 */

A few things come to mind.

  1. The file could have a header with the version number, since it is identical for all entries. Since the file is only read sequentially, and truncated at the first corrupted item found, the header could contain the first sequenceNumber as well, and values afterward can be deltas relative to this value using a variable length encoding. RecordOffset is similar -- the values are monotonically increasing and could be delta encoded with a variable length integer.
  2. As for the checksum, it could be written for small 'blocks' rather than for each record. This also would accelerate recovery from a crash, as each block could be something like: (2 byte size, 8 byte xxHash checksum, size bytes of index entries). Validating the file would then only need to go a block at a time until it fails. As long as the block had at least 3 entries, it would save space. I suspect something like flushing a block every ~ 32 entries or 2k bytes (whatever is first) would work well -- ~9% as many bytes used for checksums, but small enough chunks so that it shouldn't significantly impact the chance that data fails to reach disk before a crash.
  3. Also, unless hardware accelerated, crc32 is much slower than XXHash and also more prone to collisions.

Code Coverage maven profile

Its useful to asses what code is and is not covered when writing thorough unit tests.

Adding a maven profile to track and generate code coverage would be useful.

Manual Compaction

One of my use cases involves building a large K/V dataset in one location, then closing the data store, transfering and replicating the contents to other locations and initializing them elsewhere.

For this use case, I would like to have a method to do manual compaction.

Perhaps a method void compact(float threshold) that compacts all data files that have more than threshold ratio of overwritten bytes would be best. For example, calling compact(0.01) would result in all files that have over 1% of their space deleted to be compacted.

This could be used for my use case as well as several others. It also would be a useful tool for compaction performance measurement.

What I can store in HalloDB?

Imagine my both key and value might be:
simple data types to complex object structures (coming from JSON, YAML files)

Value also could be for example avro schema, json schema, json, messagepack message, whatever.

Is this supported?

Also now imagine key is json:
{
group: ingestion
type: schema
key: datasource1Pipeline
}

{
group: ingestion
type: schema
key: datasource2Pipeline
}

and value is real schema content for both. Now imagine I am going to search for all keys from group ingestion by providing:
{
group: ingestion
} I expect all keys with group will be returned.

Is HalloDB designed to implement such query capabilities easily? (I didn't dig down more into the source code yet...)

Now I am going to search by key:

CI pipeline for Java 11

Documentation currently says Java8 only. You should implement some CI for Java11 meaning supporting 11 (and onwards) is zero effort in the future.

Got FileSystemException when repairing database

Every time when repairing database after incorrect shutdown I get following error:

Caused by: java.nio.file.FileSystemException: C:\database\1549869944.index.repair -> C:\database\1549869944.index: The process cannot access the file because it is being used by another process.

It seems like when calling method openDataFilesForReading() my '.index' file has been opened for reading and hadn't been closed after it. No other processes use this file.

So when we call method repairFile(DBDirectory dbDirectory) we get an exception in line: Files.move(repairFile.indexFile.getPath(), indexFile.getPath(), REPLACE_EXISTING, ATOMIC_MOVE);

Any suggestions on how to avoid this error?

Truncate Database

Apart from iterating through every record in the database, is there any way to truncate all data?

SequenceId seems unnecessary

I may be wrong, but I believe that SequenceId is unnecessary. It eats up 8 bytes per entry in memory, and >=16 bytes per entry on disk (one for each entry in tombstone/index/data files).

My reasoning:

  • One reason that SequenceId is required, is so that during initialization, tombstones (deletes) and index updates (put/replace) do not process out of order. Tombstones are processed last, but do not apply to keys that were updated 'after' them.
  • If Tombstones are sequenced in order inside the index file as they occur, then rebuilding the index in the order it was written would resolve the above, without sequenceId.
  • The other reason that SequenceId is required, is to support concurrent initialization by multiple threads. For simple 'puts', the fileId is enough to resolve which update should win. But if tombstones are interleaved with index loading as I suggest above, then concurrent threads from different files doing puts and deletes on the same key will have race conditions (if file 2 does a delete on key X, then file 1 does a put, it would not know that file 2 has already removed it).
  • I see two solutions to the above, assuming tombstones are sequenced in the index/data file in the order they happened relative to the 'puts':
    1. When a delete happens, leave the fileId in the in memory map with the key, marked as deleted (possible with closed addressing but not the current data structures).
    2. Initialize the database in order, from oldest file to newest file, and split the data by hash so that a thread per segment can do the update. This would be limited by how fast one thread could compute the hash of the keys in the tombstone/index file. One optimization would be if compaction was able to merge and split files so that files are on disjoint key hash ranges. For example, if compacting file 1,2, 3, 4 all together yeilded four files "4.a, 4.b, 4.c, 4.d" each representing a distinct hash range perhaps split by the top two bits of the hash into 4 buckets. Then the initialization could read all four in parallel as their updates would be disjoint and apply to different Segments.

My biggest concerns with such a change is that it is a major overhaul of the file format and in memory layout, and would not easily share code with the older implementation. The work to make the code support the old and new formats simultaneously could easily be most of the effort.

Project status?

Hi all! I’m interested in the properties that HaloDB provides, but I’m curious to learn about the status of HaloDB. The repo here looks as though the most recent update is from a few years ago.

Is HaloDB no longer used in production by Yahoo? If not, it would be great to learn what technology was used to replace it, and why 😊 Is there a new/different fork that is the continuation of the project? Keen to learn more. Thanks!

Can we support versioning?

I would like to support key value versioning capability.

There could be introduced something like revision id (incremental number per key 0-infinity) and at the same time I am interested in tracking key (validFrom/createdOn - validTo/InvalidatedOn).

Enhancement: Iterator seek

Dear @amannaly,

I have another use case of streaming read from and offset (think Kafka, DistributedLog, Pulsar). I tried forwarding using iterator.next(). as you can guess it's slow. Is it difficult to implement 'HaloDBIterator.seek(offset)' to start the beginning of the iterator from offset. BTW your 'forEachRemaining' is cool mate. I understand HaloDB does not order guarantee on delete or update but for this use case it will be strictly append only.

Cheers

java.nio.file.AccessDeniedException thrown on windows during creation of the DBDirectory

When running the Readme example I have the following issue on a windows system:

Exception in thread "main" com.oath.halodb.HaloDBException: Failed to open db halodb_directory
	at com.oath.halodb.HaloDB.open(HaloDB.java:29)
	at com.oath.halodb.HaloDB.open(HaloDB.java:35)
....
Caused by: java.nio.file.AccessDeniedException: D:\halodb_directory
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
	at sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:115)
	at java.nio.channels.FileChannel.open(FileChannel.java:287)
	at java.nio.channels.FileChannel.open(FileChannel.java:335)
	at com.oath.halodb.DBDirectory.openReadOnlyChannel(DBDirectory.java:76)
	at com.oath.halodb.DBDirectory.open(DBDirectory.java:35)
	at com.oath.halodb.HaloDBInternal.open(HaloDBInternal.java:79)
	at com.oath.halodb.HaloDB.open(HaloDB.java:26)
	... 12 more

This is a known issue:
http://mail.openjdk.java.net/pipermail/nio-dev/2013-February/002123.html
https://issues.apache.org/jira/browse/HDFS-13586
permazen/permazen#7
cryptomator/fuse-nio-adapter#5

I think the fix made in this commit could be ported to HaloDB:
permazen/permazen@287c94c

ByteBuffer flip twice

I was attracted by the design of HaloDB recently, and I carefully read his implementation. I found that in the serialize method of InMemoryIndexMetaDataSerializer, the byteBuffer flips twice. I don’t understand why flip twice. Isn’t there any problem with flipping twice?

Can this be used as a standalone db?

Hi, The description says that HaloDB can be embedded into an application. I was wondering if there is a way to set it up as a standalone DB and connect to it from a remote application. Is that possible?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.