Giter Site home page Giter Site logo

tny's Introduction

========================
Introduction
========================

Tny is a project that seeks to find and develop high performance data
strutures that have very low memory foot prints.  The hope is that these
structures may form the basis of a high performance in-memory column oriented
database for analyzing genomic information.

In developing software for large data sets (billions of records, terabytes in size) 
the way you store your data in memory is critical – and you want your data in memory
if you want to be able to analyse it quickly (e.g. minutes not days).

Any data structure that relies on pointers for each data element quickly becomes
unworkable due to the overhead of pointers.  On a 64 bit system, with one pointer
for each data element across a billion records you have just blown near 8GB of
memory just in pointers.

Thus there is a need for compact data structures that still have fast access characteristics.

========================
The Challenge
========================

The challenge is to come up with the fastest data structure that meets the following requirements:
•	Use less memory than an array in all circumstances
•	Fast Seek is more important than Fast Access
•	Seek and Access must be better than O(N).

Where Seek and Access are defined as:

	Access (int index): Return me the value at the specified index ( like array[idx] ). 
	 
	Seek (int value): Return me all the Indexes that match value. 


(The actual return type of Seek is a little different, but logically the same.  What we need to return is a bitmap where a bit set to 1 at position X means that value was found at index X.  This allows us to combine results using logical ANDs rather than intersections as detailed here)


For more information pleace check my blog at:

http://siganakis.com

This project is released under the GPL.

Contact me at [email protected].



  

tny's People

Contributors

siganakis avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.