Giter Site home page Giter Site logo

heaparray's Introduction

HeapArray

An array (priority queue, actually) of sqrt(N) min-max heaps of increasing (odd) size.

Why?

This is a C++ implementation of the data structure proposed by Andrei Alexandrescu in the D-Language forums.

I implemented it to see what its practical performance might be, to help move the discussion along, and just for fun.

What to call it?

I have called it HeapArray, but that name isn't very good. This thing is really more of a priority queue of heaps built directly on top of a contiguous array, but with some interesting properties with respect to the heap sizes.

Details

Building the HeapArray

If you build the structure directly from an (unsorted) array, it will actually perform a std::sort ( O(n*lg(n)) ) followed by a post-processing Floyd make heap ( O(n) ).

If you build the structure dynamically by inserting values, the cost is closer to polynomial[1] due to the effect of values having to "ripple" from one heap to the next until they find the right partition. Empirical data seems to bear this out. See chart here, which looks similar to the one shown below in the "Scenario" section.

[1]: Probably around O(m^(3/2)), as nicely explained by Timon Gehr here.

Search

Search can be performed in (theoretically) O(sqrt(n)) steps; empirical data supports this; see chart here and chart below.

HeapArray VS multiset: Search Times <script data-plotly="jcausey-astate:10" src="https://plot.ly/embed.js" async></script>

Insert

Insert may cause a value to "ripple" across $\sqrt{n}-1$ partitions, each of which incur a O(lg(n)) cost, so it is O(sqrt(n)*lg(sqrt(n))) in theory. Empirical evidence seems to support this.

Delete

Delete must "ripple" from the end of the array toward the location of the delete, so it is the mirror image of insert. It should also be (theoretically) O(sqrt(n)*lg(sqrt(n))). I have not profiled deletion yet.

Scenario

For a real use-case, consider trying to generate a large number of unique values. Obviously something like std::set would be great for this. In this scenario, I used std::multiset (so that I would have to manually cull duplicates) and std::vector (where searches would be linear) to see how the HeapArray performed. Problem size increased to just over 100000.

Chart 1: vector-VS-HeapArray-VS-multiset
Generate Unique Values - HeapArray VS vector and multiset <script data-plotly="jcausey-astate:22" src="https://plot.ly/embed.js" async></script>
Here, you can see the context -- that both `HeapArray` and `multiset` perform much better than `vector` at this task. This confirms exactly what you would expect, since `vector` requires one or more linear searches every time a new value is generated. This gives a good visual line for the O(n^2) complexity this results in.

On the other hand, the plots for the other two data structures are much more well behaved -- it is hard to see their shape with vector in the chart, so I created another chart to focus on just these two.

Chart 2: HeapArray-VS-multiset
Generate Unique Values -- HeapArray VS multiset <script data-plotly="jcausey-astate:24" src="https://plot.ly/embed.js" async></script>
Here, without `vector` to dominate the Y-axis, you can see that `HeapArray` is still performing about O(n^(3/2)) during this task; so the majority of the time is spent in the addition of new values, not in the searching for collisions. Also as expected, `multiset` is performing in a more logarathmic fashion.

Dependency

This data structure depends on a Min-Max heap to perform a lot of the fast search (and insert/remove) magic. I've included my implementation of a Min-Max heap in this repository (mmheap.h).

All other dependencies are standard C++ libraries.

Big Disclaimer

This data structure is still in the research stage -- you should obviously not be using it for anything (other than curiosity) yet.

License

 Released under the MIT License: http://opensource.org/licenses/MIT
 Copyright (c) 2015 Jason L Causey, Arkansas State University 
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 
 The above copyright notice and this permission notice shall be included in
 all copies or substantial portions of the Software.
 
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 THE SOFTWARE.

heaparray's People

Contributors

jcausey-astate avatar jasoncausey avatar

Stargazers

 avatar Morwenn avatar kanshi avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.