An array (priority queue, actually) of sqrt(N) min-max heaps of increasing (odd) size.
This is a C++ implementation of the data structure proposed by Andrei Alexandrescu in the D-Language forums.
I implemented it to see what its practical performance might be, to help move the discussion along, and just for fun.
I have called it HeapArray
, but that name isn't very good. This thing is really more of a priority queue of heaps built directly on top of a contiguous array, but with some interesting properties with respect to the heap sizes.
If you build the structure directly from an (unsorted) array, it will actually perform a std::sort
( O(n*lg(n)) ) followed by a post-processing Floyd make heap ( O(n) ).
If you build the structure dynamically by inserting values, the cost is closer to polynomial[1] due to the effect of values having to "ripple" from one heap to the next until they find the right partition. Empirical data seems to bear this out. See chart here, which looks similar to the one shown below in the "Scenario" section.
[1]: Probably around O(m^(3/2)), as nicely explained by Timon Gehr here.
Search can be performed in (theoretically) O(sqrt(n)) steps; empirical data supports this; see chart here and chart below.
Insert may cause a value to "ripple" across
Delete must "ripple" from the end of the array toward the location of the delete, so it is the mirror image of insert. It should also be (theoretically) O(sqrt(n)*lg(sqrt(n))). I have not profiled deletion yet.
For a real use-case, consider trying to generate a large number of unique values. Obviously something like std::set
would be great for this. In this scenario, I used std::multiset
(so that I would have to manually cull duplicates) and std::vector (where searches would be linear) to see how the HeapArray performed. Problem size increased to just over 100000.
On the other hand, the plots for the other two data structures are much more well behaved -- it is hard to see their shape with vector
in the chart, so I created another chart to focus on just these two.
This data structure depends on a Min-Max heap to perform a lot of the fast search (and insert/remove) magic. I've included my implementation of a Min-Max heap in this repository (mmheap.h).
All other dependencies are standard C++ libraries.
This data structure is still in the research stage -- you should obviously not be using it for anything (other than curiosity) yet.
Released under the MIT License: http://opensource.org/licenses/MIT
Copyright (c) 2015 Jason L Causey, Arkansas State University
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.