Giter Site home page Giter Site logo

hilbertcurve's Introduction

image

Table of Contents

Introduction

This is a package to convert between one dimensional distance along a Hilbert curve, h, and n-dimensional points, (x_0, x_1, ... x_n-1). There are two important parameters,

  • n -- the number of dimensions (must be > 0)
  • p -- the number of iterations used in constructing the Hilbert curve (must be > 0)

We consider an n-dimensional hypercube of side length 2^p. This hypercube contains 2^{n p} unit hypercubes (2^p along each dimension). The number of unit hypercubes determine the possible discrete distances along the Hilbert curve (indexed from 0 to 2^{n p} - 1).

Quickstart

Install the package with pip,

pip install hilbertcurve

You can calculate points given distances along a hilbert curve,

>>> from hilbertcurve.hilbertcurve import HilbertCurve
>>> p=1; n=2
>>> hilbert_curve = HilbertCurve(p, n)
>>> distances = list(range(4))
>>> points = hilbert_curve.points_from_distances(distances)
>>> for point, dist in zip(points, distances):
>>>     print(f'point(h={dist}) = {point}')

point(h=0) = [0, 0]
point(h=1) = [0, 1]
point(h=2) = [1, 1]
point(h=3) = [1, 0]

You can also calculate distances along a hilbert curve given points,

>>> points = [[0,0], [0,1], [1,1], [1,0]]
>>> distances = hilbert_curve.distances_from_points(points)
>>> for point, dist in zip(points, distances):
>>>     print(f'distance(x={point}) = {dist}')

distance(x=[0, 0]) = 0
distance(x=[0, 1]) = 1
distance(x=[1, 1]) = 2
distance(x=[1, 0]) = 3

Inputs and Outputs

The HilbertCurve class has four main public methods.

  • point_from_distance(distance: int) -> Iterable[int]
  • points_from_distances(distances: Iterable[int], match_type: bool=False) -> Iterable[Iterable[int]]
  • distance_from_point(point: Iterable[int]) -> int
  • distances_from_points(points: Iterable[Iterable[int]], match_type: bool=False) -> Iterable[int]

Arguments that are type hinted with Iterable[int] have been tested with lists, tuples, and 1-d numpy arrays. Arguments that are type hinted with Iterable[Iterable[int]] have been tested with list of lists, tuples of tuples, and 2-d numpy arrays with shape (num_points, num_dimensions). The match_type key word argument forces the output iterable to match the type of the input iterable.

The HilbertCurve class also contains some useful metadata derived from the inputs p and n. For instance, you can construct a numpy array of random points on the hilbert curve and calculate their distances in the following way,

>>> from hilbertcurve.hilbertcurve import HilbertCurve
>>> p=1; n=2
>>> hilbert_curve = HilbertCurve(p, n)
>>> num_points = 10_000                                                                                              
>>> points = np.random.randint(                                                                                   
        low=0,                                                                                                    
        high=hilbert_curve.max_x + 1,                                                                                 
        size=(num_points, hilbert_curve.n)                                                                        
    )
>>> distances = hilbert_curve.distances_from_points(points)
>>> type(distances)

list

>>> distances = hilbert_curve.distances_from_points(points, match_type=True)
>>> type(distances)

numpy.ndarray

Multiprocessing

You can now take advantage of multiple processes to speed up calculations by using the n_procs keyword argument when creating an instance of HilbertCurve.

n_procs (int): number of processes to use
    0 = dont use multiprocessing
   -1 = use all available processes
    any other positive integer = number of processes to use

A value of 0 will completely avoid using the multiprocessing module while a value of 1 will use the multiprocessing module but with a single process. If you want to take advantage of every thread on your computer use the value -1 and if you want something in the middle use a value between 1 and the number of threads on your computer. A concrete example starting with the code block above is,

>>> from hilbertcurve.hilbertcurve import HilbertCurve
>>> p=1; n=2
>>> hilbert_curve = HilbertCurve(p, n, n_procs=-1)
>>> num_points = 100_000                                                                                              
>>> points = np.random.randint(                                                                                   
        low=0,                                                                                                    
        high=hilbert_curve.max_x + 1,                                                                                 
        size=(num_points, hilbert_curve.n)                                                                        
    )
>>> distances = hilbert_curve.distances_from_points(points)

The following methods are able to use multiple cores.

  • points_from_distances(distances: Iterable[int], match_type: bool=False) -> Iterable[Iterable[int]]
  • distances_from_points(points: Iterable[Iterable[int]], match_type: bool=False) -> Iterable[int]

(Absurdly) Large Integers

Due to the magic of arbitrarily large integers in Python, these calculations can be done with ... well ... arbitrarily large integers!

>>> p = 512; n = 10
>>> hilbert_curve = HilbertCurve(p, n)
>>> ii = 123456789101112131415161718192021222324252627282930
>>> point = hilbert_curve.points_from_distances([ii])[0]
>>> print(f'point = {point}')

point = [121075, 67332, 67326, 108879, 26637, 43346, 23848, 1551, 68130, 84004]

The calculations above represent the 512th iteration of the Hilbert curve in 10 dimensions. The maximum value along any coordinate axis is an integer with 155 digits and the maximum distance along the curve is an integer with 1542 digits. For comparison, an estimate of the number of atoms in the observable universe is 10^{82} (i.e. an integer with 83 digits).

Visuals

The figure above shows the first three iterations of the Hilbert curve in two (n=2) dimensions. The p=1 iteration is shown in red, p=2 in blue, and p=3 in black. For the p=3 iteration, distances, h, along the curve are labeled from 0 to 63 (i.e. from 0 to 2^{n p}-1). This package provides methods to translate between n-dimensional points and one dimensional distance. For example, between (x_0=4, x_1=6) and h=36. Note that the p=1 and p=2 iterations have been scaled and translated to the coordinate system of the p=3 iteration.

The figure above shows the first three iterations of the Hilbert curve in two (n=2) dimensions. The p=1 iteration is shown in red, p=2 in blue, and p=3 in black. For the p=3 iteration, distances, h, along the curve are labeled from 0 to 63 (i.e. from 0 to 2^{n p}-1). This package provides methods to translate between n-dimensional points and one dimensional distance. For example, between (x_0=4, x_1=6) and h=36. Note that the p=1 and p=2 iterations have been scaled and translated to the coordinate system of the p=3 iteration.

An animation of the same case in 3-D is available on YouTube. To watch the video, click the link below. Once the YouTube video loads, you can right click on it and turn "Loop" on to watch the curve rotate continuously.

3-D Hilbert Curve Animation https://www.youtube.com/watch?v=TfJEJidwkBQ

3-D Hilbert Curve Animation https://www.youtube.com/watch?v=TfJEJidwkBQ

Reference

This module is based on the C code provided in the 2004 article "Programming the Hilbert Curve" by John Skilling,

I was also helped by the discussion in the following stackoverflow post,

which points out a typo in the source code of the paper. The Skilling code provides two functions TransposetoAxes and AxestoTranspose. In this case, Transpose refers to a specific packing of the integer that represents distance along the Hilbert curve (see below for details) and Axes refer to the n-dimensional coordinates. Below is an excerpt from the documentation of Skilling's code,

//+++++++++++++++++++++++++++ PUBLIC-DOMAIN SOFTWARE ++++++++++++++++++++++++++
// Functions: TransposetoAxes  AxestoTranspose
// Purpose:   Transform in-place between Hilbert transpose and geometrical axes
// Example:   b=5 bits for each of n=3 coordinates.
//            15-bit Hilbert integer = A B C D E F G H I J K L M N O is stored
//            as its Transpose
//                   X[0] = A D G J M                X[2]|
//                   X[1] = B E H K N    <------->       | /X[1]
//                   X[2] = C F I L O               axes |/
//                          high  low                    0------ X[0]
//            Axes are stored conveniently as b-bit integers.
// Author:    John Skilling  20 Apr 2001 to 11 Oct 2003

Change Log

Version 2.0

Version 2.0 introduces some breaking changes.

API Changes

Previous versions transformed a single distance to a vector or a single vector to a distance.

  • coordinates_from_distance(self, h: int) -> List[int]
  • distance_from_coordinates(self, x_in: List[int]) -> int

In version 2.0 coordinates -> point(s) and we add methods to handle multiple distances or multiple points. The match_type kwarg forces the output type to match the input type and all functions can handle tuples, lists, and ndarrays.

  • point_from_distance(self, distance: int) -> Iterable[int]
  • points_from_distances(self, distances: Iterable[int], match_type: bool=False) -> Iterable[Iterable[int]]
  • distance_from_point(self, point: Iterable[int]) -> int
  • distances_from_points(self, points: Iterable[Iterable[int]], match_type: bool=False) -> Iterable[int]

Multiprocessing

The methods that handle multiple distances or multiple points can take advantage of multiple cores. You can control this behavior using the n_procs kwarg when you create an instance of HilbertCurve.

hilbertcurve's People

Contributors

brb1031 avatar galtay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hilbertcurve's Issues

Random sampling methods

Create methods to randomly sample distances along the curve and to randomly sample points in the n-d space.

Formulas in the Readme

Hi! I don't know any other way to notify this. This caused me slight confusion. The readme says that the hybercube has a side length of 2^p, and that the distances are indexed from 0 to 2^(N p) - 1. But really it seems that the side length of the cube is 2^p - 1, and the points are indexed from 0 to 2^(N p).

Some code verification

I am using this code, and in my point of view that is some strange in this case

p = 12; N = 3
hilbert_curve = HilbertCurve(p, N)

hilbert_curve.distance_from_coordinates(np.asarray([999,999,999]))
766935405
hilbert_curve.distance_from_coordinates(np.asarray([479, 724, 495]))
975605184

Does this result correct?
I wait that this point in 3d ([999,999,999]) was greater than ([479, 724, 495])

A question for coordinates

Hi,
I am new to this project and i want to use it,however,i met a question when i use function distance_from_coordinates(),the wrong code is '''if x[i] & Q', i use geo coordinates as input data,Q=1024 ,and it comes:TypeError: unsupported operand type(s) for &: 'float' and 'int'.
the input coordinates must be integer? is there have some method to deal with this?

distance_from_coordinates method modifies value of input variable

N               = 2
p               = 5
hilbert_curve   = hlb.HilbertCurve(p, N)

coords          = [3, 1]
distance        = hilbert_curve.distance_from_coordinates(coords)
x               = hilbert_curve.coordinates_from_distance(distance)
print("Coords   : ", coords)
print("Distance : ", distance)
print("x        : ", x)

Output:

Coords   :  [1, 2]
Distance :  6
x        :  [3, 1]

The encoded distance does not produce the correct coordinates while decoding. The tests written only check that

Is there a problem with the distance_from_coordinates function?

Documentation is outdated

The documentation at https://hilbertcurve.readthedocs.io/en/latest/readme.html still refers to coordinates_from_distance

I suspect this is because the branch master has been renamed to main - because going to edit the page via Read The Docs gives the error "Branch not found, redirected to default branch."

I think this is something the maintainer needs to change in the Read The Docs control panel.

Thanks for a great library - incredibly useful!

Version installed with "pip install hilbertcurve" is out of date

The package installed by pip aka the package here: https://pypi.org/project/hilbertcurve/ is severely out of date. I installed it on my system and encountered an error with int type. It has been fixed in this repository but the last time the PyPI project was updated was last year so using "pip install hilbertcurve" downloads a flawed version of the library. Sorry if this is the wrong place, but I didn't see a place to report a problem on pypi.org

rst docs on pypi

PyPi is not rendering the long_description in setup.py (which is equal to the README.rst)
the :math: directive seems to be the culprit. will figure it out later.

Include LICENSE in published source distribution

It is not possible to redistribute the published source tarballs from https://pypi.org/project/hilbertcurve/ without violating the license conditions

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

unless one obtains LICENSE separately from this repository or the .whl.

It would be helpful if you would include LICENSE in the source distributions too.
This can be done by adding either

  • a setup.cfg file with contents
    [metadata]
    license_file = LICESE
    
    (This requires a somewhat recent version of setuptools), or
  • a MANIFEST.in file with contents
    include LICENSE
    

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.