Giter Site home page Giter Site logo

marcyin / nmslib Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nmslib/nmslib

0.0 2.0 0.0 94.94 MB

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

Python 1.41% C 0.29% C++ 98.30%

nmslib's Introduction

Python bindings for NMSLIB

Installation

This project works with Python on version 2.7+ and 3.5+, and on Linux, OSX and the Windows operating systems. To install:

pip install https://github.com/MarcYin/nmslib/archive/master.zip

Building on Windows requires Visual Studio 2015, see this project for more information.

Example Usage

Here is a simple example, but we also have Python notebooks with more elaborate end-to-end examples, which include even computation of gold-standard data (for both sparse and dense space):

import nmslib
import numpy

# create a random matrix to index
data = numpy.random.randn(10000, 100).astype(numpy.float32)

# initialize a new index, using a HNSW index on Cosine Similarity
index = nmslib.init(method='hnsw', space='cosinesimil')
index.addDataPointBatch(data)
index.createIndex({'post': 2}, print_progress=True)

# query for the nearest neighbours of the first datapoint
ids, distances = index.knnQuery(data[0], k=10)

# get all nearest neighbours for all the datapoint
# using a pool of 4 threads to compute
neighbours = index.knnQueryBatch(data, k=10, num_threads=4)
neighbours = index.knnQueryBatch(data, k=10, num_threads=4, index_only=True) # only index will return

Basic tuning guidelines

The basic parameter tuning/selection guidelines are available here.

Logging

NMSLIB produces quite a few informational messages. By default, they are not shown in Python. To enable debugging, one should use the following commands before importing the library:

import logging
logging.basicConfig(level=logging.DEBUG)

Installing with Extras

To enable extra methods like those provided by FALCONN and LSHKIT you need to follow an extra couple steps.

These methods require a development version of the following libraries: Boost, GNU scientific library, and Eigen3. To install on Ubuntu:

sudo apt-get install libboost-all-dev libgsl0-dev libeigen3-dev

Next clone the repository and build with the C++ files using CMake:

cd similarity_search
cmake . -DWITH_EXTRAS=1
make
cd ..

Finally build and install the python extension:

cd python_bindings
pip install -r requirements.txt
python setup.py install

Additional documentation

More detailed documentation is also available (thanks to Ben Frederickson).

nmslib's People

Contributors

marcyin avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.