Giter Site home page Giter Site logo

james4388 / vptree.js Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fpirsch/vptree.js

0.0 2.0 0.0 1.03 MB

Javascript implementation of the Vantage-Point Tree nearest neighbor search algorithm

License: ISC License

JavaScript 93.80% CSS 4.27% HTML 1.94%

vptree.js's Introduction

vptree.js

A JavaScript implementation of the Vantage-Point Tree nearest neighbor search algorithm.

The VP tree is particularly useful in dividing data in a non-standard metric space into a BSP tree. Tree construction executes in O(n log(n)) time, and search is under certain circumstances and in the limit, O(log(n)) expected time. This makes it suitable when distance computations are expensive.

Simple Example

// A whole lot of strings
var stringList = [
	'culture',
	'democracy',
	'metaphor',
	'irony',
	'hypothesis',
	'science',
	'fastuous',
	'integrity',
	'synonym',
	'empathy'     // and on and on...
];

// building the tree
vptree = VPTreeFactory.build(stringList, levenshteinDistance);

nearest = vptree.search('democratic');	// [{"i":1,"d":3}]
index = nearest[0].i;			// index of nearest element is 1
distance = nearest[0].d;		// distance of nearest element is 3
alert( stringList[index] );		// alerts 'democracy'

API

The API exposes the VPTreeFactory object.

Building the tree

VPTreeFactory.build(S, distance[, bucketSize])

Builds a fresh VPTree instance.

  • S (array) the set of elements
  • distance a function that computes the distance between two elements of S
  • bucketSize (optional) to save space, tree leaves can be collapsed into buckets. This parameter gives the maximum number of leaves to collapse in each bucket.

VPTreeFactory.select(list, k, comp)

An implementation of the quick select algorithm like the nth_element function of the Standard C++ Library.

You will probably never use this function. However, it is used internally, and exposed as a bonus. Could be useful. Who knows.

  • list an array of objects or values
  • k the index of the nth_element to select, between 0 and list.length-1
  • comp comparator, a boolean function with two parameters a and b, and returning true if a < b and false if a ≥ b.

Searching the tree

vptree.search(element[, n[, dmax]])

Searches the n nearest neighbors of element in S.

  • element an object to search in S
  • n the number of closest elements to retrieve. Defaults to 1.
  • dmax maximum distance from element to search. Infinity by default.

This function returns the list of the n nearest elements found, ordered from the closest to the furthest. The list can contain less than n elements if dmax is limiting. Each item in the list is an object with 2 properties :

  • i the index of the element in S
  • d its distance to the query element

Tip: to search all elements within a certain radius, call vptree.search(element, Infinity, radius).

Precomputing the tree

Typical usage of this library involves large datasets or expensive distance computations. You will probably want to precompute the vp-tree structure, so that your final application does just the searching.

vptree.stringify()

Returns a stringified JavaScript object literal of the vp-tree structure. Like JSON.stringify but without nulls and quotes to save space. It is valid JavaScript, but not valid JSON, so JSON.parse() will complain.

The stringified object is not the whole VPTree instance : it does not contain the initial dataset, nor the distance function. Its only purpose is to be pasted in the code of your final app, where it will have to be turned back into a searchable VPTree instance with the load() function.

VPTreeFactory.load(S, distance, tree)

Reuses a precomputed stringified vp-tree, and returns a searchable VPTree instance.

  • S the array that was used to pre-build the vp-tree.
  • distance the distance function that was used to pre-build the vp-tree.
  • tree the vp-tree structure object. Must be a plain object.

About the distance function

The vp-tree algorithm needs a real metric. In particular, the squared euclidean distance won't do the job because it does not satisfy the triangle inequality : if you want to use the standard euclidean distance, don't forget the square root.

vptree.js's People

Contributors

fpirsch avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.