This is a fork of Manzil Zaheer's multithreaded CoverTree implementation, with modifications to store feature-ids as well as features & an embedded Mongoose asynchronous HTTP server to enable queries to be made over HTTP.
To use,
- Modify
config.json
so that the filepaths point to your list of filenames and your file of features, i.e
{"19" : [
{"region" : "<Name of this dataset>",
"data" : "<Filepath of data>",
"filenames" : "<List of filenames that correspond to the data>"},
{"region" : "<Another dataset>",
"data" : "<Filepath of data>",
"filenames" : "<List of filenames that correspond to the data>"}
]
}
Currently, the datafile format is <int number of points><int number of dimensions><array of doubles of point data>
. The file data/convert.py converts a .t7 file that contains an array of feature-vectors into this format. You could also modify the read_point_file function to read your own data format, or trivially modify convert.py
to convert a .npy
to this format.
make
(if not already compiled)- Run
dist/cover_tree <path to config.json> <port number to use for server>
The server responds to queries of the form:
<url of server>:<port>/?filename=<filename of feature>&limit=<number results to return>&level=19®ion=<dataset name>
- C++14 compiler
- OpenMP
- Optional:
torchfile
to use convert.py