Giter Site home page Giter Site logo

kitware / smqtk Goto Github PK

View Code? Open in Web Editor NEW
74.0 14.0 58.0 200.08 MB

Python toolkit for pluggable algorithms and data structures for multimedia-based machine learning.

Home Page: http://smqtk.readthedocs.org/

License: Other

CMake 0.89% Python 31.02% HTML 1.01% CSS 0.03% JavaScript 1.70% Shell 0.46% TeX 0.95% C 1.65% C++ 9.84% Cuda 0.88% MATLAB 0.30% Makefile 0.09% Java 1.23% M4 0.78% Jupyter Notebook 1.12% Dockerfile 0.22% Stylus 0.01% Pug 0.07% Terra 47.73% Jinja 0.01%
machine-learning machine-learning-algorithms multimedia algorithm python framework plugin hacktoberfest

smqtk's Introduction

SMQTK - Deprecated

CI Unittests

Documentation Status

Deprecated

As of Jan 2021, SMQTK v0.14.0 has been deprecated. The various interfaces and implementations have been broken out into the following distinct packages which will continue to be supported instead.

  • SMQTK-Core provides underlying tools used by other libraries.

  • SMQTK-Dataprovider provides data structure abstractions.

  • SMQTK-Image-IO provides interfaces and implementations around image input/output.

  • SMQTK-Descriptors provides algorithms and data structures around computing descriptor vectors.

  • SMQTK-Classifier provides interfaces and implementations around classification.

  • SMQTK-Detection provides interfaces and support for black-box object detection.

  • SMQTK-Indexing provides interfaces and implementations around the k-nearest-neighbor algorithm.

  • SMQTK-Relevancy provides interfaces and implementations around providing search relevancy estimation.

  • SMQTK-IQR provides classes and utilities to perform the Interactive Query Refinement (IQR) process.

Intent

Social Multimedia Query ToolKit aims to provide a simple and easy to use API for:

  • Scalable data structure interfaces and implementations, with a focus on those relevant for machine learning.
  • Algorithm interfaces and implementations of machine learning algorithms with a focus on media-based functionality.
  • High-level applications and utilities for working with available algorithms and data structures for specific purposes.

Through these features, users and developers are able to access various machine learning algorithms and techniques to use over different types of data for different high level applications. Examples of high level applications may include being able to search a media corpus for similar content based on a query, or providing a content-based relevancy feedback interface for a web application.

Documentation

Documentation for SMQTK is maintained at ReadtheDocs, including build instructions and examples.

Alternatively, you can build the sphinx documentation locally for the most up-to-date reference (see also: Building the Documentation):

# Navigate to the documentation root.
cd docs
# Install dependencies and build Sphinx docs.
pip install -r readthedocs-reqs.txt
make html
# Open in your favorite browser!
firefox _build/html/index.html

smqtk's People

Contributors

aashish24 avatar ankitshah009 avatar bardkw avatar benjamin-pikus-kw avatar bluemellophone avatar chetnieter avatar chrismattmann avatar danlamanna avatar dependabot[bot] avatar erotemic avatar fishcorn avatar jbeezley avatar jeffbaumes avatar kfieldho avatar kw-moeller avatar kylefromkitware avatar manthey avatar msarahan avatar opadron avatar predicative avatar purg avatar waxlamp avatar z-harry-sun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smqtk's Issues

ImageSpace: Extraneous scrollbars under each image

Under each image, there is a div that holds the icons. It has scroll bars with nothing to scroll. This can be fixed by setting overflow:auto or deleting the overflow tag set on .im-caption in imagespace.min.css on line 38.

This appears in Firefox 38.0 on Ubuntu 15.04.

Update IQR control structures

They are still using mechanisms from the old system: pure ID references, no new data representation use. This needs to be fixed before we can finish the IQR demo GUI again.

Add some tests for the revised implementations.

ImageSpace: Minor interface tweaks

I think these should be simple and can be handled with text and/or html, so I'm grouping them together:

  1. A button to show all images. It's not too bad to click in the search bar and then hit enter, unless you're coming from a page that already has the search field populated. Then you have to remember how to do it.
  2. A note on the Image Size Distribution and Serial Number Distribution graphs that says that they are only showing results from the current search.
  3. If there are no results from the current search for these two graphs, let's say that: "No results found for current search terms." As it is, it's just a bit confusing.

Refactor out use of global configuration components

Since this is not intended to be a single application library, each application should have its own configuration. This would be useful for preventing preventing unrelated applications from having more configuration than they need, as well as preventing applications from accidentally having duplicate configurations, e.g. server ports, security keys, etc.

This should be easier to do since we now have the Configurable interface and mechanics for nested configuration of things.

Integrate use of DescriptorElement abstraction layer

in-memory and file-based implementations are already there, but are not currently being used anywhere in the system. It will be advantageous to use the abstraction for future scaling, i.e. able to store features in db, elastic cloud, etc.

Confidence Interval display is broken

The polygon being drawn to represent the CI for a PR/ROC curve doesn't always overlap the underlying curve. This basically means the CI as it is being displayed now is meaningless. Either fix current implementation/display of the Wilson's score implementation or find another CI method to use.

Add environment variable option for generic plugin method

Currently, the generic plugin function will only pick up available implementations within a single specific directory in the source tree. When SMQTK is installed somewhere, the source tree will mostly likely be un-modifiable, or should at least be assumed so. So, for this all to work in an actual "plugin" fashion, there should be a way to add other directories for the plugin methods to look at and pick implementations up from.

Tests for bin/ scripts

Might need/want a flag in CMake or otherwise to enable these since they may end up taking a bit of time to run, drawing from my experience with these kinds of tests on other projects.

Requirements files cleanup

There are numerous packages listed in requirements files that should probably be optional. Some are also only needed for web modules, which might also want separate requirements files. With many requirements files, might also want to put them into a sub-directory instead of making the source root messy.

ImageSpace: how do you show the full size image?

Previously clicking on the image would open it full size in a new browser window. Now, it doesn't seem possible to see the full size image at all.

I think this is important, especially if you're trying to see if it's the same person in two images - or get a better look at a tattoo, etc...

I would suggest another icon under the images (maybe the picture icon?) that when clicked opens it full size. This could be a browser window for now, but we should probably think about adding light box style functionality so that you could actually move through the images like you do through any web gallery.

Add a high level data representation interface

So, at least, we can levy an abstract serialize and unserialize method instead of globally requiring the use of pickle for serialization, which is known to be sub-optimal for certain things (like numpy arrays).

New ``DescriptorIndex`` Data structure

NearestNeighborIndex, RelevancyIndex and CodeIndex algorithms and structure use some sort of "index" of descriptor element instances. I think there could be a generalized DescriptorElement index that abstractly stores instances. Elements should be query-able by their type/UUID pair (true unique identifier).

ImageSpace: missing ads db data??

This may just be me being confused, but when I look at images such as this one:
id:"/data/roxyimages/5070f670039e500755fe7887d6f3d4a8f24e4a79.jpg"
I don't see an ad ID or the text from the ad that it was retrieved from.

Am I missing something or is something amiss?

ImageSpace: Image Size link produces no results

When I click on a dot in the Image Size Distribution graph, I get no results. The dot shouldn't show up if there are no results associated with it.

To reproduce:

  1. Open ImageSpace, click in search bar (keep the *) and hit enter to bring up all images.
  2. The first image (id: /data/roxyimages/cmuImages/Illinois_2012_11_12_1352748331000_6_0.jpg) is a woman in a green Vancouver t-shirt laying on her back. Open the details and click Find Similar Images.
  3. When the results load, click Image Size Distribution in the left side bar.
  4. Click the dot at width = 149, height = 166. This should probably have 4 results, according to its size.
  5. See that it returns no results.

Parallel map function can hang during interruption or externally killed workers

When Ctrl-C'ing a parallel-map in progress, an dead-lock can occur.

It has also been seen that if the workers are doing web-requests, they can lock up, possibly due to an infinite wait issue with the request. Then the threads or processes are killed externally, the function dead-locks and can't clean itself up properly.

Add VP-Tree HashIndex

Or somehow generalize this into the CodeIndex representation structure. Its pretty much the case where all LSH-based nearest neighbor algorithms will/can use the same nearest-code-neighbor strategy once hashs are generated.

Similarity kNN server

Some server application that, given a label indicating the index to use, an example descriptor as a JSON list/vector, and N number of desired neighbors, return a JSON object that lists in order the [UUID, descriptor] pairs in distance order. Maybe include distance from exemplar value for each pair, too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.