Light

umass-rescue / combinedtechstack Goto Github PK

View Code? Open in Web Editor NEW

1.0 5.0 1.0 51.99 MB

Handle and process large amounts of media data with plug-and-play machine learning models

Home Page: https://umass-rescue.github.io/CombinedTechStack/

License: MIT License

Shell 0.35% Dockerfile 0.60% JavaScript 47.29% HTML 0.32% Python 49.39% Roff 2.05%

combinedtechstack's Introduction

Combined Server, Client, and Microservices

This repository contains the combined code for the frontend, server, and both training+prediction microservices.

Setup

The setup of the new combined repository is very simple. First, delete any existing containers and volumes from any previous installations of the server or microservices.

To run this repository, follow the following steps:

Installation Step 1: Install Docker

Install Docker Desktop and ensure that Docker is running on your computer.

Install Docker Desktop Here

Now, open the root folder of this repository in the command line.

Installation Step 2: Build Project Containers

Run the command docker-compose build

Running the Application

Start Application

Run the command docker-compose up

Stop Application

Run the command docker-compose down

API Access

If you would like to interact with the API, download the collection on Postman for easy testing

combinedtechstack's People

Contributors

Stargazers

Watchers

Forkers

anishapai

combinedtechstack's Issues

Enable GPU In Microservices

Currently, the Docker containers used for training and prediction microservices only have access to the CPU and memory.
Allowing the containers to interact with the system GPU can lead to large speed-ups in prediction and training tasks.

Testing UI for Predictions

Right now when a user wants to test their model, they must either connect via the docker shell or use the server + client to submit requests.

There should be a UI that allows for users to upload images to the model and then view the results.

Create separate home screens per user type

Right now, there is only one home screen shown to all users, and depending on the user type there is either extra/not enough information. As there are three user types: [admin, researcher, investigator] each should have their own version of the home screen.

Optional: Allow toggling between different views depending on permission hierarchy (admin > researcher > investigator)

The views should contain the following features UI features:

Admin:

Add permissions to users via username and role selector dropdown
Remove permissions from users via username (auto populating current role)
List all users of each role type in a list/table

Researcher:

Create and view API keys for use with microservices

Investigator:

View number of pending jobs that the user has submitted
View charts on image statistics. (example: images uploaded per day, images matching model criteria, etc...)

Functionality for Video objects in Review page

There is currently a Review page in the front-end for Images. We can add functionality for Video as well. Initial design idea:

Constant Image Sizing on Load

Currently, datasets can contain images of any size. On load the images should be modified to be the same size as to work with all models.

Streamline Investigator Workflow

Currently Investigators would have to manually select models to run when importing data through the client. To maximize efficiency adding features to streamline this process can be added.

Features:

Add Model Tagging
Be able to register a user-specific preset of models to run. (Ie the user can select a "fast" preset where models that run relatively faster are selected. Conversely, be able to to select "slow" preset to run models over the weekend)
Be able to choose models to run on data based on model tag (ie select all models tagged with "huggingface")

Testing the Object Pipeline

Tests have not been written for the abstracted object pipeline yet.

Things to try breaking for tests:

Uploading different files/model types via both the postman endpoint and the front-end.
Changing the model_type parameter in config.py to be something other than video, audio, image.
Uploading broken files and making sure the POST request fails.
Testing all object-related methods in db_connection.py

Allow video files to be run through other models

We can modify the /predict endpoint so that when a video comes in, it is converted to audio and text as well, as per the user's request (this would involve thinking about the front-end UI first, and what is the most intuitive way to present options to the investigator before tackling the back-end). This issue deals with the front-end.

In the back-end, this would be done by adding a clause to the prediction pipeline that is something like:

if file type is video and audio models are selected:
  1. convert files to audio 
  2. run predictions on audio models.

Things to think about:

For step 2, the same create_new_prediction function can be called.
Would recommend abstracting this clause to a helper function because the create_new_prediction function for readability
There is a model_type parameter in the create_new_prediction function. Is this still necessary?
The user could select both video and audio models, so video prediction should still be run in that case.

Create Dataset Explorer View

When a researcher wants to train a model on a dataset, they must first know what datasets are available and the general details about each dataset.

Create a new view on the Training page which queries existing server endpoints to get information on the available datasets.

Some endpoints which may be useful for this task:
/training/list - Lists all datasets connected to the server
/training/detail - Gives general information on the number of processing/finished training jobs

Improve Hyperparameter tuning to support more hyperparameters

Currently, hyperparameter tuning can tune only (1) optimizer and (2) learning rate parameters.

It should support more optimizer's hyperparameters. Now It will crash if we use a hyperparameter that is specific to any optimizer. (e.g. trying to use Adam doesn't have momentum but SGD has)

Audio Speech Recognition

Use machine learning to perform audio speech recognition and transcription. Use the transcription to pull out valuable information

Perform speech recognition on long audio files
From an audio transcription, perform Named Entity Recognition
Be able to discern all important name entities after NER operation (ranking?)
Associated named entities with time stamp from audio

Combine Text Pipeline with Object Pipeline

For Prediction, there are currently two pipelines:

a text pipeline, in which the models process string objects
an object pipeline, in which the models process 'file' objects

The /predict endpoints also take in different parameters. We could combine these endpoints so that all uploaded files go through the same endpoint and are processed from there.

This would involve:

Thinking carefully about how to maintain all the text functionality, e.g. authors, audience, source
Considering if this is even necessary? Maybe it is decided that they should stay separate. How will the front-end accommodate this?
Perhaps creating helper functions that process the uploaded files according to their type
Being wary of keeping the create_new_prediction function readable, and concise

Create Statistics Page Endpoints

After tagging is complete, create endpoints for a new page on the server: statistics

This page will show charts per tag, such as how many images fit a certain criteria in a model result, how many images are processing, etc...

This review page should also have the option to export images matching a certain criteria, similar to the search feature of the Review page.

Research best way to handle large amounts of uploaded files

All images are currently uploaded directly via HTTP. However, this is a huge bottleneck when the number of images being uploaded is large.

Different protocols should be investigated to determine if HTTP is in fact the optimal solution, and if there is a better way to send large quantities of images to the server.

Allow Training Result Sharing

When a user submits a training job, they are the only one able to view the results. Add the option to share training results with other registered users.

This will involve changing the user field in the training request to a list, adding some new endpoints, and then having corresponding UI elements on the frontend.

Force Directed Graph for User + Images

Create a React Component that displays a Force Directed Graph with user-image associations.

Many users may interact with specific images, and the relationships between which users use which portion of the images can be displayed well in a force directed graph. Creating this component and having an interactive FDG will enable more comprehensive review and insights on the image data.

Add Graphs to Home Page

The home page is extremely bland and boring right now. It can be enhanced with graphs, charts, tables, etc... that show different statistics about the user's requests and the status of the server.

This is an extremely open-ended issue but it offers an opportunity to become familiar with the endpoints available on the server as well as getting started with React .

Handle Automatic Train/Test Split

Right now the server only supports a train/validation split with the entire dataset, and no test set is generated when the dataset is loaded.

Create a new environment variable in docker-compose.yml for train/test split, and then use this new test set to provide additional statistics on model training.

Stress Testing Server

Using Pytest, make a test called test_stress_import that does not run by default. This test would be called specifically when pushed/deployed to a machine when an environment variable is set, and then benchmark the import function and ensure that the import feature completes in a reasonable amount of time.

Combine front-end Upload pages

Currently there are two different front-end views for Image and Video. These can also be combined, and the front-end Upload page can be re-designed to swap intuitively between different model types (audio, video, text, image). This issue deals with the back-end.

Possibly relevant: there are two different endpoints for /list/image and /list/video. in prediction.py These can be combined and can take a type parameter.

Take a look at the back end, if needed: https://github.com/UMass-Rescue/CombinedTechStack/wiki/Abstracted-Prediction-Pipeline

Create Statistics Page

Once tagging is created, a new page should be created for Statistics. This will show charts and information on a certain tag using your JS chart framework of choice. Information from the statistics endpoints on the server can be displayed nicely, with bar graphs and pie charts summarizing results.

Support Pytorch Lightning

Right now all training is done via Tensorflow2 and Keras. Add support for Pytorch Lightning.

Autocomplete for Image Tags

Currently, the image tag autocomplete comes from a static list on the client-side.

There should be a module in the backend to manage and store this list. And the client should fetch the list from the server.

Image Batching on Upload

Currently, models are fed data image-by-image. This is not efficient, and ideally they will be able to take a batch of images at one time to create predictions.

Determine how to enable image batching, and create a NEW endpoint that will be able to batch upload images at /model/batch_predict.

Do not worry about compatibility with existing micro-services, as they can be updated later to fit this new pattern and the old endpoint still exists.

Search Page Bug Fix

Videos models do not register on the search page. Ie searching by model in "advanced" search will not show any results if a video model is selected.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.