Giter Site home page Giter Site logo

bf777 / deeplabcut Goto Github PK

View Code? Open in Web Editor NEW

This project forked from deeplabcut/deeplabcut

1.0 1.0 0.0 27.45 MB

Markerless tracking of user-defined features with deep learning

Home Page: http://www.mousemotorlab.org/deeplabcut

License: GNU Lesser General Public License v3.0

Python 99.83% Shell 0.17%

deeplabcut's Introduction

DeepLabCut

A toolbox for markerless tracking of body parts of animals in lab settings performing various tasks, like trail tracking, reaching in mice and various drosophila behaviors (see Mathis et al. for details). There is, however, nothing specific that makes the toolbox only applicable to these tasks or species (the toolbox has also already been successfully applied to rats and zebrafish).

Please see www.mousemotorlab.org/deeplabcut for video demonstrations of automated tracking.

This work utilizes the feature detectors (ResNet + readout layers) of one of the state-of-the-art algorithms for human pose estimation by Insafutdinov et al., called DeeperCut, which inspired the name for our toolbox (see references below).

In our preprint we demonstrate that those feature detectors can be trained with few labeled images to achieve excellent tracking accuracy for various body parts in lab tasks. Please check it out:

"Markerless tracking of user-defined features with deep learning" by Alexander Mathis, Pranav Mamidanna, Taiga Abe, Kevin M. Cury, Venkatesh N. Murthy, Mackenzie W. Mathis* and Matthias Bethge*

Overview:

A typical use case is:

A user has videos of an animal (or animals) performing a behavior and wants to extract the position of various body parts from images/video frames. Ideally these parts are visible to a human annotator, yet potentially difficult to extract by standard image processing methods due to changes in background, etc.

To solve this problem, one can train feature detectors in an end-to-end fashion. In order to do so one should:

  • label points of interests (e.g. joints, snout, etc.) from distinct frames (containing different poses, individuals etc.)
  • trains a deep neural network while leaving out labeled frames to check if it generalizes well
  • once the network is trained it can be used to analyze videos in a fast way

The general pipeline for first time use is:

Install --> Extract frames --> Label training data --> Train DeeperCut feature detectors --> Apply your trained network to unlabeled data --> Extract trajectories for analysis.

Installation and Requirements:

  • Hardware:

    • Computer: For reference, we use Ubuntu 16.04 LTS and run a docker container that has TensorFlow, etc. installed (*available in a future release). One may be able to run the code in Windows or MacOS (but we never tried). You will need a strong GPU such as the NVIDIA GeForce 1080 Ti.
  • Software:     - You will need TensorFlow (we used 1.0 for figures in papers, later versions also work with the provided code (we tested TensorFlow versions 1.0 to 1.4) for Python 3 with GPU support (otherwise training and running is very slow)

    • Install Sypder (or equivalent IDE) and/or Jupyter Notebook
    • Clone (or download) the code we provide
    • You will also need to install the following Python packages (in the terminal type):
    $ pip install scipy scikit-image matplotlib pyyaml easydict 
    $ pip install moviepy imageio tqdm tables
    $ git clone https://github.com/AlexEMG/DeepLabCut.git
    

Test the Toolbox installation & code:

  • If you want to run the code on our demo video, a mouse reaching video from Mathis et al., 2017, you will NOT run code from sections (0), (1), or (2) below, as we have created labels for this video already (and e.g. (0) will extract different frames that are thus not labeled).

  • We recommend looking at the first notebooks, then proceed to (3) Formating the data below. Also note that this demo data contains so few labeled frames that one should not train the network (other then for brief testing) on the corresponding data set and expect it to work properly - it is only for demo purposes.

Using the Toolbox code - Labeling and Training Instructions:

  • The following steps document using the code with either Python scripts or in Jupyter Notebooks:

(0) Configuration of your project: Open the "myconfig.py" file and set the global variables for your dataset. (Demo users, don't edit this if you want to test on the supplied video)

(1) Selecting data to label: In the folder "Generating_a_Training_Set", the provided code allows you to select a subset of frames in a video(s) for labeling. Make sure videos you want to use for the training set are in a sub-folder under "Generating_a_Training_Set" or change the video path accordingly in **"myconfig.py"*.

  • IDE users:

    • Open "Step1_SelectRandomFrames_fromVideos.py" and crop videos if behavior of interest only happens in subset of frame (see Step1_SelectRandomFrames_fromVideos.py for detailed instructions; edit in Spyder or your favorite integrated development environment (IDE) an run the script).
  • Juypter Users: use the Step1_.._demo.ipynb file* - In general, the supplied Jupyter Notebook is helpful to optimize the video cropping step.

Generally speaking, one should create a training set that reflects the diversity of the behavior with respect to postures, animal identities, etc. of the data that will be analyzed. This code randomly selects frames from the videos in a temporally uniformly distributed way. This is fine when the postures vary accordingly. However, the behavior might be sparse (as in the case of reaching, where the reach and pull is very fast and the mouse is not moving much between trials). However, one can extract various example videos of different pulls, then this code will sample the behavior well. One should take this into account when selecting frames to label (i.e. because you can label so little data, be sure your selected frames capture the full breadth of the behavior. You may want to additionally hand select extra frames of interest).

(2) Label the frames:

  • You should label a sufficient number of frames with the anatomical locations of your choice. For the behaviors we have tested so far, 100-200 frames gave good results (see preprint). Depending on your required accuracy more training data might be necessary. Try to label consistently similar spots (e.g. on wrist that is very large).

  • Labeling can be done in any program, but we recommend using Fiji. In Fiji one can simply open the images, create a (virtual) stack* (in brief, in fiji: File > Import > Image Sequence > (check "virtual stack")), then use the "Multi-point Tool" to label frames. You scroll through the frames and click on as many points as you wish in the same order on each frame. Then simply measure and save the resulting .csv file (Analyze>Measure (or simple Ctrl+M)).

*To open virtual stack see: https://imagej.nih.gov/ij/plugins/virtual-opener.html The virtual stack is helpful when the images have different sizes. This way they are not rescaled and the label information does not need to be rescaled.

(3) Formatting the data I:

  • IDE users:

If you did not label some frames in the video because the bodyparts you were labelling disappeared, first use "FrameTrimmer.py". This compares the frames with the csv file of labels, permanently discarding frames that do not match up with the labels (which are recognized as frames that weren't labelled). It's a good idea to create a backup of these frames and the .csv first!

To split the labels by body part (which is required for Step 2 to work), use the code "Step2_1_Splitting_CSV.py". This will generate one csv for each bodypart in the same folder as the images. Make sure to discard your original .csv after this is done.

The code "Step2_ConvertingLabels2DataFrame.py" creates a data structure in pandas (stored as .h5 and .csv) combining the various labels together with the (local) file path of the images. This data structure also keeps track of who labeled the data and allows to combine data from multiple labelers. Keep in mind that ".csv" files for each bodyparts listed in the myconfig.py file should exist in the folder alongside the individual images.

  • Juypter Users: use the Step2_.._demo.ipynb file

(4) Checking the formated data:

After this step, you may check if the data was loaded correctly and all the labels are properly placed (Use "Step3_CheckLabels.py").

  • Juypter Users: use the Step3_.._demo.ipynb file

(5) Formating the data II: Next split the labeled data into test and train sets for benchmarking ("Step4_GenerateTrainingFileFromLabelledData.py"). This step will create a ".mat" file, which is used by DeeperCut as well as a ".yaml" file containing meta information with regard to the parameters of the DeeperCut. Before this step consider changing the parameters in 'pose_cfg.yaml'. This file also contains short descriptions of what these parameters mean. Generally speaking pos_dist_thresh and global_scale will be of most importance. Then run the code. This file will create a folder with the training data as well as a folder for training the corresponding model in DeeperCut.

  • Juypter Users: use the Step4_.._demo.ipynb file

  • The output will be two folders for train and test data (with their respective yaml files)

(6) Training the deep neural network:

The folder pose-tensorflow contains an earlier, minimal yet sufficient for our purposes variant of DeeperCut, which we tested for TensorFlow 1.0 to 1.4. Before training a model for the first time you need to download the weights for the ResNet pretrained on ImageNet from tensorflow.org (~200MB). To do that:

 $ cd pose-tensorflow/models/pretrained
 $ ./download.sh

Next copy the two folders generated in step (5) Formating the data II into the models folder of pose-tensorflow (i.e. pose-tensorflow/models/). We have already done this for the example project, which you will find there. Then (in a terminal) navigate to the subfolder "train" of the machine file, i.e. in our case and then start training (good luck!)

 $ cd pose-tensorflow/models/reachingJan30-trainset95shuffle1/train
 $ TF_CUDNN_USE_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=0 python3 ../../../train.py 

If your machine has multiple GPUs, you can select which GPU you want to run on by setting the environment variable, eg. CUDA_VISIBLE_DEVICES=0.

Tips: You can also stop during a training, and restart from a snapshot (aka checkpoint): Just change the init_weights term, i.e. instead of "init_weights: ../../pretrained/resnet_v1_50.ckpt" put "init_weights: ./snapshot-insertthe#ofstepshere" (i.e. 10000). Train for several thousands of iterations until the loss plateaus.

(7) Evaluate your network:

In the folder "Evaluation-tools", you will find code to evaluate the performance of the trained network on the whole data set (train and test images).

 $ CUDA_VISIBLE_DEVICES=0 python3 Step1_EvaluateModelonDataset.py #to evaluate your model [needs TensorFlow]
 $ python3 Step2_AnalysisofResults.py  #to compute test & train errors for your trained model

(8) Run the trained network on other videos and label videos results

After successfully training and finding low generalization error for the network, you can extract labeled points and poses from all videos and plot them above frames. Of course one can use the extracted poses in many other ways.

  • To begin, first edit the myconfig_analysis.py file

  • For extracting posture from a folder with videos run ("CUDA_VISIBLE_DEVICES=0 python3 AnalyzeVideos.py") and then make labeled videos ("MakingLabeledVideo.py"). Use "PlotEval.py" to output all of the predicted labels for a given video to .csv, and display all bodyparts as a single, averaged value (this is useful for communicating the efficacy of the label predictor).

(9) Run the trained network on a streaming video

Requires a Raspberry Pi with a Picam. It is likely possible to use other webcams as well, but the current script has only been tested with Picam.

Coming soon!

Contribute:

Support:

If you are having issues, please let us know (Issue Tracker). For questions feel free to reach out to: Alexander Mathis [[email protected]] and/or Mackenzie Mathis [[email protected]]

Code contributors:

Alexander Mathis, Mackenzie Mathis, Taiga Abe, Jonas Rauber and of course the DeeperCut authors for the feature detector code. The feature detector code is based on Eldar Insafutdinov's tensorflow implementation of DeeperCut. Modifications in this fork were added by Brandon Forys. Please check out the following references for details:

References:

@inproceedings{insafutdinov2017cvpr,
    title = {ArtTrack: Articulated Multi-person Tracking in the Wild},
    booktitle = {CVPR'17},
    url = {http://arxiv.org/abs/1612.01465},
    author = {Eldar Insafutdinov and Mykhaylo Andriluka and Leonid Pishchulin and Siyu Tang and Evgeny Levinkov and Bjoern Andres and Bernt Schiele}
}

@article{insafutdinov2016eccv,
    title = {DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model},
    booktitle = {ECCV'16},
    url = {http://arxiv.org/abs/1605.03170},
    author = {Eldar Insafutdinov and Leonid Pishchulin and Bjoern Andres and Mykhaylo Andriluka and Bernt Schiele}
}

@misc{1804.03142,
Author = {Alexander Mathis and Pranav Mamidanna and Taiga Abe and Kevin M. Cury and Venkatesh N. Murthy and Mackenzie W. Mathis and Matthias Bethge},
Title = {Markerless tracking of user-defined features with deep learning},
Year = {2018},
Eprint = {arXiv:1804.03142},
}

License:

This project is licensed under the GNU Lesser General Public License v3.0.

deeplabcut's People

Contributors

alexemg avatar bf777 avatar mmathislab avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.