Giter Site home page Giter Site logo

Salmon Computer Vision Project

This repository contains several tools and utilities to assist in training salmon counting automation tools. Two major categories include video-based enumeration and sonar-based enumeration.

License

The code is currently under the MIT License but could be subject to change in the future.

The data and annotation are under Creative Commons BY-NC-SA 4.0 (Official license). No commercial usage and any adaptations must be published with the same license.

Any Salmon Vision models published here is under the ResearchRAIL license for research purposes only.

Video-based

The current enumeration strategy is using two computer vision models: multi-object tracking (MOT) and object detection. We use ByteTrack for MOT and YOLOv6, respectively.

Dataset

  • Full dataset and model (Dropbox) of ~100 GB each for MOT and object detection.

It includes individual frame images and labels in the required format for ByteTrack and YOLOv6. They could be easily converted to other similar formats either manually or with Datumaro. The pre-trained models are also there with a preliminary YOLOv8 model.

These annotations are in "CVAT for Video 1.1" format and include tags that specify male/female, injuries, etc. It includes the Kitwanga River and Bear Creek River bounding box annotations with no images. The conversion script is in the utils folder (utils/datum_create_dataset.py), requiring Datumaro to run. Refer to the this documentation for more details.

Models

Trained on a Ubuntu 20.04 Lambda Scalar system with 4 A5000 GPUs.

Multi-Object Tracker (MOT)

The current framework uses ByteTrack to track individual salmon for counting.

The following steps are for Ubuntu 20.04:

Clone our version of the ByteTrack repo:

git clone https://github.com/Salmon-Computer-Vision/ByteTrack.git
cd ByteTrack

Follow either the docker install or host machine install in the ByteTrack documentation to install all the requirements to run ByteTrack.

Download the bytetrack_salmon.tar.gz dataset from the Dataset section or convert the dataset to the MOT sequences format and use the script in the ByteTrack repo to convert them to the COCO format.

Extract it and put the salmon folder in the datasets folder in ByteTrack if not already.

tar xzvf bytetrack_salmon.tar.gz

Download the pretrained model YOLOX nano at their model zoo.

Place the pretrained model in the pretrained folder. The path should be pretrained/yolox_nano.pth.

Run the training either inside the docker container or on the host machine:

python3 tools/train.py -f exps/example/mot/yolox_nano_salmon.py -d 4 -b 256 --fp16 -o -c pretrained/yolox_nano.pth

If you canceled the training in the middle, you can resume from a checkpoint with the following command:

python3 tools/train.py -f exps/example/mot/yolox_nano_salmon.py -d 4 -b 256 --fp16 -o --resume

Lower -b (batch size) if running on a GPU with less memory.

Once finished, the final outputs will be in YOLOX_outputs/yolox_nano_salmon/ where best_ckpt.pth.tar would be the checkpoint with the highest validation mAP score.

To inference with the model on a video:

python3 tools/demo_track.py video -f exps/example/mot/yolox_nano_salmon.py -c pretrained/bytetrack_x_mot17.pth.tar --path path/to/video.mp4 --fp16 --fuse --save_result

Other options can be done with demo_track.py such as camera, and images. Run the following to check them all:

python3 tools/demo_track.py -h

Object Detector

This will describe YOLOv6, however, the steps and format are similar for the other versions.

Clone the YOLOv6 repo:

git clone https://github.com/meituan/YOLOv6.git

Install Python3 requirements:

cd YOLOv6
pip3 install -r requirements.txt

Download the yolov6_salmon.tar.gz dataset from the Dataset section or convert the dataset to the YOLO format following the instructions in the YOLOv6 repo.

Extract the dataset:

tar xzvf yolov6_salmon.tar.gz

Download the combined_bear-kitwanga.yaml file from the Dataset section and place it in the data folder which describes the location of the dataset and the class labels. Please edit the yaml to point to where you extract the dataset.

Run the training using multi-GPUs:

python -m torch.distributed.launch --nproc_per_node 4 tools/train.py --epoch 100 --batch 512 --conf configs/yolov6n_finetune.py --eval-interval 2 --data data/combined_bear-kitwanga.yaml --device 0,1,2,3

Lower --batch size appropriately if running on GPUs with less memory.

The final outputs will be in runs/train/exp<X>, where <X> is the number of the run.

To run inferencing with YOLOv6:

python3 tools/infer.py \
    --yaml data/combined_bear-kitwanga.yaml \
    --weights runs/train/exp${X}/weights/best_ckpt.pt \
    --source "$vid" \
    --save-txt \
    --device $device

$device describes the GPU device number. If you only have one, $device = 0.

The resulting output will be in the runs/inference folder.

Check the YOLOv6 README for further inference commands or check python3 tools/infer.py -h.

Sonar-based

Convert ARIS sonar files to videos with pyARIS using the Python 3 script ./extract_aris/aris_to_video.py.

salmon-computer-vision's Projects

bytetrack icon bytetrack

[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box

models icon models

Models and examples built with TensorFlow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.