This repository contains several tools and utilities to assist in training salmon counting automation tools. Two major categories include video-based enumeration and sonar-based enumeration.
The code is currently under the MIT License but could be subject to change in the future.
The data and annotation are under Creative Commons BY-NC-SA 4.0 (Official license). No commercial usage and any adaptations must be published with the same license.
Any Salmon Vision models published here is under the ResearchRAIL license for research purposes only.
The current enumeration strategy is using two computer vision models: multi-object tracking (MOT) and object detection. We use ByteTrack for MOT and YOLOv6, respectively.
- Full dataset (Dropbox) of ~100 GB each for MOT and object detection.
It includes individual frame images and labels in the required format for ByteTrack and YOLOv6. They could be easily converted to other similar formats either manually or with Datumaro.
- Labels only (GitHub repo).
These annotations are in "CVAT for Video 1.1" format and include tags that
specify male/female, injuries, etc. It includes the Kitwanga River and Bear
Creek River bounding box annotations with no images. The conversion script is
in the utils
folder (utils/datum_create_dataset.py
), requiring
Datumaro to run. Refer to the
this documentation for more details.
Trained on a Ubuntu 20.04 Lambda Scalar system with 4 A5000 GPUs.
The current framework uses ByteTrack to track individual salmon for counting.
The following steps are for Ubuntu 20.04:
Clone our version of the ByteTrack repo:
git clone https://github.com/Salmon-Computer-Vision/ByteTrack.git
cd ByteTrack
Follow either the docker install or host machine install in the ByteTrack documentation to install all the requirements to run ByteTrack.
Download the bytetrack_salmon.tar.gz
dataset from the Dataset
section or convert the dataset to the MOT sequences format and use the script
in the ByteTrack
repo to convert them to the COCO format.
Extract it and put the salmon
folder in the datasets
folder in ByteTrack
if not already.
tar xzvf bytetrack_salmon.tar.gz
Download the pretrained model YOLOX nano at their model zoo.
Place the pretrained model in the pretrained
folder. The path should be
pretrained/yolox_nano.pth
.
Run the training either inside the docker container or on the host machine:
python3 tools/train.py -f exps/example/mot/yolox_nano_salmon.py -d 4 -b 256 --fp16 -o -c pretrained/yolox_nano.pth
If you canceled the training in the middle, you can resume from a checkpoint with the following command:
python3 tools/train.py -f exps/example/mot/yolox_nano_salmon.py -d 4 -b 256 --fp16 -o --resume
Lower -b
(batch size) if running on a GPU with less memory.
Once finished, the final outputs will be in YOLOX_outputs/yolox_nano_salmon/
where best_ckpt.pth.tar
would be the checkpoint with the highest validation
mAP score.
To inference with the model on a video:
python3 tools/demo_track.py video -f exps/example/mot/yolox_nano_salmon.py -c pretrained/bytetrack_x_mot17.pth.tar --path path/to/video.mp4 --fp16 --fuse --save_result
Other options can be done with demo_track.py
such as camera, and images. Run
the following to check them all:
python3 tools/demo_track.py -h
This will describe YOLOv6, however, the steps and format are similar for the other versions.
Clone the YOLOv6 repo:
git clone https://github.com/meituan/YOLOv6.git
Install Python3 requirements:
cd YOLOv6
pip3 install -r requirements.txt
Download the yolov6_salmon.tar.gz
dataset from the Dataset
section or convert the dataset to the YOLO format following the instructions in
the YOLOv6 repo.
Extract the dataset:
tar xzvf yolov6_salmon.tar.gz
Download the combined_bear-kitwanga.yaml
file from the Dataset
section and place it in the data
folder which describes the location of the
dataset and the class labels. Please edit the yaml to point to where you
extract the dataset.
Run the training using multi-GPUs:
python -m torch.distributed.launch --nproc_per_node 4 tools/train.py --epoch 100 --batch 512 --conf configs/yolov6n_finetune.py --eval-interval 2 --data data/combined_bear-kitwanga.yaml --device 0,1,2,3
Lower --batch
size appropriately if running on GPUs with less memory.
The final outputs will be in runs/train/exp<X>
, where <X>
is the number of
the run.
To run inferencing with YOLOv6:
python3 tools/infer.py \
--yaml data/combined_bear-kitwanga.yaml \
--weights runs/train/exp${X}/weights/best_ckpt.pt \
--source "$vid" \
--save-txt \
--device $device
$device
describes the GPU device number. If you only have one, $device = 0
.
The resulting output will be in the runs/inference
folder.
Check the YOLOv6 README for further inference commands or check python3 tools/infer.py -h
.
Convert ARIS sonar files to videos with pyARIS
using the Python 3 script
./extract_aris/aris_to_video.py
.