AE-NE : Autoencoder with background noise estimation for background reconstruction and foreground segmentation
Implementation of the model AE-NE described in the paper " Autoencoder-based background reconstruction and foreground segmentation with background noise estimation"
BMC008.mp4
turbulence.mp4
zoom_in_zoom_out.mp4
continuous_pan.mp4
The model needs Pytorch (>= 1.7.1) and Torchvision with cuda capability (see https://pytorch.org/ )
The model also needs OpenCV (>=4.1) (see https://opencv.org/ )
To install other requirements:
pip install -r requirements.txt
The model has been tested on Nvidia RTX 2080 TI and Nvidia RTX 3090 GPU for image_sizes lower than 1000x1000
Higher image resolutions are also supported but may require reducing the default batch size in order to avoid GPU memory overflow and updating other hyperparameters such as learning rate and number of steps accordingly
the command to generate the backgrounds and foreground masks from a sequence of frames is
python main.py --input_path your_input_path
where your_input_path is the path to the folder where the frame sequence is saved. Example : python main.py --input_path /workspace/Datasets/CDnet2014/dataset/baseline/highway
the result background images and foreground masks will be stored in two subdirectories 'results' and 'backgrounds' the current working directory.
To view options, type python main.py -h
The default training mode is fully unsupervised. A weakly supervised option is also implemented, where the number of training iterations and the background complexity have to be provided as inputs to the model.
To evaluate the AE-NE model on the CDnet 2014 dataset:
- download the CDnet 2014 dataset from the following link :
http://jacarini.dinf.usherbrooke.ca/static/dataset/dataset2014.zip
and save it in some folder
- update the dataset path in the end of the python file "test_CDnet.py"
- to perform a partial test, update the category list in the end of the python file "test_CDnet.py"
- run the python program test_CDnet.py
Warning : Different runs of the model with the same inputs may lead to small differences in evaluation results compared to the published results due to the random initialization of the autoencoder and the random sampling of the images during training.
To evaluate the model on the non video datasets Clevrtex, Shapestacks and ObjectsRoom, see AST repository