remega / omcnn_2clstm Goto Github PK

The model of "DeepVS: A Deep Learning Based Video Saliency Prediction Approach" (ECCV2018

Python 100.00%

omcnn_2clstm's Introduction

OM-CNN+2C-LSTM for video salinecy prediction

The model of " DeepVS: A Deep Learning Based Video Saliency Prediction Approach " at ECCV2018.

Abstract

Over the past few years, deep neural networks (DNNs) have exhibited great success in predicting the saliency of images. However, there are few works that apply DNNs to predict the saliency of generic videos. In this paper, we propose a novel DNN-based video saliency prediction method. Specifically, we establish a large-scale eye-tracking database of videos (LEDOV), which provides sufficient data to train the DNN models for predicting video saliency. Through the statistical analysis of our LEDOV database, we find that human attention is normally attracted by objects, particularly moving objects or the moving parts of objects. Accordingly, we propose an object-to-motion convolutional neural network (OM-CNN) to learn spatio-temporal features for predicting the intra-frame saliency via exploring the information of both objectness and object motion. We further find from our database that there exists a temporal correlation of human attention with a smooth saliency transition across video frames. Therefore, we develop a two-layer convolutional long short-term memory (2C-LSTM) network in our DNN-based method, using the extracted features of OM-CNN as the input. Consequently, the inter-frame saliency maps of videos can be generated, which consider the transition of attention across video frames. Finally, the experimental results show that our method advances the state-of-the-art in video saliency prediction.

Publication

The extended pre-print version of our work is published in ECCV2018, one can cite with the Bibtex code:

@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Lai and Xu, Mai and Liu, Tie and Qiao, Minglang and Wang, Zulin},
title = {DeepVS: A Deep Learning Based Video Saliency Prediction Approach},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Models

The whole architecture consists of two parts: OM-CNN and 2C-LSTM, which is shown below. The pre-trained model has already been uploaded to Google drive and BaiduYun.
For running the demo, please the model should be decompressed to the directory of ./model/pretrain/.

Database

As introduced in our paper, our model is trained by our newly-established eye-tracking database, LEDOV, which is also available at Dropbox and BaiduYun

Usage

This model is implemented by tensorflow-gpu 1.0.0, and the detail of our computational environment is listed in 'requirement.txt'. Just run 'TestDemo.py' to see the saliency prediction results on a test video.

Visual Results

Some visual results of our model and ground-truth.

Ablation

To do

Our DeepVS2.0 will be upated soon.

Contact

If any question, please contact [email protected] （or [email protected]）, or use public issues section of this repository.

License

This code is distributed under MIT LICENSE.

Supplementary material

Link

omcnn_2clstm's People

Contributors

Stargazers

Watchers

omcnn_2clstm's Issues

How did you fine tune on my data

thansk for your email reply

The outputs are 16 frames less than the origin one?

I found the output saliency map is 16 frames less than the input one. Is it that ?

Transfer learning with my own data

Is it possible to perform transfer learning on the pretrained model? Do you have plans to release the train code as well?

Thank you

The output doesn't match with the input.

I found the output saliency map is 16 frames less than the input one. If my input is 192 frames , then the output will be 176 frames. I look through the paper and didn't found any helpful information. So I want to ask is that true?

cannot write the results

When I run the TestDemo.py, I can not obtain any output.

New video: animal_alpaca01 with 0 frames and size of (0, 0)
Total time for this video 0.000004

notfounderror

NotFoundError (see above for traceback): ./model/pretrain/LSTMconv_prefinal_loss05_dp075_075MC100-200000.data-00000-of-00001
[[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_5/tensor_names, save/RestoreV2_5/shape_and_slices)]]
[[Node: save/RestoreV2_112/_39 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_890_save/RestoreV2_112", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]
the error is like this , tensorflow-- 1.0.0 python--2.7,
how to solve this

How do I train my own datasets?

directions to train OMCNN correctly

Congrats for the excellent work and thank you for the well-documented public release of dataset!

Could you please provide the training code? It would be helpful to understand and reproduce some of the useful steps mentioned in your paper, such as CB dropout, etc.

hi

i can not run testdemo.py because the tensorflow 1.0 , i install tensorflow 2.9 , i run the TestDemo.py and the error is
AttributeError: module 'tensorflow' has no attribute 'contrib'

Training stage

Dear author,

Thank you very much for sharing your work, I am very interested in your work. I would be greatly appreciated if you can share your training demo, because i am not sure the training stages of OMCNN and 2c-lstm in the code. Thank you so much.

best regards.

Hello，the 'epoch_num=15' in your code（TestDemo.py） is the best choice？Will over-fitting happen when it =20 or larger ? Thanks :) I'm poor at English :)

How did you fine tune SalGAN to LEDOV ?

Thank you very much for publishing this work and sharing the models and dataset. I am one of the authors of SalGAN [28] and I was wondering how you fine-tuned SalGAN to this dataset. We developed it in Lasagne quite a long time ago now and it is not that obvious how to fine-tune it.

Could you please provide some details about how do perform this domain adaptation ?

Congrats for the nice work :)