The pytorch-vsumm-reinforce's discuss from kaiyangzhou

How can get change points using KTS?

I tried to get change points using KTS code.
But i couldn't get proper change points.

If someone get change points using KTS, please help me?

Bernoulli distribution

Hi @KaiyangZhou,

1/ What's the value of m? (m = Bernoulli(probs))
2/ what's the values of the probability Pt in case of , At (action) equal =1?

Frame-wise importance scores while downsampling a video

I have to use the TVSum50 dataset on a video summarization task. The original video uses a frame rate of 30 fps and each frame is assigned an importance score from 1-5. I have to downsample the video to 3 fps but am not understanding how that will affect the importance scores. Can anyone help me here, please?

how to extend my summarized video length?

My output summarized videos are bit faster. For this reason it skipped lots of important parts from input video. I want to extend my summarized video length with appropriate important info. Thanks in advance.

Stopping criteria

As you said in the paper, there is a description on stopping criteria. "For all our models, we stop training after K consecutive epochs with descending summarization F-score on the validation set. We set K = 5", but I can not see any clue on this strategy from your code. Moreover, I don't see validation set in your code. Can you make some explanation ?

Find features, change points, num_frames and positions for custom test video

Hi @KaiyangZhou,

I wanted to know how I can find the following features to generate a summary for a custom video:

Features (for finding seq and probs)
Change points (cps)
Number of frames (num_frames)
Number of frames per seg (nfps)
Positions

Please let me know!

Extracting image features for videos

The project is unstable, does someone download and run it properly?

I use python3 and pytorch 1.4, so i change the print function and the range to range respectively, then i run the project directly, while the result is shown unstable, for the test data, test video varies seriously, and different train test combinations suffer the situation commonly, is the normal circumstances ？do I miss something? or is the version problem or something else?

5fold cross validation

i find the 5FCV may have some problem. it is not the standard K-fold cross validation.

Is gt_score (human annotation) in dataset make it supervised ?

In the paper, I have realized that the approach is called fully unsupervised. But I don't understand of using the gt_score (ground truth score) on your dataset. As far my study, I have learnt that gt_score (human annotation) is used for supervised approaches.

how to test my own custom video？

i have trained and tested with your datasets h5 file .And i just want to test my own video file ,like 'my_video.mp4'.
how can i transform it into h5 file ,and just use your code "python main.py -d datasets/my_own_video.h5 -s datasets/summe_splits.json -m summe --gpu 0 --save-dir log/summe-split0 --split-id 0 --evaluate --resume path_to_your_model.pth.tar --verbose --save-results"

Supervised version of the model

In the paper, you also describe and analyze a supervised version of the framework. Will the code for the supervised version also be made available?

Thank you.

Supervised Learning Extension

Hello,

I want to create both supervised and unspervised models on my own. However I couldn't see any options or find the related parts to switch between them. Could you please give me the reference points through the code if possible? Or do you plan to relase that part?

Thanks.

how to extracted key_frame from dataset?

how to extracted key_frame from dataset?
when i try to run the step of Visualize summary ,i encounter with a problem. I can't extracted key_frame from dataset,so i want konw how do this step?
@KaiyangZhou @yrwangxd

Question about reward function.

When will if num_picks == 0: in reward.py be executed? It seems impossible to equal 0 because of these code num_picks = len(pick_idxs) if pick_idxs.ndimension() > 0 else 1.

how to extracted key_frame from dataset?

how to extracted key_frame from dataset?
when i try to run the step of Visualize summary ,i encounter with a problem. I can't extracted key_frame from dataset,so i want konw how do this step?
@KaiyangZhou @yrwangxd @SinDongHwan

pre-train model

Thanks for your excellent work~

Could you please provide the pre-trained model, which can be helpful for other new datasets?

Frame Downsampling for dataset.

Why do we download the videos in the dataset to 2fps / choosing every 15th frame from each video?

evaluation matrice

pytorch-vsumm-reinforce/main.py

Line 164 in fdd03be

eval_metric = 'avg' if args.metric == 'tvsum' else 'max'

Why tvsum use avg but summe use max?

Thank you very much.

How can I get to know where frames are saved?

While video summarization from given datasets, I can not see any folder where frames are saved and its throwing an error
error: C:\ci\opencv_1512688052760\work\modules\imgproc\src\resize.cpp:3289: error: (-215) ssize.width > 0 && ssize.height > 0 in function cv::resize

The network is not able to pick the frames as there is no existing folder and its throwing this error. Can you please help

0/1 Knapsack problem

Hi @KaiyangZhou
What is the relation between 0/1 Knapsack problem and the summary video ?
I saw the 0/1 Knapsack problem was used in the evaluate step in the implementation, Why? I want a detailed explanation for this last.
and thank you in advance.

the inpt of lstm

from pytorch, the input of lstm is (seq_len, batch, dim).
but in the main.py, the input of lstm is (batch, seq_len, dim).
I guess there is a mistake.

Unexpected error

I'm getting an unexpected error while training the model. Can somebody please help resolve this?
Traceback (most recent call last):
File "main.py", line 205, in
main()
File "main.py", line 121, in main
probs = model(seq) # output shape (1, seq_len, 1)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/root/models.py", line 19, in forward
h, _ = self.rnn(x)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py", line 175, in forward
hx = torch.autograd.Variable(input.data.new(self.num_layers *
File "/usr/local/lib/python2.7/dist-packages/torch/tensor.py", line 407, in data
raise RuntimeError('cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?')
RuntimeError: cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?

Resume parameter doesn't work

When trying to use 'resume' functionality, you get an error:

Variable 'start_epoch' referenced before assignment

This goes from line 97 in main.py, where start_epoch is defined only if --resume option is not used.

The data set cannot be downloaded, can you provide it?

wget http://www.eecs.qmul.ac.uk/~kz303/vsumm-reinforce/datasets.tar.gz. it is 404 Not Found. can you provide it?

the following arguments are required

Hello,I can not find the dataset

Hello,I can not open your link to find the dataset,can you send the dataset to my mail?
my mail:[email protected]

Unsupervised learning use my own data

If I want to use my own dataset, but there is no features or labels in the dataset, only some video clips. How to extract features before construct the h5 file, and what should I do with user_summary, gts_score and gtsummary when construct h5 file?

Not able to create H5 file for another video dataset

I tried to use "https://github.com/SinDongHwan/pytorch-vsumm-reinforce/blob/master/utils/generate_dataset.py" , to extract features from another dataset using python 2.7(As recommended). But, cv2 function like cv2.cv.CV_CAP_PROP_FPS is not working and extracting the frames. And when I tried to use it in python 3.6. I could extract the frames, but weave library is not supported and could not find its replica in 3.6(which we need to find change points). Please suggest, if someone has worked on their own dataset to create H5 file.

How can I use the model to summarize a custom video?

Given a video, not from one of the training datasets, how can I apply the model to it?

How can I train the DR-DSNsup model?

In the paper, you mention how to train the DSNsup. But there is so little information about how to train the DR-DSNsup, can you specify it?

Thank you.

Original videos

Does anyone have a link for the original videos?

Why all the epochs are closed to 0.49 and don't diverse or improving at all ?

I have trained the model through my custom dataset. I used 1000 epochs. But I realized that all the epoch are closed to 0.49. The 1000 epochs not diverse at all. For example, First epoch is 0.49 and 1000th epoch is 0.4998, there is no change or improvement. Thanks in advance.

CUDNN_STATUS_BAD_PARAM when try to generate result.h5 file

I have used a large feature as (11585, 1000) in input. I want to generate result.h5 file as getting scores through model. I have also set the default dimension parameter 1000 as my feature dimension 1000. Error occur in h, _ = self.rnn(x) on model file. So CUDNN_STATUS_BAD_PARAM error generate. How can I overcome that ?

Cannot find Dataset

Can someone please help me find the dataset. The given link isn't working.

Why is the probability of output always around 0.5？

hello, I really appreciate your work. I have run your code, but i found the probs of the last fc layer are always close. For example, they are always around 0.5. Why not approach 0 or 1？I look forward to your reply!

does this method extract 1 frame from 7 frames to represent them?

as the title, can I control the ratio?

Saving of frames

Can anyone please tell me how are the frames being saved to .jpg images?

Video can't generate problems

when I run the command：
python summary2video.py -p log/summe-split0/result.h5 -d video_frames/ -i 0 --fps 30 --save-dir log --save-name summary.mp4
this occured：
OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'
Traceback (most recent call last):
File "summary2video.py", line 43, in
frm2video(args.frm_dir, summary, vid_writer)
File "summary2video.py", line 27, in frm2video
frm = cv2.resize(frm, (args.width, args.height))
cv2.error: OpenCV(3.4.3) /io/opencv/modules/imgproc/src/resize.cpp:4044: error: (-215:Assertion failed) !ssize.empty() in function 'resize'
My generation is to use ffepeg operation video frame,One of my format is generated according to the requirements of the code format， why such a mistake？Can you help me?

The model does not converge on other datasets.

The model does not converge on other datasets, do you have any advice?
where epochs=60, backbone=resnet50, lr=0.00001.

How did user_summary (binary vectors) generated?

Hi, as the title, there's a key called user_summary in dataset eccv16_dataset_tvsum_google_pool5.h5.

I am wondering how to convert 20 annotations, originally provided in TVSum, into that 20 binary vectors?

Thank you.

Summary not generating!!!

I have Trained and tested the model but am not able to visualize it. In the summary2video.py there is some typeError popping up stating TypeError: 'KeysViewHDF5' object is not subscriptable.

I have a folder named Videoframes containing several jpg images labelled 000001.jpg and so on....

How to create user summary of custom video dataset and save it in H5 py file?

Please explain, how to create the user summary and save it in the H5py file. I am able to create h5py file for my own video dataset but have no idea for user summary key for H5py file. Please help.

How to develop DR-DSNsup (Supervised Version Of RL_Model) ?

How I can generate the supervised version of DR-DSN ? Can anyone provide any code implementation of that. Thanks in advance.

how to calculate xcorr?

I tried a lot of methods to calculate xcorr, but I can't get the correct result. Can you tell me?

Results lesser than the original implementation.

Looks like this implementation is yielding worse results compared to the paper and the theano implementation. What could be the reason? and any ideas on how to fix this?

Thank You!

Mean instead of sum when computing the `expected_reward` by episode

Hi,
According to most of PyTorch REINFORCE algorithm implementations, the policy gradient loss should sum the log_probs on the trajectory (sum over t=1...T) instead of computing the mean. In the paper, this is correctly summed in equations 8/9/10. The only mean is over the N episodes. I believe this is a mistake in the code only.

pytorch-vsumm-reinforce/main.py

Line 131 in fdd03be

expected_reward = log_probs.mean() * (reward - baselines[key])

Should be

expected_reward = log_probs.sum() * (reward - baselines[key])

The assumption is that the authors wanted to average instead of summing because videos have a different length.

Please, tell me if I am wrong. Thanks!

Video Numbering for TVSUM dataset , and FPS for dataset

Hi ,

: TVSUM raw dataset contains videos with randomish names ( but not numbers ) while u have used video_x naming in ur processed dataset (google_tvsum dataset) . ** Where can i find this mapping **.
2): What is the fps you have used for converting videos into frames that u input to googlenet for the processed dataset . ( i am asking this for generating the summary from raw frames in summary2video)
its using fps of like 30 so i am unable to understand that part .

Can u shed some light on this.

How to extract image features for videos in this paper?

Can author give us code or link about feature extracting?

Handling one or zero frame selected bug

When Bernoulli sampling returns zero or one frame, training crashes.

First, if Bernoulli sampling selects one frame, this error will occur TypeError: len() of a 0-d tensor. This happens because the line 17 in rewards.py pick_idxs = _actions.squeeze().nonzero().squeeze() will return tensor of dimension 0, and when later on line 18 num_picks = len(pick_idxs) function len is called - error will be thrown.

Example to reproduce the error:

import torch
from torch.distributions import Bernoulli

m = Bernoulli(torch.tensor([0.0, 0.0, 1.0, 0.0, 0.0]))
actions = m.sample()

pick_idxs = _actions.squeeze().nonzero().squeeze()
print(len(pick_idxs))

Second, when zero or one frame is selected, return of compute_reward function should be tensor of size 0, otherwise line 132 in main.py will produce size mismatch error. So lines 22 reward = torch.tensor([0.]) and 31 reward = torch.tensor([0.]) should be replaced with reward = torch.tensor(0.).

Can somebody confirm these bugs? (So I can maybe commit the fix?)

pytorch version: 0.4.0
python version: 2.7.12

kaiyangzhou / pytorch-vsumm-reinforce Goto Github PK

pytorch-vsumm-reinforce's Issues

Recommend Projects

Recommend Topics

Recommend Org