Giter Site home page Giter Site logo

Comments (38)

Harryjun avatar Harryjun commented on July 17, 2024 2

@Swati640 @SinDongHwan you can send the author a email to get some suggestion,if you have some solution,please tell me ,thanks!
And I think
First,different net or params will make different changepoint.
Second,the author first average the socre in every shot(changepoint[x,y]),then get the higher,I think we can make summary by get the higher frames of each two changepoints,for example,[0,23],[23,50]we can get a key frame in [0,23],],and select 0.15*23 nums.it will make you consider all shot.
I think it,youcan try it .

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024 2

@harvestlamb
Hi~!!

To get a dataset on your video, you should make following data.

  1. 'features' : feature of each 15th frame
  2. 'picks' : 15
  3. 'n_frames' : number of frames of a video
  4. 'fps' : frame per second
  5. 'change_points' : shot or scene change points.
    • to get change_points, you should use KTS.
    • Depending on which CNN you use, you will get different change_points.
      generate_datast
  6. 'n_frame_per_seg' : number of frame in interval of each change points.

if you want train using supervised learning, i think you should gound-truth( '0/1' about each 15th frames)
and you have to make policy. because there is no right ground-truth in the summary.

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024 1

@Swati640 , @Harryjun
I think you should ask the author of the paper or the creator of the dataset for help to get similar change points to the dataset.

from pytorch-vsumm-reinforce.

harvestlamb avatar harvestlamb commented on July 17, 2024 1

@SinDongHwan
@Harryjun
Hi~ Thanks for your codes,When i run your codes ,I encounter with some problems:

File " video_forward2.py", line 236, in module
from utils.generate_dataset import Generate_Dataset
ImportError: No module named generate_dataset

I don't know how to create a right dataset on myself video with your codes ,could you tell me
some details?
thank you again~

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024 1

@harvestlamb
Hello, i don't know, too.
so i think you ask to maker of dataset how to generate 'gt_summary, gt_score' .
Good luck~!!^^

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024
  • Features : you have to extract feature.

    • length : n_frames/15 ( picks value is [0, 15, 30, 45, ... , len(frames)] in SumMe, TVSum dataset)
    • convert video to frames. every 15th frame extraction.
  • Change points : you have to use KTS. here

    • i try it...
  • Number of frames : number of video frames.

  • Number of frames per seg : fisrt, you have to get change points. and then you can get nfps.

  • Positions : ( 15 in SumMe, TVSum dataset)

if you've solved, i want to know how to use KTS.

from pytorch-vsumm-reinforce.

hungbie avatar hungbie commented on July 17, 2024

Hi,

For change point detection. What should i input to KTS? Flatten image as HxW dimension input or using some feature extraction methods so that the image become some N dimension input? What is being used in this paper to preprocess the image/frame?

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@hungbie
Hi, you should input features of each frame.
You can use KTS in "utils/generate_dataset.py" at this

from pytorch-vsumm-reinforce.

hungbie avatar hungbie commented on July 17, 2024

@SinDongHwan
Thank you! I will take a look!

from pytorch-vsumm-reinforce.

hungbie avatar hungbie commented on July 17, 2024

@hungbie
Hi, you should input features of each frame.
You can use KTS in "utils/generate_dataset.py" at this

I understand your approach. I have tried and come to the same thing using features from Googlenet or Resnet. However, I think in original KTS paper, the author used SIFT + Fisher vector to generate the descriptors. Have you tried this method?

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

I understand your approach. I have tried and come to the same thing using features from Googlenet or Resnet. However, I think in original KTS paper, the author used SIFT + Fisher vector to generate the descriptors. Have you tried this method?

yes, i tried to use SIFT+Fisher vector. but i gave up.
I know that SIFT+Fisher vector method is to generate features.
and i thought that author used SIFT+Fisher vector, because cnn had not existed.

from pytorch-vsumm-reinforce.

hungbie avatar hungbie commented on July 17, 2024

ok. I will try but since SIFT is patented anyway it's good to look into other methods. Thank you!

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@hungbie
Okay!!
Good Luck^^

from pytorch-vsumm-reinforce.

Harryjun avatar Harryjun commented on July 17, 2024

@hungbie
Hi, you should input features of each frame.
You can use KTS in "utils/generate_dataset.py" at this

Hi @SinDongHwan I use your code "generate_dataset.py" then,I fond the size of feature is 2048? what can i do

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Harryjun

I sent email, now!! but i will write here, too.
for people having same question .

Hi, Harryjun~!!train dataset is generated using GoogleNet.but, my codes extract features using ResNet. so size of feature is 2048.
i had tested and then i've gotten following result. 1) I had extracted features using GoogleNet, and i've gotten change points. 2) GoogleNet(generated by me) was worse  then open dataset(TVSum, SumMe, etc).so i tried using ResNet. and i got better results than GoogleNet(generated by me).but ResNet is more deep than GoogleNet. so it is a little bit slow to generate feature and get change points.but if you using batches, you can get it faster.
if you want to 1024 features using ResNet.I think if input size of ResNet is (112,112), will extract features 1024. just my guess.^^

Good Luck!!

from pytorch-vsumm-reinforce.

Harryjun avatar Harryjun commented on July 17, 2024

@SinDongHwan
Hi,I find we make the h5 file so slowly,like this
image
it process one frame / second
could you have this problem,and give me some suggestion?

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Harryjun
Hi, i can tell you exactly, if i have to see your situation.
i guess main memory is lack.
if out of memory, it is slow to swap in/out data. so, it will be slow for you to make h5 file, too.

when you execute code, Can you check your memory?
If it's right, you can try two methods. (just my ideas ^^)

1st method,

"split codes extracting all features for getting change points, 15th features for training dataset."

first, you have to extract all features, and then get change points.
second, you extracts 15th features for training dataset.

2nd method

"extract all features, and then select 15th frames."

first, you have to extract all features, and then get change points.
second. you select 15th features from all features for training dataset.

Do you use hangout? i want to see your situation on teamviewer.

from pytorch-vsumm-reinforce.

Harryjun avatar Harryjun commented on July 17, 2024

@SinDongHwan
I make some change in this resp,we could not save so many frames,then,we only save feature.(because the train only use feature.),at last,we make summary by the opencv frame.
the reason is the work that saving frames cost experiense.
https://github.com/Harryjun/pytorch-vsumm-reinforce

First make datastes
python video_forward2.py --makedatasets --dataset data_our/data_h5/data1.h5 --video-dir data_video/data1/ --frm-dir data_our/frames
Second make score and generate summary
python3 video_forward2.py --makescore --model log/summe-split0/model_epoch1000.pth.tar --gpu 0 --dataset data_our/data_h5/data2.h5 --save-dir logs/videolog/ \ --summary --frm-dir data_our/frames

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Harryjun
you're right. can't save so many frames.
when i make dataset, i had tried to save all frames.
i've faced to be slow. so, i removed a code line (save frame).

from pytorch-vsumm-reinforce.

Harryjun avatar Harryjun commented on July 17, 2024

@SinDongHwan Hi,I find the checkpoint got by KTS is not same with the datasets author give.what the reason

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Harryjun
Yes, not same.
I've not solved about that. TT
But result was not bad, when use Resnet.

from pytorch-vsumm-reinforce.

Harryjun avatar Harryjun commented on July 17, 2024

@SinDongHwan Hi,I want to ask you some question,Recently, we are using a video keyframe extraction to do a work.so I make a test about DSN ,the I fond it can reglect some frames ,not very good. then ,I would like to ask you how to extract key frames for long videos.can you give me some suggestion.
Thank you very much.

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Harryjun
How long videos are?
i think, you can get good results if you have proper change points.
I've read many papers about video summarization.
but I'm not video summary researcher. I'm just computer engineer. so i can't suggest nice idea.

I think you can have good results if you read many papers and think many about how to improve.
Good Luck~!! You can do it!

from pytorch-vsumm-reinforce.

Swati640 avatar Swati640 commented on July 17, 2024

@Harryjun @SinDongHwan my changepoint differs alot, from the actual H5 file. for example, in the actual H5 file for Video 1 change points have a difference of 100 frames. By KTS in "utils/generate_dataset.py" , gives me different results,so my network just select the starting frames for generating the video summary. Can you please help, how could I make changes in the change point

from pytorch-vsumm-reinforce.

Swati640 avatar Swati640 commented on July 17, 2024

@SinDongHwan @Harryjun if you have used GoogleNet for feature extraction. Please let me know , how did you guys do that for 1024 size feature extraction.

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Swati640 @Harryjun
I've tried to extract features using GoogleNet.
I just used codes from google search.
(googlenet feature extract)

tell me your email. i will send you.
but when i tried this, i got a bad results.

from pytorch-vsumm-reinforce.

Swati640 avatar Swati640 commented on July 17, 2024

I tried as well, just getting the dimension same, rest very bad results for change point. I would like to compare my code with yours so it would be really helpful. My email id is
[email protected]. Thanks in advance :)

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@Swati640
i sent you email.
There is not GoogleNet in latest version of torchvision.
So, you have to add and edit code while referring my email.
Good Luck^^

from pytorch-vsumm-reinforce.

harvestlamb avatar harvestlamb commented on July 17, 2024

@SinDongHwan Thank you very much
I try your code,and i have made myself dataset, and try to train it.and it create result.h5(only has reward , not has f-score).I encounter with this problems:

===> Evaluation
Traceback (most recent call last):
  File "video_summarization.py", line 224, in <module>
    main()
  File "video_summarization.py", line 129, in main
    evaluate(model, dataset, test_keys, use_gpu)
  File "video_summarization.py", line 167, in evaluate
    user_summary = dataset[key]['user_summary'][...]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/oliver/anaconda3/envs/PY2/lib/python2.7/site-packages/h5py/_hl/group.py", line 177, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'user_summary' doesn't exist)"

the failure of Evaluation, I think that maybe i don't label Grouth truth :''user_summary,gts_score and gtsummary in my dataset'',Should i label '0/1' about these indexs on my videos each 15th frames?could your give me some guidance on labeling these indexs on my video? (because i never label like this dataset)
thank you again!
Best wish to you~

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@harvestlamb
Hi.
I missed 'user_summary'
this need when evaluate. but don't need when test.
'user_summary' is data from n people.

i never label,too.
you have to convert your video to frames. and then assign '0/1' all frames or 15th frames.
you can refer summe dataset or tvsum dataset.

from pytorch-vsumm-reinforce.

harvestlamb avatar harvestlamb commented on July 17, 2024

@SinDongHwan thank you very much ,i analyse the summe data with your guidance
video_1 : <HDF5 dataset "user_summary": shape (15, 4494), type "<f4"> video_10 : <HDF5 dataset "user_summary": shape (15, 9721), type "<f4"> video_11 : <HDF5 dataset "user_summary": shape (15, 1612), type "<f4"> video_12 : <HDF5 dataset "user_summary": shape (15, 950), type "<f4"> video_13 : <HDF5 dataset "user_summary": shape (15, 3187), type "<f4"> video_14 : <HDF5 dataset "user_summary": shape (15, 4608), type "<f4"> video_15 : <HDF5 dataset "user_summary": shape (17, 6096), type "<f4"> video_16 : <HDF5 dataset "user_summary": shape (15, 3065), type "<f4"> video_17 : <HDF5 dataset "user_summary": shape (15, 6683), type "<f4"> video_18 : <HDF5 dataset "user_summary": shape (17, 2221), type "<f4"> video_19 : <HDF5 dataset "user_summary": shape (17, 1751), type "<f4"> video_2 : <HDF5 dataset "user_summary": shape (18, 4729), type "<f4"> video_20 : <HDF5 dataset "user_summary": shape (17, 3863), type "<f4"> video_21 : <HDF5 dataset "user_summary": shape (15, 9672), type "<f4"> video_22 : <HDF5 dataset "user_summary": shape (15, 5178), type "<f4"> video_23 : <HDF5 dataset "user_summary": shape (15, 4382), type "<f4"> video_24 : <HDF5 dataset "user_summary": shape (15, 2574), type "<f4"> video_25 : <HDF5 dataset "user_summary": shape (16, 3120), type "<f4"> video_3 : <HDF5 dataset "user_summary": shape (15, 3341), type "<f4"> video_4 : <HDF5 dataset "user_summary": shape (15, 3064), type "<f4"> video_5 : <HDF5 dataset "user_summary": shape (15, 5131), type "<f4"> video_6 : <HDF5 dataset "user_summary": shape (16, 5075), type "<f4"> video_7 : <HDF5 dataset "user_summary": shape (15, 9046), type "<f4"> video_8 : <HDF5 dataset "user_summary": shape (17, 1286), type "<f4"> video_9 : <HDF5 dataset "user_summary": shape (15, 4971), type "<f4">

So user_summary's shape like (x,y), Obviosly,'y' represent n_frames ,Dose 'x' represent the number of people needed for labeling ? To facilitate labeling ,could i reduce some dimensionality or use similar labels in 15 dim?

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

@harvestlamb
Hi,
for example about video_1,
15 dim is number of people, and 4494 is number of frames.
i think values of 4494 dim have {0/1}.
It is not easy to label ground truth about all frame.
i have a idea.
first, pick ground truth of 15th frames on your dataset.
second, if 30th frame is ground truth, label near 30th frame as "1".
frames are not picked, label "0" value.

just my idea. i've not tried this.
i think you should decide policy how to label.

from pytorch-vsumm-reinforce.

harvestlamb avatar harvestlamb commented on July 17, 2024

@SinDongHwan hello , thank you for your prompt reply,i think about your ideas is right ,and i already performed these ideas(but i don'l label gt_summary and gt_sorce , i think spervised learning need these two labels to train model,Is that right? ) , finally thank you again ~

from pytorch-vsumm-reinforce.

anaghazachariah avatar anaghazachariah commented on July 17, 2024

Hello..I implemented the project..You can refer my repo https://github.com/anaghazachariah/video_summary_generaton

from pytorch-vsumm-reinforce.

huuuuyl avatar huuuuyl commented on July 17, 2024

Please let me know , how did you guys do that for 1024 size feature extraction by ResNet152.

from pytorch-vsumm-reinforce.

SinDongHwan avatar SinDongHwan commented on July 17, 2024

Hi, @huuuuyl

feature size of ResNet152 is 2048.
So, i think you should change a input feature size 1024 to 2048 of video summarization model.

Good Luck.

from pytorch-vsumm-reinforce.

huuuuyl avatar huuuuyl commented on July 17, 2024

@SinDongHwan thanks for your kind advice.

from pytorch-vsumm-reinforce.

mohammedshady avatar mohammedshady commented on July 17, 2024

@Swati640 i sent you email. There is not GoogleNet in latest version of torchvision. So, you have to add and edit code while referring my email. Good Luck^^

Hey man i hope you are still here 😅
im having the same issue with change points that caused my f-score to drop drastically when i make my dataset using labels also the gtscore is not the same as the original dataset

here is my email : [email protected]

from pytorch-vsumm-reinforce.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.