soczech / transnetv2 Goto Github PK
View Code? Open in Web Editor NEWTransNet V2: Shot Boundary Detection Neural Network
Home Page: https://arxiv.org/abs/2008.04838
License: MIT License
TransNet V2: Shot Boundary Detection Neural Network
Home Page: https://arxiv.org/abs/2008.04838
License: MIT License
Hi Tomas,
Awesome work you done here. Really appreciate the work done.
I had followed your instructions. I had setup a conda environment with python=3.6, tensorflow=2.1, pytorch=1.7.1, cudatoolkit=10.1.
This following error occurs
ValueError: Importing a SavedModel with tf.saved_model.load requires a 'tags=' argument if there is more than one MetaGraph. Got 'tags=None', but there are 0 MetaGraphs in the SavedModel with tag sets []. Pass a 'tags=' argument to load this SavedModel.
when I tried to run
python transnetv2.py some_video.mp4
as well as
python convert_weights.py
I would love to hear from you on how to resolve this error.
Thank you again!
Is there a link where I can manually download the weight, because this weight always times out when my server pulls it. I look forward to hearing from you soon. Thank you very much.
Hi Tomas,
Thank you so much for putting this code online. You guys did an excellent job on a model that is still considered a SOTA two years after it was posted!
I'm testing the model on a certain dataset and wanted to ask a few questions:
I'll really appreciate your help! Happy holidays!
Hi,
I've trained with clipshots dataset and got the weights-30.h5 file.
How could you convert this .h5 weight file to saved model format?
Hi Tomáš,
Your work helps me a lot in understanding your great paper! Thank you so much!
I download the ClipShots training and training transitions datasets, and process them according to https://github.com/soCzech/TransNetV2/blob/master/training/consolidate_datasets.py and https://github.com/soCzech/TransNetV2/blob/master/training/create_dataset.py
I download the ClipShots test dataset and process it accordingly.
I also download the IACC.3 dataset and process it with the type of "train" .
I add the ClipShots training, training transitions and IACC.3 in the "options.trn_files" of https://github.com/soCzech/TransNetV2/blob/master/configs/transnetv2.gin, and add ClipShots test in the "options.tst_files". I also change "options.n_epochs" to 50 as indicated in the paper.
However, I can only obtain F1 of 0.74. Could you please give more training details and instructions on how to reproduce 77.9 on the test set?
What are the meanings of file names in "options.tst_files" and how to generate these files?
I also use the pretrained weights in https://github.com/soCzech/TransNetV2/tree/master/inference/transnetv2-weights to test the ClipShots test dataset by revising "options.restore" and "options.test_only" to True in https://github.com/soCzech/TransNetV2/blob/master/configs/transnetv2.gin. I can only get F1 of 0.2545 and cannot reproduce 77.9.
I appreciate your great help so much!
Wentao
Hi, I tried to use the TransNetV2. I followed with the steps
from transnetv2 import TransNetV2
# location of learned weights is automatically inferred
# add argument model_dir="/path/to/transnetv2-weights/" to TransNetV2() if it fails
model = TransNetV2()
video_frames, single_frame_predictions, all_frame_predictions = \
model.predict_video("video.mp4")
But it shows the error as title, cannot parse the .pb file. Am I missing something? Could you please help with that? Thank you.
It is a great job. I could leave a question, how could I extract key-frames meanwhile extracting the caption of this frame, or how could the net get the time of key-frames, then find the caption?
Traceback (most recent call last):
File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 193, in <module>
main()
File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 173, in main
model.predict_video(file)
File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 83, in predict_video
video_stream, err = ffmpeg.input(video_fn).output(
File "/home/tom/projects/Studium/Studienarbeit/cutting/env/lib/python3.9/site-packages/ffmpeg/_run.py", line 325, in run
raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)
I get a ffmpeg error when I run the following command:
python transnetv2.py /mnt/e/Studium/Studienarbeit/Videos/2021/reg/17/c597512d-b37c-11eb-ba8a-ecb6fe06b3b0/highlightsVideo/video/video.mp4 [--visualize]
Thats my ffmpeg:
ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --
incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample -
-enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-
libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-
libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --
enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --
enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --
enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2
--enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-
libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
Is there any problem with my ffmpeg build?
I am impressed with this tool.
I want to ask the command line for
mp4
to shots splitting
and
mp4
to shots detection
using the tool.
If you can help.
Great work!!!
I used it on my own videos, but in the visualization there are both green and blue lines, what do the different colors mean?
Thank you for the repo! I would like to compare TransNetV2 with TransNet with some videos, however, I have an issue while loading the weights. As described in inference/README.md
, I just run:
python transnetv2.py test.mp4
I get the following errors that start at line transnetv2.py:17:
saved_model.ParseFromString(file_content)
google.protobuf.message.DecodeError: Error parsing message
I've noted the size of transnetv2-weights
is only 8.4k, is that correct?
How can I get the dataset?
hello, in evaluate.py, files = glob.glob(os.path.join(args.directory, "*.npy")), we need to transform mp4 to npy? How to generate npy data? looking forward to your reply, thanks!
File "/home/y202202005/workspace/TransNetV2/inference/transnetv2.py", line 165, in main
model = TransNetV2(args.weights)
File "/home/y202202005/workspace/TransNetV2/inference/transnetv2.py", line 18, in init
self._model = tf.saved_model.load(model_dir)
File "/home/y202202005/.conda/envs/py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 603, in load
return load_internal(export_dir, tags, options)
File "/home/y202202005/.conda/envs/py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 649, in load_internal
root = load_v1_in_v2.load(export_dir, tags)
File "/home/y202202005/.conda/envs/py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 263, in load
return loader.load(tags=tags)
File "/home/y202202005/.conda/envs/py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 188, in load
meta_graph_def = self.get_meta_graph_def_from_tags(tags)
File "/home/y202202005/.conda/envs/py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 76, in get_meta_graph_def_from_tags
raise ValueError(
ValueError: Importing a SavedModel with tf.saved_model.load requires a 'tags=' argument if there is more than one MetaGraph. Got 'tags=None', but there are 0 MetaGraphs in the SavedModel with tag sets []. Pass a 'tags=' argument to load this SavedModel.
Hello, I'm trying to run the model but I get the following error
020-06-22 00:20:45.880699: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
[TransNetV2] Using weights from transnetv2-weights/.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 98, in parse_saved_model
saved_model.ParseFromString(file_content)
google.protobuf.message.DecodeError: Error parsing message
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "transnetv2.py", line 188, in <module>
main()
File "transnetv2.py", line 160, in main
model = TransNetV2(args.weights)
File "transnetv2.py", line 17, in __init__
self._model = tf.saved_model.load(model_dir)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 578, in load
return load_internal(export_dir, tags)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 588, in load_internal
loader_impl.parse_saved_model_with_debug_info(export_dir))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 56, in parse_saved_model_with_debug_info
saved_model = _parse_saved_model(export_dir)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 101, in parse_saved_model
raise IOError("Cannot parse file %s: %s." % (path_to_pb, str(e)))
OSError: Cannot parse file b'transnetv2-weights/saved_model.pb': Error parsing message.
Any idea on what could be wrong? thanks
Hi, can you provide converted pytorch weight. Tensorflow 2.1 is not available now.
The open-source model is saved in pb format, and the training model is saved in hd5 format. How is the hd5 format converted to the pb format? Is it saved directly to pb format during training, or is it converted afterward?
Hi,
First of all, thank you very much for this work.
I have a question that may be a bit silly, but I couldn't find an answer. Indeed, I would like to run the code corresponding to create_train_dataset here , however I can't understand how to fill the mapping_fn parameter : could you tell me what is /path/to/scenes/gt please ? :) I really don't have any idea which file to put...
Hei there,
how can you get the shot boundaries using the pytorch version?
Hi, I tried to run the code as follows.
from transnetv2 import TransNetV2
model = TransNetV2()
For the first time, it shows parse error. I redownload the .pb model. And it show the OpError: not an sstable (bad magic number). Not sure what happened.
I also tried TransNet https://github.com/soCzech/TransNet. It works pretty well. And I like the visualization. I wonder could you please save the TransNet and the weights as a .pb model? Since I want to use it with opencv and c++. Do you have any suggestions that how I can use it with opencv dnn? Suppose I have a video, what should I do to prepare the input for the model? Thank you very much.
Could you please detail the 'only_gradual' set and should we use it for training?
Is there a pytorch implementation of this network structure?
It's based on pytorch?
Hi,
Thank you so much for putting this model out. Excellent work!
I'm trying to train the model and I stumble upon a problem, that points out to the models.py file
When I try to run training.py with the gin file, this is the error message I get:
Traceback (most recent call last):
File "C:\videoseg\TransNetV2\training\training.py", line 10, in
import models
File "C:\videoseg\TransNetV2\training\models.py", line 168, in
@gin.configurable(blacklist=["name"])
TypeError: configurable() got an unexpected keyword argument 'blacklist'
Do you have a sense of what could be the problem?
Thanks!
First,this work is great. I want to ask for how can I extract keyframe from video with the caption of this frame?
Hi @soCzech
Thank you for creating this repo. Do you think of publishing this library to Pypi?
Wenbing
after download all the file and trained model/weights manually, I run the transnetv2.py in the inference, but raise an exception:
OSError: Cannot parse file b'/path/to/saved_model.pb': Error parsing message.
the version of tensorflow is v2.1.0.
and I also run the docker command to build an image flow the readme.md in inference, but after I built the dockerfile successfully, I run the command in the readme.md to test a video, but still raise the same exception.
how can I solve this problem? thanks
Hi,
I have successfully trained models on clipshot, RAI and BBC datasets. However, when I train the model on the IACC.3 dataset, I keep getting the error below,
P_REQUIRES failed at strided_slice_op.cc:108 : Invalid argument: slice index -1 of dimension 0 out of bounds
finally, the program ended with the error below.
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __inference_parse_train_sample_529}} {{function_node __inference_parse_train_sample_529}} slice index -1 of dimension 0 out of bounds.
[[{{node StatefulPartitionedCall/cond_3/then/_28/cond_3/PartitionedCall/strided_slice_9}}]]
[[StatefulPartitionedCall]] [Op:IteratorGetNext]
As instructed in the Readme (same as other datasets), after downloading the dataset, running 'consoliate_datasets.py', I created tfrecord with the 'train' version of 'create_dataset.py'. After that, a error arises when training.
can you help? What did I do wrong?
Hi,
Thanks for your amazing work!
I'm trying to use your code for a small project I'm working on, trying to detect scene endings in sports.
I have a decent size dataset, yet I can't get past the 80% F1 score.
I was wondering, what model hyperparameters would you suggest I try playing with?
Thank you so much!
Hi!
First of all, thanks for the great work!
I have compared both implementations (on CPU) and Torch is much slower.
I unfortunately have to use Torch so if you have any idea on how to make it faster?
Thanks
Thanks for this repo @soCzech, I've been using this for some personal projects and it's incredibly performant for all kinds of videos.
I was wondering if you at any point tried using images of different sizes? I've been wanting to try this for some edge cases that fail with the current settings, however it's not quite as trivial as changing the image size variable - so I was wondering if you might have any advice on how one would do this?
where can i download IACC.3 dataset?
Thanks for your great work,
Could you please provide the usage of trainning code with my own data
I want to detect gradual changes only,should I label one frame or multiple frames?
Hi, thank you for the comprehensive repo.
Maybe I have missed something and I have a little question about the released inference model. What is the training detail of this model? Is it based on ClipShots, BBC, or RAI?
video_frames, single_frame_predictions, all_frame_predictions = \ model.predict_video("test.mp4")
I see in function def predictions_to_scenes()
, it use the single frame predictions. The single frame predictions has the same shape with all_frame_predictions. We have already predict the frame is whether or not the shot boundry, Why we also want to know the all keyframes predcitions ? is it same?
Thank you for your contribution!
I try to inference with Pytorch.
Tensorflow 2.1 is not available now.
So, I installed tensorflow 2.7 and run programs.
Then I got the following error.
tensorflow.python.framework.errors_impl.OpError: not an sstable (bad magic number)
Hi, I download RAI from https://drive.google.com/file/d/1YColUfc3ZuCbiAAHHMYRQVBF2yBikj4N/view, but just found videos and scene annotation. I want to know where do you find the shot boundary annotation. Thanks a lot!
Hi,
in paper, you mentioned that the synthesis data is used and boost performance. But, I didn't find the code to render transitions.
Could you please provide the code to render transitions.
Did you set a specific fps to extract frames of each video? I found you use the original fps of each video in the code.
How the difference of fps between videos affect the results?
hello, I'm running consolidate_datasets.py and got these errors, could you please tell me what cause them? Thank you sooo much.
File "/Users/z/Desktop/transnetv2/consolidate_datasets.py", line 215, in
clipshots_dataset(CLIPSHOTS_TRN_txt_files, CLIPSHOTS_TRN_mp4_files, CLIPSHOTS_TRN_target_dir)
File "/Users/z/Desktop/transnetv2/consolidate_datasets.py", line 208, in clipshots_dataset
visualize_scenes(video, scenes).save(save_to + ".png")
File "/Users/z/Desktop/transnetv2/visualization_utils.py", line 55, in visualize_scenes
draw_end_frame(end)
File "/Users/z/Desktop/transnetv2/visualization_utils.py", line 33, in draw_end_frame
draw.rectangle([(w * iw + iw - 1, h * ih), (w * iw + iw - 3, h * ih + ih - 1)], fill=(255, 0, 0))
File "/Users/z/opt/anaconda3/lib/python3.9/site-packages/PIL/ImageDraw.py", line 292, in rectangle
self.draw.draw_rectangle(xy, fill, 1)
ValueError: x1 must be greater than or equal to x0
Hey
In the create_dataset.py
, there is a function named scenes2zero_one_representation
. It returns two values, which based on your paper, related to the networks heads. In the implementation, they are called one_hot
and many_hot
.
I run the function for 100,000 times with different scenes
sequences (that are generated randomly), and in all cases the returned values for both items were the same! I'm wondering if there is a point to set different names for these values? Or maybe there is a subtle difference I wasn't able to spot.
BTW, here is the code I tested the function with:
import numpy as np
from create_dataset import scenes2zero_one_representation
# create some random sequences
sequences, max_len = [], []
for num_seq in range(4):
cursor = 0
sequences.append([])
for i in range(np.random.randint(10, 15)):
run_len = np.random.randint(1, 100)
sequences[-1].append([cursor, cursor + run_len])
cursor += run_len + 1
max_len.append(cursor)
# get result of the function
results = [scenes2zero_one_representation(s, m) for s, m in zip(sequences, max_len)]
# check if one_hot and many_hot vectors are different
if all([all(result[0] == result[1]) for result in results]):
print('All values are the same!')
else:
print('There is some difference.')
Regards
Hi,
I'm attempting to run a test with "python transnetv2.py test.mp4" from the inference folder and getting this error.
[TransNetV2] Extracting frames from test.mp4
Traceback (most recent call last):
File "transnetv2.py", line 192, in <module>
main()
File "transnetv2.py", line 172, in main
model.predict_video(file)
File "transnetv2.py", line 86, in predict_video
video = np.frombuffer(video_stream, np.uint8).reshape([-1, 27, 48, 3])
TypeError: a bytes-like object is required, not 'tuple'
Not sure where to go from here, let me know if anyone has any ideas.
Cheers
I've tested inference by using pb file you provided. It worked well. Thanks for the repo :)
And then I tried to run evaluation code but couldn't run 'cause the format of model is different.
Can I get pretrained weights (.h5) file for my research?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.