Giter Site home page Giter Site logo

youtube-bb's People

Contributors

ajherman avatar mbuckler avatar sampsyo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

youtube-bb's Issues

classification dataset size

Hello,
I'm wondering what is the size of the classification dataset(before and after decoding)? I assume I do not have so much disk space therefore some information would be very helpful for me to cut it into subsets.
Also appreciate for the work. Many thanks!.

Says videos downloaded, but didn't actually download

Hi @mbuckler,
The download.py file runs fine for me. It downloads the csv, creates directories to download videos, says on the command line that Downloaded video: 193733 / 193733. But no video shows up in the specified directory that the script created. Can you tell me what am I missing?
Thank You

Downloading videos of a fixed resolution

While downloading the video is it possible to download the videos of a predefined high resolution (let's say 1080p)?

What do I need to do in order to achieve this?

Multithreading error

System: Ubuntu 14.04.5 LTS
Python: Python 2.7.6
Pip packages:
ffmpy==0.2.2
futures==3.0.5
imageio==2.1.1
moviepy==0.2.2.13
multiprocess==0.70.5
youtube-dl==2017.3.2

I get the following error when I run your script. Any help would be appreciated.

yt_bb_classification_train: Downloading annotations...
--2017-03-04 00:16:19--  https://research.google.com/youtube-bb/yt_bb_classification_train.csv.gz
Resolving research.google.com (research.google.com)... 172.217.5.110, 2607:f8b0:4005:808::200e
Connecting to research.google.com (research.google.com)|172.217.5.110|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/gzip]
Saving to: 'yt_bb_classification_train.csv.gz'

    [        <=>                                                                                                                         ] 28,582,014  16.7MB/s   in 1.6s   

2017-03-04 00:16:21 (16.7 MB/s) - 'yt_bb_classification_train.csv.gz' saved [28582014]

yt_bb_classification_train: Unzipping annotations...
yt_bb_classification_train: Parsing annotations into clip data...
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 208, in _queue_management_worker
    result_item = result_queue.get(block=True)
  File "/usr/lib/python2.7/multiprocessing/queues.py", line 117, in get
    res = self._recv()
TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())

Solve video can't be downloaded issue

For those who might find the problem that video can't be downloaded like me (similar to #25 ).
It seems like the download URL doesn't work in the original way.
To resolve this issue, try changing this line of code:

'youtu.be/'+vid.yt_id ], \
to

'https://www.youtube.com/watch?v=' + vid.yt_id

then wait for a long long time ๐Ÿ™‚.

Downloader hangs

I've been trying to run the downloader and it hangs every time. Any ideas on what's going on?

`yt_bb_classification_train: Downloading annotations...
--2017-03-14 15:10:00-- https://research.google.com/youtube-bb/yt_bb_classification_train.csv.gz
Resolving research.google.com (research.google.com)... 216.58.194.174, 2607:f8b0:4005:804::200e
Connecting to research.google.com (research.google.com)|216.58.194.174|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/gzip]
Saving to: โ€˜yt_bb_classification_train.csv.gzโ€™

[          <=>                                                                                                                       ] 28,582,014  12.9MB/s   in 2.1s   

2017-03-14 15:10:02 (12.9 MB/s) - โ€˜yt_bb_classification_train.csv.gzโ€™ saved [28582014]

yt_bb_classification_train: Unzipping annotations...
yt_bb_classification_train: Parsing annotations into clip data...
Downloaded video: 253567 / 253569
`

blocky effects in jpg images extracted by voc_convert

Hi Mark,
Thank you for the scripts. I have noticed that the jpg images extracted by voc_convert have some visible blocks as you can see in the attached images. Same thing happens even if I modified the image extension to 'png'.
I think this creates some unwanted edges on the image, do you have an idea why and how to solve it?
Many thanks,
Yiming
0c-Cwr5rI_A+20+0+13000
0v7h-88VbR8+0+0+188000

Getting this error while trying to download the videos...please help!

Hi I am really new to this and I am getting the following error...any help?

MBP:youtube-bb-master_jul19 Mac$ python3 download.py [videos] [30]

Traceback (most recent call last):
File "download.py", line 43, in
parse_and_sched(sys.argv[1],int(sys.argv[2]))
ValueError: invalid literal for int() with base 10: '[30]'

upload extracted images

Hi,

Thanks for your sharing. I am planning to work on Youtube-BB dataset too. The problem is, there are tons of videos and I don't really need to use all of them except frames with bounding box annotations. Others have expressed similar demands in stackoverflow.

So could you please upload only images with annotations? I am sure it will contribute to our community by making this dataset more available to researchers with limited storage or internet bandwidth.

connecting failed

Connecting to research.google.com (research.google.com)|2404:6800:4008:800::200e
|:443... failed: Unknown error.
Connecting to research.google.com (research.google.com)|172.217.24.14|:443... fa
iled: Unknown error.

Error while download videos using download.py

While downloading the youtube-bb-videos getting the following issue of directory not present. I am using python 3.7 on windows.

Complete Log:
C:\Users\spaul\Downloads\youtube-bb-master\youtube-bb-master>python download.py videos 1
Traceback (most recent call last):
_File "download.py", line 43, in
parse_and_sched(sys.argv[1],int(sys.argv[2]))
File "download.py", line 30, in parse_and_sched
check_call(['mkdir', '-p', dl_dir])
File "C:\python37\lib\subprocess.py", line 342, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\python37\lib\subprocess.py", line 323, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\python37\lib\subprocess.py", line 775, in init
restore_signals, start_new_session)
File "C:\python37\lib\subprocess.py", line 1178, in execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

I also tried to provide the complete path
C:\Users\spaul\Downloads\youtube-bb-master\youtube-bb-master>python download.py C:\Users\spaul\Downloads\youtube-bb-master\youtube-bb-master\videos\ 1

But still the same error. can you please tell me how to solve this?

About the dataset size

Hello, thank you for your sharing. But i have some problems about the dataset.
I downloaded 74459 videos successfully, and it takes 1.3T of space...
It begin to dowloaded yt_bb_detection_validation.csv.gz , but my hard drive ran out of space.

Does this mean the download of yt_bb_detection_train.csv is complete?
Can i use voc_convert.py to dealing with the 70,000 videos ?
How much space will it take for all video downloads of the training set and validation set?
And how much space will the dataset take after use the voc_convert.py ?

the running speed of voc_convert.py

Hi, your repo is good , and I hava downloaded the youtube data, now I wanna decode them into the voc training data, but I found the running speed is so slow,and it also has a bad case:

  1. We should use loop to handle all data in train.csv, it will cost a lot of time
  2. after handling all data in train.csv, we get the "present_annots" variable, and to decode frame, but it will break because of the [error 26]Too many open files: '/dev/null'

is there any good solution to avoid this ? only try more threads to speed up? I have used 64 threads... but it is also slow...

wget error ?

Now getting a new error message, and i have installed wget on my system.

Safats-MBP:youtube_BB_jul19 SMT_Mac$ python3 download.py vid_dir 30

yt_bb_detection_validation: Downloading annotations...
Traceback (most recent call last):
File "download.py", line 41, in
parse_and_sched(sys.argv[1],int(sys.argv[2]))
File "download.py", line 32, in parse_and_sched
annotations,clips,vids = youtube_bb.parse_annotations(d_set,dl_dir)
File "/Users/SMT_Mac/Desktop/youtube_BB_jul19/youtube_bb.py", line 182, in parse_annotations
check_call(['wget', web_host+d_set+'.csv.gz'])
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 579, in check_call
retcode = call(*popenargs, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 560, in call
with Popen(*popenargs, **kwargs) as p:
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 950, in init
restore_signals, start_new_session)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 1544, in _execute_child
raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'wget'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.