nonsleepr / edu_10gen_dl Goto Github PK
View Code? Open in Web Editor NEWGenerate list of course videos from eudcation.10gen.com.
Generate list of course videos from eudcation.10gen.com.
It seems that default execution (python edu_10gen.py
) download videos without subtitles. It will be great (for not-english natives) to have the ability to download with subtitles.
I have seen in ydl_params.json
a subtitleslang
attribute. Is it somehow related with this?
Thanks!
Can it work with a different version of python say 2.6?
Hi
I am using Git bash on windows and able to clone edu_10gen_dl and can't run the command 'sudo pip install -r requirements.txt'.
I am getting sh .exe: Sudo command not found.
I have python2.7 installed on windows.
Could you pl help what is missing
If a problem occurs with the internet connection and a file has been temporarily downloaded, is there a way to resume downloading the file or does it have to be downloaded again from scratch?
dave@dave-netbook:~/edu_10gen_dl$ sudo python edx_dl.py ~/edu_10gen_dl/
Downloading to /home/dave/edu_10gen_dl/ directory
Found the following courses:
[01] M101P MongoDB for Developers
[02] M102 MongoDB for DBAs
Processing...
Course: {'url': '/courses/10gen/M101P/2013_June/courseware', 'name': u'M101P MongoDB for Developers'}
Chapters:
Getting chapters...
[01] Week 1 - Introduction
[01.01] Welcome to M101
[01.02] What is MongoDB?
[01.03] Mongo Relative to Relational
[01.04] Overview of Building an app with Mongo
[01.05] Quick Introduction to the Mongo Shell
[01.06] JSON introduced
[01.07] Installing MongoDB (mac)
[01.08] Installing MongoDB (windows)
"
"
"
[03.16] HW 3.1
[03.17] HW 3.2
[03.18] HW 3.3
Processing /home/dave/edu_10gen_dl/M101P_MongoDB_for_Developers/Week_1_-Introduction/01.01.*...
Processing /home/dave/edu_10gen_dl/M101P_MongoDB_for_Developers/Week_1-Introduction/01.02.*...
Processing /home/dave/edu_10gen_dl/M101P_MongoDB_for_Developers/Week_1-_Introduction/01.03.*...
"
"
"
Ultimately no files are downloaded.
When I run the script python edu_10gen.py I get this error : Not all the nessesary libs are installed. Please see requirements.txt.
In the class init for EdXBrowser, there is the following call to FileDownloader (line 66)
self._fd = FileDownloader(config.YDL_PARAMS)
However, the class init for FileDownloader expects 2 arguments.
I just tried to grab my finished 10gen M101J course and got this error.
Traceback (most recent call last):
File "edx_dl.py", line 204, in
edxb.download()
File "edx_dl.py", line 141, in download
sanitize_filename(course_name, replace_space_with_underscore),
TypeError: sanitize_filename() takes exactly 1 argument (2 given)
Hi,
Firstly, thanks for the great script! Helps me a lot to get the videos offline to watch when I want.
About the bug, seems like the link to videos depends on the link:
courses/10gen//<year/month>/courseware
However, the current link is coming in as:
courses/10gen//<year/month>/syllabus
which is causing soft failure to download. I just get an empty screen without any list of course chapters.
I used a simple text replace inline as a workaround and the script works like a charm.
(Apologies if this is already raised as an issue and/or if you're working on it)
Hello, when trying the script and I get the following error:
$ python edu_10gen.py
Can't sign in
I review the code and the script seems to be trying to download from https://www.edx.org/login , this page returns a 404 in my browser.
If I print out the exception the error seems consistent.
Output:
$ python edu_10gen.py
HTTP Error 404: Not Found
Can't sign in
Code change :
--- a/edu_10gen.py
+++ b/edu_10gen.py
@@ -90,6 +90,7 @@ class TenGenBrowser(object):
print login_state.get('value')
return self._logged_in
except mechanize.HTTPError, e:
print e
sys.exit('Can\'t sign in')
def list_courses(self):
The code needs to be edited to suit the edx website.
Line 86:
my_courses = dashboard_soup.findAll('article', 'my-course') #works for 10gen website
should be
my_courses = dashboard_soup.findAll('article', 'course honor') #works for edx website
since the article tags have different class names.
Alternatively an if statement can be used to check which website you are downloading from and then use the correct class for the article tag.
$ python2.7 edu_10gen.py [01] CS169.1x Software as a Service [01] Overview ... [07] Week 6 ----------------------- Start downloading ----------------------- Processing CS169_1x_Software_as_a_Service/Overview/01.01.*... Traceback (most recent call last): File "edu_10gen.py", line 175, in tgb.download() File "edu_10gen.py", line 145, in download par_soup = BeautifulSoup(par.read()) File "/usr/lib/python2.7/site-packages/bs4/__init__.py", line 172, in __init__ self._feed() File "/usr/lib/python2.7/site-packages/bs4/__init__.py", line 185, in _feed self.builder.feed(self.markup) File "/usr/lib/python2.7/site-packages/bs4/builder/_lxml.py", line 195, in feed self.parser.close() File "parser.pxi", line 1171, in lxml.etree._FeedParser.close (src/lxml/lxml.etree.c:84800) File "parsertarget.pxi", line 126, in lxml.etree._TargetParserContext._handleParseResult (src/lxml/lxml.etree.c:93961) File "lxml.etree.pyx", line 282, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:8216) File "saxparser.pxi", line 273, in lxml.etree._handleSaxCData (src/lxml/lxml.etree.c:90291) UnicodeDecodeError: 'utf8' codec can't decode byte 0x81 in position 527: invalid start byte
Hi!
I'm actully trying your script, with the latest version of youtube_dl and seems it doesn't work.
:edu_10gen_dl $ python edx_dl.py ../videos
Traceback (most recent call last):
File "edx_dl.py", line 16, in
from youtube_dl.InfoExtractors import YoutubeIE
ImportError: cannot import name YoutubeIE
Cheers,
Hi,
Has anybody been able to get subtitles working with this script?
Whenever I try setting writesubtitles to True, I get:
[youtube] mXHUIghDkFw: Downloading video webpage
[youtube] mXHUIghDkFw: Downloading video info webpage
[youtube] mXHUIghDkFw: Extracting video information
WARNING: unable to download video subtitles for en: Unable to download webpage: HTTP Error 404: Not Found; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type youtube-dl -U to update.
[download] Destination: lectures/M101P_-_MongoDB_for_Developers/Week_4_-_Performance/04.13.02 How_Large_is_your_Index_Answer.mp4
[download] 100% of 557.22KiB in 00:00
The weird thing is, if you use youtube-dl to download them directly, it works.
But when edu_10gen_dl imports the YoutubeDL object, and calls .download(), it seems to fail?
I've raised an issue with youtube-dl:
However, just curious is anybody was actually able to get it working?
Cheers,
Victor
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.