dimakiss / udemy_bot Goto Github PK

An automation bot for free Udemy courses

License: GNU General Public License v3.0

Python 100.00%

udemy udemy-course bot chrome python pythonbot webscraping

udemy_bot's Issues

Cloudflare prevents complete execution

When I run the bot it has been unable to get past cloudflare protection. I increased the sleep timeout to manually try the captchas but it seems that I'm given an endless loop of more captchas on the udemy site via cloudflare when using selenium.

I tried using undetected_chromedriver as a replacement to chromedriver but I've been experiencing an error with it at line 207:

https://github.com/dimakiss/Udemy_bot/blob/main/Udemy_bot.py#L207
elif is_account_exist(sys.argv[1], sys.argv[2]):

selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 88 Current browser version is 87.0.4280.141 with binary path /Applications/Google Chrome.app/Contents/MacOS/Google Chrome

A couple of other thing's I've tried are manually specifying the chrome driver version / binary for undetected_chromedriver:

import undetected_chromedriver as uc
uc.TARGET_VERSION = 87
uc.install(
    executable_path='/usr/local/bin/chromedriver',
)

but it still gives the error above. There isn't currently a chrome 88 available when I check for updates.

Selenium package missing as a requirement

The selenium package is missing in the requirements.txt file.

Credentials not correct but they are. Login fails

#10
In reference to the above, I successfully resolved the bot error, but now I'm facing a login problem.
I receive the "There was a problem logging in. Check your email and password or create an account." error.

Mail and password are correct. I copied and pasted them on udemy and I logged in successfully

Debugging a bit, I see that in function "is_account_exist" in udemy_bot.py that
is_exist = temp_url == browser.current_url

gives browser.current_url is not defined

Which probably is the culprit of the whole thing.
Any idea of why it doesn't work???

EDIT: Further investigation (printing the elements found) brings to this;

Checking if the email and password are correct
browser_email: input name="email" required="" maxlength="64" minlength="7" placeholder="Email" data-purpose="email" type="email" id="email--1" class="form-control" value="">ù
browser_password: <ùinput type="password" name="password" required="" placeholder="Password" class="textinput textInput form-control" maxlength="64" data-purpose="password" id="id_password" minlength="6">ù
current_url: https://www.udemy.com/join/login-popup/
browser_submit input type="submit" name="submit" value="Log In" class="btn btn-primary " id="submit-id-submit" data-purpose="do-login
is_exist True
There was a problem logging in. Check your email and password or create an account.

So it's all ok, but nonetheless temp_url and browser.current_url remains equal and it won't work

EDIT:
Out of desperation, I tried again to solve something but this damn bot is a hatch of bugs and won't work in any way.
We've already see that now the bot is catching the html elements correctly.
So, what I did is import these libraries:

from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys

Then, I tried to call the submit in 2 ways:

The original way in which the bot calls the submit
browser.find_element_by_id("submit-id-submit").click()

and this one
browser.find_element_by_id("submit-id-submit").send_keys(Keys.ENTER)

then, at the moment of checking if the browser.current_url is changed I tried the selenium webdriverwait, which simulates a real chrome waiting process instead of
sleep(2)
which is a workaround

so I did

try:
        print('browser_submit' + browser.find_element_by_id("submit-id-submit").get_attribute('outerHTML'))
        browser.find_element_by_id("submit-id-submit").send_keys(Keys.ENTER)
    except NoSuchElementException:
        print("No element found submit")
    print('Waiting max 30 seconds for url change')
    wait = WebDriverWait(browser, 30)
    try:
        wait.until(lambda driver: browser.current_url != temp_url)
    except TimeoutException:
        print('url did not change in 30 seconds')

But even in 30 seconds, nothing happened and I fall in the "TimeoutException".

Now, why it doesn't changes url? That's causing the problem?

I tried to skip is_exists function and I got bot messages about scraping X potential courses, and that it added them to my account, but nothing was added in reality.

Please help!

Issue with Login with Google Option

Hey dev-team,

My udemy account is tied to google. Is there any way to login with google using this script?

Selenium package gives error installing with pip3

RUN pip3 install -r requirements.txt
---> Running in 88829269d1f6
Collecting bs4==0.0.1
Downloading bs4-0.0.1.tar.gz (1.1 kB)
Collecting requests==2.23.0
Downloading requests-2.23.0-py2.py3-none-any.whl (58 kB)
Collecting lxml>=4.6.2
Downloading lxml-4.6.2-cp39-cp39-manylinux1_x86_64.whl (5.4 MB)
ERROR: Could not find a version that satisfies the requirement selenium==1.25.9
ERROR: No matching distribution found for selenium==1.25.9

Error when validating mail (id="email--1") and error with selenium( DevToolsActivePort file doesn't exist)

As mentioned I tried to run the bot in a docker container, which is equal on running it in a linux environment.
I will avoid to tell you the hell I had to do to make this bot work. I Cried long hours and pulled my hair in every way. Believe me.
In the end, I managed to do it but I got this:

Checking if the email and password are correct
Traceback (most recent call last):
File "/etc/udemybot/Udemy_bot.py", line 203, in
elif is_account_exist(sys.argv[1], sys.argv[2]):
File "/etc/udemybot/Udemy_bot.py", line 182, in is_account_exist
browser = webdriver.Chrome(options=options)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py", line 76, in init
RemoteWebDriver.init(
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in init
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)

The only way I found to workaround this is to add in the code

print("Checking if the email and password are correct")
 options = Options()
 options.add_argument("--no-sandbox")
 options.add_argument("--disable-dev-shm-usage")
 options.add_argument("--incognito")
 options.add_argument("--headless")

Have you got a better solution?

Then, after fixing this with a clunky workaround, I get this:

Checking if the email and password are correct
Traceback (most recent call last):
File "/etc/udemybot/Udemy_bot.py", line 203, in
elif is_account_exist(sys.argv[1], sys.argv[2]):
File "/etc/udemybot/Udemy_bot.py", line 186, in is_account_exist
browser.find_element_by_id("email--1").send_keys(email)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 360, in find_element_by_id
return self.find_element(by=By.ID, value=id_)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="email--1"]"}
(Session info: headless chrome=88.0.4324.150)

What can it be? I know it's the html parser that is missing something but debugging in the container it's hard.
You have any idea of what's going on?

EDIT: I printed the html with
print(browser.page_source)

And what you get is this:

Which looks like some bot protection measure from https://www.botstop.com/?utm_source=hcaptcha1 or something like that

Any idea?

selenium package missing in requirements.txt

The selenium package is missing in the requirements.txt file.

All categories enabled by default, README.md states otherwise

I believe courses from all categories are being added by default, eg.

https://www.udemy.com/course/total-beginner-guitar-lessons/ is added

README.md contains
The current default categories are IT and Software and Development

The config area of udemy_bot.py contains

### CONFIG ###

categories_list = [
    'business',
    'design',
    'development',
    'finance-and-accounting',
    'health-and-fitness',
    'it-and-software',
    'lifestyle',
    'marketing',
    'music',
    'office-productivity',
    'personal-development',
    'photography',
    'photography-and-video',
    'teaching-and-academics'
]
#Personal preference for example
#categories_list=[
#    'development',
#    'it-and-software'
#]
rating_stars = 4.2
rating_people = 200

#### END OF CONFIG ###

Cheers - Scott.

Adding feature

Adding the option of reading from a previous urls text file and make sure that urls are not repeating.

dimakiss / udemy_bot Goto Github PK

udemy_bot's Issues

Cloudflare prevents complete execution

Selenium package missing as a requirement

Credentials not correct but they are. Login fails

Issue with Login with Google Option

Selenium package gives error installing with pip3

Error when validating mail (id="email--1") and error with selenium( DevToolsActivePort file doesn't exist)

selenium package missing in requirements.txt

All categories enabled by default, README.md states otherwise

Adding feature

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent