threegiantnoobs / chegg-scraper Goto Github PK

View Code? Open in Web Editor NEW

69.0 3.0 23.0 105 KB

Download Chegg homework-help questions to self-sufficient HTML files

License: The Unlicense

Python 36.27% HTML 63.73%

chegg chegg-answers chegg-downloader scraping

chegg-scraper's Introduction

NOTE

The Original Developers are no longer in a position to maintain this project. But we would still like to keep the project alive, thus any open source contribution from the community is more than welcome.

Chegg-Scrapper

Download Chegg homework-help questions to html files, these html files are self sufficient, you don't need account access to load them

Details

All files are saved to html document.

You will not need your chegg account to open these files later.

USE-CASES

In Bots
You can share your chegg subscription with your friends, eg: by making discord bot
Saving Chegg Questions Locally

Setup:

Download latest release
Install requirements pip install -r requirements.txt
Save your cookie in file cookie.txt (preferably)
Using Browser Console
- Log-in to chegg in your browser and open up the developer console. (cmd-shift-c or ctrl-shift-i)
- Grab your cookies by typing
- paste yout cookie from console into cookie.txt (without ")
Or
Using Chrome Extenstion
- Log-in to chegg in your web browser
- Open Extension (Example) EditThisCookie
- Click Export and paste in cookie.txt
You may also need to change user-agent
- Open conf.json and edit user_agent
  - Find your browser user agent
    - Open What's My User Agent
      
      Or
    - Open Browser console and run
      
      console.log(navigator.userAgent)

Usage:

If you are new to python go here

Run the Downloader.py Script

$ python Downloader.py

Enter url of the homework-help:

Arguments

ALL ARGUMENTS ARE OPTIONAL
-u or --url      >   URL of Chegg
-c or --cookie   >   Path of Cookie file (Defualt: cookie.txt)
-s or --save     >   file path, where you want to save, put inside " "

chegg-scraper's People

Contributors

Stargazers

Watchers

chegg-scraper's Issues

Parse errors

I have done all of the instructions in the readme and installed all of the requirements but I keep getting this error message - cheggscraper.Exceptions.FailedToParseAnswer. Here are the two chegg questions I tried to answer but got errors.
https://www.chegg.com/homework-help/questions-and-answers/-following-code-example-type-css-rule-h3-color-blue-id-html-selector-class-compound-q28240766
https://www.chegg.com/homework-help/questions-and-answers/math-problem-q4908624

Getting an AttributeError: 'NoneType' object has no attribute 'group'

After creating the cookie.txt file and compiling I get the following error message:

Traceback (most recent call last):
File "C:\Users\Julian\Desktop\chegg\Downloader.py", line 3, in
Downloader.main()
File "C:\Users\xxxx\Desktop\chegg\cheggscraper\Downloader.py", line 40, in main
print(Chegg.url_to_html(args['url'], file_name_format=args['file_format']))
File "C:\Users\xxxxx\Desktop\chegg\cheggscraper\CheggScraper.py", line 521, in url_to_html
headers, heading, question_div, answers__, question_uuid = self._parse(html_text=html_res_text,
File "C:\Users\xxxxx\Desktop\chegg\cheggscraper\CheggScraper.py", line 437, in _parse
re.search(r'C.page.homeworkhelp_question((.*)?);', html_text).group(1))
AttributeError: 'NoneType' object has no attribute 'group'

Please let me know if there is a solution for this or if this is now an outdated scraper, thank you.

Captcha Error

Chegg detects the script as a bot and asks to resolve the captcha after 2 links, any solution? I have changed the UserAgent correctly in the configuration file.

File "Downloader.py", line 3, in <module> Downloader.main() File "C:\Users\nitin\Desktop\chegg-scraper\cheggscraper\Downloader.py", line 40, in main print(Chegg.url_to_html(args['url'], file_name_format=args['file_format'])) File "C:\Users\nitin\Desktop\chegg-scraper\cheggscraper\CheggScraper.py", line 521, in url_to_html html_res_text = self._get_response_text(url=url) File "C:\Users\nitin\Desktop\chegg-scraper\cheggscraper\CheggScraper.py", line 311, in _get_response_text raise Exception(f'Expected status code {expected_status} but got {response.status_code}\n{error_note}') Exception: Expected status code (200,) but got 403 Error in request PS C:\Users\nitin\Desktop\chegg-scraper>

i get these error

raise Exception(f'Expected status code {expected_status} but got {response.status_code}\n{error_note}')
Exception: Expected status code (200,) but got 403
Error in request
PS C:\Users\nitin\Desktop\chegg-scraper>

Doesnt Work for any URL as of right now

Gives error everywhere, Please fix ;/

Chegg account keeps getting suspended

Have my own subscription, and I don't share with anyone. Launched the bot on the same browser, same IP as my chegg account and I still get that prompt a day or so after launching the bot. How to fix?

Error

Got this error while trying to scrape a URL
Traceback (most recent call last): File "/Users/sheheryartariq/Downloads/chegg-scraper-1.3/Downloader.py", line 3, in <module> Downloader.main() File "/Users/sheheryartariq/Downloads/chegg-scraper-1.3/cheggscraper/Downloader.py", line 40, in main print(Chegg.url_to_html(args['url'], file_name_format=args['file_format'])) File "/Users/sheheryartariq/Downloads/chegg-scraper-1.3/cheggscraper/CheggScraper.py", line 521, in url_to_html headers, heading, question_div, answers__, question_uuid = self._parse(html_text=html_res_text, File "/Users/sheheryartariq/Downloads/chegg-scraper-1.3/cheggscraper/CheggScraper.py", line 437, in _parse re.search(r'C\.page\.homeworkhelp_question\((.*)?\);', html_text).group(1)) AttributeError: 'NoneType' object has no attribute 'group'

Help(CAN PAY FOR IT)

Ey, can you help someone like me who has no idea about using python?
I need to have chegg bot into my server! and I'm willing to pay for this help! THANKS

Bot Flag Error when deployed to heroku

code works when running on local computer, however when deployed to heroku or any site: AWS, GCP, pythonanywhere etc... I get bot flagged error even with cookie file.

Error downloading

Traceback (most recent call last):
File "C:\Users\tarek\OneDrive\Escritorio\chegg-scraper-main\Downloader.py", line 3, in
Downloader.main()
File "C:\Users\tarek\OneDrive\Escritorio\chegg-scraper-main\cheggscraper\Downloader.py", line 40, in main
print(Chegg.url_to_html(args['url'], file_name_format=args['file_format']))
File "C:\Users\tarek\OneDrive\Escritorio\chegg-scraper-main\cheggscraper\CheggScraper.py", line 533, in url_to_html
headers, heading, question_div, answers__ = self._parse(
File "C:\Users\tarek\OneDrive\Escritorio\chegg-scraper-main\cheggscraper\CheggScraper.py", line 448, in _parse
heading = self._parse_heading(soup)
File "C:\Users\tarek\OneDrive\Escritorio\chegg-scraper-main\cheggscraper\CheggScraper.py", line 345, in _parse_heading
heading = json.loads(heading_data)['query']['qnaSlug']
KeyError: 'qnaSlug'

Cookie file not found

No matter where I put cookie.txt file, it keeps telling me:

Traceback (most recent call last):
  File "Downloader.py", line 1, in <module>
    from cheggscraper import Downloader
  File "/mnt/c/Users/Username/OneDrive/Desktop/solutions/cheggscraper/Downloader.py", line 48
    raise Exception(f'{args["cookie_file"]} does not exists')

Also tried Downloader.py in the root folder of the project. And using the -c argument. Nothing works
Any idea?

UnicodeEncodeError: 'charmap' codec can't encode character

how i can fix this error
UnicodeEncodeError: 'charmap' codec can't encode character.....
line 439 cheggscraper/CheggScraper.py
This error appeared when i scraping some of url link

Error

Exception error

Likes and Dislikes

The script works very well, but could you add the Likes and Dislikes of the answers? That would be very good as it would help us to know if the answers are correct. Regards.

Bot Error Flag on home internet connection

I have a paid chegg account but every time I try to use the scrapper on my laptop I get the following error "cheggscraper.Exceptions.BotFlagError"

How to change the saved html directory?

Hi, after entered the url, the converted html is saved in the same directory as the chegg-scraper file. How do i change the file directory to other place? Thanks

Bro do you face reCaptcha error too?

Hi guys, codes works good but some of the URL doesn't work.... I mean unless you open them first and pass reCaptcha stuff. Have you faced same issues? or the problem is on my side? Thanks

exception error

AttributeError: 'NoneType' object has no attribute 'group'

Hi there,

After giving the link: https://www.chegg.com/homework-help/questions-and-answers/us-government-recently-announced-started-implement-large-scale-fiscal-expansion-mitigate-n-q91031911?trackid=2e28bc4b1bcc&strackid=0708d012927e, the programme failed with the following error:

root@78:/codingProject/chegg-scraper# python3 Downloader.py -u https://www.chegg.com/homework-help/questions-and-answers/us-government-recently-announced-started-implement-large-scale-fiscal-expansion-mitigate-n-q91031911?trackid=2e28bc4b1bcc&strackid=0708d012927e
[16] 42052
root@78:/codingProject/chegg-scraper# Traceback (most recent call last):
File "Downloader.py", line 3, in
Downloader.main()
File "/codingProject/chegg-scraper/cheggscraper/Downloader.py", line 40, in main
print(Chegg.url_to_html(args['url'], file_name_format=args['file_format']))
File "/codingProject/chegg-scraper/cheggscraper/CheggScraper.py", line 521, in url_to_html
headers, heading, question_div, answers__, question_uuid = self._parse(html_text=html_res_text,
File "/codingProject/chegg-scraper/cheggscraper/CheggScraper.py", line 437, in _parse
re.search(r'C.page.homeworkhelp_question((.*)?);', html_text).group(1))
AttributeError: 'NoneType' object has no attribute 'group'

Some url can't work on it

https://www.chegg.com/homework-help/questions-and-answers/babies-birth-weights-normally-distributed-mean-120-ounces-standard-deviation-20-ounces-low-q21343037

https://www.chegg.com/homework-help/questions-and-answers/4-consider-gambler-s-ruin-problem-p-04-n-6-starting-state-3-determine-expected-amount-time-q40590496

https://www.chegg.com/homework-help/questions-and-answers/suppose-swing-ball-mass-m-vertical-circle-string-length-l--probably-know-experience-minimu-q2979048

The template has some problem

Can you update it

Not working code for non chapter type questions

AttributeError: 'NoneType' object has no attribute 'group'

Hi guys, I face the attribute error. I am wondering about how to fix that. Can you guys help me to have a look? Thank you

Cookie not working

Placing cookie.txt file in project folder not working

File "C:\Users\user\Documents\GitHub\chegg-scraper-main\Downloader.py", line 3, in
Downloader.main()
File "C:\Users\user\Documents\GitHub\chegg-scraper-main\cheggscraper\Downloader.py", line 34, in main
raise Exception(f'{args["cookie_file"]} does not exists')
Exception: cookie.txt does not exists

Always ask for captcha after opening 5,10 link

Captcha error

run

chegg-scraper>python Downloader.py
Enter url of the homework-help: https://www.chegg.com/homework-help/questions-and-answers/bicycle-pedal-crank-subjected-1000-n-pedaling-force-determine-torque-nm-point-b--dimension-q30751129
Traceback (most recent call last):
File "chegg-scraper\Downloader.py", line 28, in
print(Chegg.url_to_html(args['url'], file_path=args['file_path']))
File "chegg-scraper\CheggScraper.py", line 261, in url_to_html
final_html, heading = self.parse(html_text=html_res_text)
File "chegg-scraper\CheggScraper.py", line 243, in parse
answers = self._parse_answer(soup, html_text)
File "chegg-scraper\CheggScraper.py", line 196, in _parse_answer
_, question_data = self.parse_json(re.search(r'C.page.homeworkhelp_question((.*)?);', html_text).group(1))
AttributeError: 'NoneType' object has no attribute 'group'

Done

Done .

Need Help Regarding using this project with discord or telegram bot

Sir,
I am a student of IIT Varanasi, and I am a noob in python. I just started learning it.
I want help in making a telegram or discord bot that uses the scripts and returns the answer to the user.
It would be very kind for you to help me.
Thank You