Giter Site home page Giter Site logo

het-parekh / internshala-web-scraper-internshala.com Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 517 KB

This is a python web-scraper for Internshala ,a website for finding Internships. It is made using BeautifulSoup4 to assist in getting info regarding all the ongoing internships in any field with several filters and easy to use UI.

Home Page: https://internshala.com/

License: MIT License

Python 100.00%
web python3 webscraper internship internshala beautifulsoup webscapping

internshala-web-scraper-internshala.com's Introduction

Web-Scraper-Internshala.com [Python 3.7+]

Last Updated on 19/09/2020

Internshala Logo

Third Party Libraries Required :

  1. requests [To fetch the Url content ]
  2. BeautifulSoup4 [Library used for web scraping]
  3. xlwt [To export the data to a Excel File with multiple sheets]

Different filters are available such as :

  • Include work from home
  • Part-time
  • Internships for women
  • Internships with job offer
  • Starting from (or after)
  • Max Duration
  • select multiple locations
  • select multiple Category

Stores the following data for every internship available based on the selected filters :

  • Title
  • Company Name
  • Category
  • Location
  • Duration
  • Stripend
  • Last Date to apply
  • Number of applicants who have applied
  • Skills Required
  • Perks Provided
  • Number of openings
  • Link to that internship

How to use it :

  1. Download or clone the repository
  2. Install Required Libraries
  3. Run main.py
  4. Provide appropriate input
  5. Obtain the excel file in .xls format

Input Example :

Include Work From home?
Include Part-time?
Internships for women?
Internships with job offer?
(Represent your choice with 1-True or 0-False separated by commas such as 1,0,0,1)

1,0,0,0
Enter different categories separated by commas* (Required)
Web Development
Enter different locations separated by commas* (Required)
Mumbai,Delhi
Enter start date in format (yyyy-mm-dd) or leave empty for current date

Enter maximum duration or leave empty for any duration
3
--------------------------------------------------------------------
How many pages you would like to get? Max Pages (16)
2
Different pages on different sheets?(Default: Yes) | 1: No
#Leave empty if Yes 
--------------Scraping Page 1 -----------------
--------------Scraping Page 2 -----------------

1: Add New Sheet
2: Save and Open the file in Excel
3: Save file
4: Discard file and Exit
2
Enter the name of the file
Web_Dev

Excel File 1 Excel File 2

internshala-web-scraper-internshala.com's People

Contributors

het-parekh avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

tejas1510

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.