Giter Site home page Giter Site logo

rafaqfg / web-scraping-project-python Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.65 MB

In this project I created a python script using data scraping techniques to extract HTML content data from the Trybe's blog and stored in a MongoDB database.

Dockerfile 1.66% Python 98.34%
data-scraper data-scraping data-scrapping mongodb python python3 docker docker-compose

web-scraping-project-python's Introduction

Web Scrapping Project

Developed by

Description

  • In this project I created a python script to scrap technologies news from the Trybe's blog .

Stack

Development: Python, Docker, pymongo, beautifulsoup4 and MongoDB.

How to run the application with Docker (you need have already docker-compose installed in your machine)

Clone the repository

  git clone [email protected]:Rafaqfg/web-scraping-project-Python.git

Enter in the project folder

  cd web-scraping-project-Python

Create and activate the virtual environment for the project

  python3 -m venv .venv && source .venv/bin/activate

install the dependencies

  python3 -m pip install -r dev-requirements.txt

๐Ÿ“Œ Note: If during the installation you received some red error message just repeat the previous step until the error message is gone.

Up the Docker containers using the compose file (door 27017 must be avaible)

  docker-compose up -d

Run the menu.py file

   python3 tech_news/menu.py

Enjoy scrapping xD


๐Ÿ“Œ Note: All scrapped website is in portuguese, therefore you need to write your searches in portuguese.

Steps of development

description finished
Create the fetch function โœ”๏ธ
Create the function scrape_novidades โœ”๏ธ
Create the scrape_next_page_link function โœ”๏ธ
Create the scrape_noticia function โœ”๏ธ
Create the get_tech_news function to get the news! โœ”๏ธ
Create the function search_by_title โœ”๏ธ
create the function search_by_date โœ”๏ธ
Create the function search_by_tag โœ”๏ธ
Create the function search_by_category โœ”๏ธ
Create the function top_5_news โœ”๏ธ
Create the function top_5_categories โœ”๏ธ
Create the menu function โœ”๏ธ
Implement the menu features โœ”๏ธ

Gif of the application

web-scraping-project-python's People

Contributors

rafaqfg avatar trybe-tech-ops avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.