Giter Site home page Giter Site logo

mowlc / real-estate-scraper Goto Github PK

View Code? Open in Web Editor NEW
36.0 7.0 6.0 3.11 MB

Web scraper that makes it easier to find real estate in Slovenia.

License: MIT License

JavaScript 100.00%
realestate javascript scraper slovenia nodejs bolha nepremicnine

real-estate-scraper's Introduction

real-estate-scraper

Web scraper that makes it easier to find real estate in Slovenia. After scraper finishes you get an email with updates, so you don't have to check web pages all the time but only. In email you get all the information you need: Title with location, short description, price and link to original listing.

Email example:

Email example

Currently supporting two major webpages for real estate Bolha.com and Nepremicnine.net

Download

In order to rund script you have to install [Node.js] (https://nodejs.org/en/) and do the following:

  • Clone repo
git clone https://github.com/mowlc/real-estate-scraper.git
  • Install additional libraries
npm install --save tinyreq
npm install --save cheerio 
npm install --save node-json-db 
npm install --save nodemailer

Run

Firstly you need to get oAuth credentials. Very good tutorial on how to get them can be found here: https://stackoverflow.com/questions/24098461/nodemailer-gmail-what-exactly-is-a-refresh-token-and-how-do-i-get-one Once you have your clientID, clientSecret, refreshToken and accessToken take config.json.example, compy and rename it to config.json. Then fill in the required data:

sender_email - Email from which emails will be sent (must be gmail)
clientID - Client ID for oAuth
clientSecret - Client secret for oAuth
refreshToken - Refresh token for oAuth
accessToken - Inital access token for oAuth, can also be empty because new one is generated upon registration
interval = 15 - Interval on which the script executes (between 15 and 30 minutes is optimal) 
receiver_email - List of email addresses on which to send email 
url_bolha  - list of URLs of selection on bolha.com				
url_nepremicnine - list of URLs of selection on nepremicnine.net				

You can get desired URL's from chosen site(bolha, nepremicnine.net) by configuring search paramteres on the site and then copying the URL in to configuration file.

You run the script by executing following command:

node scraper.js

Troubleshooting

Email not send

Problem could be in your Gmail account settings as Google blocks sign-in attempts from apps that do not use modern security standards. In order to fix that go to Google less secure apps settings and turn Access for less secure apps ON.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.