Giter Site home page Giter Site logo

geo's Introduction

Description

This simple repository contains script to parse data from http://www.geonames.org/. It is using their API, so you will need user account to use it. Script parse data from API and stores in json files. The idea is to get cities per country or get areas assigned to specific id.

Requirements

  1. Python 3.6
  2. BeautifulSoup module
  3. requests module

Examples

geo.py:

This script do real parsing. It takes arguments and parse data:

python3.6 geo.py -h
usage: geo.py [-h] [--id ID] country username

Parse http://www.geonames.org data based on country or city in country

positional arguments:
  country     Country name for which parse cities. This is mandatory argument.
  username    Api username. This is mandatory argument.

optional arguments:
  -h, --help  show this help message and exit
  --id ID     Place id from geonames
python3.6 geo.py pl username
Parse country: pl
Store data in: /Users/username/geo/data/2018-03-02/country/pl/cities/data.json

This will get cities for Poland and store it in json file. Part of file output:

"cities": 
[
    {
        "name": "Warsaw", 
        "link": "http://www.geonames.org/maps/google_52.23_21.012.html", 
        "tree": "http://geotree.geonames.org/756135/", 
        "id": "756135"}, 
    {
        "name": "Łódź",
        "link": "http://www.geonames.org/maps/google_51.75_19.467.html",
        "tree": "http://geotree.geonames.org/3093133/", 
        "id": "3093133"
    },
]
python3.6 geo.py pl --id=3094802 username
Parse pl area by id: 3094802
Store data in: /Users/username/geo/data/2018-03-02/country/pl/cities/Lesser Poland Voivodeship.json

This will parse are #3094892 from Poland and store it in file. Part of file output:

{
  "city": "Lesser Poland Voivodeship"
, 
"areas": 
  [
      {
        "name": "Bielany", 
        "id": 3103478
      }, 
      {
        "name": "Bronowice Wielkie", 
        "id": 3102544
       },
  ]
}

job.py:

Second script just call geo.py script for array of countries. The idea is that it can be used for array of countries. It will first parse cities per country and for each city in country it will get its areas. Think about cron job to update data. At this moment, job use array of GCC countries:

countries = ['ae', 'om', 'bh', 'qa', 'sa', 'kw']

This is hardcoded but can be changed to get data as argument.

python3.6 job.py -h
usage: job.py [-h] username

Call script to parse data from http://www.geonames.org data based on country
or city in country

positional arguments:
  username    Api username. This is mandatory argument.

optional arguments:
  -h, --help  show this help message and exit
$ python3.6 job.py username
Parse country: ae
Store data in: /Users/username/geo/data/2018-03-02/country/ae/cities/data.json
Parse ae area by id: 292223
...
Parse country: om
Store data in: /Users/username/geo/data/2018-03-02/country/om/cities/data.json
Store data in: /Users/username/geo/data/2018-03-02/country/ae/cities/Dubai.json
Parse om area by id: 287286
Store data in: /Users/username/geo/data/2018-03-02/country/om/cities/Muscat.json
...

Final words

This script was created in my free time. I needed some data related with geographical content and geonames is good source of it. Script is free, you can do whatever you want with it but there is no warranty, also no animals were hurt during working on it.
If you want to improve it, please feel free and create issues, pull requests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.