Giter Site home page Giter Site logo

lukes / iso-3166-countries-with-regional-codes Goto Github PK

View Code? Open in Web Editor NEW
2.1K 2.1K 3.4K 190 KB

ISO 3166-1 country lists merged with their UN Geoscheme regional codes in ready-to-use JSON, XML, CSV data sets

License: Other

Ruby 100.00%
countries csv data dataset iso iso3166 iso3166-1 iso3166-2 json region-codes xml

iso-3166-countries-with-regional-codes's Introduction

ISO-3166 Country and Dependent Territories Lists with UN Regional Codes

These lists are the result of merging data from two sources, the Wikipedia ISO 3166-1 article for alpha and numeric country codes, and the UN Statistics site for countries' regional, and sub-regional codes. In addition to countries, it includes dependent territories.

The International Organization for Standardization (ISO) site provides partial data (capitalised and sometimes stripped of non-latin ornamentation), but sells the complete data set as a Microsoft Access 2003 database. Other sites give you the numeric and character codes, but there appeared to be no sites that included the associated UN-maintained regional codes in their data sets. I scraped data from the above two websites that is all publicly available already to produce some ready-to-use complete data sets that will hopefully save someone some time who had similar needs.

What's available?

The data is available in

  • JSON
  • XML
  • CSV

3 versions exist for each format

  • all.format - Everything I can find, including regional and sub-regional codes
  • slim-2.format - English name, numeric country code and alpha-2 code (e.g., NZ)
  • slim-3.format - English name, numeric country code and alpha-3 code (e.g., NZL)

What does it look like?

Take a peek inside the all, slim-2 and slim-3 directories for the full lists of JSON, XML and CSV.

Using JSON as an example:

all.json

[
  {
    "name":"Nigeria",
    "alpha-2":"NG",
    "alpha-3":"NGA",
    "country-code":"566",
    "iso_3166-2":"ISO 3166-2:NG",
    "region":"Africa",
    "sub-region":"Sub-Saharan Africa",
    "intermediate-region":"Western Africa",
    "region-code":"002",
    "sub-region-code":"202",
    "intermediate-region-code":"011"
  },
  // ...
]

slim-2.json

[
  {
    "name":"New Zealand",
    "alpha-2":"NZ",
    "country-code":"554"
  },
  // ...
]

slim-3.json

[
  {
    "name":"New Zealand",
    "alpha-3":"NZL",
    "country-code":"554"
  },
  // ...
]

Caveats

  1. Please check the data independently for accuracy before using it in any system and for any purpose
  2. Although I've tried to ensure the data is as accurate as possible, the data is not authoritative, and so should not be considered accurate

scrubber.rb

scrubber.rb is a dirty Ruby script I used to generate these files. You can run it yourself if you wish to re-generate the files fresh from the sources.

To install the gems in the Gemfile:

bundle

To run:

bundle exec ruby scrubber.rb

Note, due to file encoding issues the script should only be run using Ruby 1.9 or above.

Timestamp

  • UN Statistical data retrieved 8 December 2020
  • Wikipedia data retrieved 8 December 2020, from a document last revised 19 November 2020

Revisions

  • 8 December 2020 - tag 9.0
  • 19 March 2019 - tag 8.0
  • 25 July 2018 - tag 7.0
  • 10 April 2018 - tag 6.0
  • 26 August 2016 - tag 5.0
  • 28 August 2015 - tag 4.0
  • 20 April 2014 - tag 3.0
  • 13 June 2012 - tag 2.0
  • 18 May 2011 - tag 1.0

iso-3166-countries-with-regional-codes's People

Contributors

chocochino avatar dependabot[bot] avatar jlewis91 avatar lukes avatar michalskop avatar nickdickinsonwilde avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iso-3166-countries-with-regional-codes's Issues

Key names to match ISO

Thank you for this handy reference for the ISO countries. I ran into conflict with the key names not matching ISO's key names. ISO uses Alpha-2, Alpha-3, Numeric in displays and discussions. In their CSV sample files, the headings are alpha_2_code, alpha_3_code, numeric_code.
It would be a breaking change to rename them, but I just wanted to note it. It came up when doing the variable names in our project.
(We're going with Alpha2, Alpha3, Numeric; translating from your files; and avoiding ISO's kebab-case.)

Somewhat related to the hyphen question #27

Hyphen in keys

Using a hyphen in the object key names doesn't play very well with JavaScript and other languages. Please consider using underscores or camelCase.

C++11 code generation

I implemented a code generator for C++11 in Python. If you're at all interested in merging this back into your repo I'm happy to tidy it up some more and take your guidance on it.

Thanks for the super-convenient JSON! 👍

-- Nigel

#!/bin/getenv python

import json
import codecs

input = codecs.open("all/all.json", 'r', 'UTF-8')

h   = codecs.open("all/all.h",  'w', 'UTF-8')
cpp = codecs.open("all/all.cpp", 'w', 'UTF-8')

dom = json.load(input)

h.write('''
#pragma once

#include <string>

    struct Country
    {
        std::string  name;
        std::string  alpha_2;
        std::string  alpha_3;
        std::string  country_code;
        std::string  region_code;
        std::string  region;
        std::string  sub_region;
    };

    const Country * findByName  (const std::string &);
    const Country * findByAlpha2(const std::string &);
    const Country * findByAlpha3(const std::string &);
    const Country * findByCode  (const std::string &);

''')

cpp.write('''
#include "all.h"

#include <map>

''')

cpp.write("const Country country[%d] = {\n" % len(dom))
for i in dom:
    cpp.write('  { "%s", "%s", "%s", "%s", "%s", "%s", "%s" },\n'%(i["name"], i["alpha-2"], i["alpha-3"], i["country-code"], i["region-code"], i["region"], i["sub-region"]))
cpp.write("};\n\n");

cpp.write("const std::map<std::string, const Country *> countryByName = {\n" )
index = 0;
for i in dom:
    cpp.write('  { "%s", &country[%d] },\n'%(i["name"], index))
    index = index + 1
cpp.write("};\n\n")

cpp.write("const std::map<std::string, const Country *> countryByAlpha2 = {\n" )
index = 0;
for i in dom:
    cpp.write('  { "%s", &country[%d] },\n'%(i["alpha-2"], index))
    index = index + 1
cpp.write("};\n\n")

cpp.write("const std::map<std::string, const Country *> countryByAlpha3 = {\n" )
index = 0;
for i in dom:
    cpp.write('  { "%s", &country[%d] },\n'%(i["alpha-3"], index))
    index = index + 1
cpp.write("};\n\n")

cpp.write("const std::map<std::string, const Country *> countryByCode = {\n" )
index = 0;
for i in dom:
    cpp.write('  { "%s", &country[%d] },\n'%(i["country-code"], index))
    index = index + 1
cpp.write("};\n\n")

cpp.write("""

const Country * findByName(const std::string & name)
{
    auto i = countryByName.find(name);
    return i==countryByName.end() ? NULL : i->second;
}

const Country * findByAlpha2(const std::string & name)
{
    auto i = countryByAlpha2.find(name);
    return i==countryByAlpha2.end() ? NULL : i->second;
}

const Country * findByAlpha3(const std::string & name)
{
    auto i = countryByAlpha3.find(name);
    return i==countryByAlpha3.end() ? NULL : i->second;
}

const Country * findByCode(const std::string & name)
{
    auto i = countryByCode.find(name);
    return i==countryByCode.end() ? NULL : i->second;
}

""")

some country name is strange

Some country name is strange, looks like it's been chuncated. I think the name should be more human readable.
For Example, Korea (Democratic People's Republic of) should be full nameKorea (Democratic People's Republic of Korea) or common name North Korea.

$ cat all.json | jq ".[].name" | grep of
"Bolivia (Plurinational State of)"
"Congo, Democratic Republic of the"
"Iran (Islamic Republic of)"
"Isle of Man"
"Korea (Democratic People's Republic of)"
"Korea, Republic of"
"Micronesia (Federated States of)"
"Moldova, Republic of"
"Palestine, State of"
"Taiwan, Province of China"
"Tanzania, United Republic of"
"United Kingdom of Great Britain and Northern Ireland"
"United States of America"
"Venezuela (Bolivarian Republic of)"

Missing License

There is no mention of license in the code. A general license type of MIT or GPL will be useful to include

Add license

This dataset is very useful, and I thought you might would like to regulate its use using a license. I suggest using CC-BY-SA 4.0 since you build your works upon Wikipedia's dataset which use CC-BY-SA 3.0 as its license.

(Actually, I had just realized that your repo also contains a Ruby script to retrieve the data, but I assumed you primarily use this repo to share your datasets and not your script).

Subregion codes wrong

Thanks for the useful resource. There's a small bug with two of the countries: St Kitts and Nevis and Serbia - their value for sub-region-code is actually their country code - e.g. 688 for Serbia, when it should be 039 (Southern Europe). All the others appear fine.

Viet Nam or Vietnam

I found that Vietnam is spelled as Viet Nam

Researching it Vietnam is the western/English spelling and Việt Nam not Viet Nam the Vietnamese spelling

Happy to change this manually in all files if wanted?

Taiwan, Province of China rename to Taiwan

Hi, appreciate for the great work, but Taiwan does not belong to Province of China right now, since there's still an independent form of government in Taiwan and people from China cannot enter Taiwan without passport or any VISA.
This is a controversial issue for a long time, and it is not settled until now, so I'll be appreciate if you can change the name of this place to Taiwan, which is the name that most of the people live in this place think it should be.

Thank you.

Americas as the region name

Hello,
First of all, I want to thank you for this list. I am not an English native speaker, and I am wondering the reason you call the region 'Americas' instead of 'America'.

Taiwan is a country, not province of China.

According to Wikipedia, Taiwan is not a province of China.

There are lines of code representing "Taiwan" as "Taiwan, Province of China" which is unacceptable. I've created some pull requests which updating those lines of code in /all folder (csv, json, xml).

Latest update is missing

Hi!

At first, thank you for this repo!

In your readme you say that you have updated the data to August 2016.
I cant find this update in this repo.

Please help.

Thanks!

availability as a npm package

Hello,

I'd like to know if you were interested into a PR which would add a package.json file in order to publish to the node package manager registry.

This would allow people to use the json (and other formats, but mostly json) directly from their dependencies in any javascript project.

I'd be happy to help on that, let me know.

Question: Are country codes constant?

Sorry for asking in the issues, I really don't know where else to look for this information, and I can't find it on the web either.

Are country-code expected to be constant? I know country names and 2-letter codes can change (eg FYROM MK soon) but will the numeric codes in this repo's CSVs remain constant nomatter what? Can I safely use those as primary keys, to refer to countries from other tables? And keep your CSV updated in my app, without fear of inconsistency in my app in the future? (except for when two countries join, and a numeric code then disappears)

Thanks.

Group entries

Hello,

ISO 3166 defines countries in the usual sense and some of their subdivisions. However there is no link between a subdivision and the "main" entity. For instance:

  • Martinique (MQ) is usually understood as linked to France (FR)
  • Greenland (GL) is usually understood as linked to Denmark (DK)
  • etc.

Therefore I think it might be a good idea to add a column to store that information. I understand it might be a tricky issue (it's probably why the ISO doesn't do that) but I find this useful. We should probably pay attention to the chosen terms.

Any thoughts ?

P.S.: thanks for this useful list.

Cyprus - wrong region

I think that Cyprus is listed at a wrong region.

It´s listed at Asia, but I thinj it shoud be Europe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.