Giter Site home page Giter Site logo

openbrewerydb / openbrewerydb Goto Github PK

View Code? Open in Web Editor NEW
172.0 172.0 84.0 28.3 MB

๐Ÿป An open-source dataset of breweries, cideries, brewpubs, and bottleshops.

Home Page: https://www.openbrewerydb.org

License: MIT License

Jupyter Notebook 65.74% TypeScript 34.26%
breweries csv dataset hacktoberfest json sql typescript

openbrewerydb's Introduction

๐Ÿป Open Brewery DB Dataset

All Contributors

Open Brewery DB Logo

This is the open-source dataset for the Open Brewery DB API which is served by a REST API built with Ruby on Rails

๐ŸŽฏ Purpose

Provide an approval-based pipeline to update the dataset and API.

๐Ÿ—„ Data Formats

API

Access the dataset programmatically via the Open Brewery DB API. Use the following tools to get started without any code:

If you don't know how to use APIs, you can use Brewery DB without code through the databar.ai platform.

Run without code

A shared Postman collection containing all the API requests to fetch breweries information from the open-source dataset.

Run in Postman

๐Ÿš€ Getting Started

  1. git clone [email protected]:openbrewerydb/openbrewerydb.git
  2. cd openbrewerydb && npm install

๐Ÿค Contributing

For information on contributing to this project, please see the contributing guide and our code of conduct.

  1. Fork the repository
  2. Add or update breweries in the CSV (Excel, Google Sheets)
  3. Submit a Pull Request

Tips

First and foremost, don't worry about messing up! ๐Ÿ™‚ Thank you so much for contributing! ๐Ÿ™Œ

  • CSVs are organized by data/[country]/[state_province]
  • Required fields/columns: name, brewery_type, city, state_province, postal_code, and country
  • When adding a brewery, do not include an id. This will be created after review.
  • Please either add to breweries.csv (preferred if adding breweries for a new country) or the individual state/province CSV file. Adding to both at the same time may introduce duplicates/errors.

โš™๏ธ Scripts

These are the npm scripts used to maintain this dataset.

  • npm run csv:combine - Combine CSVs from country/state-region folders into breweries.csv
  • npm run csv:split - Split breweries.csv into country/state-region/city CSVs
  • npm run contributors:add - Add contributor (interactive CLI)
  • npm run contributors:check - Check if there are any missing contributors
  • npm run contributors:generate - Generate contributors into README.md
  • npm run generate:ids - Generate unique OBDB IDs based on the brewery name and city and overwrite breweries.csv
  • npm run generate:json - Generate JSON from breweries.csv output to breweries.json
  • npm run generate:sql - Generate PostgreSQL SQL from breweries.csv output to breweries.sql
  • npm run validate - Validate CSVs based on JSON Schema
  • npm run workflow:maintain - Combine, generate, split (used when updating individual CSVs)

๐Ÿ‘พ Community

๐Ÿ“ซ Feedback

Any feedback, please email me.

Cheers! ๐Ÿป

Contributors โœจ

Thanks goes to these wonderful people (emoji key):

Mike Putnam
Mike Putnam

๐Ÿ”ฃ
Andrew A. Barber
Andrew A. Barber

๐Ÿ”ฃ
Jason Allen
Jason Allen

๐Ÿ”ฃ
Juicob
Juicob

๐Ÿ”ฃ
Will Karnasiewicz
Will Karnasiewicz

๐Ÿ”ฃ
Dylan T. Vavra
Dylan T. Vavra

๐Ÿ”ฃ
Madison Martinez
Madison Martinez

๐Ÿ”ฃ
Daniel Eremchuk
Daniel Eremchuk

๐Ÿ”ฃ
Alex Chong
Alex Chong

๐Ÿ”ฃ
Matt S
Matt S

๐Ÿ”ฃ
Samuel Rusher
Samuel Rusher

๐Ÿ”ฃ
Evan Caraway
Evan Caraway

๐Ÿ”ฃ
Tyler K Kuromiya Parker
Tyler K Kuromiya Parker

๐Ÿ”ฃ
kendellmendoza
kendellmendoza

๐Ÿ”ฃ
Johnnyk737
Johnnyk737

๐Ÿ”ฃ
James Schuler
James Schuler

๐Ÿ”ฃ
Creighton Leif
Creighton Leif

๐Ÿ”ฃ
Vitaly Tomilov
Vitaly Tomilov

๐Ÿ’ป
Kyle Scudder
Kyle Scudder

๐Ÿ”ฃ
Chris Mears
Chris Mears

๐Ÿ’ฌ ๐Ÿ’ป ๐Ÿ”ฃ ๐Ÿšง ๐Ÿ“† ๐Ÿ”ง โœ…
donkeyslaps
donkeyslaps

๐Ÿ”ฃ
Pranav Davar
Pranav Davar

๐Ÿ”ง
Alexandre Hernandes Barrozo
Alexandre Hernandes Barrozo

๐Ÿ”ฃ
Resten
Resten

๐Ÿ”ฃ
Matt Higgins
Matt Higgins

๐Ÿ”ฃ
Alex Justesen
Alex Justesen

๐Ÿ”ฃ
Craig Kelly
Craig Kelly

๐Ÿ”ฃ
Krzysztof Rewak
Krzysztof Rewak

๐Ÿ”ฃ
John Baumert
John Baumert

๐Ÿ”ฃ
Charlie Cox
Charlie Cox

๐Ÿ”ฃ
Miles Kane
Miles Kane

๐Ÿ”ฃ
Anthony Laflamme
Anthony Laflamme

๐Ÿ’ป
Georg Engelsmann
Georg Engelsmann

๐Ÿ”ฃ
Clinton Williams
Clinton Williams

๐Ÿ”ฃ
Brent Busby
Brent Busby

๐Ÿ”ฃ
kenster89
kenster89

๐Ÿ”ฃ
Adilet Sarsembayev
Adilet Sarsembayev

๐Ÿ”ฃ
Pranav Davar
Pranav Davar

๐Ÿ”ฃ
b-mc2
b-mc2

๐Ÿ”ฃ
Nicole
Nicole

๐Ÿ”ฃ
Nicholas Hance
Nicholas Hance

๐Ÿ”ฃ
Joachim Nilsson
Joachim Nilsson

๐Ÿ”ฃ

This project follows the all-contributors specification. Contributions of any kind welcome!

openbrewerydb's People

Contributors

ahbarrozo avatar alexchong avatar amadisonm1209 avatar andrewbarber avatar b-mc2 avatar baumertjohn avatar chrisjm avatar cleif avatar craigkelly avatar danieleremchuk avatar dependabot[bot] avatar donkeyslaps avatar dvavs avatar jallend1 avatar johnnyk737 avatar juicob avatar kenster89 avatar kylescudder avatar mikeputnam avatar milsman2 avatar mstewgt avatar pranav-davar avatar resten1497 avatar sadilet avatar schurlo avatar srusher avatar tylerkkp avatar vitaly-t avatar wkarney avatar zshapleigh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openbrewerydb's Issues

ArcGIS REST Service

Overview

This could be either an implementation of ArcGIS REST Service (does this cost anything?) or just an endpoint to return the properly formated data.

Per Paul Doherty on Twitter:

"Ideally an ArcGIS REST service, but anything with consistent headers and latitude/longitude (decimal degrees) would work (geojson, .csv). Obviously, with attribution and links to whatever can help your amazing project succeed!"

https://twitter.com/pjdohertygis/status/1374182569936232456

Add US Census data

Use the Geocoding Census data and store block and tract in the DB. This data will allow us to map anything about the American public to the data.

Website: https://geocoding.geo.census.gov/

Examples:

  • how many lawyers work near breweries in New York
  • the average income of neighborhood around breweries

Add Breweries with missing/not enough data

Is your feature request related to a problem? Please describe.
There may be some breweries where there isn't enough available data (like planning or nano types for example) and currently, these aren't being included. However, that does not make them invalid or breweries that should not be known (my opinion). Additionally, it may be helpful to see a list of breweries that need information, so in the future, when more information about a particular brewery becomes available, it can be updated.

Describe the solution you'd like
It would be great if there was a section where breweries with missing or not enough data can be placed for future use.

Describe alternatives you've considered
At the moment, I've got my own helper repository for this to keep track of the ones that are missing breweries_with_missing_data.json. It would be great to keep this information within the main repo though.

๐Ÿ‡ฌ๐Ÿ‡ง UK Schema

Is your feature request related to a problem? Please describe.
It would be great to have UK Breweries on the DB too!

Directory Structure
In the UK we have all the individual countries then individual county areas, so what would be the best directory tree for this?
Would something like:

openbrewerydb/data/uk/scotland/west-dunbartonshire.csv
openbrewerydb/data/uk/scotland/argyll-bute.csv

or

openbrewerydb/data/uk/west-dunbartonshire/west-dunbartonshire.csv   
openbrewerydb/data/uk/argyll-bute/argyll-bute.csv

or

openbrewerydb/data/uk/scotland/west-dunbartonshire/west-dunbartonshire.csv   
openbrewerydb/data/uk/scotland/argyll-bute/argyll-bute.csv

be best?

CSV Structure
Additionally, what would be the best layout for the CSV be? Maybe:
id,name,brewery_type,street,city,**county**,postal_code,website_url,phone,created_at,updated_at,country,longitude,latitude,tags

Info
Unfortunately, SIBA don't provide a great data source, but I'm hoping to start the pull from there, then try and automate some further parts..

Add Brewery Data Change Manager

Continuing the conversation from #12.

Ideally, this would be a UI that opens up the main dataset file, allows you to directly edit/add/delete entries, saves and submits this as a Pull Request to the dataset repository.

Questions we'd like answered

  • Where should this UI live?
  • Use 3rd party tool to manage VCS or build our own?

Fix how international phone numbers are handled

Describe the bug
Currently, international phone numbers do not seem to be handled properly. If they are too long, they are treated like a long scientific number which ends up looking like this: "phone":"3.53599E+11"

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://github.com/openbrewerydb/openbrewerydb/blob/master/data/ireland/laois.csv
  2. See the phone number on the first line

Expected behavior
We would expect phone numbers to be handled appropriately. This value should be +353599107299

OS is irrelevant as this is an issue with the data itself.

Context
For Ireland, the format is country code + region number/mobile prefix + 7 digit number. Example: +353 21 123 4567. The region number is similar to the US areacode. It can be 3 digits, with a 0 prefix if the country code is not included (local dialing).

This also not limited to Ireland. For example, Germany has phone numbers with a max of 11 digits (not including country code)

CLI tool to create obdb_id

The obdb_id is essentially a kebab case combination of the brewery name and the city. This should handle most of the cases but let's see.

Notes

  • Might make use of the notebook
  • Perhaps there needs to be more additions to the ID for it to be unique?

International translations

We want to handle international translations in the dataset because not everything is English.

From the Discord thread:

Resten โ€” Today at 2:30 AM
this is my personal opinion.
How about separating English and foreign names by column?
Currently, the method added in Korean is ๋งฅํŒŒ์ด(Magpie), so it would be better to manage it separately for future use.

@chrisjm โ€” Today at 8:20 PM
@resten Thank you for the suggestion! This is a great idea and I'm still mulling it over. Perhaps a better solution is to have another linking translations table. I also think the names (and any other field) in the DB is reflective of the native language depending on the country. So in the Korean case, the English names should go in the translation table.

Export: CSV

Export /data to /breweries.csv

Dependencies

  • I chose Papaparse because of the small footprint (249kB) compared to csvtojson (8.69MB).

Some Bad Characters in Data: ยฉ -> ๏ฟฝ or ยฉ -> รขย€

Describe the bug
There seems to be some bad characters in the dataset. For example, Anheuser-Busch Inc establishments are all across the US. I think the dataset meant to have the ยฉ copyright character, but bad decoding changed it to รข๏ฟฝ or ๏ฟฝ.

To Reproduce
Steps to reproduce the behavior:

  1. you can see results using the api endpoint
  2. also visible in the dataset, row 12

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
Screen Shot 2021-11-15 at 2 30 28 PM

Desktop (please complete the following information):

  • OS: macOS 10.15.7 (19H2)
  • Browser: Chrome Version 95.0.4638.69 (Official Build) (x86_64)

Smartphone (please complete the following information):

  • Device: iPhone 11 Pro Simulator
  • OS: iOS 14.3
  • Flutter application: Flutter version 2.5.1

BREAKING CHANGES: Update database schema

Tasks

  • Rename obdb_id to slug
  • Update id to use UUID

Notes

  • This should be done after the versioning is migrated (it might already have)
  • The API will also need to be updated

Broken search by keyword and autocomplete queries

Describe the bug
The search by keyword and autocomplete queries return an empty array.

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://www.openbrewerydb.org/documentation
  2. Scroll down to Search Breweries
  3. Click submit button for the example queries

Expected behavior
A network response of an array of brewery info objects.

Screenshots
Screenshot 2023-01-10 at 5 04 40 PM

Desktop (please complete the following information):

  • OS: iOS
  • Browser chrome
  • Version 108.0.5359.124

Question: How to handle closed breweries?

Overview

I realized I'm not accounting for closed breweries. I think I'd like to keep them in the dataset for historical and analytical reasons, but I'm curious how best to do this.

Options

  1. Add a "closed" brewery type
  2. Add a new boolean field/column for "Active" or something similar

Other options?

Discussion: Scraping for brewery data

While we want the community to help to keep brewery data up to date, it might be easier / more efficient to set up some scraping scripts. There are several things we want to take into consideration when doing this:

  • What to scrape? (brewery guilds? brewery associations? google places? yelp?)
  • Ethical scraping (i.e., getting permission, not overloading the server, using APIs when available, etc.)
  • How to automate this process? (Consul? Airflow? other?)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.