Giter Site home page Giter Site logo

access-conf-data's Introduction

The program data has been marshalled in a Google Sheet, with a worksheet for each conference (currently starting with 1996). The worksheets have been exported into the csv directory with names like 1993.csv. Until the data format is finalized, I've considered the Google Sheet to be the authoritative source and the csv files to be working derivatives. Therefore update the Google Sheet to make changes and download new csv files, until the decision is made to consider the csv files the authoritative source.

List of conferences and sources: Access Conference History

The only non-flat field in the csv is the speakers column, which can contain multiple speakers (pipe-separated). Individual speakers are listed in the form name (institution). Not all speakers have institutions. Institution names have not been normalized and many different forms (University of Alberta, U of Alberta, University of Alberta Libraries etc.)

The schedule times are preserved, and are generally in the form 19:00 - 10:00 (with many variations), always in the 12-hour clock.

Geocoding of institutions is a complex chain: Non-normalized institution name from program is gathered into institutions.json by gather-institutions.rb;

instution.json entries look like this:

"Emory University, Atlanta, GA": {
    "city": "Atlanta",
    "type": null,
    "ignore": false
  },

The city property is added manually. It will be used for geocoding by gather-places.rb, so it needs to be specific enough for a lookup. Big cities work (like Toronto), smaller ones or ambiguious ones need more details (Victoria, BC, Canada)

Sample code for geocoding places:

s = source_data[key]
if mappings.keys.include?(s['city'])
  s['address'] = mappings[s['city']]
else
  result = Geocoder.search(s['city']).first
  puts 'Not found: ' + s['city'] if !result
  next if !result
  new = {
    address: result.address,
    city: (result.city.nil? ? '' : result.city),
    state: result.state,
    country: result.country,
    lat: result.coordinates[0],
    lon: result.coordinates[1],
  }
  mappings[s['city']] = result.address
  new_places << new[:city]
  
  master_place_data[result.address] = new
  puts s['city'] + ': ' + result.address
end

Sample parsing of program times:

# we assume times less than 8:00 are pm
eightoclock = Time.parse('8:00')
twelvehours = 12 * 60 * 60

...

times = row['time'].gsub(/[^0-9:]/, ' ').gsub(/\s+?/, ' ').strip.split(' ')
start = Time.parse(times[0])
finish = Time.parse(times[1])

# convert pm times to 24 hour format
start += twelvehours if start < eightoclock
finish += twelvehours if finish < eightoclock

duration = (finish - start) / 60

access-conf-data's People

Contributors

pbinkley avatar dependabot[bot] avatar jamesrf avatar murny avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.