Giter Site home page Giter Site logo

lblok / savi-750 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fitnr/savi-750

0.0 0.0 0.0 327.76 MB

Course guide and syllabus for Mining the Web (Fall 2018)

Home Page: https://fitnr.github.io/savi-750/

JavaScript 38.12% CSS 0.24% HTML 61.51% Python 0.13%

savi-750's Introduction

XSAVI 750 – Mining the Web: How to Scrape, Analyze & Map Open Data

Pratt Institute, Center for Continuing and Professional Studies Spatial Analysis and Visualization Initiative (SAVI)

Instructor: Neil Freeman

Location: ISC Building, Lower Level, Room 003

Continuing Education Units (C.E.U.s): 3.0


Table of Contents


Administrative Details

Course Overview

This course introduces the tools, techniques, and general approaches used to acquire, clean, analyze, and visualize open data, with particular emphasis on using web-based technologies and open-source tools at each step of the process.

Learning objectives

  • You will learn to formulate and articulate a meaningful research question with public open data, as well as meaningfully critique the work of others
  • You will learn how to acquire data through open data portals, application programmer interfaces (APIs), and scraping data from web sites
  • You will learn how to clean data using open source tools in preparation for analysis
  • You will learn how to conduct exploratory data analysis using descriptive statistics
  • You will learn to visualize your analytical findings in meaningful and visually-engaging graphics, as well as meaningfully critique the work of others
  • You will learn the basics of cartographic design as it relates to visualizing open data

Course Requirements

All students will need to bring their own laptop for exercises during class. Time will be set aside to help install, configure, and run the programs necessary for all assignments, projects, and exercises. Where possible, all programs will be free and open-source. All assigned work using services hosted online can be run using free accounts. Please update your system to the latest version of your prefered operating system prior to the first day of class to ensure you're able to successfully install and use the tools in class.

You will be required to have free accounts with Carto.

In addition, please install the text editor of your choice. Some (free) suggestions:

Course Readings

The required readings for this course consist of book chapters, newspaper articles, and short blog posts. The intention is to help give you a foundation in the critical skills ahead of class lectures. All required readings are available online or will be made available through the class portal. Recommended readings are suggestions if you wish to study further the topics covered in class. The books listed in the Suggested Readings section below offer even more depth and an extended discussion of the material we cover in class.

Class Format

Class runs from 9:30am to 5:30pm. Each day will be consist of 80-to-90–minute blocks broken up by 10-minute breaks and a half-hour break for lunch. Class will be a mix of lecture and exercise work, emphasizing the application of skills covered in the lecture portion of the class. You will have ample time in class to work on practical exercises based on the information presented in lectures.

Submitting Assignments

All assignments will be submitted to lms.pratt.edu. Assignments must be submitted by 9 pm of the Friday before class.

Assessment

Area Total Points
class participation 25
weekly critiques 25
weekly projects 25
final project 25
Total 100

Attendance

Regular, prompt attendence is required.

Participation

Your engagement makes class sessions richer and more fulfilling for everyone. Questions are encouraged, and active participation in class discussion and in-class exercises is very important.

Course Outline

Topics will be covered that day in class. Reading assignments are to be read before class in preparation of the lecture and exercises. Assignments are due before the start of the next class and build on the information presented in class.

Weekly critiques

Find an interesting or visually compelling map (interactive or static) or visualization online and write 2-3 paragraphs on the visualization, discussing the data source(s), the visual style, the creator's goals and audience, and how well the data was represented. Feel free to use the visualization resources listed below. Submit your analysis (include a link to the visualization) to this repository before each class. Come prepared to informally present the project to your classmates.


  • Introduction
  • Open data
  • Introduction to mapping and cartography
  • Introduction to CARTO
  • Introduction to HTML and CSS
  • Introduction to the Unix command line

Assignment

  1. Complete the CARTO “Online Mapping for Beginners” course.
  2. Identify a research question that you would like to explore in this class, with the intention of creating maps and visualizations that will help answer question or clarify the topic.
    • Write a short summary of the your topic. What questions would you like to answer? What audience would you like to reach? What data would you like to explore?
    • Create a basic CARTO map with one data layer that connects to your topic.
    • Embed the map in a basic HTML document with your write-up.
    • Write a paragraph describing the dataset. Namely, what data it contains, who created it and how and why they did so. Use information from the dataset's metadata. If that's incomplete, use additional research. Give your best guess if you can't find complete metadata.
  3. On the same page, include the link to an interesting map or visualization and add your weekly critique.

Readings

Suggested Reading

  • Manual Web scraping
  • Introduction to APIs
  • Data types and formats
  • Parsing data with csvkit
  • Introduction to the Census Factfinder

Assignments

  1. Complete the SQL and PostGIS in CARTO course. Update your maps (or create a new one) using data joined from two sources
  2. Create a second map, using new data scraped from the web or pulled via an API.
    • Embed the map in a new HTML document.
    • Include a paragraph discussing any challenges you encountered working with the data and/or creating your map in CARTO.
  3. Weekly critique

Readings

  • Introduction to SQL and spatial SQL
  • Introduction to Python
  • Opening closed data with Tabula

Assignments

  1. Work through "The Basics" at Learn Python (you can skip "String Formatting". If you're feeling good, jump ahead to "List Comprehensions")
  2. Update your interactive map to include data that you've joined, filtered or modified with an SQL query. Plan a 10-minute presentation explaining the topic your map addresses, the data sources you used, and your methodology.
  3. Weekly critique

Readings

  • Class presentations
  • Python for scraping the web
  • Quantitative maps on the web
  • Review

Assignments

  1. Make any desired revisions to your map. Your final project should be embedded on an HTML page that includes an introduction and description of your topic, as well as a description your process and methodology.

Readings


Resources

Answering questions

Working with data

Command line

APIs

Python

HTML & CSS

Javascript

(Some) Open Data Sources

Cartography

Visualizations and maps

Selected interactive maps

Sublime Text

Reference

Suggested Reading

  • Fry, Ben. Visualizing Data: Exploring and Explaining Data with the Processing Environment. O'Reilly Media, Inc., 2007.
  • Garrad, Chris. Geoprocessing with Python. Manning Publications Co., forthcoming. Janert, Philipp K. Data analysis with open source tools. O'Reilly Media, Inc., 2010.
  • McCallum, Q. Ethan. Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work. O'Reilly Media, Inc., 2012.
  • Munzner, Tamara. Visualization Analysis and Design. AK Peters, 2014.
  • Murray, Scott. Interactive data visualization for the Web. O'Reilly Media, Inc., 2013.
  • Tufte, Edward R., and P. R. Graves-Morris. The visual display of quantitative information. Vol. 2. Cheshire, CT: Graphics press, 1983.

Precedents

This course builds from material prepared by Richard Dunks under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

savi-750's People

Contributors

fitnr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.