This course will have three themes, described below.
Exercises from each theme will weight 1/3 in the total final grade. Each exercise is to be delivered through a single IPython notebook and will be graded according to the following equally-weighted criteria (from 0 to 5):
- 1. Completion: it contains what was requested
- 2. Clean code: the code is clean, commented and understandable
- 3. Expressivity: exploits the notebook documentation capabilities to offer sufficient verbose decription
- 4. Difficulty: of the problem/dataset/algorithm/approach chosen
31/Mar: Drafts for exercises Theme 1
17/Apr: Exercises Theme 1: 1A + 1B
3/May: Drafts for exercises Theme 2
23/May: Exercises Theme 2
2/Jun: Drafts for exercises Theme 3
17/Jun: Exercises Theme 3
17/Jun: Classes end
20/Jun: Grade submissions closes
Drafts must include a description of the chosen use case (dataset/algorithm, etc.) for each exercise and a first approach (albeit incomplete) to the solution. Failure to deliver drafts will incur in a 50% penalty in the theme grade
1.1 Introduction to Jupyter Notebooks Learn Python - Wakari Notebooks Gallery - Numpy Quickstart - Pandas Cookbook
1.2 Libraries: Matplotlib, Bokeh, Plotly Matplotlib Gallery - Bokeh notebooks Gallery - Plotly notebooks Gallery
1.3 Interactivity and Data Streaming: Bokeh Widgets - Bokeh server examples - Linking and brushing
1.4 Big Data Visualization Big Data plotting problems - Plotting NYC taxi data - Datashader
EXERCISE 1.A: Tell a story through data visualization in one notebook. The story must use:
- Pandas and Numpy for data loading and cleaning (see Pandas Cookbook - Chapter 7)
- Matplotlib and Plotly for creating several views on the same data (contours and scatters), sub-sampling data, show data interpolations
- Bokeh for Interactive multi-selection in subplots and interactive widgets
EXERCISE 1.B: Use datashader to build a meaningful visualization of at least 1 million data points
EXERCISE 1.C: Select time series data and build a data streaming example, using Bokeh or Plot.ly
SOME DATASOURCES: https://data.cityofboston.gov/ https://data.nasa.gov/ https://data.cityofchicago.org/ http://transtats.bts.gov/ http://catalog.data.gov/ http://www.kaggle.com example story Boston data
2.1 Symbolic Computing [SymPy Website] (http://www.sympy.org/) - SymPy Tutorial - Lecture Notes on SymPy
2.2 Solving ODEs ODEs in SymPy - Introduction to Differential Equations
2.3 Numerical methods SciPy Cookbook - SciPy Tutorial
2.4 Animating mathematical models Matplotlib animations
EXERCISE 2.A: Choose a problem solvable with ODEs:
- Define, solve and visualize a mathematical model
- If you use a numerical method, show first that the symbolic solver fails.
EXERCISE 2.B: Use matplotlib animations to:
- Create two animations illustrating key aspects of your model.
EXERCISE 2.C: Disseminate your work:
- Create a presentation from your notebook in 2.A)
- Use Git and nbviewer for publishing and sharing notebooks online.
3.1 Vectorizing functions NumPy ufuncs
3.2 IPython parallelization IPython parallel
3.3 Python fast like C Numba Cython
3.4 Theano: Fast symbolic computing Theano
EXERCISE 3.A: Vectorization and Just in time compilation (in Problemset 03 A)
EXERCISE 3.B: Monte Carlo with IPython Parallel (in ProblemSet 03 B)
- Run your code in Guane at www.sc3.uis.edu.co (5% additional in the total course grade for each exercise up to a maximum of 15%)
- Only valid for the following exercises: 1A, 2A, 3A, 3B