Giter Site home page Giter Site logo

sql-intro-why-select-join-groupby-pandas-lp's Introduction

title layout weight hidden
SQL/Pandas
post
10
true

Lesson Plan Template:

Course: DS
Mod: Mod 2
Topic: SQL and Pandas
Amount of time: 2 hours
Author: Robert Jackson

Ported from the ds-lesson-starters repository here.


Lesson Summary:

Topic:

Using SQL and SQL syntax with pandas

Learn.co material:

https://docs.google.com/presentation/d/1u6PaCJTzxipFO63MoHlj_kB_ywBqFe1PPmRSx5ZXyy0/edit?usp=sharing

Prerequisite knowledge/ Prework:

  • Pandas dataframes

Learning goals for this lesson:

Construct SQL queries

  • SELECT data from table
  • INSERT data into tables
  • Create tables

Use JOIN to merge tables along logical columns

  • Create JOIN statements along INNER and OUTER joins
  • Create ONE-to-MANY and MANY-to-MANY joins

Grouping Data with SQL

  • Write queries with aggregate functions
  • Use GROUP BY to sort data

Query data from pandas dataframes using SQL

  • Import and utilize the 'sqlite3' library
  • Utilize SQL syntax in a Python envrionment to duplicate standard SQL queries

Misconceptions:

SQL and Python have to be used seperately.

Materials

  • (Learn.co Link)
  • Slides
  • Jupyter notebooks

terms schema, dimensions, facts, primary key, foreign key,

  • In general, databases store sets of data that can be queried for use in other applications. A database management system supports the development, administration and use of database platforms
  • The most basic RDBMS functions are related to create, read, update and delete operations, collectively known as CRUD.
  • An realtionsal database management system (RDBMS) is a type of DBMS with a row-based table structure that connects related data elements and includes functions that maintain the security, accuracy, integrity and consistency of the data.

Lesson Outline:

Step: Introduction to SQL
Time: 2 hours

Goal/Scenario:
You are a data analyst for a large company that is doing research on economic properties of various countries. The data you need to access is in a SQL database. YOu need to be able to query for the data in a database!

Learning Goals in sequence:
What are the objectives, or steps students will need to do in order to achieve the goal or solve the problem?

SQLite¶ SQLite is a popular open source SQL database. It can store an entire database in a single file. It is 'lite' because it is not server based. Does not have many features of server-based RDBMS like users and permissions. Great to get up and running quick, not good for complex projects.

Step: Activation
Time: 5 min

Before we can do any sort of modeling of the data, we need to retrieve it from a database. While there are many formats for databases, most moden examples utilize some flavor of SQL. As we continue through this lesson, consider:

  • What does it mean when we say SQL is 'declarative'?
  • Does anyone know what SQL is? How about what a relational database system is?
  • What is a 'query' language?

Step: Learning Goal 1: Be able to construct SQL queries
Time: 15 min

Demonstrate:

  • Create an example SQL query.
  • Discuss the structure of a SQL query's was
  • Discuss what a relational database system is.

Application:

  • Students constuct queries to retrieve specified information
  • Edit existing tables by deleting, changing, and adding information to tables
  • Studetns create SQL tables and populate them

Informal assessment:
Kahoot Poll - Multiple choice

  • Convert English queries into SQL queries

Step: Learning Goal 2: Use JOIN to merge tables along logical columns
Time: 20 min

Demonstrate:

  • Use JOIN to merge the results of two or more tables
  • Explain the struct of a JOIN clause
  • Explain the difference between ONE-to-MANY and MANY-to-MANY

Application:

  • Students construct queries that join tables along logical columns
  • Students use Inner and Outer jois to join same tables
  • Create MANY-to-MANY joins

Informal assessment:
Coding Exercise - Students attempt to JOIN several tables into single format

Step: Learning Goal 3: Grouping Data
Time: 20 min

Demonstrate:

  • Create example queries with aggreagate functions (MIN, MAX, etc.)
  • Show example of grouping data (using GROUP BY clause) in queries

Application:

  • Students construct queries that join tables along logical columns
  • Students use Inner and Outer jois to join same tables
  • Create MANY-to-MANY joins

Informal assessment:
Coding Exercise - Students attempt to JOIN several tables into single format

  • Use the test.db data from SQL

Step: Learning Goal 4: Utilize SQL syntax with Python
Time: 30 min

Demonstrate:

  • Overview the sqlit3 library
  • Demonstrate the creation of a table using the 'sqlite3' library
  • Overview the pandasql libary
  • Create SQL queries in Python
  • Demonstarte SQL syntax in action using the pandasql

Application:

  • Students connect to tables in sql using importeed packages
  • Students create tables in pandasql
  • Discuss the consequences of using SQL in a python environment

Informal assessment:
Kahoot poll - choose proper syntax for queries in python:

  • Multiple choice poll conducted anonomously

Step: Assessment:
Time: 30 min

In pairs of two, students will import data from SQLite using the sqlite3 library for Python. The data they'll be pulling will be the world.db data set, containin tables on the info of various countries. Students will use this data, in conjunction with pandasql, to query the data and craete visualizations in matplotlib on any aspect of the data the wish.

Step: Reflection:
Time: [ ] min

sql-intro-why-select-join-groupby-pandas-lp's People

Contributors

mathymitchell avatar

Watchers

 avatar  avatar

Forkers

jirvingphd

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.