Giter Site home page Giter Site logo

seanisthegood / dsc-dealing-with-categorical-variables-lab Goto Github PK

View Code? Open in Web Editor NEW

This project forked from learn-co-curriculum/dsc-dealing-with-categorical-variables-lab

0.0 0.0 0.0 1.11 MB

License: Other

Jupyter Notebook 89.40% Python 10.60%

dsc-dealing-with-categorical-variables-lab's Introduction

Dealing with Categorical Variables - Lab

Introduction

In this lab, you'll explore the Ames Housing dataset for categorical variables, and you'll transform your data so you'll be able to use categorical data as predictors!

Objectives

You will be able to:

  • Determine whether variables are categorical or continuous
  • Use one hot encoding to create dummy variables
  • Describe why dummy variables are necessary

Importing the Ames Housing dataset

Let's start by importing the Ames Housing dataset from ames.csv into a pandas dataframe using pandas read_csv()

# Import your data

Now look at the first five rows of ames:

# Inspect the first few rows

Variable Descriptions

Look in data_description.txt for a full description of all variables.

A preview of some of the columns:

LotArea: Size of the lot in square feet

MSZoning: Identifies the general zoning classification of the sale.

   A	 Agriculture
   C	 Commercial
   FV	Floating Village Residential
   I	 Industrial
   RH	Residential High Density
   RL	Residential Low Density
   RP	Residential Low Density Park 
   RM	Residential Medium Density

OverallCond: Rates the overall condition of the house

   10	Very Excellent
   9	 Excellent
   8	 Very Good
   7	 Good
   6	 Above Average	
   5	 Average
   4	 Below Average	
   3	 Fair
   2	 Poor
   1	 Very Poor

KitchenQual: Kitchen quality

   Ex	Excellent
   Gd	Good
   TA	Typical/Average
   Fa	Fair
   Po	Poor

YrSold: Year Sold (YYYY)

SalePrice: Sale price of the house in dollars

Let's inspect all features using .describe() and .info()

# Use .describe()
# Use .info()

Plot Categorical Variables

Now, pick 6 categorical variables and plot them against SalePrice with a bar graph for each variable. All 6 bar graphs should be on the same figure.

import matplotlib.pyplot as plt
%matplotlib inline

# Create bar plots

Create dummy variables

Create dummy variables for the six categorical features you chose remembering to drop the first. Drop the categorical columns that you used, concat the dummy columns to our continuous variables and asign it to a new variable ames_preprocessed

# Create dummy variables for your six categorical features

Summary

In this lab, you practiced your knowledge of categorical variables on the Ames Housing dataset! Specifically, you practiced distinguishing continuous and categorical data. You then created dummy variables using one hot encoding.

dsc-dealing-with-categorical-variables-lab's People

Contributors

loredirick avatar mas16 avatar sumedh10 avatar fpolchow avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.