I have cleaned this data using SQL statements. Here the data was retrieved, understood and then accordingly updated.
- Converting String Date to Date datatype.
- Updating missing values in Address Column.
- Formating ParcelID, Acreage, LandValue
- Breaking out PropertyAddress, Owner Address into individual columns
- Change Y and N to Yes and No in 'SoldasVacant' field.
- Removing Duplicates and redudndant columns
This is a guided project from AlextheAnalyst: https://github.com/AlexTheAnalyst/PortfolioProjects/blob/main/Nashville%20Housing%20Data%20for%20Data%20Cleaning.xlsx
- Dataset: Nashville_Housing_Data.xlsx in
.csv
- Code: Nashville_Data_Cleaning.sql
.sql
- Docker
- Azure Data Studio
This dataset has 54403 records. It primarily describes the Nashville property's detail containing the Owner and its property details.
Description of the variables:
UniqueID
: Primary keyParcelID
: varcharLandUse
: varchar, Type of the property e.g Condo, Church, Apartment, DaycarePropertyAddress
: varcharSaleDate
: DateSalePrice
: IntegerLegalReference
: varcharSoldAsVacant
: Bool, Yes/ NoOwnerName
: varcharOwnerAddress
: varcharAcreage
: FloatTaxDistrict
: varcharLandValue
: IntegerBuildingValue
: IntegerTotalValue
: IntegerYearBuilt
: IntegerBedrooms
: IntegerFullBath
: IntegerHalfBath
: Integer