Using SparkSQL on Google Colab, key metrics for home sales data was determined. Then Spark was utilized to create temporary views, partition the data, cache and uncache a temporary table. The following questions were answered using SparkSQL:
- What is the average price for a four-bedroom house sold for each year? Round answer to two decimalplaces.
- What is the average price of a home for each year it was built that has three bedrooms and three bathrooms? Round answer to two decimal places.
- What is the average price of a home for each year that has three bedrooms, three bathrooms, two floors, and is greater than or equal to 2,000 square feet? Round answer to two decimal places.
- What is the "view" rating for homes costing more than or equal to $350,000? Determine the run time for this query, and round answer to two decimal places.