Welcome to the US Census Data Engineering project, where I analyze the United States Census Bureau's 2017 Basic Monthly CPS using Apache Spark (Python).
- Determine the count of responders per family income range.
- Identify the top 10 counts of responders based on geographical division/location and race.
- Assess the number of responders without a telephone at home but with access elsewhere, accepting telephone interviews.
- Evaluate the number of responders with access to a telephone, but rejecting telephone interviews.
- December 2017 United States Census Bureau’s Basic Monthly CPS Record (.dat)
- January 2017 Basic Monthly CPS Data Dictionary File (.txt)
The project leverages Python to extract specified information from the dataset using the provided data dictionary file. Apache Spark is then employed for in-depth dataset analysis, addressing key questions outlined in the project objectives.
Feel free to explore the code, data, and insights in the repository. If you have any questions or suggestions, please don't hesitate to reach out.