The goal of this project was to create Spark clusters and provision them with Azure to perform analytics on stock data. After provisioning a Spark cluster, Microsoft Azure Storage Explorer is used to add a Jupyter Notebook to the cluster to perform queries and analytics.
Steps
- Deploy an HDInsight Spark cluster
- Work with content stored in Azure Blob Storage and accessed by the Spark cluster as an HDFS volume
- Use a Jupyter Notebook to interactively explore a large dataset