For this workshop you need to have access to an Azure Synapse instance. To create an instance you can either choose from Microsoft Analytics end-to-end with Azure Synapse ARM template which deploys full set of services related to Analytics including Synapse.
Or create a single Synapse workspace via Azure Portal.
Exercise 1: Data ingestion
At this exercise you connect to different data sources that WWI needs to collect data and create a pipeline to move data into a centralized data lake. No transformation over data is required. The solution should cover delta load for big tables.
Task 1: Create a raw zone in your ADLSG2
Task 2: Create copy pipeline
Task 3: Run your pipeline and monitor
Exercise 2: Data preparation and transformation
At this exercise you get to explore different flavors of Synapse runtime. Do ad-hoc data wrangling and use Azure Synapse Lake databse patterns.
Task 1: Data quality checks
Task 2: Create Lake database
Exercise 3: Datawarehouse
At this exercise you work on designing Datawarehouse schema. You investigate what Azure Synapse offers in terms of indexing and partitioning and pipeline design.
Task 1: Determine table category
Task 2: Create Dedicated SQL pool and star schema
Task 3: Populate data warehouse tables with Spark pool
Task 4: Populate data warehouse tables with Dataflow
In this challenege you need to create an ad-hoc report in Synapse to identify Top 20 states that are generating most order amount. The result of this task is shown below: