Giter Site home page Giter Site logo

purviewdemo's Introduction

Microsoft Purview Demo Environment

This repository includes instructions on how to automate the deployment of a pre-populated Microsoft Purview demo environment.

Prerequisites

  • An active Azure subscription.
  • No Azure Policies preventing creation of Storage accounts or Event Hub namespaces. Purview will deploy a managed Storage account and Event Hub when it is created. If a blocking policy exists and needs to remain in place, please follow the Purview exception tag guide to create an exception for Purview accounts.

Usage

  1. Click Deploy to Azure.
    Deploy to Azure
  2. Select a Region.

    Note: If you are planning to create a NEW Resource Group for the set of resources that will be created as part of this template, ensure to select a Region BEFORE creating a new Resource Group (otherwise the Resource Group will be created with the default location).

  3. Select the target Azure Subscription.
  4. Select an existing OR create a new Resource Group.

    Note: If you are selecting an existing Resource Group, this will be automatically set to the existing Resource Group's location.

  5. [OPTIONAL] Change the SQL Server Admin Login.
  6. [OPTIONAL] Change the SQL Server Admin Password.

    Note: You do not need to know the password, the post deployment script will automatically store the secret in Key Vault and Purview will use this secret to successfully scan the Azure SQL Database.

Outcome

  • The template should take approximately 10 minutes to complete.
  • Once complete, all Azure resources will have been provisioned, RBAC assignments applied, and data plane operations executed, see below for more details.

Note: An additional 10 minutes post-deployment may be required for:

  • Azure Data Factory pipeline to finish running and push lineage to Microsoft Purview.
  • Microsoft Purview to finish scanning registered sources and populate the catalog.
  • The status of these jobs can be monitored within the respective service.

Validate Deployment

  1. Navigate to the Azure Portal, locate your Resource Group, click Deployments. You should see that the deployment has Succeeded. Validate Deployment

  2. Within your resource group, you should see the following set of Azure resources. Azure Resources

  3. Navigate to your Microsoft Purview Account (e.g. pvdemo{RAND_STRING}-pv), click Open Governance Portal > Data Map. You should see 3 collections and 2 sources. Microsoft Purview Data Map

  4. Within the Azure Data Lake Storage Gen2 source, click View Details, you should see a scan. Note: The scan may still be in progress and can take up to 10 minutes to complete. Microsoft Purview Azure Data Lake Storage Gen2 Scan

  5. Within the Azure Data Lake Storage Gen2 source, click the New Scan icon, click Test connection. The connection should be successful. Microsoft Purview Azure Data Lake Storage Gen2 Test Connectivity

  6. Within the Azure SQL Database source, click View Details, you should see a scan. Note: The scan may still be in progress and can take up to 10 minutes to complete. Microsoft Purview Azure SQL Database Scan

  7. Within the Azure SQL Database source, click the New Scan icon, select a Database name, set Credential to sql-cred , toggle Lineage extraction to Off, and click Test connection. The connection should be successful. Microsoft Purview Azure SQL Database Test Connectivity

  8. Navigate to Data Map > Collections > Role assignments. You should see your user added to each role (Collection admin, Data Source admin, Data curator, Data reader), you should also see the Azure Data Factory Managed Identity added as a Data Curator. Microsoft Purview Role Assignments

  9. Navigate to Management > Data Factory. You should see a Connected Azure Data Factory account. Azure Data Factory Integration

  10. Navigate to Data Catalog > Manage Glossary and click Hierarchical view. You should see a pre-populated Glossary. Microsoft Purview Glossary

  11. Navigate to Management > Credentials. You should see credential from Azure Key Vault. Microsoft Purview Credential

  12. Within the search bar, search for "copy" and navigate to the Copy_a9c asset within Purview and then click Lineage. You should see lineage from the Azure Data Factory Copy Activity. Note: The pipeline within Azure Data Factory may still be running and can take up to 10 minutes to complete. To check the status of the pipeline, navigate to Azure Data Factory and check Monitoring. Microsoft Purview Lineage

Deployed Resources

  • Microsoft Purview Account
  • Azure Key Vault
  • Azure SQL Database
  • Azure Data Lake Storage Gen2 Account
  • Azure Data Factory
  • Azure Synapse Analytics Workspace

Role Assignments

# Scope Principal Role Definition
1 Azure Storage Account Current User Storage Blob Data Reader
2 Azure Storage Account Azure Synapse MI Storage Blob Data Contributor
3 Azure Storage Account Microsoft Purview MI Storage Blob Data Reader
4 Azure Storage Account Azure Data Factory MI Storage Blob Data Contributor

Data Plane Operations

# Service Action
1 Identity Provider Get Access Token
2 Microsoft Purview Create Azure Key Vault Connection
3 Microsoft Purview Create Credential
4 Microsoft Purview Update Root Collection Policy
5 Microsoft Purview Create Collections
6 Microsoft Purview Azure SQL DB: Register Source
7 Microsoft Purview Azure SQL DB: Create Scan
8 Microsoft Purview Azure SQL DB: Run Scan
9 Azure Data Lake Storage Gen2 Load Sample Data
10 Microsoft Purview ADLS Gen2: Register Source
11 Microsoft Purview ADLS Gen2: Create Scan
12 Microsoft Purview ADLS Gen2: Run Scan
13 Azure Data Factory Run Pipeline
14 Microsoft Purview Populate Glossary

purviewdemo's People

Contributors

tayganr avatar jwstarkie avatar isantillan1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.