Giter Site home page Giter Site logo

miztiik / azure-blob-trigger-function-user-identity Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 7.29 MB

Process Azure Blob events in Azure Functions triggered by EventGrid using managed Identity

Makefile 0.01% Python 98.81% PowerShell 0.26% Batchfile 0.02% Shell 0.09% Bicep 0.82%
azure azure-functions bicep eventgrid iac miztiik miztiik-automation user-managed-identity

azure-blob-trigger-function-user-identity's Introduction

Azure Blob Storage Processing with Event Grid Triggers & Python Azure Functions using Managed Identity

The developers at Mystique Unicorn process files as soon as they arrive. They are currently using the Azure system managed identity. They are looking to scope the permissions of the function execution environment to the bare minimum.

They heard about Azure's capabilities for user managed identities. Can you help them implement this event processing at Mystique Unicorn?

Miztiik Automation - Azure Blob Storage Processing with Event Grid Triggers & Python Azure Functions using Managed Identity

๐ŸŽฏ Solution

Azure offers user assigned managed identities that can be narrowly scoped to necessary permissions. But there are few nitty gritty things that we need to remember when using user managed identities.

  • User managed identities and resource mapping are cached - The update interval is 24hrs. Excerpt from docs,

    The back-end services for managed identities maintain a cache per resource URI for around 24 hours. If you update the access policy of a particular target resource and immediately retrieve a token for that resource, you may continue to get a cached token with outdated permissions until that token expires. There's currently no way to force a token refresh.

    It is possible you may get errors due to this. My testing didn't show any errors, but it is possible.

  • The permission scoping for resources like cosmosdb data plane is slightlly different from say Storage Accounts. We need to use the Microsoft.DocumentDB/databaseAccounts/sqlRoleDefinitions to assign the permissions. The Microsoft.DocumentDB/databaseAccounts/sqlRoleAssignments is used to assign the role to the user managed identity.

  • The Azure function needs these environment variables for permissions like cosmos DB to work, Thanks to this blog

    • SUBSCRIPTION_ID
    • RESOURCE_GROUP

With this knowledge, we can process Blob Storage through the use of Azure Functions and Event Grid triggers with user managed identity. The Azure blob storage events for events like blob creation and deletion can be used to trigger the function. A sample event from event grid is shown below,

{
  "id": "538fcf9f-3..-1024-801417067d3a",
  "data": {
    "api": "PutBlob",
    "clientRequestId": "c0c0f290-ec..0bc9ef3b",
    "requestId": "538fcf9f-3..01417000000",
    "eTag": "0x8DB4E3BA4F8E488",
    "contentType": "application/json",
    "contentLength": 40,
    "blobType": "BlockBlob",
    "url": "https://warehouse6p5crf002.blob.core.windows.net/store-events-blob-002/source/7031_2023-05-06_event.json",
    "sequencer": "0000000000000000000000.000005276ba",
    "storageDiagnostics": { "batchId": "2901e730-b..-80d271000000" }
  },
  "topic": null,
  "subject": "/blobServices/default/containers/store-events-blob-002/blobs/source/7031_2023-05-06_event.json",
  "event_type": null
}

We can use this event as a trigger, retrieve the corresponding blob mentioned in data.url using the input binding and persist the processed event back to Blob Storage using the output binding and cosmos DB. Although we can use a output binding for cosmos, we will use the python sdk to demonstrate the use of managed identity.

By leveraging the power of Bicep, all necessary resources can be easily provisioned and managed with minimal effort. Our solution uses Python for efficient event processing, allowing for quick and easy deployment of sophisticated event processing pipelines.

  1. ๐Ÿงฐ Prerequisites

    This demo, instructions, scripts and bicep template is designed to be run in westeurope. With few or no modifications you can try it out in other regions as well(Not covered here).

  2. โš™๏ธ Setting up the environment

    • Get the application code

      https://github.com/miztiik/azure-blob-trigger-function-user-identity
      cd azure-blob-trigger-function-user-identity
  3. ๐Ÿš€ Prepare the environment

    Let check you have Azure Cli working with

      # You should have azure cli preinstalled
      az account show

    You should see an output like this,

     {
       "environmentName": "AzureCloud",
       "homeTenantId": "16b30820b6d3",
       "id": "1ac6fdbff37cd9e3",
       "isDefault": true,
       "managedByTenants": [],
       "name": "YOUR-SUBS-NAME",
       "state": "Enabled",
       "tenantId": "16b30820b6d3",
       "user": {
         "name": "miztiik@",
         "type": "user"
       }
     }
  4. ๐Ÿš€ Deploying the application

    • Stack: Main Bicep This will create the following resoureces

      • General purpose Storage Account
        • This will be used by Azure functions to store the function code
      • Storage Account with blob container
        • This will be used to store the events
      • Event Grid Topic
        • This will be used to trigger the Azure Function.
        • Create a subscription to the topic, that filters for Microsoft.Storage.BlobCreated events specific to the blob container.
      • Managed Identity
        • This will be used by the Azure Function to access the Cosmos DB
      • Python Azure Function
        • Input, Trigger, Output Binding to the blob container for events
      • Cosmos DB
        • This will be used to store the processed events
      # make deploy
      sh deployment_scripts/deploy.sh

      After successfully deploying the stack, Check the Resource Groups/Deployments section for the resources.

      Miztiik Automation - Azure Blob Storage Processing with Event Grid Triggers & Python Azure Functions using Managed Identity

  5. ๐Ÿ”ฌ Testing the solution

    • Upload file(s) to blob

      Get the storage account and container name from the output of the deployment. Upload a file to the container and check the logs of the function app to see the event processing in action.

      Sample bash script to upload files to blob container. You can also upload manually from the portal,

      # Set variables
      LOG_FILE="/var/log/miztiik-$(date +'%Y-%m-%d').json"
      COMPUTER_NAME=$(hostname)
      SLEEP_AT_WORK_SECS=0
      LOG_COUNT=2
      
      GREEN="\e[32m"
      CYAN="\e[36m"
      YELLOW="\e[33m"
      RESET="\e[0m"
      
      RESOURCE_GROUP="Miztiik_Enterprises_azure_blob_eventgrid_trigger_function_003"
      LOCATION="northeurope"
      SA_NAME="warehouseenx5vm003"
      CONTAINER_NAME="store-events-blob-003"
      
      
      for ((i=1; i<=LOG_COUNT; i++))
      do
      FILE_NAME_PREFIX=$(openssl rand -hex 4)
      FILE_NAME="${RANDOM}_$(date +'%Y-%m-%d')_event.json"
      echo -n "{\"message\": \"hello world on $(date +'%Y-%m-%d')\" , \"timestamp\": \"$(date -u +"%Y-%m-%dT%H:%M:%SZ")\"}" > ${FILE_NAME} 
      UPLOAD_STATUS=$(az storage blob upload \
        --account-name ${SA_NAME} \
        --container-name ${CONTAINER_NAME} \
        --name "source/${FILE_NAME}" \
        --file ${FILE_NAME} \
        --no-progress \
        --auth-mode login\
        --output json | tr -d '\r')
      
      sleep 2
      # echo -e ${GREEN}${UPLOAD_STATUS}${RESET}
      echo -e "\n  ${YELLOW}($i/$LOG_COUNT)${RESET} Blob: ${GREEN}${FILE_NAME}${RESET} uploaded to container: ${CYAN}${CONTAINER_NAME}${RESET} in storage account: ${CYAN}${SA_NAME}${RESET}"
      done

      You should see an output like this,

      (1/2) Blob: 758_2023-05-13_event.json uploaded to container: store-events-blob-003 in storage account: warehouseenx5vm003

      (2/2) Blob: 7893_2023-05-13_event.json uploaded to container: store-events-blob-003 in storage account: warehouseenx5vm003

      Miztiik Automation - Azure Blob Storage Processing with Event Grid Triggers & Python Azure Functions using Managed Identity

      Miztiik Automation - Azure Blob Storage Processing with Event Grid Triggers & Python Azure Functions using Managed Identity

  6. ๐Ÿ“’ Conclusion

    Here we have demonstrated how to use Azure functions to process blob files and persist in cosmosDB.

  7. ๐Ÿงน CleanUp

If you want to destroy all the resources created by the stack, Execute the below command to delete the stack, or you can delete the stack from console as well

# Delete from resource group
az group delete --name Miztiik_Enterprises_xxx --yes
# Follow any on-screen prompt

This is not an exhaustive list, please carry out other necessary steps as maybe applicable to your needs.

๐Ÿ“Œ Who is using this

This repository aims to show how to Bicep to new developers, Solution Architects & Ops Engineers in Azure.

๐Ÿ’ก Help/Suggestions or ๐Ÿ› Bugs

Thank you for your interest in contributing to our project. Whether it is a bug report, new feature, correction, or additional documentation or solutions, we greatly value feedback and contributions from our community. Start here

๐Ÿ‘‹ Buy me a coffee

ko-fi Buy me a coffee โ˜•.

๐Ÿ“š References

  1. Azure Event Grid trigger for Azure Functions
  2. Blob Storage events
  3. Azure Blob Storage Input Binding
  4. Azure Blob Storage Ouput Binding
  5. Azure Event Grid Filters
  6. Miztiik Blog - Blob Storage Event Processing with Python Azure Functions
  7. Miztiik Blog - Blob Storage Processing with Python Azure Functions with HTTP Triggers
  8. Azure Docs - Managed Identity
  9. Azure Docs - Managed Identity Caching
  10. Azure Docs - Azure Function and User Assigned Managed Identities

๐Ÿท๏ธ Metadata

miztiik-success-green

Level: 200

azure-blob-trigger-function-user-identity's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.