Overview
What is Azure Blob Storage: Azure Blob Storage is Microsoft's cloud object storage service for storing and accessing unstructured data. It stores file types including PDFs, images, Parquet files, and video across multiple storage tiers.

How to integrate Azure Blob Storage with Datagrid
For project teams managing high volumes of project files, the Azure Blob Storage integration gives Datagrid a direct path into the containers where work already lands. The setup below covers configuration, authentication, and sync behavior for extraction and routing workflows.
Configure the integration
The Azure Blob Storage integration gives Datagrid access to blob containers in your storage account. Datagrid can list containers, retrieve blobs, read metadata and index tags, upload processed output, and run workflows based on new file arrivals.
Open your Datagrid workspace and add a new integration for Azure Blob Storage
Enter your Azure storage account name and the target container name
Authenticate using one of the methods below
Configure which containers and blob prefixes Datagrid should monitor
Test the connection to confirm Datagrid can list blobs in the target container
Save the integration configuration
Authenticate access
Datagrid uses Microsoft Entra ID (OAuth 2.0), Shared Key Authorization, and Shared Access Signature (SAS) tokens.
Microsoft Entra ID with a service principal or managed identity is the recommended approach. Assign the Storage Blob Data Contributor role to your Datagrid service principal at the storage account level. This grants read and write access to blob data without exposing account keys. Role assignments can take 10 to 15 minutes to propagate after creation.
If you use SAS tokens, generate a User Delegation SAS backed by Entra ID credentials rather than an account-key-based SAS. Set an appropriate expiry window and scope permissions to the specific containers Datagrid needs.
Define sync behavior
Once authentication is in place, define Datagrid's source objects and output destinations, then choose the trigger behavior.
Data objects synced: Blob files, container listings, blob properties, user-defined metadata, and blob index tags.
Sync direction: Bidirectional. Datagrid reads source files from Blob Storage and writes processed output (structured JSON, extracted records, transformed files) back to the same or different containers.
Sync triggers: Datagrid can poll containers on a configured schedule. Datagrid workflows can also use blob events via Azure Event Grid for event-driven processing when a file arrives.
With those settings in place, Datagrid moves from file arrival to extraction and routing without manual handoffs.
Why use Azure Blob Storage with Datagrid
Azure Blob Storage is already where many project files accumulate. Here are some reasons to use it with Datagrid:
Automated file extraction at upload: Datagrid retrieves PDFs, images, and scanned documents from blob containers and extracts structured data such as tables, key-value pairs, and text for operators handling high-volume intake.
Event-driven agent triggers: New files landing in Blob Storage fire events through Azure Event Grid, and Datagrid workflows use those events to classify, extract, and route data as files arrive.
Format-agnostic processing: Azure Blob Storage accepts many file types, and Datagrid processes PDFs, CSVs, JSON, Parquet, TIFF images, and DOCX files from the same container through unified extraction pipelines.
Bidirectional data flow: Datagrid reads raw files from Blob Storage and writes structured output back, creating a closed loop between unstructured source data and clean, agent-processed records for analytics teams.
Tag-based intelligent routing: Blob index tags, natively searchable, give Datagrid additional metadata context to route files to the correct extraction pipeline or downstream system automatically.
Tiered storage cost control: Datagrid workflows process files in Hot or Cool tiers and reference lifecycle management policies to avoid unnecessary retrieval from Archive storage.
What you can build with Azure Blob Storage and Datagrid
Project teams use this integration to turn file drops into repeatable workflows for intake, review, and downstream system updates. The examples below show how Blob Storage becomes the execution point for Datagrid agents:
Automated insurance claims processing: Claims teams upload claim PDFs and supporting images to a blob container.
Built-world submittal cross-referencing: Project teams drop submittals, spec sheets, and drawings into designated blob containers.
Automated ETL from file drops to analytics: Operations leads receive daily CSV or JSON exports from field systems into Blob Storage.
Compliance archival with intelligent tagging: Datagrid scans incoming blobs, uses extracted content such as document type, project ID, or retention category to drive tagging and routing decisions, and feeds retention-oriented workflows.
Resources and documentation
Azure Blob Storage introduction for core Blob Storage concepts and object hierarchy
Blob Service REST API reference covering account, container, and blob operations
Authorize data access for Entra ID, SAS, and Shared Key guidance
Azure Event Grid overview for blob event handling and event-driven workflows
Lifecycle management overview for automated tier transitions and expiration policies in Blob Storage
Frequently asked questions
What authentication method should I use to connect Azure Blob Storage with Datagrid?
Microsoft recommends Microsoft Entra ID with managed identities and recommends disallowing Shared Key authorization for storage accounts. For the Datagrid integration, use a service principal with the Storage Blob Data Contributor RBAC role assigned at the storage account level. If your workflow requires SAS tokens, for example granting time-limited access to specific containers, generate a User Delegation SAS backed by Entra ID credentials rather than an account-key-based SAS.
What file formats can Datagrid process from Azure Blob Storage?
Azure Blob Storage stores unstructured data without format restrictions. Datagrid processes common file types stored as blobs, including PDFs, CSVs, JSON, Parquet, TIFF images, DOCX, and PPTX files.
Why am I getting a 403 Forbidden error when Datagrid tries to access my Blob Storage?
Typical causes include expired SAS tokens, RBAC role assignments that have not finished propagating, and other authentication or access configuration issues.
Similar integrations
Amazon AWS S3: Common cross-cloud object store used alongside Blob Storage for migration, backup, and multi-cloud sync workflows.
Google Cloud Storage: Alternative cloud object store for cross-cloud ETL and sync pipelines involving Blob Storage and analytics destinations.
Azure Data Lake Storage: Complementary Microsoft data lake tier providing hierarchical namespaces and analytics-optimized access alongside Blob Storage.
Databricks: Often paired to run Spark transformations and feature engineering on data landed in Blob Storage for ML and analytics.
Snowflake: Popular analytics destination for ELT workflows that ingest transformed files exported from Blob Storage.
BigQuery: Google Cloud data warehouse used as a target for ETL pipelines extracting and loading data from Blob Storage.