Datagrid, a Procore Company
Pricing
Request a Demo
LoginCreate Account
Datagrid, a Procore Company

Subscribe to our newsletter

By subscribing, you agree to our Privacy Policy.

Product

  • Product
  • Agents
  • Integrations
  • Pricing
  • Download

Resources

  • Guides
  • Blog
  • Events
  • Release Notes
  • FAQ
  • Brand Assets

Get Help

  • Help Center
  • API Quickstart
  • Contact Us

Follow Us

  • LinkedIn
  • YouTube

Company

  • Careers
  • Privacy Policy
  • Terms of Use
  • Legal Terms
  • Credit Usage Policy and Pricing Terms
  • Report a Vulnerability

© 2026 Datagrid, a Procore company. All rights reserved.

Connector

A

Azure Data Lake Storage + Datagrid integration

Connect Azure Data Lake Storage Gen2 with Datagrid to execute file ingestion, transformation, and bidirectional data workflows from your enterprise data lake.

Connect Azure Data Lake Storage
ProductIntegrationsAzure Data Lake Storage + Datagrid integration

On this page

OverviewHow to integrate Azure Data Lake Storage with DatagridWhy use Azure Data Lake Storage with DatagridWhat you can build with Azure Data Lake Storage and DatagridResources and documentationFrequently asked questionsSimilar Integrations

Overview

What is Azure Data Lake Storage: Azure Data Lake Storage (ADLS) is Microsoft's petabyte-scale storage service for analytics workloads. Built on Azure Blob Storage with a hierarchical namespace enabled, it provides true directory semantics, POSIX-compliant ACLs, and native Hadoop compatibility through the ABFS driver. ADLS Gen2 stores structured, semi-structured, and unstructured data in raw format for Azure analytics and lakehouse services.

Screenshot 2026-06-15 at 7.41.48 AM

How to integrate Azure Data Lake Storage with Datagrid

Use this integration to connect ADLS Gen2 filesystems and directories to Datagrid so agents execute ingestion, transformation, and writeback workflows in your data lake. Start by authorizing the connection, then configure authentication, then review sync behavior for the storage locations Datagrid will monitor.

Authorize the connection

Follow these steps to connect your storage account and grant Datagrid access to the filesystems and paths you want agents to process.

  1. Open Datagrid and go to Settings > Integrations > Add New.

  2. Search for Azure Data Lake Storage in the integration list.

  3. Select the integration and follow the Microsoft Entra ID authorization prompt to sign in.

  4. Grant Datagrid the requested permissions on your ADLS Gen2 storage account.

  5. Ensure the connection has the required permission level: Azure Datalake Storage Administrator.

  6. Select the specific filesystems (containers) and directory paths you want Datagrid agents to access.

  7. Confirm the connection and verify read/write access from the integration dashboard.

Configure authentication

Datagrid authenticates to ADLS Gen2 via Microsoft Entra ID (OAuth 2.0), the authentication method Microsoft recommends for production workloads. For the Datagrid integration, the required permission is Azure Datalake Storage Administrator.

Define data sync behavior

After authentication is complete, review the connection scope and sync behavior Datagrid applies to the selected storage locations.

  • Direction: Bidirectional (read and write)

  • Data objects synced: Filesystems (containers), directories, files, and user-defined metadata

  • Supported file formats: Read/write: Parquet, CSV, JSON, Avro, ORC, and binary files; read-only: Excel

  • Trigger options: Scheduled sync, source change detection, and event-driven workflows where configured

  • Prerequisites: Datagrid Pro or Enterprise subscription; Safari, Chrome, Edge, or Firefox

Why use Azure Data Lake Storage with Datagrid

This integration fits operators running mission-critical programs who need raw lake data turned into reliable downstream workflows without manual handoffs.

  • Automated file classification and routing: Datagrid agents detect new files as they land in ADLS Gen2 containers and classify them by type and schema.

  • Bidirectional data lake access: Datagrid agents read raw data from Bronze zones and write cleaned, enriched results back to Silver and Gold zones.

  • Cross-platform data movement: Datagrid connects ADLS Gen2 with 100+ other sources and downstream operational systems.

  • Schema-adaptive processing: When source applications change data formats or introduce new fields, Datagrid agents adapt to schema changes as they occur instead of breaking the pipeline.

  • Event-driven execution: Datagrid agents trigger on source change detection in ADLS Gen2.

  • Document extraction at scale: Datagrid agents extract structured data from unstructured files (PDFs, scanned forms, spreadsheets) stored in ADLS Gen2 and write the output to curated directories or downstream databases.

What you can build with Azure Data Lake Storage and Datagrid

ADLS Gen2 works well as a system of record for high-volume project files, operational data, and downstream reporting datasets. When connected to Datagrid, those stored inputs trigger repeatable workflows across the built world and other data-heavy operations.

Here are several workflows teams run with this integration.

  • Automated built-world document processing: Store project submittals, RFIs, and specification sheets in ADLS Gen2 as they arrive from field teams.

  • IoT data transformation pipelines: Stream sensor data and equipment logs from jobsites into ADLS Gen2 Bronze containers in CSV or JSON format.

  • Multi-source data lake consolidation: Pull financial records from ERP systems, project data from construction platforms, and vendor documents from email into a single ADLS Gen2 storage account.

  • Automated compliance file auditing: Maintain a repository of safety certifications, inspection reports, and regulatory filings in ADLS Gen2.

Resources and documentation

  • Azure Data Lake Storage Gen2 introduction: Microsoft's official overview of ADLS Gen2 architecture and features

  • ADLS Gen2 REST API reference: Filesystem and Path operations on the DFS endpoint

  • ADLS Gen2 access control model: RBAC roles, POSIX ACLs, and how they interact

  • ADLS Gen2 best practices: file sizing, directory structure, format selection, and performance guidance

  • Microsoft Entra ID authorization for Azure Storage: OAuth 2.0 setup and token handling

Frequently asked questions

What file formats can Datagrid process from ADLS Gen2?

Datagrid agents read and write common analytics formats stored in ADLS Gen2, including Parquet, CSV, JSON, Avro, ORC, and binary files, with read access also available for Excel. For analytics-heavy workloads, Microsoft recommends Parquet or ORC for columnar queries and Avro for write-heavy or event-bus workloads. Datagrid also extracts structured data from unstructured files like PDFs and scanned documents stored in ADLS Gen2.

Does the ADLS Gen2 integration support both read and write operations?

Yes. The Datagrid integration provides bidirectional access. Datagrid agents read files and directories from any accessible container and write transformed outputs back to ADLS Gen2. This bidirectional capability covers the full Medallion architecture workflow: read from Bronze (raw) zones, process through Silver (cleaned), and write to Gold (curated) zones.

What Azure permissions does my account need for the Datagrid connection?

For the Datagrid integration, the required permission is Azure Datalake Storage Administrator. Microsoft documents the full access control model including how RBAC roles interact with POSIX ACLs on HNS-enabled accounts.

Similar Integrations

  • Azure Blob Storage: Underlying object service that ADLS Gen2 extends; useful for comparing flat versus hierarchical namespaces and cross-endpoint access.

  • Databricks: Primary compute and lakehouse engine that reads and writes ADLS Gen2 for Delta Lake-powered analytics and machine learning pipelines.

  • Amazon AWS S3: Cross-cloud object store competitor often used in hybrid architectures and for comparing storage, lifecycle, and access patterns.

  • Google Cloud Storage: Alternative cloud object storage used in multi-cloud strategies and for integrating analytics across GCP and Azure data lakes.

  • MS Fabric: Microsoft's unified analytics platform built on OneLake, which relies on ADLS Gen2 capabilities for governance and analytics workflows.

  • BigQuery: Cloud data warehouse often compared as an analytics destination when syncing curated ADLS Gen2 datasets across platforms.

Related Guides

Construction Site Daily Logs (Best Practices)

How to Use AI Agents to Standardize Construction Daily Report Templates

The Submittal Review Process (Roles, Steps & Common Issues)

Request a Demo

You've got more important things to do. Let Datagrid handle the rest.

Watch our quick demo to see how Datagrid transforms workflows. Discover the seamless integration of our AI assistants in real-time tasks.

Book a DemoLearn More