Datagrid, a Procore Company
Pricing
Request a Demo
LoginCreate Account
Datagrid, a Procore Company

Subscribe to our newsletter

By subscribing, you agree to our Privacy Policy.

Product

  • Product
  • Agents
  • Integrations
  • Pricing
  • Download

Resources

  • Guides
  • Blog
  • Events
  • Release Notes
  • FAQ
  • Brand Assets

Get Help

  • Help Center
  • API Quickstart
  • Contact Us

Follow Us

  • LinkedIn
  • YouTube

Company

  • Careers
  • Privacy Policy
  • Terms of Use
  • Master Service Agreement
  • Adoption Agreement
  • Credit Usage Policy and Pricing Terms
  • Report a Vulnerability

© 2026 Datagrid. All rights reserved.

Connector

Azure Data Lake Storage + Datagrid Integration

Azure Data Lake Storage + Datagrid Integration

Connect Azure Data Lake Storage with Datagrid to automate data workflows using agentic AI.

Set up the Azure Data Lake Storage integration in Datagrid
ProductIntegrationsAzure Data Lake Storage + Datagrid Integration

On this page

OverviewHow to integrate Azure Data Lake Storage with DatagridWhy use Azure Data Lake Storage with DatagridWhat you can build with Azure Data Lake Storage and DatagridResources and documentationFrequently asked questionsSimilar integrationsBrowse by category

Overview

Azure Data Lake Storage Gen2 is Microsoft's enterprise-grade cloud data lake, built on top of Azure Blob Storage with a hierarchical namespace enabled. It stores data in any format, including CSV, JSON, Parquet, Avro, and binary, and handles petabyte-scale workloads. ADLS Gen2 provides true directory structures with atomic rename and delete operations, POSIX-compliant access control lists, and multi-protocol access via Blob REST API, DFS REST API, HDFS, NFS 3.0, and SFTP.

What is Azure Data Lake Storage: Azure Data Lake Storage Gen2 is Microsoft's cloud data lake for structured, semi-structured, and unstructured data. It combines large-scale storage with hierarchical directories and data lake access patterns.

Datagrid connects to Azure Data Lake Storage Gen2 as both a source and a destination. Datagrid's AI agents read files from the lake, transform data into business-ready formats, and write results back based on schedules or source changes. This article covers what the integration does inside Datagrid, including setup, authentication, sync behavior, and workflow examples. It covers Datagrid's connection, access, and sync behavior for ADLS Gen2 and does not cover Azure storage account provisioning beyond the linked Azure documentation.

The integration covers file, directory, and container access inside ADLS Gen2. Datagrid's agents ingest raw files from Bronze directories, apply transformations, and route curated output to downstream systems or write enriched data back into Gold directories for analytics and reporting. Teams can also blend ADLS Gen2 data with 50+ other sources in a single workflow.

How to integrate Azure Data Lake Storage with Datagrid

This setup is for operators who need Datagrid to read from and write to an Azure data lake without manual handoffs. The steps below walk through adding the integration, authenticating access, configuring sync behavior, and reviewing the resulting configuration.

Add the integration

  1. Log in to Datagrid and go to Settings > Integrations > Add New

  2. Select Azure Data Lake Storage from the integration list

  3. Enter your Azure Storage account name and authenticate

  4. Select the container or filesystem you want Datagrid to access

  5. Configure read, write, or bidirectional access based on your workflow requirements

  6. Test the connection and save

Authenticate access

The integration authenticates using your Azure credentials. Microsoft Entra ID (OAuth 2.0) is the recommended authorization method for ADLS Gen2. The connecting identity requires an appropriate Azure RBAC role: Storage Blob Data Reader for read-only access or Storage Blob Data Contributor for read-write access, assigned at the storage account or container scope.

In Datagrid, your account must have the Azure Data Lake Storage Administrator permission to configure the integration.

Configure data sync

The integration supports the following sync settings and data objects.

  • Sync direction — Bidirectional (read from and write to ADLS Gen2)

  • Supported formats — CSV, JSON, Parquet, Avro, ORC, XML, Excel, binary

  • Data objects — Files, directories, containers/filesystems

  • Trigger types — Scheduled, source change

  • Access protocols — DFS REST API (dfs.core.windows.net), Blob REST API (blob.core.windows.net)

Review a sample configuration

The following sample shows how the integration settings can be structured based on the setup fields above.

{ "integration": "Azure Data Lake Storage", "storage_account": "your-storage-account", "container_or_filesystem": "project-data", "access": "bidirectional", "triggers": ["scheduled", "source change"], "formats": ["CSV", "JSON", "Parquet", "Avro", "ORC", "XML", "Excel", "binary"], "protocols": ["dfs.core.windows.net", "blob.core.windows.net"] }

Use these settings to define how Datagrid reads from and writes to your lake. For detailed setup requirements and permissions, refer to the Datagrid documentation linked above.

Why use Azure Data Lake Storage with Datagrid

This integration fits operators running mission-critical programs who need answers and action, not admin. Datagrid executes file-based workflows directly inside the data lake so operators can keep source data, transformed output, and downstream routing in one operating model.

  • Bidirectional data lake access: Datagrid's AI agents read raw files from ADLS Gen2 and write transformed results back to the same repository without intermediate staging.

  • Format-agnostic processing: Agents handle CSV, JSON, Parquet, Avro, XML, and binary files stored in the lake, extracting and transforming data across different structures.

  • Event-driven automation: Datagrid can trigger workflows on source changes so pipelines run when new files land in ADLS Gen2.

  • Cross-source data blending: Combine ADLS Gen2 data with 50+ other sources in a single automated workflow.

  • Hierarchical namespace operations: Datagrid works with ADLS Gen2's true directory structure, including atomic rename and delete operations.

  • Autonomous pipeline execution: Agents handle extraction, cleaning, enrichment, and routing across Bronze, Silver, and Gold layers with minimal manual intervention between steps.

These capabilities matter most when operators need consistent execution across large file volumes, multiple source systems, and changing downstream requirements.

What you can build with Azure Data Lake Storage and Datagrid

Datagrid executes a wide range of file-based workflows on top of Azure Data Lake Storage. The examples below show how operators and project teams can standardize recurring work across raw, curated, and downstream data flows.

  • Automated ETL across a medallion architecture: Datagrid's AI agents read raw project files from an ADLS Gen2 Bronze layer, apply validation and transformation logic, and write curated output into Silver and Gold directories. A construction firm could ingest daily schedule exports, spec sheets, and RFI logs into ADLS Gen2, then have agents cross-reference and produce consolidated project status datasets for dashboards without manual data wrangling.

  • Agentic document extraction from lake storage: Store unstructured project files such as submittals, drawings, contracts, and inspection reports in ADLS Gen2 and let Datagrid agents extract structured data from them. An insurance operations team could drop claim files into a designated container and have agents parse key fields, validate them against policy data from a connected system, and write structured claim records back to the lake for downstream analytics.

  • Cross-platform data consolidation for project teams: Pull data from ADLS Gen2 alongside other databases, document systems, and field management tools into a single automated workflow. A manufacturing team could consolidate BOMs stored as Parquet files in the data lake with ERP records and supplier specs, producing a unified dataset that agents keep current as source files change.

  • Triggered data routing and distribution: Configure Datagrid to detect new file arrivals in specific ADLS Gen2 directories and route processed data to the right destination. When a project team uploads updated cost data to a designated container, agents can transform the data, push summaries to team channels, write aggregated records to connected analytics systems, and archive processed files in a separate ADLS Gen2 directory from the same trigger event.

These examples show the practical range of the integration: ingest, transform, validate, route, and write back without forcing teams to move core lake data into a separate workflow layer.

Resources and documentation

Use these resources when you need deeper product, API, or access-control detail.

  • Azure Data Lake Storage Gen2 introduction

  • ADLS Gen2 REST API reference: filesystem and path operations, authentication methods, current API version

  • Create a storage account for ADLS Gen2: quickstart guide for provisioning an HNS-enabled account

  • ADLS Gen2 access control model: ACL management across Azure Storage Explorer, Portal, .NET, Java, Python, and CLI

These references cover the Azure-side concepts that most often affect setup and permissions.

Frequently asked questions

What authentication method does the Azure Data Lake Storage integration use?

The integration uses Azure credentials, and Microsoft Entra ID (OAuth 2.0) is Microsoft's recommended authorization approach for ADLS Gen2. The connecting identity needs an Azure RBAC role such as Storage Blob Data Reader for read-only access or Storage Blob Data Contributor for read-write access, assigned at the storage account or container scope. In Datagrid, your account also requires the Azure Data Lake Storage Administrator permission.

Does the integration support both reading from and writing to ADLS Gen2?

Yes. Datagrid supports Azure Data Lake Storage as both a source and a destination. Datagrid's AI agents can ingest data from your lake and write processed or enriched data back.

What file formats can Datagrid process from Azure Data Lake Storage?

ADLS Gen2 stores data in any format, and Datagrid can process common types used in analytics workflows, including CSV, JSON, Parquet, Avro, ORC, XML, Excel, and binary files.

Does my Azure Storage account need the hierarchical namespace enabled?

ADLS Gen2 capabilities, including true directory structures, atomic directory operations, and POSIX ACLs, are activated by enabling the hierarchical namespace on your Azure Storage account. Without HNS enabled, the account operates as standard Azure Blob Storage and does not provide data lake functionality. Confirm HNS is enabled on your account before configuring the Datagrid integration.

Can Datagrid trigger workflows when new data lands in ADLS Gen2?

Yes. The Datagrid integration supports trigger-based execution, including source change detection.

Similar integrations

The following integrations are closely related to Azure Data Lake Storage workflows in Datagrid.

  • Azure Blob Storage: General-purpose Azure object storage and the foundational layer ADLS Gen2 is built on, without the hierarchical namespace.

  • Azure SQL Database: Managed relational database on Azure, commonly used alongside ADLS Gen2 for structured query workloads.

  • Azure PostgreSQL Database: Managed PostgreSQL on Azure, used for transactional and analytical workloads that complement lake storage.

These related integrations cover adjacent storage and database patterns commonly used with ADLS Gen2.

Browse by category

  • Cloud Storage

  • Database

Related Guides

CSI Divisions and Construction Specifications (Complete Guide)

Transmittal vs. Submittal in Construction

How to Resolve Construction Submittal Stamp Ambiguity Before It Becomes Rework

Request a Demo

You've got more important things to do. Let Datagrid handle the rest.

Watch our quick demo to see how Datagrid transforms workflows. Discover the seamless integration of our AI assistants in real-time tasks.

Book a DemoLearn More