Skip to content

Add an Azure Data Lake Storage Source Datastore

A source datastore is a storage location Qualytics connects to so it can profile, scan, and monitor data. Adding Azure Data Lake Storage as a source lets Qualytics read files directly from your container and run quality operations on the data they contain.

Before you start, review the Azure Data Lake Storage Permissions and the available Authentication methods.

This page covers two ways to add an Azure Data Lake Storage source datastore: using a new connection or reusing a saved one. Both flows share the same form fields. Use the tabs in Field reference below to pick the flow that matches your situation. If this is your first Azure Data Lake Storage datastore in Qualytics, use the New Connection tab.

For the generic step-by-step walkthrough of the Add Source Datastore modal (opening it, toggling the connection mode, testing, and finishing), see Add Source Datastore. The fields described below apply to the Azure Data Lake Storage portion of that flow.

Field reference

The Add Source Datastore form changes depending on whether you create a new connection or reuse a saved one. Pick the tab below that matches your flow.

When Add New Connection is toggled ON, the form shows five groups of fields: Connection Properties, Authentication, Secrets Management (optional Vault integration), Datastores Extraction, and Datastore Properties.

Connection Properties

These fields define where the Azure Data Lake Storage container lives.

new-connection-properties

REF. FIELD REQUIRED DESCRIPTION
1 Connection Name Yes A label for the saved connection (e.g., acme_adls_lake). Appears in the Connection dropdown when you create future datastores.
2 URI Yes The container-level ABFS URI in the format abfss://<container>@<account>.dfs.core.windows.net. Do not include a file path or subfolder here. Use Root Path in Datastores Extraction below to scope to a subfolder.

Authentication

Choose how Qualytics authenticates to Azure. Setting Type changes the credential fields shown below it.

Supply the storage account name and one of its access keys. Qualytics uses these to sign requests directly against the storage account.

new-authentication-shared-key

REF. FIELD REQUIRED DESCRIPTION
1 Type Yes Set to Shared Key.
2 Account Name Yes The Azure storage account name that owns the container.
3 Access Key Yes One of the storage account's access keys.

Authenticate as an Azure AD app registration. Grant the service principal the role required to read the container (see Permissions).

new-authentication-service-principal

REF. FIELD REQUIRED DESCRIPTION
1 Type Yes Set to Service Principal.
2 Client ID Yes The Application (Client) ID of the Azure AD app registration.
3 Client Secret Yes A client secret generated for the app registration.
4 Tenant ID Yes The Directory (Tenant) ID where the app registration lives.

Secrets Management (optional)

Use this group only if you want Qualytics to pull credentials from HashiCorp Vault instead of typing them into the form. Toggle HashiCorp Vault ON to expose the fields below.

new-secrets-management

REF. FIELD REQUIRED DESCRIPTION
1 Login URL Yes The Vault endpoint Qualytics uses to authenticate (e.g., https://vault.example.com/v1/auth/approle/login).
2 Credentials Payload Yes A JSON body containing the credentials Vault expects (e.g., {"role_id":"...","secret_id":"..."}).
3 Token JSONPath Yes The JSONPath that extracts the client token from Vault's response. Defaults to $.auth.client_token.
4 Secret URL Yes The Vault path where the secret is stored (e.g., https://vault.example.com/v1/secret/data/adls).
5 Token Header Name Yes The HTTP header name used to send the token. Defaults to X-Vault-Token.
6 Data JSONPath Yes The JSONPath that extracts the secret payload from Vault's response. Defaults to $.data.

Note

Once Vault is configured, reference any secret value in the Connection Properties or Authentication fields using ${key} (e.g., ${secret_key}). Qualytics resolves the secret at the moment the connection is opened, so updated keys take effect on the next connection.

Datastores Extraction

Pick the subfolder inside the container Qualytics should read from.

new-datastores-extraction

REF. FIELD REQUIRED DESCRIPTION
1 Root Path Yes The subfolder inside the container where the data lives (e.g., /raw/orders/). Use / to read from the container root.

Datastore Properties

Common fields for every source datastore, visible below the Datastores Extraction section in the same form.

new-datastore-properties

REF. FIELD REQUIRED DESCRIPTION
1 Name Yes A label for the datastore (e.g., acme_lake_orders). Appears on the datastore cards in the workspace.
2 Teams Yes Select one or more teams to associate with this source datastore.
3 Initiate Sync No Automatically sync the datastore to detect containers and fields after creation.
4 Connection Info No Read-only banner that shows the IP address the Qualytics dataplane uses to reach your Azure storage account. Allowlist this IP in your Azure storage firewall so the dataplane can connect.

When Add New Connection is toggled OFF and you pick a saved Azure Data Lake Storage connection, the Connection Properties, Authentication, and Secrets Management sections are collapsed and read-only. Qualytics has already validated those credentials, so there is nothing for you to fill in. You only fill in the Datastores Extraction and Datastore Properties below.

To change a saved connection's credentials, edit the connection itself from Settings → Connections. Edits there apply to every datastore that reuses the connection.

Datastores Extraction

existing-datastores-extraction

REF. FIELD REQUIRED DESCRIPTION
1 Root Path Yes The subfolder inside the container where the data lives. You can override the Root Path from the saved connection per datastore.

Datastore Properties

existing-datastore-properties

REF. FIELD REQUIRED DESCRIPTION
1 Name Yes A label for the datastore.
2 Teams Yes Select one or more teams to associate with this source datastore.
3 Initiate Sync No Automatically sync the datastore to detect containers and fields after creation.
4 Connection Info No Read-only banner that shows the IP address the Qualytics dataplane uses to reach your Azure storage account. Allowlist this IP in your Azure storage firewall so the dataplane can connect.