Google Cloud Storage
Adding and configuring a Google Cloud Storage connection within Qualytics empowers the platform to build a symbolic link with your file system to perform operations like data discovery, visualization, reporting, syncing, profiling, scanning, anomaly surveillance, and more.
This documentation provides a step-by-step guide on how to add Google Cloud Storage as both a source and enrichment datastore in Qualytics. It covers the entire process, from initial connection setup to testing and finalizing the configuration.
By following these instructions, enterprises can ensure their Google Cloud Storage environment is properly connected with Qualytics, unlocking the platform's potential to help you proactively manage your full data quality lifecycle.
Letβs get started π
Google Cloud Storage Setup Guide
This guide will walk you through the steps to set up Google Cloud Storage, including how to retrieve the necessary URIs, access keys, and secret keys, which are essential for integrating this datastore into Qualytics.
Retrieve the Google Cloud Storage URI
To retrieve the Cloud Storage URI, follow the given steps:
- Go to the Cloud Storage Console.
- Navigate to the location of the object (file) that holds the source data.
- At the top of the Cloud Storage console, locate and note down the path to the object.
- Create the URI using the following format:
-
bucketis the name of the Cloud Storage bucket. -
fileis the name of the object (file) containing the data.
Retrieve the Access Key and Secret Key
You need these keys when integrating Google Cloud Storage with other applications or services, such as when adding it as a datastore in Qualytics. The keys allow you to reuse existing code to access Google Cloud Storage without needing to implement a different authentication mechanism.
To retrieve the access key and secret key in the Google Cloud Storage Console account, follow the given steps:
Step 1: Log in to the Google Cloud Console, navigate to the Google Cloud Storage settings, and this will redirect you to the Settings page.

Step 2: Click on the Interoperability tab.

Step 3: Scroll down the Interoperability page and under Access keys for your user account, click the CREATE A KEY button to generate a new Access Key and Secret Key.

Step 4: Use these generated Access Key and Secret Key values when adding your Google Cloud Storage account to SimpleBackups.

For example, once you generate the keys, they might look like this:
-
Access Key:
GOOG1234ABCDEFGH5678 -
Secret Key:
abcd1234efgh5678ijklmnopqrstuvwx
Warning
Make sure to store these keys securely, as they provide access to your Google Cloud Storage resources.
Datastore Google Cloud Storage Privileges
The permissions required depend on whether you are using Google Cloud Storage as a source or enrichment datastore. Qualytics accesses GCS using HMAC keys (Access Key / Secret Key) or a Service Account Key.
Minimum Permissions (Source Datastore)
The service account or HMAC key must have the following permissions:
| Permission | Purpose |
|---|---|
storage.buckets.get |
Validate the bucket exists and retrieve its metadata |
storage.objects.get |
Read file contents for profiling and scanning |
storage.objects.list |
List files in the bucket to discover data assets |
Tip
You can grant these permissions by assigning the Storage Object Viewer (roles/storage.objectViewer) role to the service account on the target bucket.
Additional Permissions for Enrichment Datastore
When using Google Cloud Storage as an enrichment datastore, the following additional permissions are required:
| Permission | Purpose |
|---|---|
storage.objects.create |
Write enrichment result files |
storage.objects.delete |
Remove temporary or outdated enrichment files |
Tip
You can grant all required permissions (read + write) by assigning the Storage Object Admin (roles/storage.objectAdmin) role to the service account on the target bucket.
Example IAM Policy
Replace <SERVICE_ACCOUNT_EMAIL> and <BUCKET_NAME> with your actual values.
Source Datastore (Read-Only)
{
"bindings": [
{
"role": "roles/storage.objectViewer",
"members": [
"serviceAccount:<SERVICE_ACCOUNT_EMAIL>"
]
}
]
}
Enrichment Datastore (Read-Write)
{
"bindings": [
{
"role": "roles/storage.objectAdmin",
"members": [
"serviceAccount:<SERVICE_ACCOUNT_EMAIL>"
]
}
]
}
Tip
If you need both storage.buckets.get and object-level permissions but want to avoid a broader role, you can create a custom role with only the specific permissions listed in the Minimum Permissions section.
Assigning via gcloud CLI
# Source Datastore (Read-Only)
gsutil iam ch \
serviceAccount:<SERVICE_ACCOUNT_EMAIL>:roles/storage.objectViewer \
gs://<BUCKET_NAME>
# Enrichment Datastore (Read-Write)
gsutil iam ch \
serviceAccount:<SERVICE_ACCOUNT_EMAIL>:roles/storage.objectAdmin \
gs://<BUCKET_NAME>
Tip
You can also assign roles through the Google Cloud Console by navigating to the bucket, selecting Permissions, and clicking Grant Access.
GCS Roles Summary
| Role | Use Case | Permissions Included |
|---|---|---|
roles/storage.objectViewer |
Source Datastore | storage.objects.get, storage.objects.list, storage.buckets.get |
roles/storage.objectAdmin |
Enrichment Datastore | storage.objects.get, storage.objects.list, storage.objects.create, storage.objects.delete, storage.buckets.get |
Troubleshooting Common Errors
| Error | Likely Cause | Fix |
|---|---|---|
403 Forbidden |
The service account or HMAC key lacks the required permissions on the bucket | Assign the appropriate role (Storage Object Viewer or Storage Object Admin) to the service account on the target bucket |
404 Not Found: Bucket not found |
The bucket name in the URI is incorrect or the bucket does not exist | Verify the bucket name and ensure the URI follows the format gs://bucket-name |
Invalid credentials |
The Access Key / Secret Key pair is incorrect or the service account key file is malformed | Regenerate the HMAC keys from Cloud Storage > Settings > Interoperability or re-download the service account key |
The caller does not have storage.objects.list access |
The service account has object-level access but lacks bucket-level list permission |
Assign the Storage Object Viewer role at the bucket level (not just object level) |
The caller does not have storage.objects.create access |
The enrichment service account lacks write permissions | Upgrade the role assignment from Storage Object Viewer to Storage Object Admin |
Detailed Troubleshooting Notes
Authentication Errors
The error Invalid credentials indicates that the HMAC keys or service account key are incorrect or malformed.
Common causes:
- Incorrect Access Key / Secret Key β the HMAC key pair was copied incorrectly or has been deleted.
- Malformed service account key β the JSON key file is corrupted, truncated, or belongs to a different project.
- Service account disabled β the service account has been disabled in the Google Cloud Console.
Note
HMAC keys are tied to a specific service account. If the service account is deleted or disabled, the HMAC keys will stop working even if they have not been explicitly revoked.
Permission Errors
The error 403 Forbidden or The caller does not have storage.objects.list access means the credentials are valid but lack the required IAM permissions.
Common causes:
- Missing IAM role β the service account does not have
Storage Object Viewer(source) orStorage Object Admin(enrichment) assigned on the target bucket. - Role assigned at wrong level β the role is assigned at the project level but a bucket-level policy overrides it.
- Uniform bucket-level access β if the bucket uses uniform bucket-level access (recommended), ensure IAM policies are set at the bucket level, not through ACLs.
- Source vs. enrichment mismatch β the service account has
Storage Object Viewerbut the operation requires write access (enrichment).
Connection Errors
The error 404 Not Found: Bucket not found indicates a configuration issue with the bucket name or URI.
Common causes:
- Bucket does not exist β the bucket name was misspelled or the bucket has been deleted.
- Wrong project β the service account belongs to a different Google Cloud project than the bucket.
- Invalid URI format β the URI must follow
gs://bucket-name. Extra path segments or incorrect formatting will cause failures.
Tip
Start by confirming credentials are valid (authentication errors), then verify IAM role assignments (permission errors), and finally check the bucket name and URI format (connection errors).
Add a Source Datastore
A source datastore is a storage location used to connect and access data from external sources. Google Cloud Storage is an example of a source datastore, specifically a type of Distributed File System (DFS) datastore that is designed to handle data stored in distributed file systems. Configuring a DFS datastore enables the Qualytics platform to access and perform operations on the data, thereby generating valuable insights.
Step 1: Log in to your Qualytics account and click on the Add Source Datastore button located at the top-right corner of the interface.

Step 2: A modal window - Add Datastore will appear, providing you with the options to connect a datastore.

| REF. | FIELDS | ACTIONS |
|---|---|---|
| 1. | Name (Required) | Specify the name of the datastore (e.g., The specified name will appear on the datastore cards) |
| 2. | Toggle Button | Toggle ON to create a new source datastore from scratch, or toggle OFF to reuse credentials from an existing connection |
| 3. | Connector (Required) | Select Google Cloud Storage from the dropdown list. |
Option I: Create a Datastore with a new Connection
If the toggle for Add New connection is turned on, then this will prompt you to add and configure the source datastore from scratch without using existing connection details.
Step 1: Select the Google Cloud Storage connector from the dropdown list and add connection details such as Secrets Management, URI, service account key, root path, and teams.

Secrets Management: This is an optional connection property that allows you to securely store and manage credentials by integrating with HashiCorp Vault and other secret management systems. Toggle it ON to enable Vault integration for managing secrets.
Note
After configuring HashiCorp Vault integration, you can use ${key} in any Connection property to reference a key from the configured Vault secret. Each time the Connection is initiated, the corresponding secret value will be retrieved dynamically.
| REF | FIELDS | ACTIONS |
|---|---|---|
| 1. | Login URL | Enter the URL used to authenticate with HashiCorp Vault. |
| 2. | Credentials Payload | Input a valid JSON containing credentials for Vault authentication. |
| 3. | Token JSONPath | Specify the JSONPath to retrieve the client authentication token from the response (e.g., $.auth.client_token). |
| 4. | Secret URL | Enter the URL where the secret is stored in Vault. |
| 5. | Token Header Name | Set the header name used for the authentication token (e.g., X-Vault-Token). |
| 6. | Data JSONPath | Specify the JSONPath to retrieve the secret data (e.g., $.data). |

Step 2: The configuration form will expand, requesting credential details before establishing the connection.

| REF. | FIELDS | ACTIONS |
|---|---|---|
| 1. | URI (Required) | Enter the Uniform Resource Identifier (URI) of the Google Cloud Storage. |
| 2. | Service Account Key (Required) | Upload a JSON file that contains the credentials required for accessing the Google Cloud Storage. |
| 3. | Root Path (Required) | Specify the root path where the data is stored. |
| 4. | Teams (Required) | Select one or more teams from the dropdown to associate with this source datastore. |
| 5. | Initiate Sync (Optional) | Tick the checkbox to automatically perform sync operation on the configured source datastore to detect new, changed, or removed containers and fields. |
Step 3: After adding the source datastore details, click on the Test Connection button to check and verify its connection.

If the credentials and provided details are verified, a success message will be displayed indicating that the connection has been verified.
Option II: Use an Existing Connection
If the toggle for Add New connection is turned off, then this will prompt you to configure the source datastore using the existing connection details.
Step 1: Select a connection to reuse existing credentials.

Note
If you are using existing credentials, you can only edit the details such as Root Path, Teams, and Initiate Sync.
Step 2: Click on the Test Connection button to check and verify the source data connection. If connection details are verified, a success message will be displayed.

Note
Clicking on the Finish button will create the source datastore and bypass the enrichment datastore configuration step.
Tip
It is recommended to click on the Next button, which will take you to the enrichment datastore configuration page.
Add Enrichment Datastore
Once you have successfully tested and verified your source datastore connection, you have the option to add the enrichment datastore (recommended). This datastore is used to store the analyzed results, including any anomalies and additional metadata files. This setup provides full visibility into your data quality, helping you manage and improve it effectively.
Step 1: Whether you have added a source datastore by creating a new datastore connection or using an existing connection, click on the Next button to start adding the Enrichment Datastore.

Step 2: A modal window - Link Enrichment Datastore will appear, providing you with the options to configure an enrichment datastore.

| REF. | FIELDS | ACTIONS |
|---|---|---|
| 1. | Prefix | Add a prefix name to uniquely identify tables/files when Qualytics writes metadata from the source datastore to your enrichment datastore. |
| 2. | Caret Down Button | Click the caret down to select either Use Enrichment Datastore or Add Enrichment Datastore. |
| 3. | Enrichment Datastore | Select an enrichment datastore from the dropdown list. |
Option I: Create an Enrichment Datastore with a new Connection
If the toggle for Add New connection is turned on, then this will prompt you to add and configure the enrichment datastore from scratch without using an existing enrichment datastore and its connection details.
Step 1: Click on the caret button and select Add Enrichment Datastore.

A modal window - Link Enrichment Datastore will appear. Enter the following details to create an enrichment datastore with a new connection.

| REF. | FIELDS | ACTIONS |
|---|---|---|
| 1. | Prefix | Add a prefix name to uniquely identify tables/files when Qualytics writes metadata from the source datastore to your enrichment datastore. |
| 2. | Name | Give a name for the enrichment datastore. |
| 3. | Toggle Button for Add New Connection | Toggle ON to create a new enrichment datastore from scratch or toggle OFF to reuse credentials from an existing connection. |
| 4. | Connector | Select a datastore connector from the dropdown list. |
Step 2: Add connection details for your selected enrichment datastore connector.

| REF. | FIELDS | ACTIONS |
|---|---|---|
| 1. | URI (Required) | Enter the Uniform Resource Identifier (URI) for the Google Cloud Storage. |
| 2. | Service Account Key (Required) | Upload a JSON file that contains the credentials required for accessing the Google Cloud Storage. |
| 3. | Root Path (Required) | Specify the root path where the data is stored. |
| 4. | Teams (Required) | Select one or more teams from the dropdown to associate with this source datastore. |
Step 3: Click on the Test Connection button to verify the selected enrichment datastore connection. If the connection is verified, a flash message will indicate that the connection with the datastore has been successfully verified.

Step 4: Click on the Finish button to complete the configuration process.

When the configuration process is finished, a modal will display a success message indicating that your datastore has been successfully added.

Step 5: Close the Success dialog and the page will automatically redirect you to the Source Datastore Details page where you can perform data operations on your configured source datastore.

Option II: Use an Existing Connection
If the toggle for Use an existing enrichment datastore is turned on, you will be prompted to configure the enrichment datastore using existing connection details.
Step 1: Click on the caret button and select Use Enrichment Datastore.

Step 2: A modal window - Link Enrichment Datastore will appear. Add a prefix name and select an existing enrichment datastore from the dropdown list.

| REF. | FIELDS | ACTIONS |
|---|---|---|
| 1. | Prefix | Add a prefix name to uniquely identify tables/files when Qualytics writes metadata from the source datastore to your enrichment datastore. |
| 2. | Enrichment Datastore | Select an enrichment datastore from the dropdown list. |
Step 3: After selecting an existing enrichment datastore connection, you will view the following details related to the selected enrichment:
-
Teams: The team associated with managing the enrichment datastore is based on the role of public or private. Example: Marked as Public means that this datastore is accessible to all the users.
-
URI: Uniform Resource Identifier (URI) points to the specific location of the source data and should be formatted accordingly (e.g.,
gs://bucket/filefor Google Cloud Storage). -
Root Path: Specify the root path where the data is stored. This path defines the base directory or folder from which all data operations will be performed.

Step 4: Click on the Finish button to complete the configuration process for the existing enrichment datastore.

When the configuration process is finished, a modal will display a success message indicating that your datastore has been successfully added.

Close the success message and you will be automatically redirected to the Source Datastore Details page where you can perform data operations on your configured source datastore.

API Payload Examples
This section provides detailed examples of API payloads to guide you through the process of creating and managing datastores using Qualytics API. Each example includes endpoint details, sample payloads, and instructions on how to replace placeholder values with actual data relevant to your setup.
Creating a Source Datastore
This section provides sample payloads for creating the Google Cloud Storage datastore. Replace the placeholder values with actual data relevant to your setup.
Endpoint: /api/datastores (post)
# Step 1: Create a Connection
qualytics connections create \
--type gcs \
--name "your_connection_name" \
--uri "gs://<bucket_name>" \
--secret-key ${GCS_SERVICE_ACCOUNT_KEY}
# Step 2: Create a Source Datastore
qualytics datastores create \
--name "your_datastore_name" \
--connection-name "your_connection_name" \
--database . \
--schema /
Creating an Enrichment Datastore
This section provides sample payloads for creating an enrichment datastore. Replace the placeholder values with actual data relevant to your setup.
Endpoint: /api/datastores (post)
# Step 1: Create a Connection
qualytics connections create \
--type gcs \
--name "your_connection_name" \
--uri "gs://<bucket_name>" \
--secret-key ${GCS_SERVICE_ACCOUNT_KEY}
# Step 2: Create an Enrichment Datastore
qualytics datastores create \
--name "your_datastore_name" \
--connection-name "your_connection_name" \
--database . \
--schema /your_enrichment_path \
--enrichment-only
Link an Enrichment Datastore to a Source Datastore
Use the provided endpoint to link an enrichment datastore to a source datastore:
Endpoint Details: /api/datastores/{datastore-id}/enrichment/{enrichment-id} (patch)