Skip to content

Bulk datastore onboarding

Goal

Onboard ten, twenty, or fifty datastores in a single command, defined as YAML files in version control. This is the go-to approach for new customer rollouts where the source list is known up front, and the natural pattern for teams already using config-as-code for their other infrastructure.

Permissions

Step Endpoint Role Team permission
Create / update connections POST /api/connections, PUT /api/connections/{id} Manager N/A
Create / update datastores POST /api/datastores, PUT /api/datastores/{id} Manager (create), Member + Editor (update) Editor for updates
Read existing connections / datastores GET /api/connections, GET /api/datastores Member N/A

Use a Manager token for the import

config import may create both connections and datastores. The Manager role is required for create operations on either resource.

Prerequisites

  • The CLI is installed and authenticated.
  • A folder of YAML files describing the connections and datastores to create. Layout:
qualytics-config/
├── connections/
│   ├── warehouse-prod-db.yaml
│   ├── analytics-prod-db.yaml
│   └── data-lake-prod.yaml
└── datastores/
    ├── warehouse-prod/
    │   └── _datastore.yaml
    ├── analytics-prod/
    │   └── _datastore.yaml
    └── data-lake-prod/
        └── _datastore.yaml
# connections/warehouse-prod-db.yaml
name: warehouse-prod-db
type: postgresql
host: ${WAREHOUSE_HOST}
port: 5432
username: ${WAREHOUSE_USER}
password: ${WAREHOUSE_PASSWORD}
# datastores/warehouse-prod/_datastore.yaml
name: warehouse-prod
connection: warehouse-prod-db
database: analytics
schema: public
tags:
  - production
  - warehouse
  • Environment variables for every ${...} placeholder in the connection files. Either export them in the shell or place them in a .env file in the working directory.

CLI workflow

graph LR
    Files[Config folder] --> Dry[config import --dry-run]
    Dry --> Review[Review summary]
    Review --> Apply[config import]
    Apply --> Connections[(Connections created)]
    Apply --> Datastores[(Datastores created)]

1. Set the secrets

export WAREHOUSE_HOST=warehouse.example.com
export WAREHOUSE_USER=qualytics_reader
export WAREHOUSE_PASSWORD='S3cur3p@ss'
# ... and so on for every datastore

For larger setups, keep the values in a local .env (git-ignored) and let the CLI pick them up automatically.

2. Preview the import

qualytics config import --input ./qualytics-config --dry-run

Sample output:

[DRY RUN] Would create connection: warehouse-prod-db
[DRY RUN] Would create connection: analytics-prod-db
[DRY RUN] Would create connection: data-lake-prod
[DRY RUN] Would create datastore: warehouse-prod
[DRY RUN] Would create datastore: analytics-prod
[DRY RUN] Would create datastore: data-lake-prod
                  Import Summary
┏━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓
┃ Resource      ┃ Created ┃ Updated ┃ Failed ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩
│ Connections   │ 3       │ 0       │ 0      │
│ Datastores    │ 3       │ 0       │ 0      │
└───────────────┴─────────┴─────────┴────────┘

3. Apply

qualytics config import --input ./qualytics-config

4. Run first sync on each datastore

qualytics operations sync --datastore-id 42,43,44

operations sync accepts a comma-separated list of datastore IDs.

Behind the scenes

CLI step Method Path Notes
Walk the input folder (local) Loads every .yaml under connections/ and datastores/.
Resolve ${ENV_VAR} placeholders (local) Reads from process env then .env; missing vars abort the import.
List existing connections GET /api/connections Used to decide create vs. update by name.
List existing datastores GET /api/datastores Same matching by name.
Create new connection POST /api/connections Per new connection.
Update existing connection PUT /api/connections/{id} Per changed connection.
Create new datastore POST /api/datastores Per new datastore. Connection ID resolved by name.
Update existing datastore PUT /api/datastores/{id} Per changed datastore.

Order: connections first, then datastores. The CLI does not parallelize requests; failures abort and the summary reports what was already done.

Python equivalent

The CLI handles dependency ordering, name-to-ID resolution, secret expansion, and dry-run preview. Replicating all of it is non-trivial, but here is a minimal version that creates a list of connections and datastores from a Python list:

import os
import httpx

BASE_URL = os.environ["QUALYTICS_URL"].rstrip("/")
TOKEN    = os.environ["QUALYTICS_TOKEN"]

connections = [
    {"name": "warehouse-prod-db", "type": "postgresql",
     "host": os.environ["WAREHOUSE_HOST"], "port": 5432,
     "username": os.environ["WAREHOUSE_USER"], "password": os.environ["WAREHOUSE_PASSWORD"]},
    # ... more connections
]

datastores = [
    {"name": "warehouse-prod", "connection_name": "warehouse-prod-db",
     "database": "analytics", "schema": "public",
     "tags": ["production", "warehouse"]},
    # ... more datastores
]

with httpx.Client(headers={"Authorization": f"Bearer {TOKEN}"}, timeout=60.0) as client:
    name_to_id = {}
    for conn in connections:
        r = client.post(f"{BASE_URL}/api/connections", json=conn)
        r.raise_for_status()
        name_to_id[conn["name"]] = r.json()["id"]
        print(f"connection: {conn['name']} → id={name_to_id[conn['name']]}")

    for ds in datastores:
        body = {**ds, "connection_id": name_to_id[ds.pop("connection_name")]}
        r = client.post(f"{BASE_URL}/api/datastores", json=body)
        r.raise_for_status()
        print(f"datastore: {ds['name']} → id={r.json()['id']}")

For real use, prefer the CLI: it covers idempotency, error handling, and dry-run for free.

Variations and advanced usage

Restrict which resource types get imported

# Connections only (skip datastores even if files exist)
qualytics config import --input ./qualytics-config --include connections

# Datastores only (assumes connections already exist on the target)
qualytics config import --input ./qualytics-config --include datastores

Combine with checks and computed containers

config import also imports containers/, computed_fields/, and checks/ if those folders are present. See Export and import full configuration for the complete folder schema.

CI-driven onboarding

Drop the YAML into a Git repo, run config import from a pipeline triggered on merge to main. See GitHub Actions pipelines.

Troubleshooting

Symptom Likely cause Fix
Variable WAREHOUSE_HOST not resolved The env var is missing in the calling shell export WAREHOUSE_HOST=... or add to a local .env.
Connection 'X' not found when importing a datastore The YAML references a connection that wasn't in the connections folder Add the connection YAML, or import connections first with --include connections.
Some datastores already exist, others don't Mixed state on the target Re-run; config import is upsert-safe. The summary shows created vs. updated.
403 Forbidden on connections The token's user is Member, not Manager Re-authenticate with a token that has Manager role.
Imports succeed but nothing happens after You forgot the first sync qualytics operations sync --datastore-id <id1>,<id2>,...