Bulk datastore onboarding
Goal
Onboard ten, twenty, or fifty datastores in a single command, defined as YAML files in version control. This is the go-to approach for new customer rollouts where the source list is known up front, and the natural pattern for teams already using config-as-code for their other infrastructure.
Permissions
| Step | Endpoint | Role | Team permission |
|---|---|---|---|
| Create / update connections | POST /api/connections, PUT /api/connections/{id} |
Manager |
N/A |
| Create / update datastores | POST /api/datastores, PUT /api/datastores/{id} |
Manager (create), Member + Editor (update) |
Editor for updates |
| Read existing connections / datastores | GET /api/connections, GET /api/datastores |
Member |
N/A |
Use a Manager token for the import
config import may create both connections and datastores. The Manager role is required for create operations on either resource.
Prerequisites
- The CLI is installed and authenticated.
- A folder of YAML files describing the connections and datastores to create. Layout:
qualytics-config/
├── connections/
│ ├── warehouse-prod-db.yaml
│ ├── analytics-prod-db.yaml
│ └── data-lake-prod.yaml
└── datastores/
├── warehouse-prod/
│ └── _datastore.yaml
├── analytics-prod/
│ └── _datastore.yaml
└── data-lake-prod/
└── _datastore.yaml
# connections/warehouse-prod-db.yaml
name: warehouse-prod-db
type: postgresql
host: ${WAREHOUSE_HOST}
port: 5432
username: ${WAREHOUSE_USER}
password: ${WAREHOUSE_PASSWORD}
# datastores/warehouse-prod/_datastore.yaml
name: warehouse-prod
connection: warehouse-prod-db
database: analytics
schema: public
tags:
- production
- warehouse
- Environment variables for every
${...}placeholder in the connection files. Either export them in the shell or place them in a.envfile in the working directory.
CLI workflow
graph LR
Files[Config folder] --> Dry[config import --dry-run]
Dry --> Review[Review summary]
Review --> Apply[config import]
Apply --> Connections[(Connections created)]
Apply --> Datastores[(Datastores created)]
1. Set the secrets
export WAREHOUSE_HOST=warehouse.example.com
export WAREHOUSE_USER=qualytics_reader
export WAREHOUSE_PASSWORD='S3cur3p@ss'
# ... and so on for every datastore
For larger setups, keep the values in a local .env (git-ignored) and let the CLI pick them up automatically.
2. Preview the import
Sample output:
[DRY RUN] Would create connection: warehouse-prod-db
[DRY RUN] Would create connection: analytics-prod-db
[DRY RUN] Would create connection: data-lake-prod
[DRY RUN] Would create datastore: warehouse-prod
[DRY RUN] Would create datastore: analytics-prod
[DRY RUN] Would create datastore: data-lake-prod
Import Summary
┏━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓
┃ Resource ┃ Created ┃ Updated ┃ Failed ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩
│ Connections │ 3 │ 0 │ 0 │
│ Datastores │ 3 │ 0 │ 0 │
└───────────────┴─────────┴─────────┴────────┘
3. Apply
4. Run first sync on each datastore
operations sync accepts a comma-separated list of datastore IDs.
Behind the scenes
| CLI step | Method | Path | Notes |
|---|---|---|---|
| Walk the input folder | (local) | — | Loads every .yaml under connections/ and datastores/. |
Resolve ${ENV_VAR} placeholders |
(local) | — | Reads from process env then .env; missing vars abort the import. |
| List existing connections | GET | /api/connections |
Used to decide create vs. update by name. |
| List existing datastores | GET | /api/datastores |
Same matching by name. |
| Create new connection | POST | /api/connections |
Per new connection. |
| Update existing connection | PUT | /api/connections/{id} |
Per changed connection. |
| Create new datastore | POST | /api/datastores |
Per new datastore. Connection ID resolved by name. |
| Update existing datastore | PUT | /api/datastores/{id} |
Per changed datastore. |
Order: connections first, then datastores. The CLI does not parallelize requests; failures abort and the summary reports what was already done.
Python equivalent
The CLI handles dependency ordering, name-to-ID resolution, secret expansion, and dry-run preview. Replicating all of it is non-trivial, but here is a minimal version that creates a list of connections and datastores from a Python list:
import os
import httpx
BASE_URL = os.environ["QUALYTICS_URL"].rstrip("/")
TOKEN = os.environ["QUALYTICS_TOKEN"]
connections = [
{"name": "warehouse-prod-db", "type": "postgresql",
"host": os.environ["WAREHOUSE_HOST"], "port": 5432,
"username": os.environ["WAREHOUSE_USER"], "password": os.environ["WAREHOUSE_PASSWORD"]},
# ... more connections
]
datastores = [
{"name": "warehouse-prod", "connection_name": "warehouse-prod-db",
"database": "analytics", "schema": "public",
"tags": ["production", "warehouse"]},
# ... more datastores
]
with httpx.Client(headers={"Authorization": f"Bearer {TOKEN}"}, timeout=60.0) as client:
name_to_id = {}
for conn in connections:
r = client.post(f"{BASE_URL}/api/connections", json=conn)
r.raise_for_status()
name_to_id[conn["name"]] = r.json()["id"]
print(f"connection: {conn['name']} → id={name_to_id[conn['name']]}")
for ds in datastores:
body = {**ds, "connection_id": name_to_id[ds.pop("connection_name")]}
r = client.post(f"{BASE_URL}/api/datastores", json=body)
r.raise_for_status()
print(f"datastore: {ds['name']} → id={r.json()['id']}")
For real use, prefer the CLI: it covers idempotency, error handling, and dry-run for free.
Variations and advanced usage
Restrict which resource types get imported
# Connections only (skip datastores even if files exist)
qualytics config import --input ./qualytics-config --include connections
# Datastores only (assumes connections already exist on the target)
qualytics config import --input ./qualytics-config --include datastores
Combine with checks and computed containers
config import also imports containers/, computed_fields/, and checks/ if those folders are present. See Export and import full configuration for the complete folder schema.
CI-driven onboarding
Drop the YAML into a Git repo, run config import from a pipeline triggered on merge to main. See GitHub Actions pipelines.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
Variable WAREHOUSE_HOST not resolved |
The env var is missing in the calling shell | export WAREHOUSE_HOST=... or add to a local .env. |
Connection 'X' not found when importing a datastore |
The YAML references a connection that wasn't in the connections folder | Add the connection YAML, or import connections first with --include connections. |
| Some datastores already exist, others don't | Mixed state on the target | Re-run; config import is upsert-safe. The summary shows created vs. updated. |
403 Forbidden on connections |
The token's user is Member, not Manager |
Re-authenticate with a token that has Manager role. |
| Imports succeed but nothing happens after | You forgot the first sync | qualytics operations sync --datastore-id <id1>,<id2>,... |
Related
- Config as Code command reference
- Export and import full configuration
- Onboard a single datastore: the manual equivalent for a single source.
- GitHub Actions pipelines: automate the whole flow on merge.