Scheduled metadata exports
Goal
On a recurring schedule, push Qualytics metadata (anomalies, checks, field profiles) to your enrichment datastore. Common reasons: long-term audit retention, building a historical anomaly trend dataset, hydrating a downstream BI dashboard, or satisfying a compliance requirement to snapshot quality state daily.
Permissions
| Step | Endpoint | Role | Team permission |
|---|---|---|---|
| Schedule the export (writes a local cron entry / scheduled task) | (no API call at schedule time) | Member |
N/A |
| Each scheduled run: trigger export operation | POST /api/operations/run (export) |
Member |
Editor on the datastore's team |
| Each scheduled run: poll status | GET /api/operations/{id} |
Member |
Reporter |
Schedules run as the local user
The CLI installs a system cron entry (Linux/macOS) or PowerShell scheduled task (Windows). The schedule runs as whoever the OS user is at run time, with the token saved in ~/.qualytics/config.yaml. Treat the host as you would any system that holds a long-lived credential.
Prerequisites
- The datastore has an enrichment datastore linked. See Link Enrichment Datastore.
- The host running the schedule has the CLI installed and authenticated.
- A valid crontab expression (
* * * * *) for when to run.
CLI workflow
graph LR
Cmd[qualytics schedule export-metadata] --> OS{Platform}
OS -->|Linux/macOS| Cron[crontab entry]
OS -->|Windows| PS[PowerShell scheduled task]
Cron --> Run[Each tick: run export op]
PS --> Run
Run --> Enrich[(Enrichment datastore)]
Daily anomaly + check snapshot at 6 AM
qualytics schedule export-metadata \
--crontab "0 6 * * *" \
--datastore 42 \
--options "anomalies,checks"
Hourly anomaly export for a single container
qualytics schedule export-metadata \
--crontab "0 * * * *" \
--datastore 42 \
--containers "100" \
--options "anomalies"
Everything (anomalies, checks, field profiles)
Behind the scenes
At schedule time, the CLI writes a system schedule entry but doesn't call the API. At each tick, the schedule runs qualytics operations export ..., which then makes the API calls.
| Trigger | Method | Path | Notes |
|---|---|---|---|
| Each scheduled run | POST | /api/operations/run (export) |
Body: type: export, datastore_ids, asset_type (anomalies/checks/profiles). |
| Polling | GET | /api/operations/{id} |
Until success/failure/aborted. |
| Platform | Where the schedule lives |
|---|---|
| Linux / macOS | crontab for the current user; helper file at ~/.qualytics/crontab_commands.txt |
| Logs | ~/.qualytics/schedule_<option>.txt |
| Errors (Linux) | ~/.qualytics/crontab_errors.txt |
| Windows | A PowerShell script at ~/.qualytics/<name>.ps1. Register it manually in Task Scheduler. |
Python equivalent
The schedule itself is OS-level (cron or Task Scheduler). For the per-tick action, the Python equivalent of operations export:
import os
import time
import httpx
BASE_URL = os.environ["QUALYTICS_URL"].rstrip("/")
TOKEN = os.environ["QUALYTICS_TOKEN"]
HEADERS = {"Authorization": f"Bearer {TOKEN}"}
with httpx.Client(headers=HEADERS, timeout=60.0) as client:
r = client.post(f"{BASE_URL}/api/operations/run", json={
"type": "export",
"datastore_ids": [42],
"asset_type": "anomalies", # or "checks" or "profiles"
})
r.raise_for_status()
op_id = r.json()["id"]
while True:
op = client.get(f"{BASE_URL}/api/operations/{op_id}").json()
if op["result"] in ("success", "failure", "aborted"):
print(op["result"])
break
time.sleep(10)
You'd then schedule this script with cron / systemd / Task Scheduler yourself.
Variations and advanced usage
Listing what's scheduled
The CLI doesn't surface the list of schedules directly; inspect the OS:
Removing a schedule
Edit crontab -e (Linux/macOS) or use Task Scheduler (Windows). The CLI doesn't currently provide a schedule remove command.
Targeting many containers
--containers accepts a comma-separated list:
qualytics schedule export-metadata \
--crontab "0 6 * * *" \
--datastore 42 \
--containers "100,101,102" \
--options "anomalies,checks"
Combining with operations on a CI runner
If you don't want a long-lived host running cron, schedule the same job from your CI provider (GitHub Actions, GitLab CI). Run qualytics operations export directly without the schedule wrapper. See GitHub Actions pipelines.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
Invalid crontab expression |
Wrong number of fields, or out-of-range values | Standard 5-field crontab: min hour dom mon dow. Validate with crontab.guru. |
| Schedule installed but never runs | The system cron daemon is not running, or the user has no crontab access | systemctl status cron (Linux). Confirm the crontab line with crontab -l. |
| "No enrichment datastore" error at run time | The source datastore isn't linked to an enrichment datastore | Link one in the web app, or via API; see Link Enrichment Datastore. |
| Token expires after schedule was set | The token in ~/.qualytics/config.yaml is now invalid |
Use a service token (non-expiring), or rotate and re-init: qualytics auth init --url ... --token ... |
| Logs show success but no new records in enrichment | The schedule fired but produced an empty result | Confirm there are anomalies/checks/profiles to export in the time window with qualytics anomalies list ... |
Related
- Automation & CI/CD command reference
- Operations command reference:
operations exportis the per-tick action. - GitHub Actions pipelines: an alternative if you don't want a host with cron.
- Link Enrichment Datastore