Targeted scans by container or tag
Goal
Run a scan only against the tables (or files) that matter, rather than the entire datastore. This is the right approach when a few tables changed and you want quick feedback, when you tag tables by criticality and only want to scan the high-tier ones, or when you're investigating a single table during incident response.
Permissions
| Step | Endpoint | Role | Team permission |
|---|---|---|---|
| Run scan | POST /api/operations/run |
Member |
Editor on the datastore's team |
Look up containers (for --container-names) |
GET /api/containers |
Member |
Reporter |
Look up tags (for --container-tags) |
GET /api/global-tags |
Member |
N/A |
Prerequisites
- The CLI is installed and authenticated.
- The target datastore exists and has been synced at least once.
- For
--container-tags: the relevant tags have been created and applied to containers (see Tags).
CLI workflow
graph LR
Decide{Filter by} --> Names[--container-names]
Decide --> Tags[--container-tags]
Names --> Scan[operations scan]
Tags --> Scan
Scan --> Anom[Anomalies for filtered set only]
Filter by name
Filter by tag
Combine with other scan flags
The container filter applies on top of every other scan option:
qualytics operations scan \
--datastore-id 42 \
--container-tags "critical" \
--incremental \
--auto-resolve-passed-anomalies
Behind the scenes
| CLI step | Method | Path | Notes |
|---|---|---|---|
| (Optional) Resolve container names to IDs | GET | /api/containers |
The CLI does this before sending the scan request. |
| Trigger scan | POST | /api/operations/run |
Body includes type: scan, datastore_ids, and container_ids or container_tags. |
| Poll for completion | GET | /api/operations/{operation_id} |
Standard polling loop. |
Python equivalent
import os
import time
import httpx
BASE_URL = os.environ["QUALYTICS_URL"].rstrip("/")
TOKEN = os.environ["QUALYTICS_TOKEN"]
HEADERS = {"Authorization": f"Bearer {TOKEN}"}
DATASTORE_ID = 42
CONTAINER_NAMES = ["orders", "order_items", "customers"]
with httpx.Client(headers=HEADERS, timeout=60.0) as client:
# Resolve names to IDs
r = client.get(
f"{BASE_URL}/api/containers",
params={"datastore_id": DATASTORE_ID, "name": ",".join(CONTAINER_NAMES)},
)
r.raise_for_status()
container_ids = [c["id"] for c in r.json() if c["name"] in CONTAINER_NAMES]
# Trigger targeted scan
r = client.post(
f"{BASE_URL}/api/operations/run",
json={
"type": "scan",
"datastore_ids": [DATASTORE_ID],
"container_ids": container_ids,
},
)
r.raise_for_status()
op_id = r.json()["id"]
# Poll
while True:
op = client.get(f"{BASE_URL}/api/operations/{op_id}").json()
if op["result"] in ("success", "failure", "aborted"):
print(f"scan: {op['result']}")
break
time.sleep(5)
Variations and advanced usage
Tag-driven workflows
Tagging containers gives you a single primitive that drives scans, profile runs, exports, and anomaly filters:
# Apply a tag at datastore creation
qualytics datastores create --name ds --connection-name c --database d --schema s \
--tags "pii,production"
# Scan only PII containers nightly
qualytics operations scan --datastore-id 42 --container-tags pii
# Triage only PII anomalies
qualytics anomalies list --datastore-id 42 --tag pii --status Active
Profile only changed tables
Same flags work on profile:
Export only tagged containers
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
Container 'X' not found |
The datastore was renamed or the table was dropped between sync and scan | Re-sync first: qualytics operations sync --datastore-id 42. |
--container-tags matches nothing |
The tag exists globally but isn't applied to any container in this datastore | Confirm with qualytics containers list --datastore-id 42 --tag <name>. |
| Scan completes but no anomalies appear | The filtered containers have no active checks | qualytics checks list --datastore-id 42 --containers "<id1>,<id2>". |
| Mix of names and IDs failing | The CLI accepts names or IDs, not both at once | Pick one. Use --container-names for portability. |
Related
- Daily sync, profile, and scan: the full pipeline this is a subset of.
- Incremental scans for large tables: scan only the new rows, not the whole table.
- Tags command reference: create and manage tags.
- Operations command reference: every flag for
scan.