Expected Values Check Examples
Three real-world scenarios that show how the Expected Values check is typically used in production: enforcing a status enum on a transactional table, enforcing a country code on a customer table with a filter, and validating an Array[String] field element-wise.
The Sample Data tables follow the platform's source-records view: one row per source record, with only the offending cell highlighted (similar to how the application marks an anomalous field).
The situation: Your orders table holds one row per order. The o_orderstatus column must be one of three values: O (open), F (filled), or P (pending). Any other value is invalid and should fire a Record Anomaly so the responsible team can fix it.
Check configuration
| Field | Value |
|---|---|
| Rule | Expected Values |
| Fields | o_orderstatus |
| List | O, F, P |
| Filter | (none) |
| Coverage | 100% |
| Custom Anomaly Description | Off |
| Status | Active |
| Owner | (check creator) |
| Anomaly Assignee | (orders-domain owner) |
| Tags | enum, orders |
| Additional Metadata | jira: DATA-1101 |
| Description | Order status must be O, F, or P |
Payload
{
"description": "Order status must be O, F, or P",
"rule": "expectedValues",
"fields": ["o_orderstatus"],
"container_id": 145,
"coverage": 1,
"filter": null,
"properties": {"list": ["O", "F", "P"]},
"tags": ["enum", "orders"],
"additional_metadata": {"jira": "DATA-1101"},
"anomaly_message_field": null,
"template_id": null,
"status": "Active",
"owner_id": 7,
"default_anomaly_assignee_id": 18
}
Sample Data
| o_orderkey | o_orderstatus | o_totalprice |
|---|---|---|
| 1 | F | 173665.47 |
| 2 | O | 46929.18 |
| 3 | P | 193846.25 |
| 4 | X | 32151.78 |
What gets flagged
Row 4's o_orderstatus = X is not in the list of expected values. The platform raises one Record Anomaly tied to the o_orderstatus field on that row. The other columns (o_orderkey, o_totalprice) are not part of the violation and are not highlighted.
Record Anomaly
The field o_orderstatus has value 'X', which is not in the list of expected values
Flowchart
graph TD
A["No filter, evaluate all rows"] --> B{"Is o_orderstatus NULL?"}
B -->|Yes| D["Row passes"]
B -->|No| C{"Is o_orderstatus<br/>in (O, F, P)?"}
C -->|Yes| D
C -->|No| E["Emit Record Anomaly on o_orderstatus"]
Equivalent SQL
The situation: Your customers table is global, but the compliance team only requires the country code vocabulary to be enforced on North-American customers (region = 'NA'). The accepted codes for North America are US, CA, and MX. Customers in other regions should be ignored by this check. They are validated by region-specific checks elsewhere.
Check configuration
| Field | Value |
|---|---|
| Rule | Expected Values |
| Fields | country_code |
| List | US, CA, MX |
| Filter | region = 'NA' |
| Coverage | 100% |
| Custom Anomaly Description | Off |
| Status | Active |
| Owner | (check creator) |
| Anomaly Assignee | (compliance lead) |
| Tags | compliance, geo |
| Additional Metadata | jira: DATA-1212 |
| Description | North-American country codes must be US, CA, or MX |
Payload
{
"description": "North-American country codes must be US, CA, or MX",
"rule": "expectedValues",
"fields": ["country_code"],
"container_id": 212,
"coverage": 1,
"filter": "region = 'NA'",
"properties": {"list": ["US", "CA", "MX"]},
"tags": ["compliance", "geo"],
"additional_metadata": {"jira": "DATA-1212"},
"anomaly_message_field": null,
"template_id": null,
"status": "Active",
"owner_id": 7,
"default_anomaly_assignee_id": 24
}
Sample Data (filtered to region = 'NA')
| customer_id | region | country_code |
|---|---|---|
| 9001 | NA | US |
| 9002 | NA | CA |
| 9003 | NA | BR |
| 9004 | NA | MX |
Note
The customer with region = 'EU' is not shown here because the filter region = 'NA' excludes it before evaluation. Cross-region values are intentionally outside the scope of this check.
What gets flagged
Row 3's country_code = BR is not in the North-American vocabulary. One Record Anomaly is emitted on the country_code field of that row. The region column is part of the filter (not the rule) and is not highlighted.
Record Anomaly
The field country_code has value 'BR', which is not in the list of expected values [filter: region = 'NA']
Flowchart
graph TD
A["Apply filter: region = 'NA'"] --> B{"Is country_code NULL?"}
B -->|Yes| D["Row passes"]
B -->|No| C{"Is country_code<br/>in (US, CA, MX)?"}
C -->|Yes| D
C -->|No| E["Emit Record Anomaly on country_code"]
Equivalent SQL
The situation: Your articles table has a tags column of type Array[String]. Every tag must come from the editorial vocabulary news, sports, tech, finance. A single out-of-vocabulary element in an array invalidates the row, even if the other elements are valid.
The platform auto-enables element-wise evaluation for Array[String] fields, so you do not need to set properties.is_element_context manually.
Check configuration
| Field | Value |
|---|---|
| Rule | Expected Values |
| Fields | tags |
| List | news, sports, tech, finance |
| Filter | (none) |
| Coverage | 100% |
| Custom Anomaly Description | Off |
| Status | Active |
| Owner | (check creator) |
| Anomaly Assignee | (content-platform owner) |
| Tags | vocabulary, editorial |
| Additional Metadata | jira: DATA-1330 |
| Description | Article tags must come from the editorial vocabulary |
Payload
{
"description": "Article tags must come from the editorial vocabulary",
"rule": "expectedValues",
"fields": ["tags"],
"container_id": 318,
"coverage": 1,
"filter": null,
"properties": {
"list": ["news", "sports", "tech", "finance"],
"is_element_context": true
},
"tags": ["vocabulary", "editorial"],
"additional_metadata": {"jira": "DATA-1330"},
"anomaly_message_field": null,
"template_id": null,
"status": "Active",
"owner_id": 7,
"default_anomaly_assignee_id": 31
}
Sample Data
| article_id | title | tags |
|---|---|---|
| a-001 | Markets close higher | ["finance", "news"] |
| a-002 | Game recap | ["sports"] |
| a-003 | New gadget review | ["tech", "gossip"] |
| a-004 | Daily roundup | [] |
What gets flagged
Row 3's tags array contains gossip, which is not in the editorial vocabulary. The whole tags cell is marked because the violation is at the field level (the array as a whole failed array_forall). Rows with an empty or NULL array pass.
Record Anomaly
The field tags has value '["tech", "gossip"]', which is not in the list of expected values
Flowchart
graph TD
A["No filter, evaluate all rows"] --> B{"Is tags NULL or empty?"}
B -->|Yes| D["Row passes"]
B -->|No| C{"Is every element of tags<br/>in (news, sports, tech, finance)?"}
C -->|Yes| D
C -->|No| E["Emit Record Anomaly on tags"]
Equivalent SQL