Skip to content

Expected Values Check Examples

Three real-world scenarios that show how the Expected Values check is typically used in production: enforcing a status enum on a transactional table, enforcing a country code on a customer table with a filter, and validating an Array[String] field element-wise.

The Sample Data tables follow the platform's source-records view: one row per source record, with only the offending cell highlighted (similar to how the application marks an anomalous field).

The situation: Your orders table holds one row per order. The o_orderstatus column must be one of three values: O (open), F (filled), or P (pending). Any other value is invalid and should fire a Record Anomaly so the responsible team can fix it.

Check configuration

Field Value
Rule Expected Values
Fields o_orderstatus
List O, F, P
Filter (none)
Coverage 100%
Custom Anomaly Description Off
Status Active
Owner (check creator)
Anomaly Assignee (orders-domain owner)
Tags enum, orders
Additional Metadata jira: DATA-1101
Description Order status must be O, F, or P

Payload

{
    "description": "Order status must be O, F, or P",
    "rule": "expectedValues",
    "fields": ["o_orderstatus"],
    "container_id": 145,
    "coverage": 1,
    "filter": null,
    "properties": {"list": ["O", "F", "P"]},
    "tags": ["enum", "orders"],
    "additional_metadata": {"jira": "DATA-1101"},
    "anomaly_message_field": null,
    "template_id": null,
    "status": "Active",
    "owner_id": 7,
    "default_anomaly_assignee_id": 18
}

Sample Data

o_orderkey o_orderstatus o_totalprice
1 F 173665.47
2 O 46929.18
3 P 193846.25
4 X 32151.78

What gets flagged

Row 4's o_orderstatus = X is not in the list of expected values. The platform raises one Record Anomaly tied to the o_orderstatus field on that row. The other columns (o_orderkey, o_totalprice) are not part of the violation and are not highlighted.

Record Anomaly

The field o_orderstatus has value 'X', which is not in the list of expected values

Flowchart

graph TD
    A["No filter, evaluate all rows"] --> B{"Is o_orderstatus NULL?"}
    B -->|Yes| D["Row passes"]
    B -->|No| C{"Is o_orderstatus<br/>in (O, F, P)?"}
    C -->|Yes| D
    C -->|No| E["Emit Record Anomaly on o_orderstatus"]

Equivalent SQL

-- Rows the Expected Values check would flag on the orders table.
SELECT o_orderkey, o_orderstatus, o_totalprice
FROM orders
WHERE o_orderstatus IS NOT NULL
  AND o_orderstatus NOT IN ('O', 'F', 'P');

The situation: Your customers table is global, but the compliance team only requires the country code vocabulary to be enforced on North-American customers (region = 'NA'). The accepted codes for North America are US, CA, and MX. Customers in other regions should be ignored by this check. They are validated by region-specific checks elsewhere.

Check configuration

Field Value
Rule Expected Values
Fields country_code
List US, CA, MX
Filter region = 'NA'
Coverage 100%
Custom Anomaly Description Off
Status Active
Owner (check creator)
Anomaly Assignee (compliance lead)
Tags compliance, geo
Additional Metadata jira: DATA-1212
Description North-American country codes must be US, CA, or MX

Payload

{
    "description": "North-American country codes must be US, CA, or MX",
    "rule": "expectedValues",
    "fields": ["country_code"],
    "container_id": 212,
    "coverage": 1,
    "filter": "region = 'NA'",
    "properties": {"list": ["US", "CA", "MX"]},
    "tags": ["compliance", "geo"],
    "additional_metadata": {"jira": "DATA-1212"},
    "anomaly_message_field": null,
    "template_id": null,
    "status": "Active",
    "owner_id": 7,
    "default_anomaly_assignee_id": 24
}

Sample Data (filtered to region = 'NA')

customer_id region country_code
9001 NA US
9002 NA CA
9003 NA BR
9004 NA MX

Note

The customer with region = 'EU' is not shown here because the filter region = 'NA' excludes it before evaluation. Cross-region values are intentionally outside the scope of this check.

What gets flagged

Row 3's country_code = BR is not in the North-American vocabulary. One Record Anomaly is emitted on the country_code field of that row. The region column is part of the filter (not the rule) and is not highlighted.

Record Anomaly

The field country_code has value 'BR', which is not in the list of expected values [filter: region = 'NA']

Flowchart

graph TD
    A["Apply filter: region = 'NA'"] --> B{"Is country_code NULL?"}
    B -->|Yes| D["Row passes"]
    B -->|No| C{"Is country_code<br/>in (US, CA, MX)?"}
    C -->|Yes| D
    C -->|No| E["Emit Record Anomaly on country_code"]

Equivalent SQL

-- Rows the Expected Values check would flag on the customers table.
SELECT customer_id, region, country_code
FROM customers
WHERE region = 'NA'
  AND country_code IS NOT NULL
  AND country_code NOT IN ('US', 'CA', 'MX');

The situation: Your articles table has a tags column of type Array[String]. Every tag must come from the editorial vocabulary news, sports, tech, finance. A single out-of-vocabulary element in an array invalidates the row, even if the other elements are valid.

The platform auto-enables element-wise evaluation for Array[String] fields, so you do not need to set properties.is_element_context manually.

Check configuration

Field Value
Rule Expected Values
Fields tags
List news, sports, tech, finance
Filter (none)
Coverage 100%
Custom Anomaly Description Off
Status Active
Owner (check creator)
Anomaly Assignee (content-platform owner)
Tags vocabulary, editorial
Additional Metadata jira: DATA-1330
Description Article tags must come from the editorial vocabulary

Payload

{
    "description": "Article tags must come from the editorial vocabulary",
    "rule": "expectedValues",
    "fields": ["tags"],
    "container_id": 318,
    "coverage": 1,
    "filter": null,
    "properties": {
        "list": ["news", "sports", "tech", "finance"],
        "is_element_context": true
    },
    "tags": ["vocabulary", "editorial"],
    "additional_metadata": {"jira": "DATA-1330"},
    "anomaly_message_field": null,
    "template_id": null,
    "status": "Active",
    "owner_id": 7,
    "default_anomaly_assignee_id": 31
}

Sample Data

article_id title tags
a-001 Markets close higher ["finance", "news"]
a-002 Game recap ["sports"]
a-003 New gadget review ["tech", "gossip"]
a-004 Daily roundup []

What gets flagged

Row 3's tags array contains gossip, which is not in the editorial vocabulary. The whole tags cell is marked because the violation is at the field level (the array as a whole failed array_forall). Rows with an empty or NULL array pass.

Record Anomaly

The field tags has value '["tech", "gossip"]', which is not in the list of expected values

Flowchart

graph TD
    A["No filter, evaluate all rows"] --> B{"Is tags NULL or empty?"}
    B -->|Yes| D["Row passes"]
    B -->|No| C{"Is every element of tags<br/>in (news, sports, tech, finance)?"}
    C -->|Yes| D
    C -->|No| E["Emit Record Anomaly on tags"]

Equivalent SQL

-- Rows the Expected Values check would flag on the articles table.
SELECT article_id, title, tags
FROM articles
WHERE tags IS NOT NULL
  AND size(tags) > 0
  AND NOT array_forall(tags, x -> x IN ('news', 'sports', 'tech', 'finance'));