Unique Check FAQ
Common questions about how the Unique check evaluates uniqueness, how it handles NULLs and filters, and how anomalies are reported.
Behavior
What happens when I select multiple fields?
The combination of values across the selected fields must be unique on every row (composite key). Individual columns can repeat values; only the combined tuple has to be unique. For example, a Unique check on (order_id, line_number) allows order_id to repeat and line_number to repeat, but the pair must be unique.
How are NULLs treated?
The check treats NULL as a real value when grouping rows. Multiple rows where the selected field is NULL (or where every field of the composite key is NULL) are counted as duplicates and flagged as a violation. This is different from a SQL UNIQUE constraint, where NULLs are treated as distinct from each other. If you need SQL-style NULL semantics, combine the Unique check with a WHERE field IS NOT NULL filter.
Does the filter clause run before or after the uniqueness check?
Before. The platform applies the filter first and then evaluates uniqueness only on the rows that pass the filter. This lets you scope a Unique check to a subset of records (for example, only status = 'active' rows) without flagging duplicates that exist outside the scope.
Anomaly Reporting
Why does the violation count include both copies of a duplicate, not just the extra one?
The platform reports every row that participates in a duplicate group as part of the violation. If (Customer_A, 123 Main St) appears on rows 1 and 3, both rows are flagged, not just row 3. This makes the resulting anomaly count match the rows you would see by running a SELECT ... WHERE (a, b) IN (duplicates) query against the source data.
What does the Shape Anomaly message look like?
Single-field: For the field 'order_id', X.XXX% of N records (K) contain duplicate values
Composite key (multi-field): For the field 'order_id, line_number', X.XXX% of N records (K) contain duplicate values
The template wording uses "field" (singular) even when multiple fields are checked; the joined field list inside the quotes is how you tell single-field from composite-key checks at a glance.
Why doesn't Unique produce a Record Anomaly?
Unique is a shape-only rule type: the violation is a property of the dataset as a whole (a duplicate group), not of an individual record's value. Other rule types (such as Min Value or Matches Pattern) can produce Record Anomalies because each row's value can be evaluated independently against the rule.
Configuration
Does Unique have any rule-specific properties?
For scalar fields, no: the rule is fully defined by the fields it applies to, the coverage, and the optional filter clause, and properties is null. For Array fields, there is one optional property: setting properties to {"is_element_context": true} makes the check evaluate uniqueness of elements within each array instead of treating the array as a single value. See the API page for the payload shape.
Can I lower the coverage on a Unique check?
Yes. Coverage at 100% (1.0) means every row must be part of a unique tuple; lowering it to, say, 99.5% (0.995) means the check tolerates up to 0.5% of records appearing in duplicate groups before flagging an anomaly. This is useful when a small known gap of duplicates is acceptable.
Can I combine Unique with a Not Null check on the same column?
Yes, and it is a common pattern when you want true primary-key semantics. Add a Not Null check alongside the Unique check on the same field; together they enforce the equivalent of a SQL PRIMARY KEY (UNIQUE + NOT NULL).
Does Custom Anomaly Description (the anomaly_message_field payload field) work for Unique?
No. The Custom Anomaly Description toggle (and the corresponding anomaly_message_field payload field) only affects Record Anomaly messages. Because Unique emits only Shape Anomalies, the field is silently ignored at evaluation time, and the resulting anomaly message uses the fixed Shape Anomaly template described above.
Related
- Introduction: formal definition, modes overview, field scope, and general/anomaly properties.
- How It Works: full semantics, NULL handling, filter behavior, and edge cases.
- API: payload example and field notes for creating a Unique check programmatically.
- Examples: three production scenarios with sample data and resulting anomalies.