Test queries are written to return a set of failing records, ones not matching the expectation or assertion declared by that test: duplicate records, null values, etc.
Most often, this is the count of rows returned by the test query: the default value of
count(*). But it can also be a custom calculation, whether an aggregate calculation or simply the name of a column to be selected from the test query.
Most tests do not use the
fail_calc config, preferring to return a count of failing rows. For the tests that do, the most common place to set the
fail_calc config is right within a generic test block, alongside its query definition. All the same,
fail_calc can be set in all the same places as other configs.
For instance, you can configure a
unique test to return
sum(n_records) instead of
count(*) as the failure calculation: that is, the number of rows in the model containing a duplicated column value, rather than the number of distinct column values that are duplicated.
- Specific test
- One-off test
- Generic test block
- Project level
Configure a specific instance of a generic (schema) test:
version: 2models:- name: my_modelcolumns:- name: my_columnstests:- unique:fail_calc: "sum(n_records)"