Tests

Related reference docs

Getting started

Tests are assertions you make about your models and other resources in your dbt project (e.g. sources, seeds and snapshots). When you run dbt test, dbt will tell you if each test in your project passes or fails.

Changelog

Like almost everything in dbt, tests are SQL queries. In particular, they are select statements that seek to grab "failing" records, ones that disprove your assertion. If you assert that a column is unique in a model, the test query selects for duplicates; if you assert that a column is never null, the test seeks after nulls. If the test returns zero failing rows, it passes, and your assertion has been validated.

There are two ways of defining tests in dbt:

  • A bespoke test (sometimes called "data test") is testing in its simplest form: If you can write a SQL query that returns failing rows, you can save that query in a .sql file within your test directory. It's now a test, and it will be executed by the dbt test command.
  • A generic test (sometimes called "schema test") is a parametrized query that accepts arguments. The test query is defined in a special test block (like a macro). Once defined, you can reference the generic test by name throughout your .yml files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic tests built in, and we think you should use them!

Defining tests is a great way to confirm that your code is working correctly, and helps prevent regressions when your code changes. Because you can use them over and over again, making similar assertions with minor variations, generic tests tend to be much more common—they should make up the bulk of your dbt testing suite. That said, both ways of defining tests have their time and place.

Creating your first tests

If you're new to dbt, we recommend that you check out our Getting Started Tutorial to build your first dbt project with models and tests.

Bespoke tests

The simplest way to define a test is by writing the exact SQL that will return failing records. We call these bespoke, one-off, or "data" tests.

These tests are defined in .sql files, typically in your tests directory (as defined by your test-paths config). You can use Jinja (including ref and source) in the test definition, just like you can when creating models. Each .sql file contains one select statement, and it defines one test:

tests/assert_total_payment_amount_is_positive.sql
-- Refunds have a negative amount, so the total amount should always be >= 0.
-- Therefore return records where this isn't true to make the test fail
select
order_id,
sum(amount) as total_amount
from {{ ref('fct_payments' )}}
group by 1
having not(total_amount >= 0)

The name of this test is the name of the file: assert_total_payment_amount_is_positive. Simple enough.

Bespoke tests are easy to write—so easy that you may find yourself writing the same basic structure over and over, only changing the name of a column or model. In that case, we recommend...

Generic tests

Certain tests are generic: they can be reused over and over again. A generic test is defined in a test block, which contains a parametrized query and accepts arguments. It might look like:

{% test not_null(model, column_name) %}
select *
from {{ model }}
where {{ column_name }} is null
{% endtest %}

You'll notice that there are two arguments, model and column_name, which are then templated into the query. This is what makes the test "generic": it can be defined on as many columns as you like, across as many models as you like, and dbt will pass the values of model and column_name accordingly. Once that generic test has been defined, it can be added as a property on any existing model (or source, seed, or snapshot). These properties are added in .yml files in the same directory as your resource.

info

If this is your first time working with adding properties to a resource, check out the docs on declaring properties.

Out of the box, dbt ships with four generic tests already defined: unique, not_null, accepted_values and relationships. Here's a full example using those tests on an orders model:

version: 2
models:
- name: orders
columns:
- name: order_id
tests:
- unique
- not_null
- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'completed', 'returned']
- name: customer_id
tests:
- relationships:
to: ref('customers')
field: id

In plain English, these tests translate to:

  • unique: the order_id column in the orders model should be unique
  • not_null: the order_id column in the orders model should not contain null values
  • accepted_values: the status column in the orders should be one of 'placed', 'shipped', 'completed', or 'returned'
  • relationships: each customer_id in the orders model exists as an id in the customers table (also known as referential integrity)

Behind the scenes, dbt constructs a select query for each test, using the parametrized query from the generic test block. These queries return the rows where your assertion is not true; if the test returns zero rows, your assertion passes.

You can find more information about these tests, and additional configurations (including severity and tags) in the reference section.

You can also install generic tests from a package, or write your own, to use (and reuse) across your dbt project. Check out the guide for more information.

info

There are generic tests defined in some open source packages, such as dbt-utils and dbt-expectations — skip ahead to the docs on packages to learn more!

Example

To add a generic (or "schema") test to your project:

  1. Add a .yml file to your models directory, e.g. models/schema.yml, with the following content (you may need to adjust the name: values for an existing model)
models/schema.yml
version: 2
models:
- name: orders
columns:
- name: order_id
tests:
- unique
- not_null
  1. Run the dbt test command:
$ dbt test
Found 3 models, 2 tests, 0 snapshots, 0 analyses, 130 macros, 0 operations, 0 seed files, 0 sources
17:31:05 | Concurrency: 1 threads (target='learn')
17:31:05 |
17:31:05 | 1 of 2 START test not_null_order_order_id..................... [RUN]
17:31:06 | 1 of 2 PASS not_null_order_order_id........................... [PASS in 0.99s]
17:31:06 | 2 of 2 START test unique_order_order_id....................... [RUN]
17:31:07 | 2 of 2 PASS unique_order_order_id............................. [PASS in 0.79s]
17:31:07 |
17:31:07 | Finished running 2 tests in 7.17s.
Completed successfully
Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
  1. Check out the SQL dbt is running by either:
    • dbt Cloud: checking the Details tab.
    • dbt CLI: checking the target/compiled directory

Unique test

select *
from (
select
order_id
from analytics.orders
where order_id is not null
group by order_id
having count(*) > 1
) validation_errors

Not null test

select *
from analytics.orders
where order_id is null

Storing test failures

Changelog

Normally, a test query will calculate failures as part of its execution. If you set the optional --store-failures flag or store_failures config, dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures.

This workflow allows you to query and examine failing records much more quickly in development:

Store test failures in the database for faster development-time debugging.

Store test failures in the database for faster development-time debugging.

Note that, if you elect to store test failures:

  • Test result tables are created in a schema suffixed or named dbt_test__audit, by default. It is possible to change this value by setting a schema config. (For more details on schema naming, see using custom schemas.)
  • A test's results will always replace previous failures for the same test.

FAQs

 How do I test one model at a time?
 One of my tests failed, how can I debug it?
 What tests should I add to my project?
 When should I run my tests?
 Can I store my data tests in a directory other than the `test` directory in my project?
 How do I run tests on sources only?
 Can I set test failure thresholds?
 Can I test the uniqueness of two columns?