DBT — Generic Custom Tests for Partitioned Tables

Anks Sarraf
3 min readFeb 14, 2023

--

Pre-requisites:

This post assumes that one is aware of the basics of DBT and Google BigQuery.

Why this post:

I recently was working on creating a partitioned table. As a part of the unit testing, I wanted to add some generic tests on the Model.

Turns out, unit testing a partitioned table with generic custom tests throws a challenge that a normal table doesn’t.

I’d like to share the problem I faced and present the solution that helped.

How to reproduce the issue?

The Partitioned Table Definition:

{# my_partitioned_table.sql #}

{{config(
materialized = 'table',
partition_by={'field': 'date_snapshot', 'data_type': 'date'},
cluster_by='policy_id',
incremental_strategy = 'insert_overwrite',
require_partition_filter = true
)}}

{% set insurance_types=['flood', 'auto', 'earthquake', 'life', 'theft'] %}
{% set all_dates=['2023-01-0'] %}

{% for i in range(0, 10) %}
{% for one_date in all_dates %}
SELECT
CAST('{{ one_date }}{{ range(1, 6) | random }}' AS DATE) AS date_snapshot,
GENERATE_UUID() AS policy_id,
'{{ insurance_types | random }}' AS insurance_type
{% endfor %}
{% if not loop.last %}
UNION ALL
{% endif %}
{% endfor %}

To DBT-run this table:

$ dbt run --full-refresh my_partitioned_table

Creates a Table in BigQuery:

Add tests in the .yml file

Assume we want to check if only one of the values theft , auto , flood , earthquake , life is a legitimate value for the field insurance_type

# partitioned_table.yml
version: 2

models:
- name: my_partitioned_table
description: >
Dummy Table to demonstrate how Generic tests can be used in DBT for Partitioned tables
columns:
- name: insurance_type
description: Type of Insurance
tests:
- accepted_values:
values:
- 'flood'
- 'auto'
- 'earthquake'
- 'life'
- 'theft'

Test the table using Generic DBT Tests

$ dbt test --models my_partitioned_table    
05:11:17 Running with dbt=1.2.0
:
:
:
05:11:48 1 of 1 START test accepted_values_my_partitioned_table_insurance_type__flood__auto__earthquake__life__theft [RUN]
05:11:50 1 of 1 ERROR accepted_values_my_partitioned_table_insurance_type__flood__auto__earthquake__life__theft [ERROR in 1.74s]
:
:
:
05:11:53 Completed with 1 error and 0 warnings:
05:11:53
05:11:53 Database Error in test accepted_values_my_partitioned_table_insurance_type__flood__auto__earthquake__life__theft (dbt/models/partitioned_tables/partitioned_table.yml)
05:11:53 Cannot query over table '<some_path>.my_partitioned_table' without a filter over column(s) 'date_snapshot' that can be used for partition elimination
:
:
:
05:11:53 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Isolating Error Message:

Cannot query over table '<some_path>.my_partitioned_table' without a filter over column(s) 'date_snapshot' that can be used for partition elimination

Reason for Error:

All the generic tests inherently convert into SQLs and then they are executed to test.

In this case, the Test SQL didn’t haveWHERE Clause

Solution?

Include the where clause in the .ymlconfig

# Updated partitioned_table.yml
version: 2

models:
- name: my_partitioned_table
description: >
Dummy Table to demonstrate how Generic tests can be used in DBT for Partitioned tables
columns:
- name: insurance_type
description: Type of Insurance
tests:
- accepted_values:
values:
- 'flood'
- 'auto'
- 'earthquake'
- 'life'
- 'theft'
# Added the below line
where: date_snapshot is not null

Run Tests Again:

$ dbt test --models my_partitioned_table
05:12:24 Running with dbt=1.2.0
:
:
:
05:12:56 1 of 1 START test accepted_values_my_partitioned_table_insurance_type__flood__auto__earthquake__life__theft [RUN]
05:12:57 1 of 1 PASS accepted_values_my_partitioned_table_insurance_type__flood__auto__earthquake__life__theft [PASS in 1.26s]
05:12:57
:
:
:
05:12:59 Finished running 1 test, 3 hooks in 0 hours 0 minutes and 8.47 seconds (8.47s).
05:13:00
05:13:00 Completed successfully
05:13:00
05:13:00 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Happy Testing!!!

References:

https://docs.getdbt.com/reference/resource-properties/tests#out-of-the-box-tests

--

--

Anks Sarraf
Anks Sarraf

No responses yet