The SQL Unit Testing Landscape: 2023 | by Chad Isenberg | May, 2023

By Jessie Hobb On May 2, 2023

To illustrate some of these ideas, here’s a scenario.

Let’s say I’m working on a particularly complicated query that takes 30 minutes and $10 to execute against the production dataset. During the course of development, I have to execute the query 5 times as I make small adjustments, each time spending another $1 and 2 minutes while my audit query completes (noting that this step could be much more expensive if I manually audit results). As an average American data engineer making $120K / year, my time is approximately $57 / hour (ignoring taxes, benefits, etc.). My development cost $171 in labor and $55 in compute, for a total of $226.

Implementing unit tests to cover the same query might take me an hour or two, since it’s really complicated. But now my first 4 runs of the query are essentially free; since we’ll be processing a handful of records, execution will take a handful of seconds, and I’m only on the hook for the final run for validation: $11 and 30 minutes of time ($28.50), for a total of $39.50. Combined with the unit test overhead, we’re looking at $153.50, a savings of ~33%.

There are plenty of counterarguments. What if the query only cost pennies and took seconds to run in the first place? What if I only had to make a trivial change and completed the work in one pass? What if I were working on a scaled-back development environment that cut my costs by half or more?

Your specific use-cases may dictate how much value you can get out of unit testing, but I would argue that no matter your circumstances, you need to think about your testing strategy and the tradeoffs you’re making. Testing incurs costs but is crucial to developing high-quality products; this is no different in the data world than it is in software.

To answer the headline: there isn’t one. Hopefully I’ve sold you on the idea that unit testing can give us the coverage we need with a very minimal amount of data. But of course, there’s a catch. In the SQL world, what is a “unit”? For a language Python or Java, unit testing is frequently done at the method or class level; as the name implies, these are the most basic “units” of functionality that need to be tested.

In SQL, is a unit the entire query? Is it a single CTE? A group of CTEs that convey some meaning together? The latter adds complexity to unit testing solutions since SQL isn’t readily composable like general-purpose languages; you can’t just “call” a CTE like you would a method. Good data modeling can help reduce this burden by making your queries more modular, but good test design is still challenging, and there are scarce best practices available.

There are a number of pre-built options for SQL unit testing, but there’s no standard at this point. There’s an excellent dbt thread that details the benefits and potential approaches, but unfortunately, this isn’t something that has made the roadmap just yet. Let’s review some of the more popular options.

tSQLt is an open source project maintained by Redgate, a big name in the SQL Server development space. Like a lot of SQL Server solutions, tSQLt is implemented as a series of stored procedures added as database objects into your target database.

From the tutorial, here’s a test of a stored procedure, with mocks handled as variables:

EXEC tSQLt.NewTestClass 'testFinancialApp';
GOCREATE PROCEDURE testFinancialApp.[test that ConvertCurrency converts using given conversion rate]
AS
BEGIN
DECLARE @actual MONEY;
DECLARE @rate DECIMAL(10,4); SET @rate = 1.2;
DECLARE @amount MONEY; SET @amount = 2.00;
SELECT @actual = FinancialApp.ConvertCurrency(@rate, @amount);
DECLARE @expected MONEY; SET @expected = 2.4;   --(rate * amount)
EXEC tSQLt.AssertEquals @expected, @actual;
END;
GO

tSQLt also offers functionality for mocking tables and views, as well as asserting equalities, inequalities, and existence of various database objects. This tool is particularly well-suited to applications database development since, well, that’s one of SQL Server’s primary use-cases. You certainly can apply this to analytics (and there are plenty of data warehouses implemented in SQL Server), but the focus is clearly on a different kind of development.

dbt-unit-testing is a dbt package maintained by Equal Experts, a global technology consultancy. The approach is to create custom tests that utilize the package’s macros, and then to run them. Here, mocks can be defined in query- or CSV-style, and some clever work with Jinja macros substitutes the mocks for the actual table references. From their docs:

{{ config(tags=['unit-test']) }}{% call dbt_unit_testing.test('customers', 'should sum order values to calculate customer_lifetime_value') %}
{% call dbt_unit_testing.mock_ref ('stg_customers') %}
select 1 as customer_id, '' as first_name, '' as last_name
{% endcall %}
{% call dbt_unit_testing.mock_ref ('stg_orders') %}
select 1001 as order_id, 1 as customer_id, null as order_date
UNION ALL
select 1002 as order_id, 1 as customer_id, null as order_date
{% endcall %}
{% call dbt_unit_testing.mock_ref ('stg_payments') %}
select 1001 as order_id, 10 as amount
UNION ALL
select 1002 as order_id, 10 as amount
{% endcall %}
{% call dbt_unit_testing.expect() %}
select 1 as customer_id, 20 as customer_lifetime_value
{% endcall %}
{% endcall %}

Once you’ve established your tests, it’s as simple as running dbt test, and you get formatted output (also from the docs):

MODEL: customers
TEST:  should sum order values to calculate customer_lifetime_value
Rows mismatch:
| diff | count | customer_id | customer_lifetime_value |
| ---- | ----- | ----------- | ----------------------- |
| +    |     1 |           1 |                      20 |
| -    |     1 |           1 |                      30 |

In addition to manually creating mocks, this package supports functionality to infer columns by directly querying your warehouse. In my personal experience, I had trouble getting this working, but this functionality could be very useful for wide tables, as you’re otherwise on the hook for enumerating every single column.

dbt-datamocktool is another dbt package that takes a slightly different approach. Mocks and expectations are created as seed files, and then tests are defined as inputs to the package’s existing test in schema.yml:

models:
- name: stg_customers
tests:
- dbt_datamocktool.unit_test:
input_mapping:
source('jaffle_shop', 'raw_customers'): ref('dmt__raw_customers_1')
expected_output: ref('dmt__expected_stg_customers_1')
depends_on:
- ref('raw_customers')
columns: ...- name: stg_orders
tests:
- dbt_datamocktool.unit_test:
input_mapping:
ref('raw_orders'): ref('dmt__raw_orders_1')
expected_output: ref('dmt__expected_stg_orders_1')
columns: ...

In addition to providing the mock data, you’ll map the models’ sources and refs to the appropriate mocks. This approach is more hands-on, but it’s also more lightweight in terms of what it demands from your project. Unlike dbt-unit-testing, you don’t have to override the ref() and source() macros with package-specific versions. The majority of the functionality is coming from dbt’s native components: tests and seeds.

The other exciting feature is support for incremental models for adapters that support the MERGE operation. This is especially important in the dbt landscape since the incremental materialization fundamentally runs different queries depending on the “mode” of the dbt invocation; you need to be able to test both “versions” of the query for full coverage.

A newcomer to the SQL modeling / templating space, SQLMesh brings a ton of goodies, not the least of which is native unit test support. Tests are defined in YAML files and then executed on-demand or upon rebuilding of a SQLMesh plan.

Mocking is straightforward, with mocks and expectations defined inside the same file. A killer feature I haven’t seen elsewhere is the ability to test individual CTEs:

test_example_full_model:
model: sqlmesh_example.example_full_model
inputs:
sqlmesh_example.example_incremental_model:
rows:
- id: 1
item_id: 1
ds: '2020-01-01'
- id: 2
item_id: 1
ds: '2020-01-02'
- id: 3
item_id: 2
ds: '2020-01-03'
outputs:
ctes:
filtered_orders_cte:
rows:
- id: 1
item_id: 1
- id: 2
item_id: 1
query:
rows:
- item_id: 1
num_orders: 2

As we saw with dbt-unit-testing, test output is visualized nicely:

$ sqlmesh test
F
======================================================================
FAIL: test_example_full_model (/Users/izeigerman/github/tmp/tests/test_suite.yaml:1)
----------------------------------------------------------------------
AssertionError: Data differs
- {'item_id': 1, 'num_orders': 3}
?                              ^+ {'item_id': 1, 'num_orders': 2}
?                              ^
----------------------------------------------------------------------
Ran 1 test in 0.008s
FAILED (failures=1)

For me personally, this is the most compelling SQL unit testing option I’ve seen; however, it’s also complicated by the fact that it’s a SQL framework that isn’t dbt. This is an early-stage tool facing a de-facto industry standard, and between feature gaps and adoption inertia (i. e., it has to be better by enough to get teams to switch), it’s uncertain what its future holds.

That being said, read the docs and get an idea of why, as impressive as the unit testing framework is, that’s not even SQLMesh’s best feature.

No. At the end of the day, the existing solutions are all pretty close in functionality and suffer from similar drawbacks. All of the tools do more or less the same thing: run your queries with mock inputs and compare the query output to an expectation. Some are more sophisticated than others in how they construct the test scenarios and generate output, but the differences are relatively small.

There are three hurdles to adoption that I can see:

The industry hasn’t coalesced around a standard. dbt has been remarkably successful even though it has significant and serious gaps (column-level lineage, unit testing, scalability); adoption has driven adoption. No unit testing solution has hit critical mass
The culture isn’t there yet. SQL developers have been producing successful solutions for decades without access to unit testing. While the rise of the analytics engineer has brought more and more software engineering best practices to SQL development, it hasn’t brought all of them, and unit testing is one of the gaps
Data mocking is still an unsolved problem. In all of the documented examples, we’re looking at a handful of columns with simple relations between a handful of tables. Real queries can be orders of magnitude more complex. The developer feels the burden of many, sometimes dozens of join conditions and the referential integrity needed to make the tests pass. To some extent, this can be seen as a feature: i. e., the developer has to reason about every join and every record. But practically speaking, it’s a slog, especially when real data that will just work is tantalizingly in reach

I’m hopeful that the increasing number of conversations about the importance of testing, quality, and standards in the industry is pushing us toward more mature practices and better outcomes. When we can develop queries as safely and quickly as we can develop other software, we’re going to unlock a lot of value.