[ad_1]
First, we have to add two new dbt packages, dbt-expectations and dbt-utils, that can enable us to make assertions on the schema of our sources and the accepted values.
# packages.yml
packages:- bundle: dbt-labs/dbt_utilsversion: 1.1.1
– bundle: calogica/dbt_expectationsversion: 0.8.5
Testing the info sources
Let’s begin by defining a contract take a look at for our first supply. We pull information from raw_height, a desk that comprises top info from the customers of the health club app.
We agree with our information producers that we are going to obtain the peak measurement, the items for the measurements, and the consumer ID. We agree on the info varieties and that solely ‘cm’ and ‘inches’ are supported as items. With all this, we are able to outline our first contract within the dbt supply YAML file.
The constructing blocks
Trying on the earlier take a look at, we are able to see a number of of the dbt-unit-testing macros in use:
dbt_expectations.expect_column_values_to_be_of_type: This assertion permits us to outline the anticipated column information kind.accepted_values: This assertion permits us to outline an inventory of the accepeted values for a selected column.dbt_utils.accepted_range: This assertion permits us to outline a numerical vary for a given column. Within the instance, we anticipated the column’s worth to not be lower than 0.not null: Lastly, built-in assertions like ‘not null’ enable us to outline column constraints.
Utilizing these constructing blocks, we added a number of checks to outline the contract expectations described above. Discover additionally how we have now tagged the checks as “contract-test-source”. This tag permits us to run all contract checks in isolation, each domestically, and as we’ll see later, within the CI/CD pipeline:
dbt take a look at –select tag:contract-test-source
We now have seen how rapidly we are able to create contract checks for the sources of our dbt app, however what in regards to the public interfaces of our information pipeline or information product?
As information producers, we wish to be sure we’re producing information in accordance with the expectations of our information shoppers so we are able to fulfill the contract we have now with them and make our information pipeline or information product reliable and dependable.
A easy approach to make sure that we’re assembly our obligations to our information shoppers is so as to add contract testing for our public interfaces.
Dbt lately launched a brand new characteristic for SQL fashions, mannequin contracts, that enables to outline the contract for a dbt mannequin. Whereas constructing your mannequin, dbt will confirm that your mannequin’s transformation will produce a dataset matching up with its contract, or it can fail to construct.
Let’s see it in motion. Our mart, body_mass_indexes, produces a BMI metric from the load and top measure information we get from our sources. The contract with our supplier establishes the next:
Information varieties for every column.Consumer IDs can’t be nullUser IDs are at all times larger than 0
Let’s outline the contract of the body_mass_indexes mannequin utilizing dbt mannequin contracts:
The constructing blocks
Trying on the earlier mannequin specification file, we are able to see a number of metadata that enable us to outline the contract.
contract.enforced: This configuration tells dbt that we wish to implement the contract each time the mannequin is run.data_type: This assertion permits us to outline the column kind we expect to supply as soon as the mannequin runs.constraints: Lastly, the constraints block provides us the prospect to outline helpful constraints like {that a} column can’t be null, set major keys, and customized expressions. Within the instance above we outlined a constraint to inform dbt that the user_id should at all times be larger than 0. You may see all of the obtainable constraints right here.
A distinction between the contract checks we outlined for our sources and those outlined for our marts or output ports is when the contracts are verified an enforced.
Mannequin contracts are enforced when the mannequin is being generated by dbt run, whereas contracts based mostly on the dbt checks are enforced when the dbt checks run.
If one of many mannequin contracts isn’t glad, you will note an error while you execute ‘dbt run’ with particular particulars on the failure. You may see an instance within the following dbt run console output.
1 of 4 START sql desk mannequin dbt_testing_example.stg_gym_app__height ……….. [RUN]2 of 4 START sql desk mannequin dbt_testing_example.stg_gym_app__weight ……….. [RUN]2 of 4 OK created sql desk mannequin dbt_testing_example.stg_gym_app__weight …… [SELECT 4 in 0.88s]1 of 4 OK created sql desk mannequin dbt_testing_example.stg_gym_app__height …… [SELECT 4 in 0.92s]3 of 4 START sql desk mannequin dbt_testing_example.int_weight_measurements_with_latest_height [RUN]3 of 4 OK created sql desk mannequin dbt_testing_example.int_weight_measurements_with_latest_height [SELECT 4 in 0.96s]4 of 4 START sql desk mannequin dbt_testing_example.body_mass_indexes …………. [RUN]4 of 4 ERROR creating sql desk mannequin dbt_testing_example.body_mass_indexes …. [ERROR in 0.77s]
Completed working 4 desk fashions in 0 hours 0 minutes and 6.28 seconds (6.28s).
Accomplished with 1 error and 0 warnings:
Database Error in mannequin body_mass_indexes (fashions/marts/body_mass_indexes.sql)new row for relation “body_mass_indexes__dbt_tmp” violates verify constraint “body_mass_indexes__dbt_tmp_user_id_check1″DETAIL: Failing row comprises (1, 2009-07-01, 82.5, null, null).compiled Code at goal/run/dbt_testing_example/fashions/marts/body_mass_indexes.sql
Till now we have now a take a look at suite of highly effective contract checks, however how and when will we run them?
We are able to run contract checks in two kinds of pipelines.
CI/CD pipelinesData pipelines
For instance, you’ll be able to execute the supply contract checks on a schedule in a CI/CD pipeline focusing on the info sources obtainable in decrease environments like take a look at or staging. You may set the pipeline to fail each time the contract isn’t met.
These failures gives useful details about contract-breaking adjustments launched by different groups earlier than these adjustments attain manufacturing.
[ad_2]
Source link