Updating dbt models
1. Updating dbt models
Welcome back! Our next topic is updating models in dbt, including ones we've created directly, or those created by colleagues.2. Why update?
An advantage of working with dbt is to easily make changes to a project without writing new queries / models from scratch each time. Let's review some reasons why you'd want to update models: Iterative tasks, where the requirements for your project have changed or have not been fully implemented yet. Fixing bugs in the queries that define the models. Given that SQL can be considered a programming language of its own, there are likely issues that need to be fixed. Migrating the data, whether source or destination locations, can also require updates to your project.3. Update workflow
As we've discussed why you'd need to update your models, let's look at a potential workflow used when updating a dbt project. The first step is checking out a dbt project from your source control system, such as git. An example would be git clone dbt_project then opening the dbt_project folder. Git isn't required for dbt, but it is advantageous doing so as you can easily track changes / updates / modifications. Once you have the current project source, you'll find the appropriate model file in question and update the query contents. This could be updating the query directly, creating a subquery, or otherwise modifying the .sql file contents. After updating the models, you'll need to apply these changes to the project. This is often done by executing dbt run. Occasionally, larger changes need a full refresh of the model, which can be done by adding dash-f to the dbt run command. If you see an error in your update or the results don't appear as expected, you can try the full refresh option. Depending on your data and models, it might take longer to run than a simple update. Finally, once updates have been made and verified to work, you'll check the changes into source control to keep the process easy in the future.4. YAML files
In addition to directly updating .sql files for dbt models, you can also make changes in some YAML / .yml files. Typically these updates would be in two types of files, either the dbt_project.yml file or a model_properties.yml file.5. dbt_project.yml
The dbt_project.yml file contains settings that relate to the full project. This includes items such as the project name, version, and directory locations. The materialization settings for a model can also reside here, though settings in this file are applied globally. These include whether models are created as tables / views / etc in the data warehouse. Note that there is one dbt_project.yml file per project.6. model_properties.yml
The model_properties.yml file is specific to settings for model information. This includes description, documentation details, and much more. Refer to the dbt documentation for more information. One interesting note is the file can actually be named anything as long as it exists somewhere in the models/ subdirectory and ends in a .yml extension. You can have as many of these .yml files as needed.7. Let's practice!
We've learned about updating our project and models. Let's practice these details in the coming exercises.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.