Get startedGet started for free

Data manipulation language (DML) statements

1. Data manipulation language (DML) statements

While most data in BigQuery will be loaded by batch or streaming ingestion, BigQuery also provides us with tools to manipulate data to modify results or underlying tables.

2. Overview of data manipulation in BigQuery

Data manipulation language or DML allows us to modify your data in BigQuery using syntax similar to other SQL databases. For example, statements such as INSERT, UPDATE, and DELETE allow us to add new records to our table, modify values, and remove data from our tables. The MERGE statement allows us to combine the previous statements into one statement, and the CREATE TABLE statement allows us to create a new table with the results from a query.

3. Considerations and performance

When using DML statements, there are some steps we can take to run them efficiently. First, group our DML statements rather than running them on individual rows. We must also use a WHERE condition when running an update statement, which can filter the rows on which the operation will run. Finally, you may consider using partitions on those datasets or clustering data to improve performance.

4. INSERT

BigQuery's INSERT statement lets us add new data entries to our tables. This empowers us to add new records to our data as needed. We are adding three new customers to our "customers" dataset in this query.

5. UPDATE

The UPDATE statement lets us change data within a table based on a condition. Here, we are updating an email address in the first query. We can also use subqueries and joins to update data from other tables, like in this query, and we can update more than one column in an UPDATE statement.

6. DELETE

The DELETE statement removes rows based on specific conditions. It is permanent and cannot be reversed. In this query, we are removing the record for the customer with "customer-id" number 3. We can also use joins to delete data based on conditions in other tables.

7. MERGE

BigQuery's MERGE statement combines the power of INSERT, UPDATE, and DELETE in a single operation. Here is an example of a merge statement. First, we set a target table to make our changes to, in this case, "customers", and a source table, in this case, "new customers." We then establish a matching condition, "customer-id" which works like a join condition. Then, we create a matching condition, which updates the email if and only if there is a mismatch. If there is no match, then the MERGE statement will insert the new value into the target table. MERGE allows us to add more complex logic to the data manipulation statement.

8. CREATE TABLE

Finally, the CREATE TABLE AS statement allows us to create a new table with the results from a query. This query shows us how to create a new table named "active-customers" with the results of a query that returns customers who have been active within the last 30 days.

9. Let's practice!

Let's test some of your data manipulation skills in a few exercises.