Techno Blender
Digitally Yours.

How to use variables in BigQuery using SQL — Part 1 | by Romain Granger | Jul, 2022

0 77


A step towards flexibility and reusability using parameters and variables using a WITH clause

Black screen by David Schultz on Unsplash

What are variables and why can they be useful?

Variables are also referred to as parameters. They can be used in SQL for setting or declaring values.

Variables store one or more values you provide them and can then be used anywhere in your query.

Usually set at the beginning of your code, they can be useful when:

  • You don’t want to change the same value in several places because your query has many lines of code
  • You want to make it easy to change this value without affecting the code logic
  • You want to reduce the length and complexity of your queries

Variables can be used in other languages such as Python, through your command line, or using the BigQuery API. But in this article, we’ll focus on how to use it in a SQL query.

We have two ways of using variables in BigQuery:

  • Using a WITH clause
  • Using BigQuery procedural language

In this article called “Part 1”, we will only cover the WITH clause.

If you want to know more about procedural language (which is like a scripting language), you can refer to the BigQuery documentation.

As a related topic (but not entirely relevant in this article), you might also be interested in using store procedures, which can also leverage procedural language.

We start with a base data table, which includes a date, a country, a product name, and the revenue linked with each order.

Our base data table. (Image by Author)

Experiment with a single value (Manual)

In the first example, we want to get all products with revenues equal to or greater than 250.

In the “main query” section, we use a comma, after calling our table (base table and variable) in the FROM clause, to be able to use the value in our filtering clause without any join.

But, I hear you, and you are correct: we could have avoided using a WITH clause and just use the value in our WHEREclause.

But, as you can see, this is a convenient place to put this value at the top of our query. Let’s look at a little more complex example, with more than one variable and querying two tables instead of one.

Experiment with multiple values (Manual)

We have now multiple variables, that we write in an array (we directly UNNESTto get all values as individual rows). We also have two tables, base_table and base_table_2 that reuse the same filtering variable.

In this case, it would be faster to add, change or remove values within ourWITHclause which holds our variables.

Another tip, if you want to combine multiple values, can be to use a format conbining ARRAY and STRUCT types.

The clause that holds our variable, is now having two fields, related to the price and the name of a product. It can be used in the filter the same way as in our previous example.

Now that we’ve seen how to use manual values, let’s see how to make them dynamic, based on a value computed from another table, or based on data that can change when new data is ingested.

Experiment with a single value (Dynamic)

Starting from our base table, we want to find all rows with a revenue per product that is three times greater than the average product revenue.

Our WITH clause will return 144.2, which is dynamically computed using the average product revenue on the our base table multiplied by 3. Note that you can use any data set you want and that this value is likely to change when performing this query with new data.

Experiment with a list (Manual)

In the same spirit, you can also use a list of items. Let’s say we want to filter specific item names in our base table.

Manual list created with an ARRAY

You could also write the same query using UNION ALL in the WITH clause. But I find the arrays faster to write.

Manual list created with a UNION ALL

Experiment with a list (Dynamic)

We want all products that contain the word “Google”. We can get all these values using a LIKE syntax in our variable statement.

Experiment with dates (Manual)

For dates, a manual addition would require converting the data type to DATE types (or TIMESTAMP, or DATETIME to be used in the query).

Experiment with dates (Dynamic)

We want to use the dates from another data set (base_table_2 in our case) to get a dynamic range to filter in our main SQL statement.

A few concluding words

This article demonstrates the mechanics of using variables in a WITH clause using several examples. We will discuss the BigQuery procedural language (also called scripting syntax) in “Part 2”.

As we’ve seen, a WITH clause can be very useful, but it’s not always ideal: It adds more lines of code, you may need to modify data types (like in the manual date example), and BigQuery gives a simpler syntax for declaring and setting variables (with a DECLARE and SET clause).

I hope this would help, and let me know if you would like to see more examples!


A step towards flexibility and reusability using parameters and variables using a WITH clause

Black screen by David Schultz on Unsplash

What are variables and why can they be useful?

Variables are also referred to as parameters. They can be used in SQL for setting or declaring values.

Variables store one or more values you provide them and can then be used anywhere in your query.

Usually set at the beginning of your code, they can be useful when:

  • You don’t want to change the same value in several places because your query has many lines of code
  • You want to make it easy to change this value without affecting the code logic
  • You want to reduce the length and complexity of your queries

Variables can be used in other languages such as Python, through your command line, or using the BigQuery API. But in this article, we’ll focus on how to use it in a SQL query.

We have two ways of using variables in BigQuery:

  • Using a WITH clause
  • Using BigQuery procedural language

In this article called “Part 1”, we will only cover the WITH clause.

If you want to know more about procedural language (which is like a scripting language), you can refer to the BigQuery documentation.

As a related topic (but not entirely relevant in this article), you might also be interested in using store procedures, which can also leverage procedural language.

We start with a base data table, which includes a date, a country, a product name, and the revenue linked with each order.

Our base data table. (Image by Author)

Experiment with a single value (Manual)

In the first example, we want to get all products with revenues equal to or greater than 250.

In the “main query” section, we use a comma, after calling our table (base table and variable) in the FROM clause, to be able to use the value in our filtering clause without any join.

But, I hear you, and you are correct: we could have avoided using a WITH clause and just use the value in our WHEREclause.

But, as you can see, this is a convenient place to put this value at the top of our query. Let’s look at a little more complex example, with more than one variable and querying two tables instead of one.

Experiment with multiple values (Manual)

We have now multiple variables, that we write in an array (we directly UNNESTto get all values as individual rows). We also have two tables, base_table and base_table_2 that reuse the same filtering variable.

In this case, it would be faster to add, change or remove values within ourWITHclause which holds our variables.

Another tip, if you want to combine multiple values, can be to use a format conbining ARRAY and STRUCT types.

The clause that holds our variable, is now having two fields, related to the price and the name of a product. It can be used in the filter the same way as in our previous example.

Now that we’ve seen how to use manual values, let’s see how to make them dynamic, based on a value computed from another table, or based on data that can change when new data is ingested.

Experiment with a single value (Dynamic)

Starting from our base table, we want to find all rows with a revenue per product that is three times greater than the average product revenue.

Our WITH clause will return 144.2, which is dynamically computed using the average product revenue on the our base table multiplied by 3. Note that you can use any data set you want and that this value is likely to change when performing this query with new data.

Experiment with a list (Manual)

In the same spirit, you can also use a list of items. Let’s say we want to filter specific item names in our base table.

Manual list created with an ARRAY

You could also write the same query using UNION ALL in the WITH clause. But I find the arrays faster to write.

Manual list created with a UNION ALL

Experiment with a list (Dynamic)

We want all products that contain the word “Google”. We can get all these values using a LIKE syntax in our variable statement.

Experiment with dates (Manual)

For dates, a manual addition would require converting the data type to DATE types (or TIMESTAMP, or DATETIME to be used in the query).

Experiment with dates (Dynamic)

We want to use the dates from another data set (base_table_2 in our case) to get a dynamic range to filter in our main SQL statement.

A few concluding words

This article demonstrates the mechanics of using variables in a WITH clause using several examples. We will discuss the BigQuery procedural language (also called scripting syntax) in “Part 2”.

As we’ve seen, a WITH clause can be very useful, but it’s not always ideal: It adds more lines of code, you may need to modify data types (like in the manual date example), and BigQuery gives a simpler syntax for declaring and setting variables (with a DECLARE and SET clause).

I hope this would help, and let me know if you would like to see more examples!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment