4 BigQuery SQL Shortcuts That Can Simplify Your Queries | by Vicky Yu | Jun, 2022
Check if your database has them too
The most common advice when you start learning SQL is to practice the basic statements because they’ll work across multiple databases. The downside is you may never discover database-specific syntax that can simplify your queries because there’s never any need to read the documentation. This was the case for me until I started using Google BigQuery and read the SQL syntax documentation and discovered a few amazing shortcuts you should know about.
1. EXCEPT
Use EXCEPT in a SELECT * to select all fields from a table except the columns you don’t want. In the example below, I created a CTE named orders and selected all of the columns except for item_id and item_name.
Prior to this, I would’ve written the SELECT below by listing every column except for item_id and item_name. This doesn’t seem like a shortcut with 6 columns but imagine if you have a 20 column table and you just want to exclude 2 fields. Instead of typing 18 column names in a SELECT statement, you can simply use an EXCEPT to exclude the 2 columns you don’t want.
2. PIVOT
I lost track of how many times I queried data from a table to put into Excel to get subtotals by different time periods using a pivot table. Now I can do this easily with a PIVOT statement and bypass Excel.
To get sales pivoted by quarter using the sample data above you just need to specify the quarters in your PIVOT statement.
Alternatively, to compare year over year total sales by quarter you can drop the product column and specify the years in the PIVOT statement.
3. ROLLUP
In the past, I used an Excel pivot table or ran multiple queries to get the total and subtotals but now I can use ROLLUP with GROUP BY instead. Using the product sample shown above we can get total sales and subtotals by year using ROLLUP ( year, quarter ) after the GROUP BY. Total sales for 2020 and 2021 are 355 denoted by the null values in the year and quarter columns. 2020 sales are 199 and 2021 is 156 denoted by the null values in the quarter column.
4. QUALIFY
QUALIFY allows you to apply it like a WHERE condition on a column created in your SELECT statement because it’s evaluated after the GROUP BY, HAVING, and WINDOW statements.
You can use QUALIFY to get the product with the highest sales by quarter calculated with a rank window function by simply using QUALIFY = 1.
The traditional way without using QUALIFY is to have the SQL statement as a subquery and then apply a WHERE rank = 1 like I have below. Seems a lot simpler with QUALIFY right?
Final Thoughts
Knowing SQL basics is great but it doesn’t hurt to review the documentation to see if there’s database-specific SQL syntax that can simplify your queries because they may be available in other databases. For example, Snowflake also has PIVOT, ROLLUP, and QUALIFY. I’ve highlighted a few SQL shortcuts but I’m sure I’ve only scratched the surface. How many more can you find?
Note: All queries above were run on BigQuery sandbox that’s free to anyone with a Google account.
Check if your database has them too
The most common advice when you start learning SQL is to practice the basic statements because they’ll work across multiple databases. The downside is you may never discover database-specific syntax that can simplify your queries because there’s never any need to read the documentation. This was the case for me until I started using Google BigQuery and read the SQL syntax documentation and discovered a few amazing shortcuts you should know about.
1. EXCEPT
Use EXCEPT in a SELECT * to select all fields from a table except the columns you don’t want. In the example below, I created a CTE named orders and selected all of the columns except for item_id and item_name.
Prior to this, I would’ve written the SELECT below by listing every column except for item_id and item_name. This doesn’t seem like a shortcut with 6 columns but imagine if you have a 20 column table and you just want to exclude 2 fields. Instead of typing 18 column names in a SELECT statement, you can simply use an EXCEPT to exclude the 2 columns you don’t want.
2. PIVOT
I lost track of how many times I queried data from a table to put into Excel to get subtotals by different time periods using a pivot table. Now I can do this easily with a PIVOT statement and bypass Excel.
To get sales pivoted by quarter using the sample data above you just need to specify the quarters in your PIVOT statement.
Alternatively, to compare year over year total sales by quarter you can drop the product column and specify the years in the PIVOT statement.
3. ROLLUP
In the past, I used an Excel pivot table or ran multiple queries to get the total and subtotals but now I can use ROLLUP with GROUP BY instead. Using the product sample shown above we can get total sales and subtotals by year using ROLLUP ( year, quarter ) after the GROUP BY. Total sales for 2020 and 2021 are 355 denoted by the null values in the year and quarter columns. 2020 sales are 199 and 2021 is 156 denoted by the null values in the quarter column.
4. QUALIFY
QUALIFY allows you to apply it like a WHERE condition on a column created in your SELECT statement because it’s evaluated after the GROUP BY, HAVING, and WINDOW statements.
You can use QUALIFY to get the product with the highest sales by quarter calculated with a rank window function by simply using QUALIFY = 1.
The traditional way without using QUALIFY is to have the SQL statement as a subquery and then apply a WHERE rank = 1 like I have below. Seems a lot simpler with QUALIFY right?
Final Thoughts
Knowing SQL basics is great but it doesn’t hurt to review the documentation to see if there’s database-specific SQL syntax that can simplify your queries because they may be available in other databases. For example, Snowflake also has PIVOT, ROLLUP, and QUALIFY. I’ve highlighted a few SQL shortcuts but I’m sure I’ve only scratched the surface. How many more can you find?
Note: All queries above were run on BigQuery sandbox that’s free to anyone with a Google account.