Techno Blender
Digitally Yours.

How To Handle Date and Time Difference in PostgreSQL | by Marie Lefevre | Jun, 2022

0 56


3 tricks to make your life easier when there is no DATE_DIFF() function available

Photo by Jon Tyson on Unsplash

Prospect A signed up for your company’s service offer on January 20th, 2022 at 8 AM. She cancelled her subscription on February 3rd, 2022 at 4 PM. How long has she been a client of your company?

To answer this question, you will need to compute the difference between two points in time. Regardless of the most relevant time scale in this specific situation (should the difference be expressed in weeks? in days? in hours?), there must exist a function to help you find easily how much time has passed between two events — in this case, between the prospect’s signup and the cancellation of her subscription.

In my previous work experience I used to get an answer to this type of questions by using the Google BigQuery console that was directly connected to our database. In Google BigQuery queries are written in standard SQL. Conveniently enough, there is an easy-to-implement formula to calculate the difference between two points in time: DATE_DIFF().

However, in PostgreSQL there is no direct equivalent to DATE_DIFF(). Therefore we must find the right combination of existing functions to get what we really need as an output. On that topic, PostgreSQL’s documentation is extensive, but I must admit that I became somehow confused with all the formulas listed.

Extract from PostgreSQL documentation

What seemed to be easy turned out to be slightly more complex than I expected, so I thought I could share with you the 3 following tips to correctly handle date differences in PostgreSQL.

Let’s go into them in details by using the example mentioned in the introduction. In this example let’s assume that we define two input variables: start_date and end_date. What we want to get as an output is the difference between these two date times:

Example by author

The first tip I used to come across is from SQLines:

Extract from SQLines

They suggest to create a user-defined function or, if you only have the rights or the skills to write some declarative SQL (SELECT... ), the suggested solution is to use a succession of DATE_PART() functions. In our example, we would write the following to get the difference in days and hours:

This would lead to the following output:

Output of tip #1 — Code and screenshot by author

The calculation is correct, and the output is easily interpretable. However I see two disadvantages with this method.

First, the piece of code required is not compact. For a better code readability, one must ensure that the code is correctly indented so the reader understands which “DATEPART” corresponds to which part of the calculation. In our example we only went until the hour scale, but if we wanted to compute the difference in seconds, we would add two more DATE_PART functions.

Second, calling several times the function DATE_PART() leads to possible loss performance when executing the piece of code. When implementing various options for the same outputs, you must make sure to compare the performance between two queries when applied to the same dataset.

In PostgreSQL the function AGE() will give you a detailed answer to your question: how much time did pass between these 2 events? The output is in the form of a text line displaying the difference between the two points in time.

The output will look like this:

Output of tip #2 — Code and screenshot by author

Depending in your data inputs and the use case you are tackling, you may be satisfied with this method. In particular if you answer the end user’s need with the output given by the function AGE(), there may be no need to over-complicate the output.

If you are sure that all your values are included within a given date and time scale, keep it simple. Just stick to the AGE() function and combine it with EXTRACT(). This works correctly for the difference expressed in days below. But look closely at the output of the difference in hours:

Output of inappropriate use for tip #2 — Code and screenshot by author

As you could notice, the problem with the second query is that we try to return the difference between two days in hours. Here, the AGE() function is not appropriate: it returns the difference in hours…regardless of the day!

In this type of situation— and in any case if you have a doubt — you should rather use the combination of the EXTRACT() function and the EPOCH parameter. Let’s take our example one last time to demonstrate that:

The output will look like this:

Output of tip #3 — Code and screenshot by author

It works! With two advantages compared to the code written with option #1:

  1. The code is more compact
  2. The difference in days is expressed at a more detailed level (by default, with two decimals)

Keep it simple where applicable

Make it complex where relevant

The idea of this article is to gather useful tips to handle date and time differences in PostgreSQL. There is no need for you to apply all these functions at once — just choose the most relevant for your own use case.

Do you use other functions to easily handle dates and times in PostgreSQL? I would love to hear them and to share our best practices in the comments section!


3 tricks to make your life easier when there is no DATE_DIFF() function available

Photo by Jon Tyson on Unsplash

Prospect A signed up for your company’s service offer on January 20th, 2022 at 8 AM. She cancelled her subscription on February 3rd, 2022 at 4 PM. How long has she been a client of your company?

To answer this question, you will need to compute the difference between two points in time. Regardless of the most relevant time scale in this specific situation (should the difference be expressed in weeks? in days? in hours?), there must exist a function to help you find easily how much time has passed between two events — in this case, between the prospect’s signup and the cancellation of her subscription.

In my previous work experience I used to get an answer to this type of questions by using the Google BigQuery console that was directly connected to our database. In Google BigQuery queries are written in standard SQL. Conveniently enough, there is an easy-to-implement formula to calculate the difference between two points in time: DATE_DIFF().

However, in PostgreSQL there is no direct equivalent to DATE_DIFF(). Therefore we must find the right combination of existing functions to get what we really need as an output. On that topic, PostgreSQL’s documentation is extensive, but I must admit that I became somehow confused with all the formulas listed.

Extract from PostgreSQL documentation

What seemed to be easy turned out to be slightly more complex than I expected, so I thought I could share with you the 3 following tips to correctly handle date differences in PostgreSQL.

Let’s go into them in details by using the example mentioned in the introduction. In this example let’s assume that we define two input variables: start_date and end_date. What we want to get as an output is the difference between these two date times:

Example by author

The first tip I used to come across is from SQLines:

Extract from SQLines

They suggest to create a user-defined function or, if you only have the rights or the skills to write some declarative SQL (SELECT... ), the suggested solution is to use a succession of DATE_PART() functions. In our example, we would write the following to get the difference in days and hours:

This would lead to the following output:

Output of tip #1 — Code and screenshot by author

The calculation is correct, and the output is easily interpretable. However I see two disadvantages with this method.

First, the piece of code required is not compact. For a better code readability, one must ensure that the code is correctly indented so the reader understands which “DATEPART” corresponds to which part of the calculation. In our example we only went until the hour scale, but if we wanted to compute the difference in seconds, we would add two more DATE_PART functions.

Second, calling several times the function DATE_PART() leads to possible loss performance when executing the piece of code. When implementing various options for the same outputs, you must make sure to compare the performance between two queries when applied to the same dataset.

In PostgreSQL the function AGE() will give you a detailed answer to your question: how much time did pass between these 2 events? The output is in the form of a text line displaying the difference between the two points in time.

The output will look like this:

Output of tip #2 — Code and screenshot by author

Depending in your data inputs and the use case you are tackling, you may be satisfied with this method. In particular if you answer the end user’s need with the output given by the function AGE(), there may be no need to over-complicate the output.

If you are sure that all your values are included within a given date and time scale, keep it simple. Just stick to the AGE() function and combine it with EXTRACT(). This works correctly for the difference expressed in days below. But look closely at the output of the difference in hours:

Output of inappropriate use for tip #2 — Code and screenshot by author

As you could notice, the problem with the second query is that we try to return the difference between two days in hours. Here, the AGE() function is not appropriate: it returns the difference in hours…regardless of the day!

In this type of situation— and in any case if you have a doubt — you should rather use the combination of the EXTRACT() function and the EPOCH parameter. Let’s take our example one last time to demonstrate that:

The output will look like this:

Output of tip #3 — Code and screenshot by author

It works! With two advantages compared to the code written with option #1:

  1. The code is more compact
  2. The difference in days is expressed at a more detailed level (by default, with two decimals)

Keep it simple where applicable

Make it complex where relevant

The idea of this article is to gather useful tips to handle date and time differences in PostgreSQL. There is no need for you to apply all these functions at once — just choose the most relevant for your own use case.

Do you use other functions to easily handle dates and times in PostgreSQL? I would love to hear them and to share our best practices in the comments section!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment