Techno Blender
Digitally Yours.

How to run inference with a PyTorch time series Transformer

0 29


Using a PyTorch transformer for time series forecasting at inference time where you do not know the target values

Photo by Crystal Kwok on Unsplash

In this post, I will show how to run inference with a PyTorch transformer for time series forecasting. Specifically, we will use the PyTorch time series transformer I described in my previous post How to make a Transformer for time series forecasting with PyTorch.

The post is structured as follows: First, I’ll briefly describe what inputs a the PyTorch time series Transformer requires. Then, I’ll show you how to run inference with the model when you do not know the decoder input values. Finally, I’ll point out a few disadvantages of the shown approach.

How to make a PyTorch Transformer for time series forecasting

What inputs does the time series Transformer model require?

In case you haven’t read my post “How to make a PyTorch Transformer for time series forecasting”, let’s first briefly go over what inputs the time series Transformer requires. For a more elaborate walk-through, see the above-mentioned post. Please note that the terms trgand tgt are sometimes used interchangeably this post and the other post.

The transformer model requires the following inputs:

src which is used by the encoder. The shape of src must be [batch size, n, number of input features] or [n, batch size, number of input features] (depending on the value of thebatch_first constructor argument) where n is the number of data points in your input series. If, for instance, you’re forecasting hourly electricity prices and you want to base your forecasts on the past week’s data, then n=168.

tgt is another input required by a transformer. It’s used by the decoder. tgtconsists of the last value of the input sequence insrc and all but the last value of the target sequence. In other words it will have the shape [batch size, m, number of predicted variables] or [m, batch size, number of predicted variables] where m is the forecasting horizon. Continuing the example of electricity price forecasting, if you want to forecast electricity prices 48 hours ahead, then m=48.

In addition, the encoder and decoders require so-called masks. The reader is referred to the above-mentioned post for an introduction to masking.

Another thing that is important to note before we proceed is that the specific time series Transformer implementation we’re using in this blog post always outputs a matrix of shape [batch size, m, number of predicted variables] or [m, batch size, number of predicted variables], i.e. the length of the model output sequences is determined by the length of the input sequences given to the decoder in the tgt tensor.

So if tgthas shape [72, batch size, 1], it means the length of the sequences in tgt is 72, and thus the model will also output sequences of 72.

How to use a time series Transformer for inference

Okay, with the preliminaries in place, let’s now consider why a blog post about how to run inference with a Transformer for time series forecasting even exists:

During training, it is straight forward to produce the tgt because we know the values of the target sequence. However, during inference (for instance in a production environment), we of course do not know the values of the target sequence when making the forecasts — otherwise we wouldn’t need to make the forecasts in the first place. So we need to find a way to produce a reasonable tgt that can be used as input to the model during inference.

Now that we know what inputs the time series Transformer requires and why we need to somehow generate the tgt , let’s take a look at how to actually do it. In what follows, keep in mind that the overall purpose is to produce a tgt tensor which, once it is produced, can be used as input to the model to make a forecast.

To illustrate with a simple example, suppose that at inference time, t, we want to forecast the next 3 values of a sequence based on the 5 most recent observations of the sequence.

Here is what the src would look like:

src = [xt-4, xt-3, xt-2, xt-1, xt]

where x denotes the series we’re dealing with, e.g. electricity prices.

The objective is to predict tgt_y which would be:

tgt_y = [xt+1, xt+2, xt+3]

So our tgt , which the model needs as input in order to make its forecast for tgt_y , should be:

tgt = [xt, xt+1, xt+2]

We know the value ofxt but not of xt+1 nor xt+2 , so we need to somehow estimate these. In this post, we will do this by first forecasting xt+1 , then add this forecast it to tgt such that tgt = [xt, xt+1] and then use this tgt to forecast xt+2 , then add this forecast to tgt such that tgt = [xt, xt+1, xt+2] and finally use this tgt to produce the final forecast.

The below function is the code you need to run inference with a time series Transformer model in PyTorch. The function produces a forecast according to the approach described above. You pass in a Transformer model and src along with some other arguments that are described in the docstring. The function then iteratively generates the tgt and produces the final forecast based on a tgt consisting of the last known observation at time t and estimated values for the remaining m-1 datapoints.

https://medium.com/media/8292b6b5bafe7d8b9c7d1cffd5cbd730/href

The function is designed to be used inside your validation or test loop. Instead of calling the model to produce the predictions, you call the inference function. Here is a simplified example of how to use it:

https://medium.com/media/ee296a360c493043851c3332c37ec0a3/href

Please note that you cannot use this script as is. This is merely an example to show the overall idea — it is not meant to be something you can copy paste and expect to work. For instance, you need to instantiate the model and the data loaders before you can make the script work. In the GitHub repo for this blog post, see the file sandbox.py for examples of how to do this. If you have never before trained, validated and tested a PyTorch neural network, I suggest you look at some of PyTorch’s beginner-level tutorials.

Downsides of the shown approach for running inference with a time series Transformer

Given that the inference functions relies on a loop to iteratively produce the tgt , the function can be slow if m is big because this will increase the number of iterations in the loop. This is the main drawback of the approach described above. I have not been imaginative enough to come up with a more efficient approach, but I’d love to hear from you in the comments section if you have any ideas. You’re also welcome to contribute directly to the repo.

Given that the inference function calls the model m-1 times per batch, you may be want to be wary of some of the things that increases the computation time of calling the model such as using a model with many parameters or using a big n. In addition, the more batches you have, the more times the inference function will be called and the longer the total training or test script will take to run.

The code for running inference with the time series Transformer as well as the PyTorch Transformer implementation can be found in this repo:

GitHub – KasperGroesLudvigsen/influenza_transformer: PyTorch implementation of Transformer model used in "Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case"

That’s it! I hope you enjoyed this post 🤞

Please leave a comment letting me know what you think.

Follow for more posts related to time series forecasting, green software engineering and the environmental impact of data science🍀

And feel free to connect with me on LinkedIn.


How to run inference with a PyTorch time series Transformer was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.


Using a PyTorch transformer for time series forecasting at inference time where you do not know the target values

Photo by Crystal Kwok on Unsplash

In this post, I will show how to run inference with a PyTorch transformer for time series forecasting. Specifically, we will use the PyTorch time series transformer I described in my previous post How to make a Transformer for time series forecasting with PyTorch.

The post is structured as follows: First, I’ll briefly describe what inputs a the PyTorch time series Transformer requires. Then, I’ll show you how to run inference with the model when you do not know the decoder input values. Finally, I’ll point out a few disadvantages of the shown approach.

How to make a PyTorch Transformer for time series forecasting

What inputs does the time series Transformer model require?

In case you haven’t read my post “How to make a PyTorch Transformer for time series forecasting”, let’s first briefly go over what inputs the time series Transformer requires. For a more elaborate walk-through, see the above-mentioned post. Please note that the terms trgand tgt are sometimes used interchangeably this post and the other post.

The transformer model requires the following inputs:

src which is used by the encoder. The shape of src must be [batch size, n, number of input features] or [n, batch size, number of input features] (depending on the value of thebatch_first constructor argument) where n is the number of data points in your input series. If, for instance, you’re forecasting hourly electricity prices and you want to base your forecasts on the past week’s data, then n=168.

tgt is another input required by a transformer. It’s used by the decoder. tgtconsists of the last value of the input sequence insrc and all but the last value of the target sequence. In other words it will have the shape [batch size, m, number of predicted variables] or [m, batch size, number of predicted variables] where m is the forecasting horizon. Continuing the example of electricity price forecasting, if you want to forecast electricity prices 48 hours ahead, then m=48.

In addition, the encoder and decoders require so-called masks. The reader is referred to the above-mentioned post for an introduction to masking.

Another thing that is important to note before we proceed is that the specific time series Transformer implementation we’re using in this blog post always outputs a matrix of shape [batch size, m, number of predicted variables] or [m, batch size, number of predicted variables], i.e. the length of the model output sequences is determined by the length of the input sequences given to the decoder in the tgt tensor.

So if tgthas shape [72, batch size, 1], it means the length of the sequences in tgt is 72, and thus the model will also output sequences of 72.

How to use a time series Transformer for inference

Okay, with the preliminaries in place, let’s now consider why a blog post about how to run inference with a Transformer for time series forecasting even exists:

During training, it is straight forward to produce the tgt because we know the values of the target sequence. However, during inference (for instance in a production environment), we of course do not know the values of the target sequence when making the forecasts — otherwise we wouldn’t need to make the forecasts in the first place. So we need to find a way to produce a reasonable tgt that can be used as input to the model during inference.

Now that we know what inputs the time series Transformer requires and why we need to somehow generate the tgt , let’s take a look at how to actually do it. In what follows, keep in mind that the overall purpose is to produce a tgt tensor which, once it is produced, can be used as input to the model to make a forecast.

To illustrate with a simple example, suppose that at inference time, t, we want to forecast the next 3 values of a sequence based on the 5 most recent observations of the sequence.

Here is what the src would look like:

src = [xt-4, xt-3, xt-2, xt-1, xt]

where x denotes the series we’re dealing with, e.g. electricity prices.

The objective is to predict tgt_y which would be:

tgt_y = [xt+1, xt+2, xt+3]

So our tgt , which the model needs as input in order to make its forecast for tgt_y , should be:

tgt = [xt, xt+1, xt+2]

We know the value ofxt but not of xt+1 nor xt+2 , so we need to somehow estimate these. In this post, we will do this by first forecasting xt+1 , then add this forecast it to tgt such that tgt = [xt, xt+1] and then use this tgt to forecast xt+2 , then add this forecast to tgt such that tgt = [xt, xt+1, xt+2] and finally use this tgt to produce the final forecast.

The below function is the code you need to run inference with a time series Transformer model in PyTorch. The function produces a forecast according to the approach described above. You pass in a Transformer model and src along with some other arguments that are described in the docstring. The function then iteratively generates the tgt and produces the final forecast based on a tgt consisting of the last known observation at time t and estimated values for the remaining m-1 datapoints.

https://medium.com/media/8292b6b5bafe7d8b9c7d1cffd5cbd730/href

The function is designed to be used inside your validation or test loop. Instead of calling the model to produce the predictions, you call the inference function. Here is a simplified example of how to use it:

https://medium.com/media/ee296a360c493043851c3332c37ec0a3/href

Please note that you cannot use this script as is. This is merely an example to show the overall idea — it is not meant to be something you can copy paste and expect to work. For instance, you need to instantiate the model and the data loaders before you can make the script work. In the GitHub repo for this blog post, see the file sandbox.py for examples of how to do this. If you have never before trained, validated and tested a PyTorch neural network, I suggest you look at some of PyTorch’s beginner-level tutorials.

Downsides of the shown approach for running inference with a time series Transformer

Given that the inference functions relies on a loop to iteratively produce the tgt , the function can be slow if m is big because this will increase the number of iterations in the loop. This is the main drawback of the approach described above. I have not been imaginative enough to come up with a more efficient approach, but I’d love to hear from you in the comments section if you have any ideas. You’re also welcome to contribute directly to the repo.

Given that the inference function calls the model m-1 times per batch, you may be want to be wary of some of the things that increases the computation time of calling the model such as using a model with many parameters or using a big n. In addition, the more batches you have, the more times the inference function will be called and the longer the total training or test script will take to run.

The code for running inference with the time series Transformer as well as the PyTorch Transformer implementation can be found in this repo:

GitHub – KasperGroesLudvigsen/influenza_transformer: PyTorch implementation of Transformer model used in "Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case"

That’s it! I hope you enjoyed this post 🤞

Please leave a comment letting me know what you think.

Follow for more posts related to time series forecasting, green software engineering and the environmental impact of data science🍀

And feel free to connect with me on LinkedIn.


How to run inference with a PyTorch time series Transformer was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment