Model Validation Techniques for Time Series | by Michael Keith | Jun, 2022

By Jessie Hobb On Jun 17, 2022

Data splitting, cross validation, model optimization, and dynamic predictions to validate forecasting models

How do you know if your time series model is any good? How can you be sure whether changes to your model will make it better or worse? In part 1, we looked at how not validating a model correctly could mislead an audience about its accuracy. In that post, I was seemingly able to predict, among a few other incredible phenomena, COVID-19’s impact on the airline industry only training on data from before 2016, which, if this is not already clear, should not be possible. That post showed what not to do. This post will show what practices should be followed to soundly validate and optimize time-series models.

We first need to do some preparation. We will work with the sunspots dataset, available on Kaggle with a Public Domain license. We employ the following libraries:

If you have an older version of scalecast, cross validation will not be available. It is a good idea to upgrade the package:

pip install scalecast --upgrade

We will be using a 10% test split, leaving 2,939 observations to train and optimize models and 326 observations to test each model we apply. For inputs, we will use the series’ first 120 lags (constituting 10 years of data), a few seasonal lags, several irregular Fourier cycles, and a yearly trend. I talked about feature selection for this same dataset in a previous post. Also in that post, I found that the gradient boosted tree regressor from scikit-learn is a solid estimator of this data, so for brevity’s sake, we will limit the analysis to just that model class.

Data splitting, cross validation, model optimization, and dynamic predictions to validate forecasting models

We first need to do some preparation. We will work with the sunspots dataset, available on Kaggle with a Public Domain license. We employ the following libraries:

If you have an older version of scalecast, cross validation will not be available. It is a good idea to upgrade the package:

pip install scalecast --upgrade

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Model Validation Techniques for Time Series | by Michael Keith | Jun, 2022

Data splitting, cross validation, model optimization, and dynamic predictions to validate forecasting models

Create a Grid

Train/Validation/Test Split

Time Series Cross Validation

Rolling Time Series Cross Validation

Data splitting, cross validation, model optimization, and dynamic predictions to validate forecasting models

Create a Grid

Train/Validation/Test Split

Time Series Cross Validation

Rolling Time Series Cross Validation