More about forecasting in cienciadedatos.net
- Time series forecasting with machine learning
- ARIMA and SARIMAX models
- Time series forecasting with gradient boosting
- Exponential smoothing models
- ARAR forecasting models
- Visualizing time series data
- Electricity demand forecasting
- Global forecasting models I
- Global forecasting models II
- Global forecasting models III
- Clustering time series
- Forecasting with deep learning models
- Probabilistic forecasting I
- Probabilistic forecasting II
- Probabilistic forecasting III
- Data drift detection in time series forecasting models
- Forecasting categorical time series
- Interpretability and explainability in forecasting models
Introduction¶
The ARAR algorithm is a specialized time-series forecasting method designed to handle data with long-memory components or complex trends. Unlike traditional models that rely on rigid structures to stationarize data, ARAR employs a dynamic, two-step process: transforming the data through "memory shortening" and then modeling the remaining signal. This approach allows ARAR to be highly automated, making it a powerful alternative for scenarios where manual model selection is impractical.
The ARAR algorithm is defined by its unique workflow, which separates the data cleaning process from the prediction process.
Phase 1: The Memory Shortening Process
Before any forecasting occurs, ARAR applies a series of adaptive filters to the raw data.
- Goal: To decorrelate the data and remove long-term dependencies (trends and seasonality) that confuse standard models.
- Method: It uses "memory-shortening" filters. Rather than just looking at the immediately preceding time step, these filters can subtract a fraction of a value from a longer lag (e.g., a week or a year ago).
- Result: Transformed data that is "short-memory," making it easier to predict using simple autoregressive rules.
Phase 2: Fitting the Model
Once the data has been transformed, the algorithm fits a standard model to the residuals.
- Model Type: It utilizes an AR (Autoregressive) model.
- Simplicity: Because Phase 1 does the heavy lifting of removing complex patterns, Phase 2 only needs to look at past values (an "all-pole" model) rather than modeling random shocks (Moving Averages).
✏️ Note
This document is part of the series of articles about time series forecasting with statistical models:
For a more detailed explanation of ARIMA-SARIMAX models visit:
ARAR vs. ARIMA¶
While both methods aim to predict future values based on history, they approach the problem from fundamentally different angles.
| Feature | ARIMA (Auto-Regressive Integrated Moving Average) | ARAR (Memory Shortening + AR) |
|---|---|---|
| Transformation | Fixed Differencing. It removes trends by computing $y_t - y_{t-1}$ (the "Integration" step). | Adaptive Filters. It uses memory-shortening filters to subtract fractions of values from various time lags. |
| Model Form | ARMA. It fits a combination of Autoregressive (AR) and Moving Average (MA) terms. | AR Only. It fits a pure Autoregressive model to the transformed data. |
| Automation | Configurable. Requires selecting orders ($p, d, q$), though auto-selection methods (AIC, BIC) can assist. | Fully Automated. Designed to run with minimal human intervention, making it ideal for bulk processing. |
ARAR model theory¶
The ARAR model applies a memory-shortening transformation if the underlying process of a given time series ${Y_{t}, t = 1, 2, ..., n}$ is "long-memory" then it fits an autoregressive model.
Memory Shortening¶
The model follows five steps to classify ${Y_{t}}$ and take one of the following three actions:
- L: declare ${Y_{t}}$ as long memory and form ${Y_{t}}$ by ${\tilde{Y}_{t} = Y_{t} - \hat{\phi}Y_{t - \hat{\tau}}}$
- M: declare ${Y_{t}}$ as moderately long memory and form ${Y_{t}}$ by ${\tilde{Y}_{t} = Y_{t} - \hat{\phi}_{1}Y_{t -1} - \hat{\phi}_{2}Y_{t -2}}$
- S: declare ${Y_{t}}$ as short memory.
If ${Y_{t}}$ declared to be $L$ or $M$ then the series ${Y_{t}}$ is transformed again until. The transformation process continuous until the transformed series is classified as short memory. However, the maximum number of transformation process is three, it is very rare a time series require more than 2.
- For each $\tau = 1, 2, ..., 15$, we find the value $\hat{\phi(\tau)} $ of $\hat{\phi}$ that minimizes $ERR(\phi, \tau) = \frac{\sum_{t=\tau +1 }^{n} [Y_{t} - \phi Y_{t-\tau}]^2 }{\sum_{t=\tau +1 }^{n} Y_{t}^{2}}$ then define $Err(\tau) = ERR(\hat{\phi(\tau), \tau})$ and choose the lag $\hat{\tau}$ to be the value of $\tau$ that minimizes $Err(\tau)$.
- If $Err(\hat{\tau}) \leq 8/n$, ${Y_{t}}$ is a long-memory series.
- If $\hat{\phi}( \hat{\tau} ) \geq 0.93$ and $\hat{\tau} > 2$, ${Y_{t}}$ is a long-memory series.
- If $\hat{\phi}( \hat{\tau} ) \geq 0.93$ and $\hat{\tau} = 1$ or $2$, ${Y_{t}}$ is a long-memory series.
- If $\hat{\phi}( \hat{\tau} ) < 0.93$, ${Y_{t}}$ is a short-memory series.
Subset Autoregressive Model:¶
In the following we will describe how ARAR algorithm fits an autoregressive process to the mean-corrected series $X_{t} = S_{t}- {\bar{S}}$, $t = k+1, ..., n$ where ${S_{t}, t = k + 1, ..., n}$ is the memory-shortened version of ${Y_{t}}$ which derived from the five steps we described above and $\bar{S}$ is the sample mean of $S_{k+1}, ..., S_{n}$.
The fitted model has the following form:
$X_{t} = \phi_{1}X{t-1} + \phi_{1}X_{t-l_{1}} + \phi_{1}X_{t- l_{1}} + \phi_{1}X_{t-l_{1}} + Z$
where $Z \sim WN(0, \sigma^{2})$. The coefficients $\phi_{j}$ and white noise variance $\sigma^2$ can be derived from the Yule-Walker equations for given lags $l_1, l_2,$ and $l_3$ :
\begin{equation} \begin{bmatrix} 1 & \hat{\rho}(l_1 - 1) & \hat{\rho}(l_2 - 1) & \hat{\rho}(l_3 - 1)\\ \hat{\rho}(l_1 - 1) &1 & \hat{\rho}(l_2 - l_1) & \hat{\rho}(l_3 - l_1)\\ \hat{\rho}(l_2 - 1) & \hat{\rho}(l_2 - l_1) & 1 & \hat{\rho}(l_2 - l_2)\\ \hat{\rho}(l_3 - 1) & \hat{\rho}(l_3 - l_1) & \hat{\rho}(l_3 - l_1) & 1 \end{bmatrix}*\begin{bmatrix} \phi_{1} \\ \phi_{l_1} \\ \phi_{l_2}\\ \phi_{l_3} \end{bmatrix} = \begin{bmatrix} \hat{\rho}(1) \\ \hat{\rho}(l_1) \\ \hat{\rho}(l_2)\\ \hat{\rho}(l_3) \end{bmatrix} \end{equation}and $\sigma^2 = \hat{\gamma}(0) [1-\phi_1\hat{\rho}(1)] - \phi_{l_1}\hat{\rho}(l_1)] - \phi_{l_2}\hat{\rho}(l_2)] - \phi_{l_3}\hat{\rho}(l_3)]$, where $\hat{\gamma}(j)$ and $\hat{\rho}(j), j = 0, 1, 2, ...,$ are the sample autocovariances and autocorelations of the series $X_{t}$.
The algorithm computes the coefficients of $\phi(j)$ for each set of lags where $1<l_1<l_2<l_3 \leq m$ where m chosen to be 13 or 26. The algorithm selects the model that the Yule-Walker estimate of $\sigma^2$ is minimal.
Forecasting¶
If short-memory filter found in first step it has coefficients $\Psi_0, \Psi_1, ..., \Psi_k (k \geq0)$ where $\Psi_0 = 1$. In this case the transforemed series can be expressed as \begin{equation} S_t = \Psi(B)Y_t = Y_t + \Psi_1 Y_{t-1} + ...+ \Psi_k Y_{t-k}, \end{equation} where $\Psi(B) = 1 + \Psi_1B + ...+ \Psi_k B^k$ is polynomial in the back-shift operator.
If the coefficients of the subset autoregression found in the second step it has coefficients $\phi_1, \phi_{l_1}, \phi_{l_2}$ and $\phi_{l_3}$ then the subset AR model for $X_t = S_t - \bar{S}$ is
\begin{equation} \phi(B)X_t = Z_t, \end{equation}where $Z_t$ is a white-noise series with zero mean and constant variance and $\phi(B) = 1 - \phi_1B - \phi_{l_1}B^{l_1} - \phi_{l_2}B^{l_2} - \phi_{l_3}B^{l_3}$. From equation (1) and (2) one can obtain
\begin{equation} \xi(B)Y_t = \phi(1)\bar{S} + Z_t, \end{equation}where $\xi (B) = \Psi(B)\phi(B)$.
Assuming the fitted model in equation (3) is an appropriate model, and $Z_t$ is uncorrelated with $Y_j, j <t$ $\forall t \in T$, one can determine minimum mean squared error linear predictors $P_n Y_{n + h}$ of $Y_{n+h}$ in terms of ${1, Y_1, ..., Y_n}$ for $n > k + l_3$, from recursions
\begin{equation} P_n Y_{n+h} = - \sum_{j = 1}^{k + l_3} \xi P_nY_{n+h-j} + \phi(1)\bar{S}, h\geq 1, \end{equation}with the initial conditions $P_n Y_{n+h} = Y_{n + h}$, for $h\leq0$.
Ref: Brockwell, Peter J, and Richard A. Davis. Introduction to Time Series and Forecasting. Springer (2016)
✏️ Note
The python implementation of the ARAR algorithm in skforecast is based on the Julia package Durbyn.jl develop by Resul Akay.
Libraries and data¶
# Libraries
# ==============================================================================
import pandas as pd
import matplotlib.pyplot as plt
from skforecast.stats import Arar
from skforecast.recursive import ForecasterStats
from skforecast.model_selection import TimeSeriesFold, backtesting_stats
from skforecast.datasets import fetch_dataset
from skforecast.plot import set_dark_theme, plot_prediction_intervals
# Data download
# ==============================================================================
data = fetch_dataset(name='fuel_consumption', raw=True)
data = data[['Fecha', 'Gasolinas']]
data = data.rename(columns={'Fecha':'date', 'Gasolinas':'litters'})
data['date'] = pd.to_datetime(data['date'], format='%Y-%m-%d')
data = data.set_index('date')
data = data.loc[:'1990-01-01 00:00:00']
data = data.asfreq('MS')
data = data['litters'].rename('y')
display(data.head(4))
╭──────────────────────────────── fuel_consumption ────────────────────────────────╮ │ Description: │ │ Monthly fuel consumption in Spain from 1969-01-01 to 2022-08-01. │ │ │ │ Source: │ │ Obtained from Corporación de Reservas Estratégicas de Productos Petrolíferos and │ │ Corporación de Derecho Público tutelada por el Ministerio para la Transición │ │ Ecológica y el Reto Demográfico. https://www.cores.es/es/estadisticas │ │ │ │ URL: │ │ https://raw.githubusercontent.com/skforecast/skforecast- │ │ datasets/main/data/consumos-combustibles-mensual.csv │ │ │ │ Shape: 644 rows x 6 columns │ ╰──────────────────────────────────────────────────────────────────────────────────╯
date 1969-01-01 166875.2129 1969-02-01 155466.8105 1969-03-01 184983.6699 1969-04-01 202319.8164 Freq: MS, Name: y, dtype: float64
# Train-test dates
# ==============================================================================
end_train = '1983-01-01 23:59:59'
print(
f"Train dates : {data.index.min()} --- {data.loc[:end_train].index.max()} "
f"(n={len(data.loc[:end_train])})"
)
print(
f"Test dates : {data.loc[end_train:].index.min()} --- {data.loc[:].index.max()} "
f"(n={len(data.loc[end_train:])})"
)
data_train = data.loc[:end_train]
data_test = data.loc[end_train:]
# Plot
# ==============================================================================
set_dark_theme()
fig, ax=plt.subplots(figsize=(7, 3))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
ax.set_title('Monthly fuel consumption in Spain')
ax.legend();
Train dates : 1969-01-01 00:00:00 --- 1983-01-01 00:00:00 (n=169) Test dates : 1983-02-01 00:00:00 --- 1990-01-01 00:00:00 (n=84)
ARAR¶
Skforecast provides the class ARAR to facilitate the creation of ARAR models in Python, allowing users to easily fit and forecast time series data using this approach.
Two arguments can be specified when creating an instance of the ARAR class:
max_ar_depth: Maximum lag considered when fitting the subset autoregressive model. If not specified, defaults are data-dependent: 26 for series with >40 observations, 13 for series with 13-40 observations, or max(4, ⌈n/3⌉) for shorter series.max_lag: Maximum lag used when computing autocovariances. If not specified, defaults are data-dependent: 40 for series with >40 observations, 13 for series with 13-40 observations, or max(4, ⌈n/2⌉) for shorter series.
# ARAR model
# ==============================================================================
model = Arar()
model.fit(y=data_train)
Arar(lags=(1, 2, 12, 13))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| max_ar_depth | 26 | |
| max_lag | 40 | |
| safe | True |
Once the model is fitted, future observations can be forecasted using the predict and predict_interval methods.
# Prediction
# ==============================================================================
model.predict(steps=10)
array([409292.69510231, 474934.65681833, 533164.63956981, 471306.38987833,
521540.20231166, 624704.82153381, 621950.46146758, 534146.97012364,
518085.06631994, 468517.6939802 ])
# Prediction interval
# ==============================================================================
model.predict_interval(steps=10, level=[95])
| mean | lower_95 | upper_95 | |
|---|---|---|---|
| step | |||
| 1 | 409292.695102 | 373011.640924 | 445573.749280 |
| 2 | 474934.656818 | 436834.085425 | 513035.228212 |
| 3 | 533164.639570 | 490061.171827 | 576268.107312 |
| 4 | 471306.389878 | 426635.407785 | 515977.371972 |
| 5 | 521540.202312 | 475047.704232 | 568032.700391 |
| 6 | 624704.821534 | 577263.161234 | 672146.481834 |
| 7 | 621950.461468 | 573688.254150 | 670212.668785 |
| 8 | 534146.970124 | 485362.925306 | 582931.014941 |
| 9 | 518085.066320 | 468896.704356 | 567273.428283 |
| 10 | 468517.693980 | 419051.025062 | 517984.362899 |
ForecasterStats¶
The previous section introduced the construction of ARAR models. In order to seamlessly integrate these models with the various functionalities provided by skforecast, the next step is to encapsulate the skforecast ARAR model within a ForecasterStats object. This encapsulation harmonizes the intricacies of the model and allows for the coherent use of skforecast's extensive capabilities.
Train¶
The train-prediction process follows an API similar to that of scikit-learn. More details in the ForecasterStats user guide.
# Create and fit ForecasterStats
# ==============================================================================
forecaster = ForecasterStats(estimator=Arar())
forecaster.fit(y=data_train)
forecaster
ForecasterStats
General Information
- Estimators:
- skforecast.Arar: Arar(lags=(1, 2, 12, 13))
- Window size: 1
- Series name: y
- Exogenous included: False
- Creation date: 2026-02-02 13:49:39
- Last fit date: 2026-02-02 13:49:39
- Skforecast version: 0.20.0
- Python version: 3.13.11
- Forecaster id: None
Exogenous Variables
-
None
Data Transformations
- Transformer for y: None
- Transformer for exog: None
Training Information
- Training range: [Timestamp('1969-01-01 00:00:00'), Timestamp('1983-01-01 00:00:00')]
- Training index type: DatetimeIndex
- Training index frequency: MS
Estimator Parameters
- skforecast.Arar: {'max_ar_depth': 26, 'max_lag': 40, 'safe': True}
Fit Kwargs
-
None
Prediction¶
Once the model is fitted, it can be used to forecast future observations. It is important to note that these types of models require predictions to follow immediately after the training data; therefore, the forecast starts right after the last observed value.
# Predict
# ==============================================================================
steps = len(data_test)
predictions = forecaster.predict(steps=steps)
predictions.head(3)
1983-02-01 409292.695102 1983-03-01 474934.656818 1983-04-01 533164.639570 Freq: MS, Name: pred, dtype: float64
Prediction intervals¶
The method predict_interval enables the calculation of prediction intervals for the forecasted values. Users can specify the confidence level of the estimated interval using either the alpha or interval argument.
# Predict intervals
# ==============================================================================
predictions = forecaster.predict_interval(steps=steps, alpha=0.05)
predictions.head(3)
| pred | lower_bound | upper_bound | |
|---|---|---|---|
| 1983-02-01 | 409292.695102 | 373011.640924 | 445573.749280 |
| 1983-03-01 | 474934.656818 | 436834.085425 | 513035.228212 |
| 1983-04-01 | 533164.639570 | 490061.171827 | 576268.107312 |
# Plot predictions
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
plot_prediction_intervals(
predictions = predictions,
y_true = data_test,
target_variable = "y",
title = "Prediction intervals in test data",
kwargs_fill_between = {'color': 'white', 'alpha': 0.3, 'zorder': 1},
ax = ax
)
Feature importances¶
The method get_feature_importances returns the estimated coefficients of the model (AR, MA, and seasonal components). These coefficients indicate the weight assigned to each lag term in the model's equations.
# Feature importances
# ==============================================================================
model.get_feature_importances()
| feature | importance | |
|---|---|---|
| 0 | lag_1 | 0.320650 |
| 1 | lag_2 | 0.452729 |
| 2 | lag_12 | -0.335571 |
| 3 | lag_13 | 0.275434 |
Backtesting¶
ARAR and other statistical models, once integrated in a ForecasterStats object, can be evaluated using any of the backtesting strategies implemented in skforecast.
✏️ Note
Why do statistical models require refitting during backtesting?
Unlike machine learning models, statistical models like ARAR maintain an internal state that depends on the sequence of observations. They can only generate predictions starting from the last observed time step — they cannot "jump" to an arbitrary point in the future without knowing all previous values.
During backtesting, when the validation window moves forward, the model must be refitted to incorporate the new observations and update its internal state. This is why refit=True is typically required.
Performance optimization: Because refitting is mandatory, skforecast's Numba-optimized backend becomes essential. It enables hundreds of refits during backtesting in a fraction of the time required by non-optimized libraries.
# Create forecaster
# ==============================================================================
forecaster = ForecasterStats(estimator=Arar())
# Backtesting
# ==============================================================================
cv = TimeSeriesFold(
steps = 12,
initial_train_size = len(data_train),
refit = True,
)
metrics, predictions = backtesting_stats(
forecaster = forecaster,
y = data,
cv = cv,
metric = ['mean_absolute_error', 'mean_absolute_percentage_error'],
suppress_warnings = True,
verbose = True,
)
Information of folds
--------------------
Number of observations used for initial training: 169
Number of observations used for backtesting: 84
Number of folds: 7
Number skipped folds: 0
Number of steps per fold: 12
Number of steps to exclude between last observed data (last window) and predictions (gap): 0
Fold: 0
Training: 1969-01-01 00:00:00 -- 1983-01-01 00:00:00 (n=169)
Validation: 1983-02-01 00:00:00 -- 1984-01-01 00:00:00 (n=12)
Fold: 1
Training: 1970-01-01 00:00:00 -- 1984-01-01 00:00:00 (n=169)
Validation: 1984-02-01 00:00:00 -- 1985-01-01 00:00:00 (n=12)
Fold: 2
Training: 1971-01-01 00:00:00 -- 1985-01-01 00:00:00 (n=169)
Validation: 1985-02-01 00:00:00 -- 1986-01-01 00:00:00 (n=12)
Fold: 3
Training: 1972-01-01 00:00:00 -- 1986-01-01 00:00:00 (n=169)
Validation: 1986-02-01 00:00:00 -- 1987-01-01 00:00:00 (n=12)
Fold: 4
Training: 1973-01-01 00:00:00 -- 1987-01-01 00:00:00 (n=169)
Validation: 1987-02-01 00:00:00 -- 1988-01-01 00:00:00 (n=12)
Fold: 5
Training: 1974-01-01 00:00:00 -- 1988-01-01 00:00:00 (n=169)
Validation: 1988-02-01 00:00:00 -- 1989-01-01 00:00:00 (n=12)
Fold: 6
Training: 1975-01-01 00:00:00 -- 1989-01-01 00:00:00 (n=169)
Validation: 1989-02-01 00:00:00 -- 1990-01-01 00:00:00 (n=12)
0%| | 0/7 [00:00<?, ?it/s]
# Backtest predictions
# ==============================================================================
predictions.head(4)
| fold | pred | |
|---|---|---|
| 1983-02-01 | 0 | 409292.695102 |
| 1983-03-01 | 0 | 474934.656818 |
| 1983-04-01 | 0 | 533164.639570 |
| 1983-05-01 | 0 | 471306.389878 |
# Backtest metrics
# ==============================================================================
metrics
| mean_absolute_error | mean_absolute_percentage_error | |
|---|---|---|
| 0 | 24422.818488 | 0.045112 |
# Plot backtest predictions
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data.loc[end_train:].plot(ax=ax, label='test')
predictions['pred'].plot(ax=ax)
ax.set_title('Backtest predictions (folds of 12 months)')
ax.legend();
Session information¶
import session_info
session_info.show(html=False)
----- matplotlib 3.10.8 pandas 2.3.3 session_info v1.0.1 skforecast 0.20.0 ----- IPython 9.8.0 jupyter_client 7.4.9 jupyter_core 5.9.1 notebook 6.5.7 ----- Python 3.13.11 | packaged by Anaconda, Inc. | (main, Dec 10 2025, 21:28:48) [GCC 14.3.0] Linux-6.14.0-37-generic-x86_64-with-glibc2.39 ----- Session information updated at 2026-02-02 13:49
Citation¶
How to cite this document
If you use this document or any part of it, please acknowledge the source, thank you!
ARAR forecasting models in Python by Joaquín Amat Rodrigo, Javier Escobar Ortiz and Resul Akay, available under Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) at https://cienciadedatos.net/documentos/py73-arar-forecasting-models-python.html
How to cite skforecast
If you use skforecast for a publication, we would appreciate if you cite the published software.
Zenodo:
Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2024). skforecast (v0.20.0). Zenodo. https://doi.org/10.5281/zenodo.8382787
APA:
Amat Rodrigo, J., & Escobar Ortiz, J. (2024). skforecast (Version 0.20.0) [Computer software]. https://doi.org/10.5281/zenodo.8382787
BibTeX:
@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.20.0}, month = {01}, year = {2026}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }
Did you like the article? Your support is important
Your contribution will help me to continue generating free educational content. Many thanks! 😊
This work by Joaquín Amat Rodrigo, Javier Escobar Ortiz and Resul Akay is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.
Allowed:
-
Share: copy and redistribute the material in any medium or format.
-
Adapt: remix, transform, and build upon the material.
Under the following terms:
-
Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NonCommercial: You may not use the material for commercial purposes.
-
ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
