VARMAX models

This is a brief introduction notebook to VARMAX models in Statsmodels. The VARMAX model is generically specified as: $$ y_t = \nu + A_1 y_{t-1} + \dots + A_p y_{t-p} + B x_t + \epsilon_t + M_1 \epsilon_{t-1} + \dots M_q \epsilon_{t-q} $$

where $y_t$ is a $\text{k_endog} \times 1$ vector.

In [1]:
%matplotlib inline
In [2]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
In [3]:
dta = sm.datasets.webuse('lutkepohl2', 'https://www.stata-press.com/data/r12/')
dta.index = dta.qtr
endog = dta.loc['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]

Model specification

The VARMAX class in Statsmodels allows estimation of VAR, VMA, and VARMA models (through the order argument), optionally with a constant term (via the trend argument). Exogenous regressors may also be included (as usual in Statsmodels, by the exog argument), and in this way a time trend may be added. Finally, the class allows measurement error (via the measurement_error argument) and allows specifying either a diagonal or unstructured innovation covariance matrix (via the error_cov_type argument).

Example 1: VAR

Below is a simple VARX(2) model in two endogenous variables and an exogenous series, but no constant term. Notice that we needed to allow for more iterations than the default (which is maxiter=50) in order for the likelihood estimation to converge. This is not unusual in VAR models which have to estimate a large number of parameters, often on a relatively small number of time series: this model, for example, estimates 27 parameters off of 75 observations of 3 variables.

In [4]:
exog = endog['dln_consump']
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0), trend='nc', exog=exog)
res = mod.fit(maxiter=1000, disp=False)
print(res.summary())
/home/travis/build/statsmodels/statsmodels/statsmodels/tsa/base/tsa_model.py:165: ValueWarning: No frequency information was provided, so inferred frequency QS-OCT will be used.
  % freq, ValueWarning)
                             Statespace Model Results                             
==================================================================================
Dep. Variable:     ['dln_inv', 'dln_inc']   No. Observations:                   75
Model:                            VARX(2)   Log Likelihood                 361.038
Date:                    Fri, 19 Jul 2019   AIC                           -696.077
Time:                            16:50:58   BIC                           -665.949
Sample:                        04-01-1960   HQIC                          -684.047
                             - 10-01-1978                                         
Covariance Type:                      opg                                         
===================================================================================
Ljung-Box (Q):                61.24, 39.25   Jarque-Bera (JB):          11.14, 2.41
Prob(Q):                        0.02, 0.50   Prob(JB):                   0.00, 0.30
Heteroskedasticity (H):         0.45, 0.40   Skew:                      0.16, -0.38
Prob(H) (two-sided):            0.05, 0.03   Kurtosis:                   4.86, 3.44
                            Results for equation dln_inv                            
====================================================================================
                       coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------------
L1.dln_inv          -0.2388      0.093     -2.564      0.010      -0.421      -0.056
L1.dln_inc           0.2861      0.450      0.636      0.525      -0.595       1.167
L2.dln_inv          -0.1665      0.155     -1.072      0.284      -0.471       0.138
L2.dln_inc           0.0628      0.421      0.149      0.881      -0.762       0.888
beta.dln_consump     0.9750      0.638      1.528      0.127      -0.276       2.226
                            Results for equation dln_inc                            
====================================================================================
                       coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------------
L1.dln_inv           0.0633      0.036      1.773      0.076      -0.007       0.133
L1.dln_inc           0.0811      0.107      0.758      0.448      -0.129       0.291
L2.dln_inv           0.0104      0.033      0.315      0.753      -0.054       0.075
L2.dln_inc           0.0350      0.134      0.261      0.794      -0.228       0.298
beta.dln_consump     0.7731      0.112      6.879      0.000       0.553       0.993
                                  Error covariance matrix                                   
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
sqrt.var.dln_inv             0.0434      0.004     12.284      0.000       0.036       0.050
sqrt.cov.dln_inv.dln_inc   5.58e-05      0.002      0.028      0.978      -0.004       0.004
sqrt.var.dln_inc             0.0109      0.001     11.222      0.000       0.009       0.013
============================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

From the estimated VAR model, we can plot the impulse response functions of the endogenous variables.

In [5]:
ax = res.impulse_responses(10, orthogonalized=True).plot(figsize=(13,3))
ax.set(xlabel='t', title='Responses to a shock to `dln_inv`');

Example 2: VMA

A vector moving average model can also be formulated. Below we show a VMA(2) on the same data, but where the innovations to the process are uncorrelated. In this example we leave out the exogenous regressor but now include the constant term.

In [6]:
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(0,2), error_cov_type='diagonal')
res = mod.fit(maxiter=1000, disp=False)
print(res.summary())
/home/travis/build/statsmodels/statsmodels/statsmodels/tsa/base/tsa_model.py:165: ValueWarning: No frequency information was provided, so inferred frequency QS-OCT will be used.
  % freq, ValueWarning)
                             Statespace Model Results                             
==================================================================================
Dep. Variable:     ['dln_inv', 'dln_inc']   No. Observations:                   75
Model:                             VMA(2)   Log Likelihood                 353.887
                              + intercept   AIC                           -683.774
Date:                    Fri, 19 Jul 2019   BIC                           -655.964
Time:                            16:51:02   HQIC                          -672.670
Sample:                        04-01-1960                                         
                             - 10-01-1978                                         
Covariance Type:                      opg                                         
===================================================================================
Ljung-Box (Q):                68.61, 39.33   Jarque-Bera (JB):         12.77, 13.96
Prob(Q):                        0.00, 0.50   Prob(JB):                   0.00, 0.00
Heteroskedasticity (H):         0.44, 0.81   Skew:                      0.06, -0.49
Prob(H) (two-sided):            0.04, 0.59   Kurtosis:                   5.02, 4.87
                           Results for equation dln_inv                          
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
intercept         0.0182      0.005      3.809      0.000       0.009       0.028
L1.e(dln_inv)    -0.2576      0.106     -2.437      0.015      -0.465      -0.050
L1.e(dln_inc)     0.5044      0.629      0.802      0.422      -0.728       1.737
L2.e(dln_inv)     0.0286      0.149      0.192      0.848      -0.264       0.321
L2.e(dln_inc)     0.1951      0.475      0.410      0.682      -0.737       1.127
                           Results for equation dln_inc                          
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
intercept         0.0207      0.002     13.065      0.000       0.018       0.024
L1.e(dln_inv)     0.0477      0.042      1.145      0.252      -0.034       0.129
L1.e(dln_inc)    -0.0709      0.141     -0.503      0.615      -0.347       0.205
L2.e(dln_inv)     0.0181      0.043      0.424      0.672      -0.065       0.102
L2.e(dln_inc)     0.1199      0.154      0.780      0.435      -0.181       0.421
                             Error covariance matrix                              
==================================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
----------------------------------------------------------------------------------
sigma2.dln_inv     0.0020      0.000      7.345      0.000       0.001       0.003
sigma2.dln_inc     0.0001   2.32e-05      5.840      0.000    9.01e-05       0.000
==================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Caution: VARMA(p,q) specifications

Although the model allows estimating VARMA(p,q) specifications, these models are not identified without additional restrictions on the representation matrices, which are not built-in. For this reason, it is recommended that the user proceed with error (and indeed a warning is issued when these models are specified). Nonetheless, they may in some circumstances provide useful information.

In [7]:
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(1,1))
res = mod.fit(maxiter=1000, disp=False)
print(res.summary())
/home/travis/build/statsmodels/statsmodels/statsmodels/tsa/statespace/varmax.py:159: EstimationWarning: Estimation of VARMA(p,q) models is not generically robust, due especially to identification issues.
  EstimationWarning)
/home/travis/build/statsmodels/statsmodels/statsmodels/tsa/base/tsa_model.py:165: ValueWarning: No frequency information was provided, so inferred frequency QS-OCT will be used.
  % freq, ValueWarning)
                             Statespace Model Results                             
==================================================================================
Dep. Variable:     ['dln_inv', 'dln_inc']   No. Observations:                   75
Model:                         VARMA(1,1)   Log Likelihood                 354.283
                              + intercept   AIC                           -682.567
Date:                    Fri, 19 Jul 2019   BIC                           -652.440
Time:                            16:51:04   HQIC                          -670.537
Sample:                        04-01-1960                                         
                             - 10-01-1978                                         
Covariance Type:                      opg                                         
===================================================================================
Ljung-Box (Q):                68.77, 39.05   Jarque-Bera (JB):         10.77, 14.12
Prob(Q):                        0.00, 0.51   Prob(JB):                   0.00, 0.00
Heteroskedasticity (H):         0.43, 0.91   Skew:                      0.00, -0.46
Prob(H) (two-sided):            0.04, 0.81   Kurtosis:                   4.86, 4.92
                           Results for equation dln_inv                          
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
intercept         0.0110      0.068      0.161      0.872      -0.122       0.144
L1.dln_inv       -0.0097      0.718     -0.013      0.989      -1.417       1.397
L1.dln_inc        0.3620      2.847      0.127      0.899      -5.218       5.942
L1.e(dln_inv)    -0.2502      0.729     -0.343      0.732      -1.680       1.180
L1.e(dln_inc)     0.1262      3.089      0.041      0.967      -5.929       6.181
                           Results for equation dln_inc                          
=================================================================================
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
intercept         0.0166      0.029      0.580      0.562      -0.039       0.072
L1.dln_inv       -0.0332      0.286     -0.116      0.908      -0.593       0.527
L1.dln_inc        0.2311      1.160      0.199      0.842      -2.042       2.505
L1.e(dln_inv)     0.0884      0.292      0.303      0.762      -0.484       0.661
L1.e(dln_inc)    -0.2352      1.192     -0.197      0.844      -2.572       2.102
                                  Error covariance matrix                                   
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
sqrt.var.dln_inv             0.0449      0.003     14.535      0.000       0.039       0.051
sqrt.cov.dln_inv.dln_inc     0.0017      0.003      0.645      0.519      -0.003       0.007
sqrt.var.dln_inc             0.0116      0.001     11.667      0.000       0.010       0.013
============================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).