{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dates in timeseries models" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import statsmodels.api as sm\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting started" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [], "source": [ "data = sm.datasets.sunspots.load()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Right now an annual date series must be datetimes at the end of the year." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from datetime import datetime\n", "dates = sm.tsa.datetools.dates_from_range('1700', length=len(data.endog))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Pandas\n", "\n", "Make a pandas TimeSeries or DataFrame" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "endog = pd.Series(data.endog, index=dates)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instantiate the model" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/travis/build/statsmodels/statsmodels/statsmodels/tsa/ar_model.py:691: FutureWarning: \n", "statsmodels.tsa.AR has been deprecated in favor of statsmodels.tsa.AutoReg and\n", "statsmodels.tsa.SARIMAX.\n", "\n", "AutoReg adds the ability to specify exogenous variables, include time trends,\n", "and add seasonal dummies. The AutoReg API differs from AR since the model is\n", "treated as immutable, and so the entire specification including the lag\n", "length must be specified when creating the model. This change is too\n", "substantial to incorporate into the existing AR api. The function\n", "ar_select_order performs lag length selection for AutoReg models.\n", "\n", "AutoReg only estimates parameters using conditional MLE (OLS). Use SARIMAX to\n", "estimate ARX and related models using full MLE via the Kalman Filter.\n", "\n", "To silence this warning and continue using AR until it is removed, use:\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore', 'statsmodels.tsa.ar_model.AR', FutureWarning)\n", "\n", " warnings.warn(AR_DEPRECATION_WARN, FutureWarning)\n" ] } ], "source": [ "ar_model = sm.tsa.AR(endog, freq='A')\n", "pandas_ar_res = ar_model.fit(maxlag=9, method='mle', disp=-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Out-of-sample prediction" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2005-12-31 20.003293\n", "2006-12-31 24.703980\n", "2007-12-31 20.026125\n", "2008-12-31 23.473639\n", "2009-12-31 30.858582\n", "2010-12-31 61.335455\n", "2011-12-31 87.024687\n", "2012-12-31 91.321234\n", "2013-12-31 79.921599\n", "2014-12-31 60.799502\n", "2015-12-31 40.374871\n", "Freq: A-DEC, dtype: float64\n" ] } ], "source": [ "pred = pandas_ar_res.predict(start='2005', end='2015')\n", "print(pred)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using explicit dates" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[20.00329308 24.70398036 20.02612492 23.47363929 30.85858227 61.33545516\n", " 87.02468683 91.3212336 79.92159926 60.79950228 40.37487059]\n" ] } ], "source": [ "ar_model = sm.tsa.AR(data.endog, dates=dates, freq='A')\n", "ar_res = ar_model.fit(maxlag=9, method='mle', disp=-1)\n", "pred = ar_res.predict(start='2005', end='2015')\n", "print(pred)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This just returns a regular array, but since the model has date information attached, you can get the prediction dates in a roundabout way." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DatetimeIndex(['2005-12-31', '2006-12-31', '2007-12-31', '2008-12-31',\n", " '2009-12-31', '2010-12-31', '2011-12-31', '2012-12-31',\n", " '2013-12-31', '2014-12-31', '2015-12-31'],\n", " dtype='datetime64[ns]', freq='A-DEC')\n" ] } ], "source": [ "print(ar_res.data.predict_dates)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: This attribute only exists if predict has been called. It holds the dates associated with the last call to predict." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 0 }