M-Estimators for Robust Linear Modeling
=========================================


.. _robust_models_1_notebook:

`Link to Notebook GitHub <https://github.com/statsmodels/statsmodels/blob/master/examples/notebooks/robust_models_1.ipynb>`_

.. raw:: html

   
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="kn">from</span> <span class="nn">__future__</span> <span class="k">import</span> <span class="n">print_function</span>
   <span class="kn">from</span> <span class="nn">statsmodels.compat</span> <span class="k">import</span> <span class="n">lmap</span>
   <span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
   <span class="kn">from</span> <span class="nn">scipy</span> <span class="k">import</span> <span class="n">stats</span>
   <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
   
   <span class="kn">import</span> <span class="nn">statsmodels.api</span> <span class="k">as</span> <span class="nn">sm</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>An M-estimator minimizes the function </li>
   </ul>
   $$Q(e_i, \rho) = \sum_i~\rho \left (\frac{e_i}{s}\right )$$<p>where $\rho$ is a symmetric function of the residuals</p>
   <ul>
   <li>The effect of $\rho$ is to reduce the influence of outliers</li>
   <li>$s$ is an estimate of scale. </li>
   <li>The robust estimates $\hat{\beta}$ are computed by the iteratively re-weighted least squares algorithm</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>We have several choices available for the weighting functions to be used</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">norms</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">norms</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="k">def</span> <span class="nf">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">weights_func</span><span class="p">,</span> <span class="n">xlabels</span><span class="p">,</span> <span class="n">xticks</span><span class="p">):</span>
       <span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span><span class="mi">8</span><span class="p">))</span>
       <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">)</span>
       <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">weights_func</span><span class="p">(</span><span class="n">support</span><span class="p">))</span>
       <span class="n">ax</span><span class="o">.</span><span class="n">set_xticks</span><span class="p">(</span><span class="n">xticks</span><span class="p">)</span>
       <span class="n">ax</span><span class="o">.</span><span class="n">set_xticklabels</span><span class="p">(</span><span class="n">xlabels</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">16</span><span class="p">)</span>
       <span class="n">ax</span><span class="o">.</span><span class="n">set_ylim</span><span class="p">(</span><span class="o">-.</span><span class="mi">1</span><span class="p">,</span> <span class="mf">1.1</span><span class="p">)</span>
       <span class="k">return</span> <span class="n">ax</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Andrew's-Wave">Andrew's Wave<a class="anchor-link" href="#Andrew's-Wave">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">AndrewWave</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">a</span> <span class="o">=</span> <span class="mf">1.339</span>
   <span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
   <span class="n">andrew</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">AndrewWave</span><span class="p">(</span><span class="n">a</span><span class="o">=</span><span class="n">a</span><span class="p">)</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">andrew</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;$-\pi*a$&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;$\pi*a$&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="o">*</span><span class="n">a</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       Andrew&apos;s wave weighting function for the IRLS algorithm
       
       The psi function scaled by z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           weights(z) = sin(z/a)/(z/a)     for \|z\| &lt;= a*pi
       
           weights(z) = 0                  for \|z\| &gt; a*pi
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Hampel's-17A">Hampel's 17A<a class="anchor-link" href="#Hampel's-17A">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">Hampel</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">c</span> <span class="o">=</span> <span class="mi">8</span>
   <span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
   <span class="n">hampel</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">Hampel</span><span class="p">(</span><span class="n">a</span><span class="o">=</span><span class="mf">2.</span><span class="p">,</span> <span class="n">b</span><span class="o">=</span><span class="mf">4.</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">hampel</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;3*c&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;3*c&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       Hampel weighting function for the IRLS algorithm
       
       The psi function scaled by z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           weights(z) = 1                            for \|z\| &lt;= a
       
           weights(z) = a/\|z\|                        for a &lt; \|z\| &lt;= b
       
           weights(z) = a*(c - \|z\|)/(\|z\|*(c-b))      for b &lt; \|z\| &lt;= c
       
           weights(z) = 0                            for \|z\| &gt; c
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Huber's-t">Huber's t<a class="anchor-link" href="#Huber's-t">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">HuberT</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">t</span> <span class="o">=</span> <span class="mf">1.345</span>
   <span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">t</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">t</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
   <span class="n">huber</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">HuberT</span><span class="p">(</span><span class="n">t</span><span class="o">=</span><span class="n">t</span><span class="p">)</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">huber</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;-3*t&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;3*t&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">t</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">t</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       Huber&apos;s t weighting function for the IRLS algorithm
       
       The psi function scaled by z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           weights(z) = 1          for \|z\| &lt;= t
       
           weights(z) = t/\|z\|      for \|z\| &gt; t
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Least-Squares">Least Squares<a class="anchor-link" href="#Least-Squares">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">LeastSquares</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
   <span class="n">lst_sq</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">LeastSquares</span><span class="p">()</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">lst_sq</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;-3&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;3&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       The least squares estimator weighting function for the IRLS algorithm.
       
       The psi function scaled by the input z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           weights(z) = np.ones(z.shape)
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Ramsay's-Ea">Ramsay's Ea<a class="anchor-link" href="#Ramsay's-Ea">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">RamsayE</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">a</span> <span class="o">=</span> <span class="o">.</span><span class="mi">3</span>
   <span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
   <span class="n">ramsay</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">RamsayE</span><span class="p">(</span><span class="n">a</span><span class="o">=</span><span class="n">a</span><span class="p">)</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">ramsay</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;-3*a&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;3*a&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">a</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       Ramsay&apos;s Ea weighting function for the IRLS algorithm
       
       The psi function scaled by z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           weights(z) = exp(-a*\|z\|)
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Trimmed-Mean">Trimmed Mean<a class="anchor-link" href="#Trimmed-Mean">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">TrimmedMean</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">c</span> <span class="o">=</span> <span class="mi">2</span>
   <span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
   <span class="n">trimmed</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">TrimmedMean</span><span class="p">(</span><span class="n">c</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">trimmed</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;-3*c&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;3*c&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       Least trimmed mean weighting function for the IRLS algorithm
       
       The psi function scaled by z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           weights(z) = 1             for \|z\| &lt;= c
       
           weights(z) = 0             for \|z\| &gt; c
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Tukey's-Biweight">Tukey's Biweight<a class="anchor-link" href="#Tukey's-Biweight">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">help</span><span class="p">(</span><span class="n">norms</span><span class="o">.</span><span class="n">TukeyBiweight</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">c</span> <span class="o">=</span> <span class="mf">4.685</span>
   <span class="n">support</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
   <span class="n">tukey</span> <span class="o">=</span> <span class="n">norms</span><span class="o">.</span><span class="n">TukeyBiweight</span><span class="p">(</span><span class="n">c</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
   <span class="n">plot_weights</span><span class="p">(</span><span class="n">support</span><span class="p">,</span> <span class="n">tukey</span><span class="o">.</span><span class="n">weights</span><span class="p">,</span> <span class="p">[</span><span class="s">&#39;-3*c&#39;</span><span class="p">,</span> <span class="s">&#39;0&#39;</span><span class="p">,</span> <span class="s">&#39;3*c&#39;</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">c</span><span class="p">]);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Help on function weights in module statsmodels.robust.norms:
   
   weights(self, z)
       Tukey&apos;s biweight weighting function for the IRLS algorithm
       
       The psi function scaled by z
       
       Parameters
       ----------
       z : array-like
           1d array
       
       Returns
       -------
       weights : array
           psi(z) = (1 - (z/c)**2)**2          for \|z\| &lt;= R
       
           psi(z) = 0                          for \|z\| &gt; R
   
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Scale-Estimators">Scale Estimators<a class="anchor-link" href="#Scale-Estimators">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>Robust estimates of the location</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">500</span><span class="p">])</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>The mean is not a robust estimator of location</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">x</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>The median, on the other hand, is a robust estimator with a breakdown point of 50%</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">np</span><span class="o">.</span><span class="n">median</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>Analagously for the scale</li>
   <li>The standard deviation is not robust</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">x</span><span class="o">.</span><span class="n">std</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Median Absolute Deviation</p>
   $$ median_i |X_i - median_j(X_j)|) $$
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <p>Standardized Median Absolute Deviation is a consistent estimator for $\hat{\sigma}$</p>
   $$\hat{\sigma}=K \cdot MAD$$<p>where $K$ depends on the distribution. For the normal distribution for example,</p>
   $$K = \Phi^{-1}(.75)$$
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">(</span><span class="o">.</span><span class="mi">75</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">scale</span><span class="o">.</span><span class="n">stand_mad</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>[  1   2   3   4 500]
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mf">5.</span><span class="p">])</span><span class="o">.</span><span class="n">std</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stderr output_text">
   <pre>/Users/tom.augspurger/Envs/py3/lib/python3.4/site-packages/statsmodels-0.6.1-py3.4-macosx-10.10-x86_64.egg/statsmodels/robust/scale.py:49: FutureWarning: stand_mad is deprecated and will be removed in 0.7.0. Use mad instead.
     &quot;instead.&quot;, FutureWarning)
   </pre>
   </div>
   </div>
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>The default for Robust Linear Models is MAD</li>
   <li>another popular choice is Huber's proposal 2</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">12345</span><span class="p">)</span>
   <span class="n">fat_tails</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="p">(</span><span class="mi">6</span><span class="p">)</span><span class="o">.</span><span class="n">rvs</span><span class="p">(</span><span class="mi">40</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">kde</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">nonparametric</span><span class="o">.</span><span class="n">KDE</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">)</span>
   <span class="n">kde</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span><span class="mi">8</span><span class="p">))</span>
   <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">)</span>
   <span class="n">ax</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">kde</span><span class="o">.</span><span class="n">support</span><span class="p">,</span> <span class="n">kde</span><span class="o">.</span><span class="n">density</span><span class="p">);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">fat_tails</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">fat_tails</span><span class="o">.</span><span class="n">std</span><span class="p">())</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">))</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>(0.068823104481087499, 1.3471633229698652)
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">,</span> <span class="n">f0</span><span class="o">=</span><span class="mi">6</span><span class="p">))</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>(6, 0.039009187170278181, 1.0564230978488927)
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">huber</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">scale</span><span class="o">.</span><span class="n">Huber</span><span class="p">()</span>
   <span class="n">loc</span><span class="p">,</span> <span class="n">scale</span> <span class="o">=</span> <span class="n">huber</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">)</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">loc</span><span class="p">,</span> <span class="n">scale</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>0.04048984333271795 1.1557140047569665
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">stand_mad</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stderr output_text">
   <pre>/Users/tom.augspurger/Envs/py3/lib/python3.4/site-packages/statsmodels-0.6.1-py3.4-macosx-10.10-x86_64.egg/statsmodels/robust/scale.py:49: FutureWarning: stand_mad is deprecated and will be removed in 0.7.0. Use mad instead.
     &quot;instead.&quot;, FutureWarning)
   </pre>
   </div>
   </div>
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">stand_mad</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="p">(</span><span class="mi">6</span><span class="p">)</span><span class="o">.</span><span class="n">ppf</span><span class="p">(</span><span class="o">.</span><span class="mi">75</span><span class="p">))</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stderr output_text">
   <pre>/Users/tom.augspurger/Envs/py3/lib/python3.4/site-packages/statsmodels-0.6.1-py3.4-macosx-10.10-x86_64.egg/statsmodels/robust/scale.py:49: FutureWarning: stand_mad is deprecated and will be removed in 0.7.0. Use mad instead.
     &quot;instead.&quot;, FutureWarning)
   </pre>
   </div>
   </div>
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">scale</span><span class="o">.</span><span class="n">mad</span><span class="p">(</span><span class="n">fat_tails</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Duncan's-Occupational-Prestige-data---M-estimation-for-outliers">Duncan's Occupational Prestige data - M-estimation for outliers<a class="anchor-link" href="#Duncan's-Occupational-Prestige-data---M-estimation-for-outliers">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="kn">from</span> <span class="nn">statsmodels.graphics.api</span> <span class="k">import</span> <span class="n">abline_plot</span>
   <span class="kn">from</span> <span class="nn">statsmodels.formula.api</span> <span class="k">import</span> <span class="n">ols</span><span class="p">,</span> <span class="n">rlm</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">prestige</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">get_rdataset</span><span class="p">(</span><span class="s">&quot;Duncan&quot;</span><span class="p">,</span> <span class="s">&quot;car&quot;</span><span class="p">,</span> <span class="n">cache</span><span class="o">=</span><span class="k">True</span><span class="p">)</span><span class="o">.</span><span class="n">data</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">prestige</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="mi">10</span><span class="p">))</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>            type  income  education  prestige
   accountant  prof      62         86        82
   pilot       prof      72         76        83
   architect   prof      75         92        90
   author      prof      55         90        76
   chemist     prof      64         86        90
   minister    prof      21         84        87
   professor   prof      64         93        93
   dentist     prof      80        100        90
   reporter      wc      67         87        52
   engineer    prof      72         86        88
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span><span class="mi">12</span><span class="p">))</span>
   <span class="n">ax1</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">211</span><span class="p">,</span> <span class="n">xlabel</span><span class="o">=</span><span class="s">&#39;Income&#39;</span><span class="p">,</span> <span class="n">ylabel</span><span class="o">=</span><span class="s">&#39;Prestige&#39;</span><span class="p">)</span>
   <span class="n">ax1</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">prestige</span><span class="o">.</span><span class="n">income</span><span class="p">,</span> <span class="n">prestige</span><span class="o">.</span><span class="n">prestige</span><span class="p">)</span>
   <span class="n">xy_outlier</span> <span class="o">=</span> <span class="n">prestige</span><span class="o">.</span><span class="n">ix</span><span class="p">[</span><span class="s">&#39;minister&#39;</span><span class="p">][[</span><span class="s">&#39;income&#39;</span><span class="p">,</span><span class="s">&#39;prestige&#39;</span><span class="p">]]</span>
   <span class="n">ax1</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span><span class="s">&#39;Minister&#39;</span><span class="p">,</span> <span class="n">xy_outlier</span><span class="p">,</span> <span class="n">xy_outlier</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">16</span><span class="p">)</span>
   <span class="n">ax2</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">212</span><span class="p">,</span> <span class="n">xlabel</span><span class="o">=</span><span class="s">&#39;Education&#39;</span><span class="p">,</span>
                              <span class="n">ylabel</span><span class="o">=</span><span class="s">&#39;Prestige&#39;</span><span class="p">)</span>
   <span class="n">ax2</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">prestige</span><span class="o">.</span><span class="n">education</span><span class="p">,</span> <span class="n">prestige</span><span class="o">.</span><span class="n">prestige</span><span class="p">);</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">ols_model</span> <span class="o">=</span> <span class="n">ols</span><span class="p">(</span><span class="s">&#39;prestige ~ income + education&#39;</span><span class="p">,</span> <span class="n">prestige</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">ols_model</span><span class="o">.</span><span class="n">summary</span><span class="p">())</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>                            OLS Regression Results                            
   ==============================================================================
   Dep. Variable:               prestige   R-squared:                       0.828
   Model:                            OLS   Adj. R-squared:                  0.820
   Method:                 Least Squares   F-statistic:                     101.2
   Date:                Mon, 20 Jul 2015   Prob (F-statistic):           8.65e-17
   Time:                        17:44:43   Log-Likelihood:                -178.98
   No. Observations:                  45   AIC:                             364.0
   Df Residuals:                      42   BIC:                             369.4
   Df Model:                           2                                         
   Covariance Type:            nonrobust                                         
   ==============================================================================
                    coef    std err          t      P&gt;|t|      [95.0% Conf. Int.]
   ------------------------------------------------------------------------------
   Intercept     -6.0647      4.272     -1.420      0.163       -14.686     2.556
   income         0.5987      0.120      5.003      0.000         0.357     0.840
   education      0.5458      0.098      5.555      0.000         0.348     0.744
   ==============================================================================
   Omnibus:                        1.279   Durbin-Watson:                   1.458
   Prob(Omnibus):                  0.528   Jarque-Bera (JB):                0.520
   Skew:                           0.155   Prob(JB):                        0.771
   Kurtosis:                       3.426   Cond. No.                         163.
   ==============================================================================
   
   Warnings:
   [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">infl</span> <span class="o">=</span> <span class="n">ols_model</span><span class="o">.</span><span class="n">get_influence</span><span class="p">()</span>
   <span class="n">student</span> <span class="o">=</span> <span class="n">infl</span><span class="o">.</span><span class="n">summary_frame</span><span class="p">()[</span><span class="s">&#39;student_resid&#39;</span><span class="p">]</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">student</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>accountant            0.303900
   pilot                 0.340920
   architect             0.072256
   author                0.000711
   chemist               0.826578
   minister              3.134519
   professor             0.768277
   dentist              -0.498082
   reporter             -2.397022
   engineer              0.306225
   undertaker           -0.187339
   lawyer               -0.303082
   physician             0.355687
   welfare.worker       -0.411406
   teacher               0.050510
   conductor            -1.704032
   contractor            2.043805
   factory.owner         1.602429
   store.manager         0.142425
   banker                0.508388
   bookkeeper           -0.902388
   mail.carrier         -1.433249
   insurance.agent      -1.930919
   store.clerk          -1.760491
   carpenter             1.068858
   electrician           0.731949
   RR.engineer           0.808922
   machinist             1.887047
   auto.repairman        0.522735
   plumber              -0.377954
   gas.stn.attendant    -0.666596
   coal.miner            1.018527
   streetcar.motorman   -1.104485
   taxi.driver           0.023322
   truck.driver         -0.129227
   machine.operator      0.499922
   barber                0.173805
   bartender            -0.902422
   shoe.shiner          -0.429357
   cook                  0.127207
   soda.clerk           -0.883095
   watchman             -0.513502
   janitor              -0.079890
   policeman             0.078847
   waiter               -0.475972
   Name: student_resid, dtype: float64
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">student</span><span class="o">.</span><span class="n">ix</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">student</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">2</span><span class="p">])</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>minister      3.134519
   reporter     -2.397022
   contractor    2.043805
   Name: student_resid, dtype: float64
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">infl</span><span class="o">.</span><span class="n">summary_frame</span><span class="p">()</span><span class="o">.</span><span class="n">ix</span><span class="p">[</span><span class="s">&#39;minister&#39;</span><span class="p">])</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>dfb_Intercept      0.144937
   dfb_income        -1.220939
   dfb_education      1.263019
   cooks_d            0.566380
   dffits             1.433935
   dffits_internal    1.303510
   hat_diag           0.173058
   standard_resid     2.849416
   student_resid      3.134519
   Name: minister, dtype: float64
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">sidak</span> <span class="o">=</span> <span class="n">ols_model</span><span class="o">.</span><span class="n">outlier_test</span><span class="p">(</span><span class="s">&#39;sidak&#39;</span><span class="p">)</span>
   <span class="n">sidak</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="s">&#39;unadj_p&#39;</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="k">True</span><span class="p">)</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">sidak</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>                    student_resid   unadj_p  sidak(p)
   minister                 3.134519  0.003177  0.133421
   reporter                -2.397022  0.021170  0.618213
   contractor               2.043805  0.047433  0.887721
   insurance.agent         -1.930919  0.060428  0.939485
   machinist                1.887047  0.066248  0.954247
   store.clerk             -1.760491  0.085783  0.982331
   conductor               -1.704032  0.095944  0.989315
   factory.owner            1.602429  0.116738  0.996250
   mail.carrier            -1.433249  0.159369  0.999595
   streetcar.motorman      -1.104485  0.275823  1.000000
   carpenter                1.068858  0.291386  1.000000
   coal.miner               1.018527  0.314400  1.000000
   bartender               -0.902422  0.372104  1.000000
   bookkeeper              -0.902388  0.372122  1.000000
   soda.clerk              -0.883095  0.382334  1.000000
   chemist                  0.826578  0.413261  1.000000
   RR.engineer              0.808922  0.423229  1.000000
   professor                0.768277  0.446725  1.000000
   electrician              0.731949  0.468363  1.000000
   gas.stn.attendant       -0.666596  0.508764  1.000000
   auto.repairman           0.522735  0.603972  1.000000
   watchman                -0.513502  0.610357  1.000000
   banker                   0.508388  0.613906  1.000000
   machine.operator         0.499922  0.619802  1.000000
   dentist                 -0.498082  0.621088  1.000000
   waiter                  -0.475972  0.636621  1.000000
   shoe.shiner             -0.429357  0.669912  1.000000
   welfare.worker          -0.411406  0.682918  1.000000
   plumber                 -0.377954  0.707414  1.000000
   physician                0.355687  0.723898  1.000000
   pilot                    0.340920  0.734905  1.000000
   engineer                 0.306225  0.760983  1.000000
   accountant               0.303900  0.762741  1.000000
   lawyer                  -0.303082  0.763360  1.000000
   undertaker              -0.187339  0.852319  1.000000
   barber                   0.173805  0.862874  1.000000
   store.manager            0.142425  0.887442  1.000000
   truck.driver            -0.129227  0.897810  1.000000
   cook                     0.127207  0.899399  1.000000
   janitor                 -0.079890  0.936713  1.000000
   policeman                0.078847  0.937538  1.000000
   architect                0.072256  0.942750  1.000000
   teacher                  0.050510  0.959961  1.000000
   taxi.driver              0.023322  0.981507  1.000000
   author                   0.000711  0.999436  1.000000
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">fdr</span> <span class="o">=</span> <span class="n">ols_model</span><span class="o">.</span><span class="n">outlier_test</span><span class="p">(</span><span class="s">&#39;fdr_bh&#39;</span><span class="p">)</span>
   <span class="n">fdr</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="s">&#39;unadj_p&#39;</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="k">True</span><span class="p">)</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">fdr</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>                    student_resid   unadj_p  fdr_bh(p)
   minister                 3.134519  0.003177   0.142974
   reporter                -2.397022  0.021170   0.476332
   contractor               2.043805  0.047433   0.596233
   insurance.agent         -1.930919  0.060428   0.596233
   machinist                1.887047  0.066248   0.596233
   store.clerk             -1.760491  0.085783   0.616782
   conductor               -1.704032  0.095944   0.616782
   factory.owner            1.602429  0.116738   0.656653
   mail.carrier            -1.433249  0.159369   0.796844
   streetcar.motorman      -1.104485  0.275823   0.999436
   carpenter                1.068858  0.291386   0.999436
   coal.miner               1.018527  0.314400   0.999436
   bartender               -0.902422  0.372104   0.999436
   bookkeeper              -0.902388  0.372122   0.999436
   soda.clerk              -0.883095  0.382334   0.999436
   chemist                  0.826578  0.413261   0.999436
   RR.engineer              0.808922  0.423229   0.999436
   professor                0.768277  0.446725   0.999436
   electrician              0.731949  0.468363   0.999436
   gas.stn.attendant       -0.666596  0.508764   0.999436
   auto.repairman           0.522735  0.603972   0.999436
   watchman                -0.513502  0.610357   0.999436
   banker                   0.508388  0.613906   0.999436
   machine.operator         0.499922  0.619802   0.999436
   dentist                 -0.498082  0.621088   0.999436
   waiter                  -0.475972  0.636621   0.999436
   shoe.shiner             -0.429357  0.669912   0.999436
   welfare.worker          -0.411406  0.682918   0.999436
   plumber                 -0.377954  0.707414   0.999436
   physician                0.355687  0.723898   0.999436
   pilot                    0.340920  0.734905   0.999436
   engineer                 0.306225  0.760983   0.999436
   accountant               0.303900  0.762741   0.999436
   lawyer                  -0.303082  0.763360   0.999436
   undertaker              -0.187339  0.852319   0.999436
   barber                   0.173805  0.862874   0.999436
   store.manager            0.142425  0.887442   0.999436
   truck.driver            -0.129227  0.897810   0.999436
   cook                     0.127207  0.899399   0.999436
   janitor                 -0.079890  0.936713   0.999436
   policeman                0.078847  0.937538   0.999436
   architect                0.072256  0.942750   0.999436
   teacher                  0.050510  0.959961   0.999436
   taxi.driver              0.023322  0.981507   0.999436
   author                   0.000711  0.999436   0.999436
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">rlm_model</span> <span class="o">=</span> <span class="n">rlm</span><span class="p">(</span><span class="s">&#39;prestige ~ income + education&#39;</span><span class="p">,</span> <span class="n">prestige</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">rlm_model</span><span class="o">.</span><span class="n">summary</span><span class="p">())</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>                    Robust linear Model Regression Results                    
   ==============================================================================
   Dep. Variable:               prestige   No. Observations:                   45
   Model:                            RLM   Df Residuals:                       42
   Method:                          IRLS   Df Model:                            2
   Norm:                          HuberT                                         
   Scale Est.:                       mad                                         
   Cov Type:                          H1                                         
   Date:                Mon, 20 Jul 2015                                         
   Time:                        17:44:43                                         
   No. Iterations:                    18                                         
   ==============================================================================
                    coef    std err          z      P&gt;|z|      [95.0% Conf. Int.]
   ------------------------------------------------------------------------------
   Intercept     -7.1107      3.879     -1.833      0.067       -14.713     0.492
   income         0.7015      0.109      6.456      0.000         0.489     0.914
   education      0.4854      0.089      5.441      0.000         0.311     0.660
   ==============================================================================
   
   If the model instance has been used for another fit with different fit
   parameters, then the fit options might not be the correct ones anymore .
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">rlm_model</span><span class="o">.</span><span class="n">weights</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>accountant            1.000000
   pilot                 1.000000
   architect             1.000000
   author                1.000000
   chemist               1.000000
   minister              0.344596
   professor             1.000000
   dentist               1.000000
   reporter              0.441669
   engineer              1.000000
   undertaker            1.000000
   lawyer                1.000000
   physician             1.000000
   welfare.worker        1.000000
   teacher               1.000000
   conductor             0.538445
   contractor            0.552262
   factory.owner         0.706169
   store.manager         1.000000
   banker                1.000000
   bookkeeper            1.000000
   mail.carrier          0.690764
   insurance.agent       0.533499
   store.clerk           0.618656
   carpenter             0.935848
   electrician           1.000000
   RR.engineer           1.000000
   machinist             0.570360
   auto.repairman        1.000000
   plumber               1.000000
   gas.stn.attendant     1.000000
   coal.miner            0.963821
   streetcar.motorman    0.832870
   taxi.driver           1.000000
   truck.driver          1.000000
   machine.operator      1.000000
   barber                1.000000
   bartender             1.000000
   shoe.shiner           1.000000
   cook                  1.000000
   soda.clerk            1.000000
   watchman              1.000000
   janitor               1.000000
   policeman             1.000000
   waiter                1.000000
   dtype: float64
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Hertzprung-Russell-data-for-Star-Cluster-CYG-0B1---Leverage-Points">Hertzprung Russell data for Star Cluster CYG 0B1 - Leverage Points<a class="anchor-link" href="#Hertzprung-Russell-data-for-Star-Cluster-CYG-0B1---Leverage-Points">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>Data is on the luminosity and temperature of 47 stars in the direction of Cygnus.</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">dta</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">get_rdataset</span><span class="p">(</span><span class="s">&quot;starsCYG&quot;</span><span class="p">,</span> <span class="s">&quot;robustbase&quot;</span><span class="p">,</span> <span class="n">cache</span><span class="o">=</span><span class="k">True</span><span class="p">)</span><span class="o">.</span><span class="n">data</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="kn">from</span> <span class="nn">matplotlib.patches</span> <span class="k">import</span> <span class="n">Ellipse</span>
   <span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span><span class="mi">8</span><span class="p">))</span>
   <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="o">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">,</span> <span class="n">xlabel</span><span class="o">=</span><span class="s">&#39;log(Temp)&#39;</span><span class="p">,</span> <span class="n">ylabel</span><span class="o">=</span><span class="s">&#39;log(Light)&#39;</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s">&#39;Hertzsprung-Russell Diagram of Star Cluster CYG OB1&#39;</span><span class="p">)</span>
   <span class="n">ax</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="o">*</span><span class="n">dta</span><span class="o">.</span><span class="n">values</span><span class="o">.</span><span class="n">T</span><span class="p">)</span>
   <span class="c"># highlight outliers</span>
   <span class="n">e</span> <span class="o">=</span> <span class="n">Ellipse</span><span class="p">((</span><span class="mf">3.5</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span> <span class="o">.</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=.</span><span class="mi">25</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">&#39;r&#39;</span><span class="p">)</span>
   <span class="n">ax</span><span class="o">.</span><span class="n">add_patch</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
   <span class="n">ax</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span><span class="s">&#39;Red giants&#39;</span><span class="p">,</span> <span class="n">xy</span><span class="o">=</span><span class="p">(</span><span class="mf">3.6</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span> <span class="n">xytext</span><span class="o">=</span><span class="p">(</span><span class="mf">3.8</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span>
               <span class="n">arrowprops</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">facecolor</span><span class="o">=</span><span class="s">&#39;black&#39;</span><span class="p">,</span> <span class="n">shrink</span><span class="o">=</span><span class="mf">0.05</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">2</span><span class="p">),</span>
               <span class="n">horizontalalignment</span><span class="o">=</span><span class="s">&#39;left&#39;</span><span class="p">,</span> <span class="n">verticalalignment</span><span class="o">=</span><span class="s">&#39;bottom&#39;</span><span class="p">,</span>
               <span class="n">clip_on</span><span class="o">=</span><span class="k">True</span><span class="p">,</span> <span class="c"># clip to the axes bounding box</span>
               <span class="n">fontsize</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span>
        <span class="p">)</span>
   <span class="c"># annotate these with their index</span>
   <span class="k">for</span> <span class="n">i</span><span class="p">,</span><span class="n">row</span> <span class="ow">in</span> <span class="n">dta</span><span class="o">.</span><span class="n">ix</span><span class="p">[</span><span class="n">dta</span><span class="p">[</span><span class="s">&#39;log.Te&#39;</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">3.8</span><span class="p">]</span><span class="o">.</span><span class="n">iterrows</span><span class="p">():</span>
       <span class="n">ax</span><span class="o">.</span><span class="n">annotate</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">row</span><span class="p">,</span> <span class="n">row</span> <span class="o">+</span> <span class="o">.</span><span class="mi">01</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">14</span><span class="p">)</span>
   <span class="n">xlim</span><span class="p">,</span> <span class="n">ylim</span> <span class="o">=</span> <span class="n">ax</span><span class="o">.</span><span class="n">get_xlim</span><span class="p">(),</span> <span class="n">ax</span><span class="o">.</span><span class="n">get_ylim</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="k">import</span> <span class="n">Image</span>
   <span class="n">Image</span><span class="p">(</span><span class="n">filename</span><span class="o">=</span><span class="s">&#39;star_diagram.png&#39;</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">y</span> <span class="o">=</span> <span class="n">dta</span><span class="p">[</span><span class="s">&#39;log.light&#39;</span><span class="p">]</span>
   <span class="n">X</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">add_constant</span><span class="p">(</span><span class="n">dta</span><span class="p">[</span><span class="s">&#39;log.Te&#39;</span><span class="p">],</span> <span class="n">prepend</span><span class="o">=</span><span class="k">True</span><span class="p">)</span>
   <span class="n">ols_model</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">OLS</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">X</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="n">abline_plot</span><span class="p">(</span><span class="n">model_results</span><span class="o">=</span><span class="n">ols_model</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">rlm_mod</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">RLM</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">X</span><span class="p">,</span> <span class="n">sm</span><span class="o">.</span><span class="n">robust</span><span class="o">.</span><span class="n">norms</span><span class="o">.</span><span class="n">TrimmedMean</span><span class="p">(</span><span class="o">.</span><span class="mi">5</span><span class="p">))</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="n">abline_plot</span><span class="p">(</span><span class="n">model_results</span><span class="o">=</span><span class="n">rlm_mod</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">&#39;red&#39;</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>Why? Because M-estimators are not robust to leverage points.</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">infl</span> <span class="o">=</span> <span class="n">ols_model</span><span class="o">.</span><span class="n">get_influence</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">h_bar</span> <span class="o">=</span> <span class="mi">2</span><span class="o">*</span><span class="p">(</span><span class="n">ols_model</span><span class="o">.</span><span class="n">df_model</span> <span class="o">+</span> <span class="mi">1</span> <span class="p">)</span><span class="o">/</span><span class="n">ols_model</span><span class="o">.</span><span class="n">nobs</span>
   <span class="n">hat_diag</span> <span class="o">=</span> <span class="n">infl</span><span class="o">.</span><span class="n">summary_frame</span><span class="p">()[</span><span class="s">&#39;hat_diag&#39;</span><span class="p">]</span>
   <span class="n">hat_diag</span><span class="o">.</span><span class="n">ix</span><span class="p">[</span><span class="n">hat_diag</span> <span class="o">&gt;</span> <span class="n">h_bar</span><span class="p">]</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">sidak2</span> <span class="o">=</span> <span class="n">ols_model</span><span class="o">.</span><span class="n">outlier_test</span><span class="p">(</span><span class="s">&#39;sidak&#39;</span><span class="p">)</span>
   <span class="n">sidak2</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="s">&#39;unadj_p&#39;</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="k">True</span><span class="p">)</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">sidak2</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>    student_resid   unadj_p  sidak(p)
   16      -2.049393  0.046415  0.892872
   13      -2.035329  0.047868  0.900286
   33       1.905847  0.063216  0.953543
   18      -1.575505  0.122304  0.997826
   1        1.522185  0.135118  0.998911
   3        1.522185  0.135118  0.998911
   21      -1.450418  0.154034  0.999615
   17      -1.426675  0.160731  0.999735
   29       1.388520  0.171969  0.999859
   14      -1.374733  0.176175  0.999889
   35       1.346543  0.185023  0.999933
   34      -1.272159  0.209999  0.999985
   28      -1.186946  0.241618  0.999998
   20      -1.150621  0.256103  0.999999
   44       1.134779  0.262612  0.999999
   39       1.091886  0.280826  1.000000
   19       1.064878  0.292740  1.000000
   6       -1.026873  0.310093  1.000000
   30      -1.009096  0.318446  1.000000
   22      -0.979768  0.332557  1.000000
   8        0.961218  0.341695  1.000000
   5        0.913802  0.365801  1.000000
   11       0.871997  0.387943  1.000000
   12       0.856685  0.396261  1.000000
   46      -0.833923  0.408829  1.000000
   10       0.743920  0.460879  1.000000
   42       0.727179  0.470968  1.000000
   15      -0.689258  0.494280  1.000000
   43       0.688272  0.494895  1.000000
   7        0.655712  0.515424  1.000000
   40      -0.646396  0.521381  1.000000
   26      -0.640978  0.524862  1.000000
   25      -0.545561  0.588123  1.000000
   37       0.472819  0.638680  1.000000
   32       0.472819  0.638680  1.000000
   38       0.462187  0.646225  1.000000
   0        0.430686  0.668799  1.000000
   31       0.341726  0.734184  1.000000
   36       0.318911  0.751303  1.000000
   4        0.307890  0.759619  1.000000
   9        0.235114  0.815211  1.000000
   41       0.187732  0.851950  1.000000
   2       -0.182093  0.856346  1.000000
   23      -0.156014  0.876736  1.000000
   27      -0.147406  0.883485  1.000000
   24       0.065195  0.948314  1.000000
   45       0.045675  0.963776  1.000000
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">fdr2</span> <span class="o">=</span> <span class="n">ols_model</span><span class="o">.</span><span class="n">outlier_test</span><span class="p">(</span><span class="s">&#39;fdr_bh&#39;</span><span class="p">)</span>
   <span class="n">fdr2</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="s">&#39;unadj_p&#39;</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="k">True</span><span class="p">)</span>
   <span class="nb">print</span><span class="p">(</span><span class="n">fdr2</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>    student_resid   unadj_p  fdr_bh(p)
   16      -2.049393  0.046415   0.764747
   13      -2.035329  0.047868   0.764747
   33       1.905847  0.063216   0.764747
   18      -1.575505  0.122304   0.764747
   1        1.522185  0.135118   0.764747
   3        1.522185  0.135118   0.764747
   21      -1.450418  0.154034   0.764747
   17      -1.426675  0.160731   0.764747
   29       1.388520  0.171969   0.764747
   14      -1.374733  0.176175   0.764747
   35       1.346543  0.185023   0.764747
   34      -1.272159  0.209999   0.764747
   28      -1.186946  0.241618   0.764747
   20      -1.150621  0.256103   0.764747
   44       1.134779  0.262612   0.764747
   39       1.091886  0.280826   0.764747
   19       1.064878  0.292740   0.764747
   6       -1.026873  0.310093   0.764747
   30      -1.009096  0.318446   0.764747
   22      -0.979768  0.332557   0.764747
   8        0.961218  0.341695   0.764747
   5        0.913802  0.365801   0.768599
   11       0.871997  0.387943   0.768599
   12       0.856685  0.396261   0.768599
   46      -0.833923  0.408829   0.768599
   10       0.743920  0.460879   0.770890
   42       0.727179  0.470968   0.770890
   15      -0.689258  0.494280   0.770890
   43       0.688272  0.494895   0.770890
   7        0.655712  0.515424   0.770890
   40      -0.646396  0.521381   0.770890
   26      -0.640978  0.524862   0.770890
   25      -0.545561  0.588123   0.837630
   37       0.472819  0.638680   0.843682
   32       0.472819  0.638680   0.843682
   38       0.462187  0.646225   0.843682
   0        0.430686  0.668799   0.849556
   31       0.341726  0.734184   0.892552
   36       0.318911  0.751303   0.892552
   4        0.307890  0.759619   0.892552
   9        0.235114  0.815211   0.922751
   41       0.187732  0.851950   0.922751
   2       -0.182093  0.856346   0.922751
   23      -0.156014  0.876736   0.922751
   27      -0.147406  0.883485   0.922751
   24       0.065195  0.948314   0.963776
   45       0.045675  0.963776   0.963776
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>Let's delete that line</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="k">del</span> <span class="n">ax</span><span class="o">.</span><span class="n">lines</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">weights</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">ones</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">X</span><span class="p">))</span>
   <span class="n">weights</span><span class="p">[</span><span class="n">X</span><span class="p">[</span><span class="n">X</span><span class="p">[</span><span class="s">&#39;log.Te&#39;</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">3.8</span><span class="p">]</span><span class="o">.</span><span class="n">index</span><span class="o">.</span><span class="n">values</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
   <span class="n">wls_model</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">WLS</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">X</span><span class="p">,</span> <span class="n">weights</span><span class="o">=</span><span class="n">weights</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span>
   <span class="n">abline_plot</span><span class="p">(</span><span class="n">model_results</span><span class="o">=</span><span class="n">wls_model</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">&#39;green&#39;</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <ul>
   <li>MM estimators are good for this type of problem, unfortunately, we don't yet have these yet. </li>
   <li>It's being worked on, but it gives a good excuse to look at the R cell magics in the notebook.</li>
   </ul>
   
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">yy</span> <span class="o">=</span> <span class="n">y</span><span class="o">.</span><span class="n">values</span><span class="p">[:,</span><span class="k">None</span><span class="p">]</span>
   <span class="n">xx</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="s">&#39;log.Te&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">values</span><span class="p">[:,</span><span class="k">None</span><span class="p">]</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="o">%</span><span class="k">load_ext</span> rmagic
   
   <span class="o">%</span><span class="k">R</span> library(robustbase)
   <span class="o">%</span><span class="k">Rpush</span> yy xx
   <span class="o">%</span><span class="k">R</span> mod &lt;- lmrob(yy ~ xx);
   <span class="o">%</span><span class="k">R</span> params &lt;- mod$coefficients;
   <span class="o">%</span><span class="k">Rpull</span> params
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Error in library(robustbase) : there is no package called ‘robustbase’
   Error in withVisible({ : could not find function &quot;lmrob&quot;
   Error in withVisible({ : object &apos;mod&apos; not found
   </pre>
   </div>
   </div>
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stderr output_text">
   <pre>/Users/tom.augspurger/Envs/py3/lib/python3.4/site-packages/IPython/extensions/rmagic.py:693: UserWarning: The rmagic extension in IPython is deprecated in favour of rpy2.ipython. If available, that will be loaded instead.
   http://rpy.sourceforge.net/
     warnings.warn(&quot;The rmagic extension in IPython is deprecated in favour of &quot;
   </pre>
   </div>
   </div>
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="o">%</span><span class="k">R</span> print(mod)
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   <div class="output_subarea output_stream output_stdout output_text">
   <pre>Error in eval(expr, envir, enclos) : object &apos;params&apos; not found
   Error in print(mod) : object &apos;mod&apos; not found
   </pre>
   </div>
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="nb">print</span><span class="p">(</span><span class="n">params</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">abline_plot</span><span class="p">(</span><span class="n">intercept</span><span class="o">=</span><span class="n">params</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">slope</span><span class="o">=</span><span class="n">params</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">&#39;green&#39;</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h3 id="Exercise:-Breakdown-points-of-M-estimator">Exercise: Breakdown points of M-estimator<a class="anchor-link" href="#Exercise:-Breakdown-points-of-M-estimator">&#182;</a></h3>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">12345</span><span class="p">)</span>
   <span class="n">nobs</span> <span class="o">=</span> <span class="mi">200</span>
   <span class="n">beta_true</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span><span class="p">])</span>
   <span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">20</span><span class="p">,</span><span class="mi">20</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="n">nobs</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">beta_true</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">))</span>
   <span class="c"># stack a constant in front</span>
   <span class="n">X</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">add_constant</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">prepend</span><span class="o">=</span><span class="k">True</span><span class="p">)</span> <span class="c"># np.c_[np.ones(nobs), X]</span>
   <span class="n">mc_iter</span> <span class="o">=</span> <span class="mi">500</span>
   <span class="n">contaminate</span> <span class="o">=</span> <span class="o">.</span><span class="mi">25</span> <span class="c"># percentage of response variables to contaminate</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">all_betas</span> <span class="o">=</span> <span class="p">[]</span>
   <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">mc_iter</span><span class="p">):</span>
       <span class="n">y</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">beta_true</span><span class="p">)</span> <span class="o">+</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span>
       <span class="n">random_idx</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">nobs</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">contaminate</span> <span class="o">*</span> <span class="n">nobs</span><span class="p">))</span>
       <span class="n">y</span><span class="p">[</span><span class="n">random_idx</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">750</span><span class="p">,</span> <span class="mi">750</span><span class="p">)</span>
       <span class="n">beta_hat</span> <span class="o">=</span> <span class="n">sm</span><span class="o">.</span><span class="n">RLM</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">X</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">()</span><span class="o">.</span><span class="n">params</span>
       <span class="n">all_betas</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">beta_hat</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">all_betas</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">asarray</span><span class="p">(</span><span class="n">all_betas</span><span class="p">)</span>
   <span class="n">se_loss</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span> <span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="nb">ord</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span>
   <span class="n">se_beta</span> <span class="o">=</span> <span class="n">lmap</span><span class="p">(</span><span class="n">se_loss</span><span class="p">,</span> <span class="n">all_betas</span> <span class="o">-</span> <span class="n">beta_true</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing text_cell rendered">
   <div class="prompt input_prompt">
   </div>
   <div class="inner_cell">
   <div class="text_cell_render border-box-sizing rendered_html">
   <h4 id="Squared-error-loss">Squared error loss<a class="anchor-link" href="#Squared-error-loss">&#182;</a></h4>
   </div>
   </div>
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">se_beta</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">all_betas</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">beta_true</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>
   <div class="cell border-box-sizing code_cell rendered">
   <div class="input">
   <div class="prompt input_prompt">In&nbsp;[&nbsp;]:</div>
   <div class="inner_cell">
       <div class="input_area">
   <div class=" highlight hl-ipython3"><pre><span class="n">se_loss</span><span class="p">(</span><span class="n">all_betas</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">-</span> <span class="n">beta_true</span><span class="p">)</span>
   </pre></div>
   
   </div>
   </div>
   </div>
   
   <div class="output_wrapper">
   <div class="output">
   
   
   <div class="output_area"><div class="prompt"></div>
   
   </div>
   
   </div>
   </div>
   
   </div>

   <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"type="text/javascript"></script>
   <script type="text/javascript">
   init_mathjax = function() {
       if (window.MathJax) {
           // MathJax loaded
           MathJax.Hub.Config({
               tex2jax: {
               // I'm not sure about the \( and \[ below. It messes with the
               // prompt, and I think it's an issue with the template. -SS
                   inlineMath: [ ['$','$'], ["\\(","\\)"] ],
                   displayMath: [ ['$$','$$'], ["\\[","\\]"] ]
               },
               displayAlign: 'left', // Change this to 'center' to center equations.
               "HTML-CSS": {
                   styles: {'.MathJax_Display': {"margin": 0}}
               }
           });
           MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
       }
   }
   init_mathjax();

   // since we have to load this in a ..raw:: directive we will add the css
   // after the fact
   function loadcssfile(filename){
       var fileref=document.createElement("link")
       fileref.setAttribute("rel", "stylesheet")
       fileref.setAttribute("type", "text/css")
       fileref.setAttribute("href", filename)

       document.getElementsByTagName("head")[0].appendChild(fileref)
   }
   // loadcssfile({{pathto("_static/nbviewer.pygments.css", 1) }})
   // loadcssfile({{pathto("_static/nbviewer.min.css", 1) }})
   loadcssfile("../../../_static/nbviewer.pygments.css")
   loadcssfile("../../../_static/ipython.min.css")
   </script>