statsmodels.nonparametric.kernel_density.KDEMultivariateConditional

class statsmodels.nonparametric.kernel_density.KDEMultivariateConditional(endog, exog, dep_type, indep_type, bw, defaults=None)[source]

Conditional multivariate kernel density estimator.

Calculates P(Y_1,Y_2,...Y_n | X_1,X_2...X_m) = P(X_1, X_2,...X_n, Y_1, Y_2,..., Y_m)/P(X_1, X_2,..., X_m). The conditional density is by definition the ratio of the two densities, see [R354a09f7866f-1].

Parameters
endoglist of ndarrays or 2-D ndarray

The training data for the dependent variables, used to determine the bandwidth(s). If a 2-D array, should be of shape (num_observations, num_variables). If a list, each list element is a separate observation.

exoglist of ndarrays or 2-D ndarray

The training data for the independent variable; same shape as endog.

dep_typestr

The type of the dependent variables:

c : Continuous u : Unordered (Discrete) o : Ordered (Discrete)

The string should contain a type specifier for each variable, so for example dep_type='ccuo'.

indep_typestr

The type of the independent variables; specified like dep_type.

bwarray_like or str, optional

If an array, it is a fixed user-specified bandwidth. If a string, should be one of:

  • normal_reference: normal reference rule of thumb (default)

  • cv_ml: cross validation maximum likelihood

  • cv_ls: cross validation least squares

defaultsInstance of class EstimatorSettings

The default values for the efficient bandwidth estimation

See also

KDEMultivariate

References

R354a09f7866f-1

https://en.wikipedia.org/wiki/Conditional_probability_distribution

Examples

>>> import statsmodels.api as sm
>>> nobs = 300
>>> c1 = np.random.normal(size=(nobs,1))
>>> c2 = np.random.normal(2,1,size=(nobs,1))
>>> dens_c = sm.nonparametric.KDEMultivariateConditional(endog=[c1],
...     exog=[c2], dep_type='c', indep_type='c', bw='normal_reference')
>>> dens_c.bw   # show computed bandwidth
array([ 0.41223484,  0.40976931])
Attributes
bwarray_like

The bandwidth parameters

Methods

cdf([endog_predict, exog_predict])

Cumulative distribution function for the conditional density.

imse(bw)

The integrated mean square error for the conditional KDE.

loo_likelihood(bw[, func])

Returns the leave-one-out conditional likelihood of the data.

pdf([endog_predict, exog_predict])

Evaluate the probability density function.