Python generalized linear model poisson

classsklearn.linear_model.PoissonRegressor(*, alpha=1.0, fit_intercept=True, max_iter=100, tol=0.0001, warm_start=False, verbose=0)[source]

Generalized Linear Model with a Poisson distribution.

This regressor uses the ‘log’ link function.

Read more in the User Guide.

New in version 0.23.

Parameters:alphafloat, default=1

Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities). Values must be in the range [0.0, inf).

fit_interceptbool, default=True

Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

max_iterint, default=100

The maximal number of iterations for the solver. Values must be in the range [1, inf).

tolfloat, default=1e-4

Stopping criterion. For the lbfgs solver, the iteration will stop when max{|g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function. Values must be in the range (0.0, inf).

warm_startbool, default=False

If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

verboseint, default=0

For the lbfgs solver set verbose to any positive number for verbosity. Values must be in the range [0, inf).

Attributes:coef_array of shape (n_features,)

Estimated coefficients for the linear predictor (X @ coef_ + intercept_) in the GLM.

intercept_float

Intercept (a.k.a. bias) added to linear predictor.

n_features_in_int

Number of features seen during fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

n_iter_int

Actual number of iterations used in the solver.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.PoissonRegressor()
>>> X = [[1, 2], [2, 3], [3, 4], [4, 3]]
>>> y = [12, 17, 22, 21]
>>> clf.fit(X, y)
PoissonRegressor()
>>> clf.score(X, y)
0.990...
>>> clf.coef_
array([0.121..., 0.158...])
>>> clf.intercept_
2.088...
>>> clf.predict([[1, 1], [3, 4]])
array([10.676..., 21.875...])

Methods

fit(X, y[, sample_weight])

Fit a Generalized Linear Model.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict using GLM with feature matrix X.

score(X, y[, sample_weight])

Compute D^2, the percentage of deviance explained.

set_params(**params)

Set the parameters of this estimator.

propertyfamily

DEPRECATED: Attribute family was deprecated in version 1.1 and will be removed in 1.3.

Ensure backward compatibility for the time of deprecation.

fit(X, y, sample_weight=None)[source]

Fit a Generalized Linear Model.

Parameters:X{array-like, sparse matrix} of shape (n_samples, n_features)

Training data.

yarray-like of shape (n_samples,)

Target values.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:selfobject

Fitted model.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:paramsdict

Parameter names mapped to their values.

predict(X)[source]

Predict using GLM with feature matrix X.

Parameters:X{array-like, sparse matrix} of shape (n_samples, n_features)

Samples.

Returns:y_predarray of shape (n_samples,)

Returns predicted values.

score(X, y, sample_weight=None)[source]

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 uses the deviance of this GLM, see the User Guide.

D^2 is defined as \(D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}\), \(D_{null}\) is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to \(y_{pred} = \bar{y}\). The mean \(\bar{y}\) is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).

Parameters:X{array-like, sparse matrix} of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,)

True values of target.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:scorefloat

D^2 of self.predict(X) w.r.t. y.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form __ so that it’s possible to update each component of a nested object.

Parameters:**paramsdict

Estimator parameters.

Returns:selfestimator instance

Estimator instance.

Examples using sklearn.linear_model.PoissonRegressor¶

Is Poisson a generalized linear model?

A Poisson Regression model is a Generalized Linear Model (GLM) that is used to model count data and contingency tables. The output Y (count) is a value that follows the Poisson distribution. It assumes the logarithm of expected values (mean) that can be modeled into a linear form by some unknown parameters.

How do I fit a GLM model in python?

To fit a model we first need to describe the model using the model class glm. Then the method fit is used to fit the model. Very detailed results of the model fit can be analyzed via the summary method, and finally, we can compute predictions using the predict method.

What is a generalized Poisson model?

Generalized Poisson Regression (GPR) is one method that can handle cases of overdispersion and underdispersion. The GPR model is used to estimate regression parameters. Many articles proposed to use only Maximum Likelihood Estimation (MLE) to estimate the parameters of GPR.