Goodness of fit test for poisson distribution python
So I think the Chi-square approach works OK for low mean Poisson data, since setting the bins at integer values is the logical choice. With higher means though, it becomes more tricky -- you will get different answers with different binning strategies. Hence my suggestion for the KS test in the comments -- you don't need to bin the data at all, just look at the CDF. Show But Glen_b is right, in that the KS test without prespecifying the mean will have too high of Type II error (false negatives). An alternative is the Lilliefors test, which uses the same CDF approach as the KS test, but uses simulations to generate the null distribution for the KS statistic. First though, lets look at the CDF of your data. I thought your histogram looked pretty consistent with Poisson data, and the CDF graph comports with that as well. Here I generate 10 simulations of 112 observations to show the typical variation with data that is actually Poisson (with the same mean as your data):
So you can see your data does not look like all that out of line with a Poisson process. Here I coded up a Lilliefor's version for Poisson (if you have the original timestamps, you could estimate an exponential distribution and check with Lilliefor's or statsmodels simulated lookup tables).
Your p-value may be slightly different due to the simulation run, but I don't think it is likely to be anything nearby the edge of the distribution. To check and make sure my
Caveat emptor, I do not know the power of this relative to the binning Chi-square approach. But here is how I would do the Chi-square approach (I don't believe the approach you did is correct). Here I bin according to Poisson quantiles, instead of based on the data. (So the expected number per bin is the same.)
Like I said, different binning strategies will give different p-values. Here if you do All in all, I think your example data is quite consistent with a Poisson distribution. How do you find the goodness of fit in a Poisson distribution?Testing the Goodness-of-Fit for a Poisson Distribution
Values must be integers that are greater than or equal to zero. For example, the number of sales per day in a store can follow the Poisson distribution. If these data follow the Poisson distribution, you can use this distribution to make predictions.
Which test is applicable for determining goodness of fit of a Poisson distribution?Chi-Square goodness of fit test determines how well theoretical distribution (such as normal, binomial, or Poisson) fits the empirical distribution. In Chi-Square goodness of fit test, sample data is divided into intervals.
How do you do a goodness of fit test in Python?First, create a data frame with 8 intervals as below. Create two columns each for observed and expected frequency. Use Pandas' apply method to calculate the observed frequency between intervals. We are now ready to perform the Goodness-of-Fit test.
How do I know if my data fits a Poisson distribution?A variable follows a Poisson distribution when the following conditions are true: Data are counts of events. All events are independent. The average rate of occurrence does not change during the period of interest.
|