Your data does not appear to be gamma-distributed, but assuming it is, you could fit it like this:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
gamma = stats.gamma
a, loc, scale = 3, 0, 2
size = 20000
y = gamma.rvs[a, loc, scale, size=size]
x = np.linspace[0, y.max[], 100]
# fit
param = gamma.fit[y, floc=0]
pdf_fitted = gamma.pdf[x, *param]
plt.plot[x, pdf_fitted, color='r']
# plot the histogram
plt.hist[y, normed=True, bins=30]
plt.show[]
The area under the pdf [over the entire domain] equals 1. The area under the histogram equals 1 if you use
normed=True
.x
has lengthsize
[i.e. 20000], andpdf_fitted
has the same shape asx
. If we callplot
and specify only the y-values, e.g.plt.plot[pdf_fitted]
, then values are plotted over the x-range[0, size]
. That is much too large an x-range. Since the histogram is going to use an x-range of[min[y], max[y]]
, we much choosex
to span a similar range:x = np.linspace[0, y.max[]]
, and callplot
with both the x- and y-values specified, e.g.plt.plot[x, pdf_fitted]
.As Warren Weckesser points out in the comments, for most applications you know the gamma distribution's domain begins at 0. If that is the case, use
floc=0
to hold theloc
parameter to 0. Withoutfloc=0
,gamma.fit
will try to find the best-fit value for theloc
parameter too, which given the vagaries of data will generally not be exactly zero.
I was surprised that I couldn't found this piece of code somewhere.
What I basically wanted was to fit some theoretical distribution to my graph. If you are lucky, you should see something like this:
from scipy import stats
import numpy as np
import matplotlib.pylab as plt
# create some normal random noisy data
ser = 50*np.random.rand[] * np.random.normal[10, 10, 100] + 20
# plot normed histogram
plt.hist[ser, normed=True]
# find minimum and maximum of xticks, so we know
# where we should compute theoretical distribution
xt = plt.xticks[][0]
xmin, xmax = min[xt], max[xt]
lnspc = np.linspace[xmin, xmax, len[ser]]
# lets try the normal distribution first
m, s = stats.norm.fit[ser] # get mean and standard deviation
pdf_g = stats.norm.pdf[lnspc, m, s] # now get theoretical values in our interval
plt.plot[lnspc, pdf_g, label="Norm"] # plot it
# exactly same as above
ag,bg,cg = stats.gamma.fit[ser]
pdf_gamma = stats.gamma.pdf[lnspc, ag, bg,cg]
plt.plot[lnspc, pdf_gamma, label="Gamma"]
# guess what :]
ab,bb,cb,db = stats.beta.fit[ser]
pdf_beta = stats.beta.pdf[lnspc, ab, bb,cb, db]
plt.plot[lnspc, pdf_beta, label="Beta"]
plt.show[]
In statistics, the Gamma distribution is often used to model probabilities related to waiting times.
The following examples show how to use the scipy.stats.gamma[] function to plot one or more Gamma distributions in Python.
Example 1: Plot One Gamma Distribution
The following code shows how to plot a Gamma distribution with a shape parameter of 5 and a scale parameter of 3 in Python:
import numpy as np import scipy.stats as stats import matplotlib.pyplot as plt #define x-axis values x = np.linspace [0, 40, 100] #calculate pdf of Gamma distribution for each x-value y = stats.gamma.pdf[x, a=5, scale=3] #create plot of Gamma distribution plt.plot[x, y] #display plot plt.show[]
The x-axis displays the potential values that a Gamma distributed random variable can take on and the y-axis shows the corresponding PDF values of the Gamma distribution with a shape parameter of 5 and scale parameter of 3.
Example 2: Plot Multiple Gamma Distributions
The following code shows how to plot multiple Gamma distributions with various shape and scale parameters:
import numpy as np import scipy.stats as stats import matplotlib.pyplot as plt #define three Gamma distributions x = np.linspace[0, 40, 100] y1 = stats.gamma.pdf[x, a=5, scale=3] y2 = stats.gamma.pdf[x, a=2, scale=5] y3 = stats.gamma.pdf[x, a=4, scale=2] #add lines for each distribution plt.plot[x, y1, label=shape=5, scale=3'] plt.plot[x, y2, label='shape=2, scale=5'] plt.plot[x, y3, label='shape=4, scale=2'] #add legend plt.legend[] #display plot plt.show[]
Notice that the shape of the Gamma distribution can vary quite a bit depending on the shape and scale parameters.
Related: How to Plot Multiple Lines in Matplotlib
Additional Resources
The following tutorials explain how to plot other common distributions in Python:
How to Plot a Normal Distribution in Python
How to Plot a Chi-Square Distribution in Python