# Maximum-likelihood Fitting with R

So you have gathered your data and now you want use maximum-likelihood estimation along with a selected distribution to fit a model. Here are two methods:

Method 1

In R, you could define the log-likelihood and use R’s optim() to maximise this function with respect to the parameters. This gets you your maximum likelihood estimates which then enable you to fit a probability distribution to the data.

The process is perhaps best shown by way of an example.

Say we have some count data that we think may best be modelled using the Poisson distribution. The data looks like this:


set.seed(100)

# let’s generate some synthetic sample data:
x = rpois(n=10000,lambda=2)

# plotting
hist(x,col='lightblue')



Hint: We know the data has been generated from a Poisson distribution with $\lambda=2$!

The Poisson probability function has the form:

$P(X=x) = \frac{\lambda^{x}e^{-\lambda}}{x!}$

The likelihood function considering $n$ independent data points is therefore:

$L(\lambda;x) = \prod_{i=1}^{n}\frac{\lambda^{x_i}e^{-\lambda}}{x_i!} = \frac{\lambda^{x_1}e^{-\lambda}}{x_1!}\times\frac{\lambda^{x_2}e^{-\lambda}}{x_2!}\times\ldots\times\frac{\lambda^{x_n}e^{\lambda}}{x_n!}$

$= \frac{\lambda^{\sum x_i}e^{-n\lambda}}{x_1!x_2!\dots x_n!}$

And the log-likelihood function:

$\log(L(\lambda;x))=\log(\lambda^{\sum x_i}e^{-n\lambda})-\log(x_1!x_2!\dots x_n!)$

Which, ignoring the terms not featuring the parameter $\lambda$,

$=\sum x_i\log(\lambda)-n\lambda$

Expressing this in R:


# the log-likelihood is a function of lambda and the data
poissonLogLik = function(lambda,data){
# total number of observations
n = length(data)
# equation
logLik = (sum(data)*log(lambda)-n*lambda)
}



To find our estimate $\hat\lambda$ we need to maximise the log-likelihood function:

$\frac{\mathrm \partial}{\mathrm \partial \lambda} \big(\log(L(\hat\lambda;x))\big)=0$

Doing it in R:


# with a starting value of 1, minimize the function!
mle = optim(par=1,fn=poissonLogLik,data=x,control=list(fnscale=-1))


> mle
$par [1] 2.000195$value
[1] -6136.363

$counts function gradient 30 NA$convergence
[1] 0

$message NULL  Note that by default, optim() finds a minimum solution. This is no good because we are only interested in maximising! Adding control=list(fnscale=-1) solves the problem. Check out optim()’s help file for more details. Now we have our estimate, $\hat\lambda$, how close are our observed and expected values? Our observed values:  # observed table(x)  x 0 1 2 3 4 5 6 7 8 9 1327 2781 2676 1790 880 367 126 38 14 1  Our expected values:  # expected expected = dpois(x=min(x):max(x), lambda=mle$par)*10000


> expected
[1] 1353.088531 2706.441338 2706.705639 1804.646644  902.411439  360.999826
[7]  120.345027   34.387651    8.597752    1.910798


Overlaying the expected values onto the original histogram (with a red line):

# observed
hist(x,col='lightblue')

# expected
lines(min(x):max(x),expected,col='red',lwd=2)


The the model appears to fit well!

Method 2

The function fitdistr() in the MASS package allows us to fit a range distributions to data using maximum-likelihood. This means we don’t have to use method 1 each time! woohoo! It’s as easy as handing over your data, choosing your distribution and pressing RUN. You also get your estimated standard errors which is a bonus.

# loading the MASS package
library(MASS)

# fitting a Poisson distribution using maximum-likelihood
fitdistr(x,densfun='Poisson')

     lambda
2.00010000
(0.01414249)


Very nice!