Describing the MCMC Method
- MCMC methods are used to approximate the posterior distribution of a parameter by random sampling in a probabilistic space
- Roughly speaking, the posterior distribution can be thought of as a kind of average of the prior and likelihood distributions
-
We can determine a posterior distribution by doing the following:
-
Using conjugate distributions (i.e. conjugate prior)
- The prior and likelihood distributions rarely match up with some analytical formula perfectly, so we usually aren't able to work with conjugate distributions
- If we are lucky, we can use these and simply multiply the closed-form functions (prior and likelihood) by each other
-
Using MCMC sampling
- If the posterior, likelihood, or prior distributions become more complicated and do not follow a closed-form function, then we cannot solve for these analytically and will need use MCMC sampling
-
How Monte Carlo Simulations Relate to MCMC
- Monte Carlo simulations generally refer to taking samples from a population to simulate the population distribution
- In many cases, we can use Monte Carlo simulations to approximate the area under curves when integration doesn't have a closed-form solution
- Within the MCMC algorithm, random parameter values are simulated and either accepted or rejected as the new parameter value at each iteration
- In the scope of the MCMC algorithm, the process of generating new random parameter values for many iterations is known as Monte Carlo simulations
How Markov Chains Relate to MCMC
-
A Markov Chain generally refers to a model that only uses the current state to predict the next state
- Specifically, the model contains an independent variable that represents some current observation, and a dependent variable that represents the previous observation
- This is referred to as the Markov Property, or memorylessness
- Within the MCMC algorithm, we determine how much better a randomly-generated parameter estimate is to the previous parameter estimate, and we add the randomly-generated parameter value to the chain if it is considered better
-
In the scope of the MCMC algorithm, the Markov Chain refers to the following features:
- Only referring to the previous parameter estimate when evaluating if the randomly-generated parameter estimate is better or not
- Adding any better values to a chain of parameter estimates
Defining the MCMC Algorithm
- Generate a random parameter value to consider, and continue to generate random parameter values for many iterations (this is the Monte Carlo part)
-
Compute the posterior (joint) probability of observing the pair of randomly-generated parameter values by combining the prior and likelihood distributions (this is the Bayes theorem part)
- More specifically, the likelihood equals the joint likelihood of observing the parameter of interest given the data
- And the prior equals the joint likelihood of observing the parameters together
- For example, let's say we want to estimate the conditional mean , which is represented by the following posterior distribution:
- Here, represents our response data, represents our predictor variable, and
- We know that our likelihood equals the joint likelihood (or density) of observing the data (or response variable) given the parameters , , and
- Therefore, our likelihood function can be defined in R as the following:
- Here, we're estimating the probability of observing our data (or response variable in this case) given our parameters
- We know that our prior equals the joint likelihood (or density) of observing the parameters , , and
- Therefore, our prior function can be defined in R as the following:
- Where , and are some fixed parameter estimates we're adjusting across many iterations
- Where is some mean we decide for the parameter (whatever mean we believe parameter parameter to be centered around)
- Where is some standard deviation we decide for the parameter (whatever sd we believe the parameter to be centered around)
- Also, we decided parameters and were all normally-distributed, but we can choose any other distribution if we believe they take on a different form
- For example, poisson, beta, and uniform distributions are popular choices
- Determine if the randomly-generated parameter estimate is better than the previous parameter estimate (this is the Metropolis-Hastings part, or some other method for evaluation)
- If the pair of randomly-generated parameter estimates is better than the last one, then it is added to the chain of parameter estimates (this is the Markov chain part)
References
Previous
Next