Defining Notation from Statistical Mechanics
- A state refers to a unique observation of some system
- A system refers to a collection of random variables, which includes the future state and current state
- The states and are essentially thought of as independent and dependent random variables, respectively
Describing a Markov Process
-
A Markov process is a stochastic process that has the following properties:
- The number of distinct possible outcomes (or states) is finite
-
Supports the Markov Property, otherwise known as memorylessness
- The memorylessness property refers to the outcome of a random variable depending only on the previously observed outcome of that random variable
- The probabilities (associated with each transition between other state) are constant over time
-
Since we assume independence of any previous observations, we can consider the conditional probabilities of each state to be fixed
- Specifically, the next state is only conditional on the present state of the system, since its past states are considered to be independent of the next state
- Said another way, the next observation is only conditional on the current observation of a random variable, since its past observations are considered to be independent of the next observation
-
The Markov Property doesn't usually apply to systems within the real world, so we wouldn't expect our predictions to be very accurate
- However, running the Markov chain for thousands of iterations typically provides us with a good long-run prediction
An Example of a Markov Process about Weather
- Let's assume that on a given day it is more likely to rain if it rained the day before than if it didn't
- Likewise, it is more likely to be sunny if it was sunny the day before than if it wasn't
- We may try to express the probability of it raining on Tuesday as a function of the weather on Monday without taking Sunday into account
-
Even though every day might be dependant on everything that happened previously, we can really only consider a small amount of data when predicting the weather, which can actually produce accurate predictions
- Specifically, the weather on Monday doesn't actually affect the weather on Tuesday
- Rather, they both stem from the quite complex meteorological conditions, and those conditions don't tend to just shift in unexpected directions
Basic Theory behind Markov Chains
- A Markov Chain is a probabilistic model of a stochastic process
- Every Markov Chain can be represented using a transition matrix, a state vector, and a steady-state vector
-
The Markov part refers to the model's use of the Markov property
- Specifically, the transition matrix is usable only because of the Markov property
-
The Chain part refers to running many iterations of the model by re-inputting the previous model's output over and over again
- Specifically, we input an initial state vector
- Then, we receive some output from the model (by multiplying the transition matrix and the state vector together)
- Then, we re-input the output over and over again until the state vector converges, which we call our steady-state vector
- Mathematically, we keep running the model until
- In other words, we keep running the model until the input equals the output (i.e. the state vector equals the steady-state vector
- The goal of a Markov Chain is to keep iterating until we receive a steady-state vector that converges
Defining Markov Chains
-
Rows represent the from or current state
- The rows sum up to a probability of 1
-
Columns represent the to or next state
- The columns do not sum up to 1, since they are conditional probabilities and not joint probabilities
-
A transition matrix represents the conditional probabilities of some observation given the previous observation
- This is based on a large amount of historical data
-
The state vector represents the probabilities of some current observation that you want to better understand
- This is based on the probability that the system (or random variable) is in a state at a given time
- This is used as input
-
The steady-state vector represents the converged output of multiplying the transition matrix and the initial state vector over time
- This represents the output after very many iterations, since the state vector will eventually converge after very many iterations
Mathematically Defining Markov Chains
- A Markov Chain is a discrete-valued, discrete-time stochastic processes
- Mathematically, a discrete-valued, discrete-time stochastic process has the Markov property when:
- In other words, the probability distribution for the next state depends solely on the current state
- If the above property is true for a stochastic process, then we can say that is a Markov process
- The analogous condition for continuous time is the following:
Why use a Markov Chain?
- If we use a Markov Chain, we don't need to keep track of a bunch of dependent variables with respect to some independent variable
- Instead, we only need to keep track of the history of one independent variable (and the index, which is typically a date)
- Then, we can calculate the conditional probabilities for each state (or unique observation) given its previous state (or previous observation)