Data Science

A data generating process is the true, underlying phenomenon that is creating the data
A mathematical model is the (often imperfect) attempt to describe and emulate this phenomenon
A mathematical model is represented as a function with adjustable parameters
We can think of a mathematical model as a shower head that is controlled by a bunch of knobs, which are the model's parameters
For example, the angle of the shower head is one parameter that controls the location of droplets on the floor
There is another knob that controls the spread of the spray
This shower head can be thought of as our normal probability density function, where the location know is $\mu$ and the spread know is $\sigma$
The bathroom shower head is just one device for generating a pattern of random water droplets
There are others, such as different types of lawn sprinklers
Each type of sprinkler generates a different pattern of droplets, and each type of sprinkler has different control knobs
Different mathematical models of data are like different shower heads or sprinklers, each with their own control knobs
Each mathematical model generates a particular type of data, and each mathematical model has particular knobs – called parameters – that control the specific details of the pattern of data

Previously, we estimated probabilities using frequentist and bayesian methods
A probability is just considered a parameter of a mathematical model
Therefore, these frequentist and bayesian methods used for estimating probabilities can also be used to estimate other parameters
In other words, we can use those same frequentist and bayesian parameter estimation techniques, such as MLE and simulations, to estimate the parameters $\mu$ and $\sigma$ for a normal distribution

In Bayesian statistics, we typically follow a general process when estimating parameters:
1. Start with a set of possible parameter values in a model, with initial credibilities of those parameter values
2. Gather data that makes some parameter values more or less credible
3. Re-allocate credibility to the parameter values that are more consistent with the data, and re-allocate credibility away from parameter values that are less consistent with the data
One attraction of Bayesian methods is that the posterior distribution inherently reveals the uncertainty of the estimated parameter value
When the posterior distribution is wide, the estimate is uncertain
When the posterior distribution is narrower, the posterior estimate is more certain
In the Bayesian framework, uncertainty is inherently represented by the posterior distribution over the parameters
In the frequentist framework, there is no such representation, and so confidence intervals must be used to represent uncertainty

We should use Bayesian analysis if we're asking what parameter values and models are most credible given the data
We should use frequentist analysis if we're asking about error rates for imaginary data from hypothetical worlds

It is often said incorrectly that parameters are treated as fixed by frequentists but random by bayesians
However, frequentists and bayesians both believe a parameter may have been fixed from the start or may have been generated from a physically random mechanism
In either case, both suppose it has taken on some fixed value
The bayesian uses formal probability models to express personal uncertainty about that value, whereas the frequentist uses confidence interval to express uncertainty about that value
Randomness in our model creates personal uncertainty about our parameter estimates in our model
Randomness is not a property of the parameter, although we hope it accurately reflects properties of the mechanisms that produced the parameter

Philosophy behind Probability

Random Variables

Bayesian and Frequentist Estimation