Data Science

Linear Discriminant Analysis (LDA) is a supervised classification technique that is solved using either SVD (i.e. dimensionality reduction) or bayes theorem (i.e. Bayesian techniques)
If we assume LDA uses dimensionality reduction when predicting an observation's class, then LDA involves mapping the data from a high dimensional space to a lower dimensional space
The data is transformed to a lower dimensional space by finding the axes that maximize the seperatibility between classes (in the lower dimensional space)
Said another way, LDA maps the data from a high dimensional space to a lower dimensional space by performing a linear transformation on the data in its original form (i.e. in the high dimensional space)
More specifically, the linear transformation includes a change of basis (using the SVD formula) that finds the axes that best separate the classes
LDA uses linear decision boundaries to determine the class of an observation in the newly mapped space

Observations within each class are drawn from a multivariate Gaussian distribution
Each class has its own unique mean vector, but each class needs to have equal variance/covariance

Perform a change of basis on the data that finds the axes that best separate the classes
Receive coefficients for ( $k-1$ ) number of linear discriminants (LDA axes) based on the $k$ number of classes (from the response)
Use these coefficients to map the data on the new vector space (LDA axes)
Determine the class for an observation by observing where the mapped observation lands with respect to the (linear) decision boundaries

Check assumptions
- Gaussian distribution - Use log and root transformations to ensure Gaussian distributions are maintained
- Same variance - standarize data to ensure equal variance is maintained across distribution
Possibly remove outliers

The preferred method of classification is logistic regression when the response has exactly 2 classes
The preferred method of classification is linear discriminant analysis when the response has more than 2 classes
Logistic regression parameter estimates can become poor/unstable when the two classes are well-separated, whereas LDA does not suffer from this
Logistic regression parameter estimates can become poor/unstable if the sample size is small, whereas LDA is more stable (assuming normally distributed predictors)

Logistic regression and LDA both use MLE for parameter estimation (or Bayesian techniques)
Logistic regression involves directly modeling $Pr(Y=1|X=x)$ using the logistic function
LDA involves directly modeling $Pr(Y=k|X=x)$

PCA is an unsupervised learning method that involves performing linear transformations (dimensionality reduction) on the data to find the features that make up the most variability
LDA is a supervised learning method that involves performing linear transformations (dimensionality reduction) on the data to maximize the distance between classes and minimize the distance within classes

Logistic Regression

Quadratic Discriminant Analysis