๐Linear Discriminant Analysis (LDA)
LDA (Linear Discriminant Analysis) has a clear goal, that is to maximize the separatability of the labeled classes, while reducing the dimensions.
To better understand how does LDA work, we can compare with PCA:
PCA is unsupervised learning while LDA is supervised, you need to provide data labels to LDA.
PCA looks for a new dimension that can maximize the data spread on this new dimension. LDA looks for a new dimension that can maximize the distances between classes and to minimize the variance within each class.
For 2 classes, the distance means the differences between the average of the 2 classes.
For 3 or more classes, LDA find a central point of all the data, then measures the distance between each class' average and the central point.
LDA can lose more data variance than PCA, as its primary focus is class separation rather than preserving overall data variance.
The dimensions of LDA's output is min(the number of features of data input, the number of classes - 1). Therefore, our bank campaign data will be reduced to 1 dimension after applying LDA. Let's look at how did LDA separate the 2 classes:

Last updated