Linear Discriminant Analysis (LDA)

The Basic Idea

After many years of running the show at the family restaurant, your pizzeria has become a local favorite. Like any restaurateur, you're always trying to find new customers. Though (almost) everyone likes pizza, you want to definitively answer the question: “Which type of customer is most likely to eat my pizzas?” With participation from both regulars and first-time pizza eaters, you begin asking customers to complete a simple survey about themselves. With hundreds of customer data features collected, you are in search of the right analysis to discover what type of person truly wants to buy your pizza. In this case scenario, a linear discriminant analysis may be the technique for you.

Linear discriminant analysis (LDA), also referred to as normal discriminant analysis (NDA) or discriminant function analysis (DFA), is a popular technique in machine learning and statistics used to reduce the number of dimensions in a dataset while maintaining the ability to distinguish between different classes.¹ The main goal of LDA is to find the linear combinations of features that best separate two or more classes in the data. Unlike other dimensionality reduction methods like principal component analysis (PCA), which focuses on maximizing variance,¹ LDA aims to maximize how well classes can be separated in a linear fashion.

LDA is a generative model, meaning it estimates how data is distributed for each class. Using Bayes' theorem, it calculates the probability that a new data point belongs to each class and assigns it accordingly.² With the help of Bayes, LDA calculates condition probabilities for a dataset—the chance that something will happen when another event has happened. If we were to use LDA algorithms at our pizzeria, the application of Bayes’ theorem would help us check if our assumption of likely customer types is accurate or not.

In practice, linear discriminant analysis helps to find a linear grouping of characteristics that isolates at least two types of objects or events. As a dimensionality reduction method, LDA simplifies complex datasets by transforming data with multiple features (dimensions) into a lower-dimensional space. It does this while preserving the ability to distinguish between different classes, making classification more efficient and reducing computational complexity. Since LDA is adept at reducing dimensions, it is a tool that can be applied to multi-class data classification problems, in contrast to other data analyses like logistic regression, which only works on binary classifications.² Due to its versatile nature, it is common to use LDA as a means to improve the abilities of other classification algorithms, such as decision trees.

LDA vs. PCA

It can be a bit tricky to understand how LDA is distinct from a similar approach called principal component analysis (PCA), so let’s continue with the pizza example in our chart to make sense of these analyses:³

Eigenvectors and Eigenvalues

The goal of LDA, especially when we interpret it as a technique for reducing dimensions, is to separate data in a straight line. When we consider the math of linear functions, this is accomplished using eigenvectors and eigenvalues.² To understand this, let’s return to our pizza example. Once you’ve collected data on your customers, it's not as simple as picking pepperoni versus margherita: the data is all over the place, and you need a scatterplot to sort it out. Eigenvectors give us the direction of this scatterplot, indicating the direction in which the data is best separated, whereas eigenvalues highlight how important this directional data is. A high eigenvalue explains how a relative eigenvalue is more crucial.

When we conduct an LDA, the eigenvector calculations are based on the data found and collected from two different scatter-matrices:²

Within-class scatter matrix: Represents how spread out, or varied, the data points are within each class. LDA tries to minimize this scatter to ensure that points within the same class remain close together.
Between-class scatter matrix: Represents the variation between different class means. LDA tries to maximize this scatter to push class means farther apart, improving separation.

How do you prepare to conduct an LDA?

The raw survey data won’t give us the answers we are looking for. If we really want to figure out who is buying our pizza, we need to sort the data first. Here are some best practices to follow before conducting a linear discriminant analysis:²

Data preprocessing to make sure it's normal and centered: LDA assumes that the data follows a normal distribution, and mean-centering the data helps compute scatter matrices correctly.
Picking the right amount of dimensions for the lower dimensional space: Choosing the number of discriminants is based on keeping the most informative eigenvalues or testing performance in lower dimensions. We will come back to this “lower” dimensional space when we actually do the LDA.
Regularize the model chosen: This can help avoid overfitting, which is when a statistical model fits precisely with its training data yet leads to issues with accuracy by failing to accommodate additional data or make reliable predictions.
Apply cross-validation to assess how well the model is working: One way to assess classifiers is by using a confusion matrix, which checks to see if a classifier is getting confused about the classes—or naming one incorrectly as another! We know not everyone likes Hawaiian pizza.

How does LDA actually work?

When we apply an LDA, the data projects onto a lower-dimensional space, maximizing the amount of separation between classes. This happens when LDA identifies a set of linear discriminants that can maximize the spread of between-class variance relative to within-class variance.¹A simpler way to understand this is that LDA discovers the directions that best separate various data classes.

There are three key steps of linear discriminant analysis from a computational perspective:²

Find the between-class variance: How separate the classes are, also known as the distance between the class means.
Find the within-class variance: The distance between class means and individual samples.
Place the data into a lower-dimensional space: The fewer dimensions, the more between-class variance is maximized, and within-class variance is minimized.

What contexts is LDA used in?

Linear discriminant analysis (LDA) is widely used in various fields due to its ability to simplify complex datasets while preserving class separability. Common applications include facial recognition, where LDA helps identify individuals by distinguishing facial features, and medical diagnostics, where it is used to classify disease states based on patient data. LDA is also employed in marketing to segment customers based on purchasing behavior and in finance for credit scoring, helping to predict whether individuals are likely to default on loans. Its versatility in classification tasks makes it an essential tool in many industries.

About the Author

A smiling man with light hair and a beard is wearing a denim jacket over a light turtleneck. He is standing in a nighttime setting, with warm lights glowing in the background, including a large, glowing yellow sphere. He has a black strap across his chest, possibly from a bag, and the environment around him suggests an outdoor, urban atmosphere.

Isaac Koenig-Workman

Early Resolution Advocate @ CLAS Mental Health Law Program

Isaac Koenig-Workman has several years of experience in mental health support, group facilitation, and public communication across government, nonprofit, and academic settings. He holds a Bachelor of Arts in Psychology from the University of British Columbia and is currently pursuing an Advanced Professional Certificate in Behavioural Insights at UBC Sauder School of Business. Isaac has contributed to research at UBC’s Attentional Neuroscience Lab and Centre for Gambling Research, and supported the development of the PolarUs app for bipolar disorder through UBC’s Psychiatry department. In addition to writing for TDL, he works as an Early Resolution Advocate with the Community Legal Assistance Society’s Mental Health Law Program, where he supports people certified under B.C.'s Mental Health Act and helps reduce barriers to care—especially for youth and young adults navigating complex mental health systems.

Consulting

Industries

Resources

Linear Discriminant Analysis (LDA)

What is Linear Discriminant Analysis?

The Basic Idea

LDA vs. PCA

Eigenvectors and Eigenvalues

How do you prepare to conduct an LDA?

How does LDA actually work?

What contexts is LDA used in?

Case studies

From Insight to Impact: Our Success Stories

Is there a problem we can help with?

About the Author

Isaac Koenig-Workman

About us

We are the leading applied research & innovation consultancy

Our insights are leveraged by the most ambitious organizations

OUR CLIENT SUCCESS

Annual Revenue Increase

Increase in Monthly Users

Reduction In Design Time

Reduction in Client Drop-Off

Read Next

AI and the Future of Work

AI in Healthcare Equity

AI in Public Policy

Ethics of Automated Decision-Making

Eager to learn about how behavioral science can help your organization?

Consulting

Industries

Resources

Linear Discriminant Analysis (LDA)

What is Linear Discriminant Analysis?

The Basic Idea

LDA vs. PCA

Eigenvectors and Eigenvalues

How do you prepare to conduct an LDA?

How does LDA actually work?

What contexts is LDA used in?

Case studies

From Insight to Impact: Our Success Stories

Is there a problem we can help with?

About the Author

Isaac Koenig-Workman

About us

We are the leading applied research & innovation consultancy

Our insights are leveraged by the most ambitious organizations

OUR CLIENT SUCCESS

Annual Revenue Increase

Increase in Monthly Users

Reduction In Design Time

Reduction in Client Drop-Off

Read Next

AI and the Future of Work

AI in Healthcare Equity

AI in Public Policy

Ethics of Automated Decision-Making

Eager to learn about how behavioral science can help your organization?

Get new behavioral science insights in your inbox every month.