Supervised Learning
The Basic Idea
From the moment a child starts to talk, they are provided with labeled data to learn about the world around them and how to speak. Sounds a bit mathematical and boring? Well, not really. Take the alphabet for example; children learn to recognize the sounds of each letter by associating them with a picture of an apple (A) or a hat (H). When an adult shows them a picture of an apple, the child immediately knows the correct sound to make.
AI algorithms also use labeled data to learn how to recognize patterns and make predictions about future inputs. Supervised learning is a type of machine learning that uses datasets labeled by a human to train computer algorithms to predict outcomes and recognize patterns.1 The examples given to the algorithm are like pairs of questions and answers; the computer studies these pairs and learns to give the right answers when it’s asked similar questions it hasn't encountered before. Ultimately, the goal of supervised learning is to make predictions from data.2
How do you know if the algorithm is learning its data correctly? Well, the dataset is usually divided into two parts; a training set and a testing set. The training set is used to teach the model and the testing set is used to evaluate its performance with unseen data. This division of the dataset is important to make sure that the model is not learning the training set too well to the extent that it can’t perform on new data (this is called overfitting).
Let’s look at a simple example. Imagine you want to teach a model to identify pictures of flowers. You start by providing the algorithm with a labeled data set that contains lots of pictures of different kinds of flowers and the corresponding name of each species (e.g. rose, petunia, sunflower). The algorithm then tries to define the characteristics that belong to each flower based on the labeled outputs (e.g. thorns or the colour yellow). Once this is done, you can test the model by showing it a flower picture and asking it to guess the correct species. If the model provides an incorrect answer, you continue training it and adjusting its parameters with more examples to improve accuracy. When the model is ready, it can use its existing knowledge to make predictions about unknown data.
The whole process can be loosely compared to the student-teacher dynamic we see in schools. In many subjects, students are required to learn information provided to them by a teacher and then apply this knowledge to unseen questions on a test. If they don’t pass the test, the teacher simply goes back over the information again, but this time adjusting the way the material is taught to improve retention and understanding.
One of the most crucial steps in supervised learning is, as the name suggests, human supervision in the form of feedback and corrections. Just like a teacher tells a child when their answer is or isn’t correct, humans give feedback to algorithms about prediction accuracy in the training process. We’ll talk more about that later.
In today’s hyper-connected, digital society there is an increasing need for machines that can make quick and accurate predictions for us. In the world of AI, there are two main types of supervised learning:
- Regression: In regression tasks, the target variable is continuous, meaning it can take on any value within a range. The goal is to predict a numerical value. Examples include predicting house prices based on features like size, location, and number of bedrooms, or predicting stock prices based on historical data.
- Classification: In classification tasks, the target variable is categorical, meaning it falls into one of a limited number of classes or categories. The goal is to predict the class label of new instances based on their features. Examples include email spam detection (classifying emails as either spam or not spam), image recognition (classifying images into different categories such as cats, dogs, or cars), and sentiment analysis (classifying text as positive, negative, or neutral).
About the Author
Dr. Lauren Braithwaite
Dr. Lauren Braithwaite is a Social and Behaviour Change Design and Partnerships consultant working in the international development sector. Lauren has worked with education programmes in Afghanistan, Australia, Mexico, and Rwanda, and from 2017–2019 she was Artistic Director of the Afghan Women’s Orchestra. Lauren earned her PhD in Education and MSc in Musicology from the University of Oxford, and her BA in Music from the University of Cambridge. When she’s not putting pen to paper, Lauren enjoys running marathons and spending time with her two dogs.