Machine Learning

What is Machine Learning?

Machine learning (ML) is a subset of artificial intelligence (AI) that uses statistical techniques to enable machines to learn from data and improve over time, loosely based on human learning. By analyzing data and using trial and error, ML algorithms can generate accurate predictions, content, recommendations, and decisions without being explicitly programmed.

The Basic Idea

Even if we are fans of staying up to date with the latest technology, it’s easy to mix up the different types of artificial intelligence and what they do. Since AI is so complex and multifaceted, this article will focus on machine learning (ML).

Machine learning is a type of AI that allows a machine to learn from its environment, rather than having to be explicitly preprogrammed to generate predictions, content, recommendations, or decisions. Quickly changing and incredibly powerful, ML technology is already reshaping our world and beliefs of what computers are capable of.

Although we might hear people use the terms “machine learning” and “artificial intelligence” interchangeably, let’s get one thing straight: ML and AI aren’t the same thing. ML is a subset of AI.

AI: Artificial intelligence refers to any systems that aim to match or exceed human capabilities by generating outputs that require humanlike reasoning and decision-making.

ML: Machine learning is a field of AI using a set of statistical techniques that enable a machine to learn from the environment rather than having to be explicitly preprogrammed to generate predictions.

Therefore, ML is always AI, but AI doesn’t always need to be achieved through ML.

For example, in rule-based statements, programmers might give an AI system explicit if-then statements (for example, "if a dog is ‘under two years old,’ classify the dog as a ‘puppy’”).¹ This is not an example of machine learning since the algorithm does not need to learn these rules itself.

Now, let’s dig into some of the different types of ML.

Supervised ML

Supervised ML uses human-labeled data of both inputs and correct outputs to train computer algorithms to make predictions.

Let’s use an example: say you want to teach an algorithm to recognize dog breeds. First, you need to find a big dataset with a lot of pictures of dogs (generally not hard on the internet). These ‘inputs’ might include different breeds of dogs as puppies, old dogs, three-legged dogs, and maybe even dogs wearing funny hats. To train the algorithm, you’d need to feed it the correct name of the breed (the ‘output’) along with each photo. This is how the machine ‘learns.’ Then, you’d let the algorithm try to define what set of characteristics belongs to each dog based on the labeled outputs.

You can then test the model by showing it a dog picture and asking it to guess what breed it is. If the model provides an incorrect answer, you can continue training it and adjusting its parameters with more examples to improve its accuracy and minimize errors.

Within supervised learning, there are generally two types of variables:

Categorical: If the variable is dichotomous (such as yes/no, dog/not dog, etc.) and we want to predict the categories from these features, the process is known as classification. Common classification algorithms include logistic regression, decision trees, random forests, and k-nearest neighbors (KNN).
Continuous: If the outcome variable is continuous (such as age, height, etc.), and we want to make predictions from these features, the process is known as regression. Common regression algorithms include linear regression, ridge regression, and lasso regression.²

Unsupervised ML

Unsupervised ML uses unlabeled data to discover patterns without any explicit guidance or instruction.

Let’s go back to our example of someone searching for pictures of dogs. Unsupervised learning is often used in recommendation engines, which usually have a lot of data about us. The machine may start to learn that users with certain characteristics (maybe certain personality traits, activity levels, hair color, etc.) prefer looking at certain types of dogs, and then will begin recommending people images of the types of dogs they most enjoy (have you ever noticed that people tend to look a lot like their dogs?). Even though we didn’t explicitly give the program the instructions that ‘people who like to hike and have curly hair like seeing pictures of Labradoodles,’ the algorithm may have noticed a pattern like this, even if we don’t know or understand why certain variables are associated.

Reinforcement Learning

Reinforcement learning uses trial-and-error learning with rewards and punishments to help a machine learn what the optimal outcome is.

A common example of reinforcement learning is teaching a machine to play chess. The more games the machine plays, the more opportunities it will have to receive a ‘punishment’ (loss) for bad moves and a ‘reward’ (win) for good moves, over time learning what the most strategic moves and responses are.³

“

Learning is experience. Everything else is just information.

– Albert Einstein

Key Terms

Algorithms: Statistical models that are applied to the data to learn from the data (i.e., to generate outcomes such as predictions). Many algorithms have been developed and can involve linear and logistic regression, decision trees and random forests, neural networks, and so on.
Neural Networks: An interconnected, artificial model designed to loosely mimic how neurons in the human brain interact with each other. It connects artificial nodes with edges, which somewhat resemble the way synapses in the brain connect neurons.
Convolutional Neural Network (CNN): A neural network model architecture that uses convolutional layers. These models use several convolutional layers stacked on top of one another. The first layers can recognize simple features, like edges, shapes, and textures. As the network gets deeper, it produces more “abstract” representations, eventually identifying concepts such as dog breeds.⁴
Data Preprocessing: The process of transforming raw data into a comprehensible format before machine learning or other analysis can take place. This involves first cleaning up and organizing the data, then improving the data quality, and finally getting the data into the appropriate format for further analysis or use by ML. For example, feature extraction is typically one of the final steps in data preprocessing, where (usually image) files are turned into numerical features that the ML algorithms can recognize.⁴
Deep Learning: A subset of machine learning that involves neural networks with many layers (hence "deep") to model and understand complex patterns in data. It enables computers to perform tasks such as image and speech recognition, natural language processing, and autonomous driving by learning from large amounts of data.
Generative AI: A subset of ML that relies on neural networks to identify the patterns and structures within existing data to produce original content. Generative AI typically combines both supervised and unsupervised (deep) learning. For example, ChatGPT, Google’s Bard, and DALL-E are all examples of generative AI.

History

The first mathematical model of a neural network with an electric circuit was developed by Warren McCulloch and Walter Pitts in 1943 to answer the question: can computers communicate with each other? The answer was yes! This first circuit was incredibly important in shaping the future of machine learning, as it showed that it was possible for computers to share information without any human interaction.

In 1949, Canadian psychologist Donald Hebb proposed that when we learn something new, the neurons in our brains connect and build up a neural network in response. The more often the new information or skill is repeated, the stronger the connections between the neurons become. This theory of the neural network provided the foundation for modern machine learning and the artificial neural network, which somewhat mimics the neurons and synapses in our brain with artificial nodes and edges.

Ten years later, a computer scientist named Arthur Samuel coined the term “machine learning.” He did this after building an intelligent system that could play checkers, which he engineered to have the ability to learn through an intricate scoring system. Each time a move was made, the program would assess the probability of winning based on the position of the pieces. The more the game would play, the better it would get at predicting these permutations.⁵

Later on, the developments of Hebb and Samuel were applied to image recognition. In 1957, Frank Rosenblatt built the Perceptron, which was an early neural network that resulted from Hebb and Samuel’s contributions. As the first software program designed to recognize objects, the Perceptron was a promising start on the road toward machine learning. However, it was only semi-functional, as it could only learn simple objects and struggled with the details of more complex objects, such as faces (something ML is very familiar with today, as we can see with features like FaceID).

Stemming from this invention, quite a few effective algorithms were discovered shortly after the Perceptron, rapidly transforming machines’ ability to recognize objects. At the same time, researchers began to layer neural networks on top of each other, which laid the groundwork for deep learning.

In 1962, Frank Rosenblatt developed the concept of backpropagation, which is the process of adjusting neurons to fit novel situations. At that time, Rosenblatt and a researcher named Stuart Dreyfus’ backpropagation models were relatively simple and used the chain rule to connect one step to the other. However, in 1967, Shun’ichi Amari’s work marked the first instance of training a multilayer neural network, which allowed the computer to learn internal representations of concepts through non-linear pattern classes. 1970 brought Seppo Linnainmaa’s ‘reverse mode of automatic differentiation,’ which was the first version of the modern backpropagation we see today, connecting many differentiable and nested neural networks.⁶

In 1997, the general public began to realize some of the ways in which AI systems were able to surpass humans. That year, IBM supercomputer Deep Blue defeated chess grandmaster Garry Kasparov in a match. This event was the first time on record that a machine had beaten an expert player at chess and kickstarted a cultural conversation about what it means when humans can create artificial intelligence that can learn on its own, evolve, and surpass us.

The victory for Deep Blue was just the beginning of a long string of achievements for AI in competition. In 2012, AlexNet, a deep convolutional neural network, won the image recognition competition, ImageNet, scoring above 83% accuracy, a huge breakthrough in image recognition and ML technology.⁴ Two years later, in 2014, Google's AlphaGo competed against the professional human player Lee Sedol in Go, a strategic Chinese board game, which further showcased the power of deep reinforcement learning.

In the last few decades, machine learning has developed with astounding speed and has surpassed humans in a variety of tests, particularly when we look at how quickly algorithms can process information. Perhaps most relevant, one of the biggest differences between humans and ML is the speed at which it learns and evolves. While humans take thousands of years to develop as a species, ML improves in days or weeks. The speed at which we’re seeing change is both a reason to proceed with caution and also a cause for excitement and hope.

The Human Element

As artificial intelligence advances, ML technology will continue to become more and more integrated into our daily lives. So far, one of the most impactful ways we’ve seen ML technology used is in healthcare and wellness. AI integration with diet, exercise, habit-changing, safer driving, and monitoring ongoing health conditions like diabetes is now commonplace.

We are also now living in a world with more personal data than ever before. Our phones, smartwatches, health rings, and even alarm clocks and refrigerators can track a plethora of data on all the ways we are eating, sleeping, and moving. Sometimes, this data is used to tailor our experiences for the better, but much of the information is used by companies to shape the advertisements we see and sell our private information. That doesn’t mean personal data tracking is all bad, though. There is a wide variety of apps where you can track personal behaviors—everything from mood changes to water intake to nicotine use.

ML technology is key to unlocking the benefits of all this information. With so much unstructured data, and with such variation between individuals, it would take hundreds of scientists working overtime to quantify, model, and predict every behavior of interest for every individual. Instead, ML can compile thousands of data points for each person and learn what works for them, providing personalized recommendations.

For example, two individuals may share the same goal: getting better quality sleep. If both use an app to help them reach their goals, over time the algorithm may pick up on patterns relevant to the goal of getting better sleep. Person A may be consuming too much caffeine in the afternoon, while person B struggles to get to sleep on days without exercise. Using all the information available, the algorithm can then provide personalized recommendations and adapt as each of the user’s unique behaviors change.

Different behavioral science intervention strategies may affect people differently as well; while person A may adhere better to their sleep schedule when they’ve set concrete goals in their app and can watch a ‘sleep streak’ form, person B may be more likely to stick to their sleep schedule when certain behavioral barriers are in place—perhaps certain time restrictions on social media apps.

Since these types of interventions can be constantly readjusted and recalculated in real time, algorithms use reinforcement learning to shape their suggestions that are designed to be personalized. This is known as the Just-in-Time Adaptive Intervention, which provides “the right type/amount of support, at the right time, by adapting to an individual’s changing internal and contextual state.”
For example, the mobile app FOCUS is designed to provide support to individuals with schizophrenia in domains such as medication adherence, mood regulation, sleep, social functioning, and coping with hallucinations. The app uses environmental and biometric information to tailor its messaging, providing suggestions or positive reinforcement. These types of interventions have provided a number of incredibly promising results from experiments and applications using this type of ML to support all kinds of outpatient care and lifestyle changes.⁷

behavior change 101

Start your behavior change journey at the right place

What is behavior change? ⮕

Learn the tools of the trade ⮕

Frameworks for lasting change ⮕

Behavior change in action ⮕

Making a career of beh change ⮕

See real world examples ⮕

Advanced concepts & trends ⮕

Controversies

As alluded to earlier, there are a number of privacy concerns when it comes to the use of machine learning algorithms, mainly because AI requires big data: personal data that has to come from real people. Many people are unaware of just how much of their information is being collected and what the possible ramifications might be. Everything including demographic and financial information, geographic location, behavioral data, and even health and biometric information can be monitored without our knowledge. Many companies use our data for marketing and service personalization, but the breach of privacy comes at a cost.

In addition to concerns about surveillance and personal privacy, having large amounts of our personal information tracked and stored puts us at risk for data breaches. If the companies who have access to our data are hacked, we’re exposed to the threat of financial fraud and identity fraud. Our data can also be misused in a number of ways, either being sold to third parties or used for direct manipulation, such as influencing our behavior as consumers or making us targets for misinformation spread.

Another controversial aspect of this new technology comes from its very essence. Machine learning is the process of teaching machines how to learn like humans—and unfortunately, humans are far from perfect. We often fall victim to biases and prejudices, and since it is us humans who are teaching the machines, we often program our human biases into them. Machine learning algorithms have consistently come under such scrutiny for being discriminatory, non-inclusive, or inaccurate towards racial or gender minorities.

Facial recognition software also has an accuracy problem, particularly when it comes to people in minority groups. In fact, the error rates are up to 34% higher for recognition of dark-skinned females compared to lighter-skinned men. This is especially troubling when we consider the increasing use of surveillance and facial recognition technology being used by law enforcement, as we’ve already seen how people of color are disproportionately incorrectly matched with mug shots.¹⁰

Examples of bias in machine learning are abundant. In 2014, Amazon sought to automate its hiring process, so the company built an AI system that used machine learning to review job applicants’ resumes. After a year of testing, Amazon had to throw the whole system out, as it had internalized patriarchal preferences and was discriminating against women. Resumes that included female names or associations with the word “women” in them were automatically penalized.⁸

In 2019, Facebook had its own machine learning scandal. At the time, their advertising platform, Facebook Ads, used intentional targeting based on gender, race, and religion. When it came to the job market, they found that their system was targeting women with traditionally feminine jobs, like secretary work or nursing. On the other hand, it targeted minority men in industries like janitorial work or taxi driving.⁸

Case Study

Microsoft’s “Tay”

In 2016, Microsoft attempted to use machine learning to build a chatbot called “Tay,” which sourced its data from Twitter (now X) to learn how to communicate in a conversational way. The bot was designed to have automated discussions with users and was programmed to learn how to communicate by mimicking the users’ own language patterns. However, in less than a day, the chatbot began tweeting horribly racist, sexist, and homophobic comments, even going as far as to deny the existence of the Holocaust and advocate for genocide.

Although Microsoft quickly took down the awful messages, they emphasized that most of the tweets were benign, and the majority of the posts were only created when users were explicitly ‘baiting’ the program into spewing this type of offensive and dangerous rhetoric (sometimes even telling the program to repeat them directly). Still, the program was obviously not ready for the public and this debacle was a clear case study of what can happen when our programs are released without the proper checks and safeguards in place.⁹

What Can We Do About Bias in ML?

Clearly, machine learning has a bias problem. The first thing we as humans need to do is acknowledge that bias exists and understand how it gets into our training models. Some of that bias comes from the data we’ve made available to the models; the benchmark dataset for facial recognition skews 70% male and 80% white.¹¹ This is likely part of the reason that, as discussed, facial recognition technology performs worse for subjects who are female and Black. By expanding all of our ML data sets to include more diverse data, our algorithms will become more adept at serving the diverse population it is meant to.

Another thing we can do to address the skew of our algorithms is continue to rework the models as we uncover systematic errors, a process often called “fairness through awareness.” We can manually remove problematic data or create code designed to counteract known biases (such as reordering the hierarchy of rules a system runs through). We can even provide our systems with specific data known to be in opposition to the bulk of information, a “consider-the-opposite” approach.

Overall, the most important thing we can do is continue to expand and educate each other, including those who work with artificial intelligence. If the people designing the systems are more representative of the people the systems are designed for, everyone will be better served. Research has shown that when we continue to emphasize how our minds (and thus, ML) use heuristics to simplify information and provide people with explicit guidance on formal logic, hypothesis testing, and critical assessment of information, we see a reduction in errors and bias on a number of tasks.¹² Although none of these solutions are perfect, if we continue to work on educating and understanding the development of biases, perhaps our understanding of human prejudice will expand alongside our expansion of ML.

Sources

Krpan, Dario. (2023). Behavioural Science in An Age of New Technology. Lecture 1: PB434
Morales, E. F., & Escalante, H. J. (2022). A brief introduction to supervised, unsupervised, and reinforcement learning. In Biosignal processing and classification using computational learning and intelligence (pp. 111-129). Academic Press.
Google Cloud. Supervised vs. unsupervised learning: What's the difference? Retrieved from https://cloud.google.com/discover/supervised-vs-unsupervised-learning
Briggs, J., & Carnevali, L. (Year). Chapter 3. In Embedding Methods for Image Search. Pinecone.
Foote, K. D. (2019, March 13). A brief history of machine learning. DATAVERSITY. https://www.dataversity.net/a-brief-history-of-machine-learning/#.
Suryansh. (2023, June 5). The evolution of backpropagation: A revolutionary breakthrough in machine learning. Medium.
Inbal Nahum-Shani, Shawna N Smith, Bonnie J Spring, Linda M Collins, Katie Witkiewitz, Ambuj Tewari, Susan A Murphy, Just-in-Time Adaptive Interventions (JITAIs) in Mobile Health: Key Components and Design Principles for Ongoing Health Behavior Support, Annals of Behavioral Medicine, Volume 52, Issue 6, June 2018, Pages 446–462, https://doi.org/10.1007/s12160-016-9830-8
Dilmegani, C. (2021, August 9). Bias in AI: What it is, types & examples of bias & tools to fix it. AIMultiple. https://research.aimultiple.com/ai-bias/.
Victor, D. (2016, March 24). Microsoft created a twitter bot to learn from users. it quickly became a racist jerk. The New York Times. Retrieved 2024, from https://www.nytimes.com/2016/03/25/technology/microsoft-created-a-twitter-bot-to-learn-from-users-it-quickly-became-a-racist-jerk.html.
Najibi, A. (2020, October 24). Racial discrimination in face recognition technology. Science in the News. https://sitn.hms.harvard.edu/flash/2020/racial-discrimination-in-face-recognition-technology/
Nouri, S. (2021, February 3). Council post: The role of bias in artificial intelligence. Forbes. https://www.forbes.com/sites/forbestechcouncil/2021/02/04/the-role-of-bias-in-artificial-intelligence/?sh=1751699e579
Raji, I & Buolamwini, J. (2019). Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products. Conference on Artificial Intelligence, Ethics, and Society.

Case studies

From Insight to Impact: Our Success Stories

See Case Studies

Is there a problem we can help with?

See how we work

About the Author

A smiling woman with long blonde hair is standing, wearing a dark button-up shirt, set against a backdrop of green foliage and a brick wall.

Annika Steele

Annika completed her Masters at the London School of Economics in an interdisciplinary program combining behavioral science, behavioral economics, social psychology, and sustainability. Professionally, she’s applied data-driven insights in project management, consulting, data analytics, and policy proposal. Passionate about the power of psychology to influence an array of social systems, her research has looked at reproductive health, animal welfare, and perfectionism in female distance runners.

Consulting

Industries

Resources

Machine Learning

What is Machine Learning?