Reinforcement Theory
What is Reinforcement Theory?
Reinforcement theory is a key principle in behaviorism and is tied to B.F. Skinner’s operant conditioning. This theory states that a behavior can be encouraged or discouraged by the consequences that follow it. For example, If we want to increase a certain behavior we would reinforce it through offering rewards.
The Basic Idea
Do you remember back in elementary school when you received stickers and smiley faces on your worksheets? Or maybe you were occasionally chosen for the esteemed position of “class monitor.” It always made you feel a warm glow like you were doing something right. On the other hand, the feeling of receiving a timeout or sitting in for recess was soul-crushing.
These various rewards and punishments are all examples of reinforcement theory at work. Though we can remember examples from elementary school, reinforcement theory still influences our daily lives.
Put simply, reinforcement theory suggests that a behavior can be strengthened when good events or positive consequences follow and reduced when undesirable events or negative consequences follow. The theory rests on the idea that human behavior (and animal behavior, more broadly) is influenced by what happens as a consequence. For instance, when action A results in a desirable outcome, one is more likely to do action A; when action B results in an unpleasant outcome, one is less likely to do action B. You’re more likely to study for your spelling test after getting your teacher’s praise; you’re less likely to pull your friend’s hair after getting a stern lecture.
Reinforcement theory and operant conditioning
Reinforcement theory stands alone but is also part of a larger framework—B.F. Skinner’s operant conditioning.
Reinforcement encourages a behavior, while punishment aims to reduce it. Reinforcement can be positive or negative: positive reinforcement adds a desirable stimulus (e.g., buying yourself a treat after successfully completing your first week without smoking), while negative reinforcement removes an undesirable stimulus (e.g., reducing chores when a child earns good grades). Punishment has two forms: positive punishment adds an unpleasant consequence (e.g., adding an extra mile to your run when you’ve missed too many days in a row), and negative punishment removes something desirable (e.g., taking away recess privileges when a child misbehaves in class).
Operant conditioning has been used to explain various human and animal behaviors, including learning processes, addiction, and language acquisition.7 This method primarily concerns voluntary behaviors as it involves learning through consequences—rewards or punishments—based on individual choices. These behaviors are typically controlled by the individual, like studying to achieve good grades or attempting to quit smoking.
"I think, as much as people moan at things like award ceremonies, it gives people role models. It provides real positive reinforcement that you can be who you are and still massively achieve."
– Jack Monroe
Key Terms
Behaviorism: Behaviorism is a school of thought focusing on understanding behavior through observable action. This approach emphasizes the role of environmental stimuli and reinforcement in shaping behavior. According to behaviorism, all behaviors are learned through conditioned interactions with the external environment.
Operant Conditioning: A concept introduced by B.F. Skinner that explains how behaviors are learned and shaped through reinforcement (which increases the likelihood of a behavior) or punishment (which decreases its likelihood).
Reinforcement: A process in operant conditioning that increases the likelihood of a behavior being repeated. A consequence the individual finds desirable or rewarding immediately follows the desirable behavior. There are two main types of reinforcement:
- Positive reinforcement: Adding something pleasant to encourage a behavior (e.g., giving a child their favorite food for doing a household chore)
- Negative reinforcement: Removing something unpleasant to promote a behavior (e.g, a manager stops sending reminder emails once an employee consistently meets deadlines)
Punishment: A process in operant conditioning that decreases the likelihood of a behavior being repeated. A consequence the individual finds unpleasant immediately follows the undesirable behavior. There are two main types of punishment:
- Positive punishment: Adding something unpleasant to discourage a behavior (e.g., giving someone a ticket after they are caught speeding in order to make them slow down)
- Negative punishment: Removing something desirable to discourage a behavior (e.g, taking away a toy when a child misbehaves)
Extinction rate: In operant conditioning, the extinction rate refers to the speed at which a previously reinforced behavior decreases and eventually stops following the cessation of the reinforcement. For example, if a child receives video game time for doing their daily chores and the video game time reward stops, the child may gradually stop cleaning their room. The rate at which this behavior fades is the extinction rate.
History
Earlier developments in conditioning focused on the association between stimuli and involuntary responses. You likely know Pavlov’s dogs, who started to salivate when they heard the sound of his assistant’s footsteps long before the food was in front of them. This became known as classical conditioning: stimulus A and a resulting response, such as food and salivation, become associated with a different, neutral stimulus, such as the sound of the assistant approach or a bell. B becomes associated with A over time and, as a result, prompts the same response as A. Eventually, the dogs learn that approaching footsteps means food is arriving—and they begin to salivate over the footsteps.
Classical conditioning was developed when psychology was primarily concerned with an individual’s internal needs and motivations. Maslow and Herzberg completed related work during this period.
The psychoanalytical approach was dissatisfying for behaviourists because no external, observable phenomena allowed its techniques to be verified and tested. In the early 1900s, Edward Thorndike concretized the Law of Effect, suggesting that individuals are more likely to perform actions that have satisfying rewards. This marked a significant outward shift in behaviourism; subsequent research began examining the external effects of an action and how they influence choices, as opposed to theorizing how past events influenced internal responses. More specifically, Thorndike proposed that if the link between an action and the satisfying effect is strengthened, the action will become more likely in the future.
B.F. Skinner further distinguished between the roles of stimuli and actions in shaping behavior, diverging even more from early studies in classical conditioning. In classical conditioning, as seen in the example above, stimulus B (footsteps) becomes a conditioned stimulus that elicits an involuntary response (salivation). However, Skinner focused on how behavior itself is shaped by its consequences rather than by preceding stimuli. He introduced the term "operant" to differentiate voluntary actions from mere responses to stimuli, emphasizing that behavior is conditioned by its outcomes.
This insight led to his groundbreaking framework: operant conditioning. Skinner proposed that complex behaviors can be systematically shaped by reinforcing successive approximations of a desired action—rewarding behaviors that bring an organism closer to the goal while discouraging undesired actions. His work, along with the broader behaviorist movement, marked a pivotal shift in psychology, steering it away from its psychoanalytic roots and toward a more empirical, scientific approach that continues to shape modern behavioral research.
The story of reinforcement is the result of trying to understand the interplay between an action and its consequences, specifically how the probabilistic strengthening of this link operates.
People
Ivan Pavlov
A Russian physiologist known for his early research on classical conditioning. Pavlov contributed to behaviorism—the systematic study of behaviors—and conditioning. Classical conditioning is notably different from operant conditioning: classical conditioning deals with involuntary response, whereas operant conditioning involves modifying voluntary behavior. Nevertheless, Pavlov was a major influence on all behaviorists, including practitioners of operant conditioning, like Skinner.
Edward Thorndike
An American psychologist and pioneer in the field of behaviorism. Thorndike developed a more empirically driven approach to assessing behavior. He formulated the Law of Effect, which stated that an action followed by a desirable effect strengthens the link between that action and the following effect, thereby making the action more likely to recur. While this may seem obvious to us now, Thorndike’s law of effect set the stage for empirical testing of reinforcement to occur.
Burrhus Frederic Skinner
An American psychologist best known for his seminal work on behavior, B.F. Skinner is known as the father of operant conditioning. Arguing that classical conditioning was too simplistic to fully explain the complexity of human behavior, Skinner believed that people’s behavior was a result of how they have been conditioned by the consequences of their past behavior.
behavior change 101
Start your behavior change journey at the right place
FAQs
What does reinforcement theory explain?
Reinforcement theory explains how human and animal behavior is shaped and maintained by the consequences that follow it. This theory posits that behaviors followed by positive outcomes (reinforcements) are more likely to be repeated, while those followed by negative outcomes (punishments) are less likely to occur. The theory emphasizes the role of rewards and punishments in learning and behavior modification.
What are the four components of reinforcement theory?
Within this framework, also known as operant conditioning, there are four types of reinforcement and punishment; positive reinforcement, negative reinforcement, positive punishment, and negative punishment.
Is Reinforcement Theory of Motivation different from Reinforcement Theory?
Yes, reinforcement theory of motivation is a specific application of reinforcement theory within the context of the workplace. While reinforcement theory broadly explains how behavior is shaped by consequences, the motivational aspect focuses on using rewards and punishments to influence employee performance and achieve organizational goals.
What is the 'Skinner Box'?
Animal lovers may want to skip to the next section. In 1948, Skinner came up with the ‘Skinner Box’ (also known as an ‘operant conditioning chamber’ and a variation on Thorndike’s puzzle box), a device through which to study operant conditioning. Using this chamber, an animal’s behavior (usually a rat or a pigeon) can be objectively observed and measured in a compressed time frame. The box contained mechanisms such as a lever or button that the animal could manipulate to receive a reward (like food pellets) or avoid punishment (like a mild electric shock).
Impacts
Reinforcement theory can be a powerful way to promote positive behavior and is thus important to any team or organization. It is often used to achieve a team’s objectives, such as enhancing productivity or improving communication.
Reinforcement can also amplify the effectiveness of other behavioral techniques. For instance, antecedents—such as warnings, instructions, or informational cues—aim to encourage certain behaviors but may have a limited impact on their own. However, when paired with reinforcing consequences, they become significantly more effective
Consider a workplace safety initiative. Simply posting “Wear Your Safety Gear” signs (an antecedent) may not be enough to change employee behavior. But if supervisors actively recognize and reward employees who consistently wear protective equipment—such as through verbal praise, incentives, or team-based rewards—the reinforcement strengthens the desired behavior. Conversely, implementing mild penalties for non-compliance (e.g., requiring retraining) can further reinforce adherence.
Schedules of conditioning
When building his theory of operant conditioning, Skinner found that effectiveness was significantly altered by the schedule it was employed on. This led Skinner to develop a key concept in behaviorism, which is now known as schedules of reinforcement. The theory boils down to a simple, practical conclusion: to assure behavioral change, some reinforcement schedules may be better suited than others for a particular problem.
A reinforcement schedule can be continuous, meaning reinforcement will occur every time the target behavior happens (e.g., A vending machine dispensing a snack every time money is inserted). Another option is having reinforcement occur in fixed intervals, which are typically based on a certain period of time elapsing or after the behavior has been performed a certain number of times (e.g., Employees receiving a paycheck every two weeks, regardless of performance). Finally, a reinforcement schedule can reinforce behavior at variable intervals. In this case, the time or occurrences of the behavior are not fixed. In essence, an individual is rewarded randomly, regardless of behavior (e.g., Checking for social media notifications—rewards (likes/comments) appear at irregular times).
Controversies
Skinner was averse to examinations of the mind, discussions of goals, and internal motivations.3 This perspective itself is a major point of disagreement in the psychology community, since it eliminates a whole angle of looking at behavior.
Some academics and studies have taken issue with the perceived efficacy of reinforcement theory. As early as 1994, it has been argued that behavioral therapists are increasingly adopting procedures supported by reinforcement theory that lack tangible empirical evidence of working in a clinical setting.4 They point out that there have even been instances in which such procedures have had a counterproductive effect, suggesting that these techniques “may actually reduce positive behaviors and increase resistance to change.”
For example, Dan Pink suggests that having incentive-driven policies is effective when the task at hand is clear cut with straightforward rules, but otherwise it “ dulls thinking and blocks creativity.” In contrast, intrinsic motivation, feeling purposeful, and having autonomy may be better factors in increasing desirable behaviors. Strategies to encourage these behaviors could thus be more effective for complex tasks.5
Finally, reinforcement theory can inadvertently influence our judgment, such as when we make decisions based on past experiences and discard new or contradicting information in doing so.
Case Studies
Seat belt reminders in cars
While seat belts in cars have been mandatory in Canada since 1976, it was initially difficult to ensure that the mandate was being followed.6 After years of figuring out the best way to enforce the rule, the seat belt reminder sound found its way into most cars. When the driver and passengers have not buckled up and the car starts moving, the car beeps loudly and relentlessly, until the seat belts are finally clicked. This annoying beeper is a classic example of negative reinforcement: after the target action is performed, the negative stimuli is removed. To avoid this annoyance in the future,we’re encouraged to put on the seat belt as early as possible next time we get in the car.
Examining the effect of positive reinforcement and punishment on cigarette use
When it comes to smoking, our experience with our first cigarette often dictates if we develop a dependence later on. In a 2018 study, researchers surveyed respondents on their feelings, reactions, and symptoms during the first few times they smoked. It was found that if our first cigarette was a positive experience, we tended to get hooked later on. This finding strongly suggests that reinforcement could be a key driver of habitual smoking, as we have come to associate it with positive feelings. On the other hand, they found that an unpleasant first experience, which acts as a positive punishment, did not significantly decrease the smoking frequency later in life. Accordingly, positive initiation experiences could predict cigarette use with some accuracy, whereas negative experiences could not.7
Reinforcement Learning and AI
With its roots in reinforcement theory, reinforcement learning is an approach to machine learning where an AI agent learns to make decisions by interacting with an environment. The agent takes actions to maximize cumulative rewards over time, receiving feedback in the form of rewards or penalties. Through trial and error, the agent improves its strategy to achieve optimal outcomes. Both reinforcement learning and reinforcement theory are based on the same fundamental principles of learning and maintaining behaviors in response to consequences.8
Related TDL Content
Positive Reinforcement and Negative Reinforcement
Understanding the difference between positive and negative reinforcement is critical in using these behavioral catalysts correctly. To get a deeper dive into each reinforcement aspect of operant conditioning, check out these two guides that focus on positive and negative reinforcement.
Using Behavioral Insights to Stay Motivated at Work
Concepts from reinforcement theory often come into play in the workplace, and being aware of them can help us adopt helpful work habits. This article discusses how reinforcements like acknowledgement, appreciation, and knowing the impact of our work can be used to motivate ourselves and others.
Sources
- Skinner, B. F. (1937). Two Types of Conditioned Reflex: A Reply to Konorski and Miller. Journal of General Psychology, Vol. 16, No. 1, 272-279.
- A. (2016, February 1). Reinforcement Theory of Motivation – IResearchNet. Psychology. http://psychology.iresearchnet.com/industrial-organizational-psychology/leadership-and-management/reinforcement-theory-of-motivation/
- Banaji, M. R. (2011). Reinforcement Theory. The Harvard Gazette. Retrieved from https://news.harvard.edu/gazette/story/2011/10/reinforcement-theory/
- Viken, R. & McFall, R. (1994). Paradox Lost: Implications of Contemporary Reinforcement Theory for Behavior Therapy. Current Directions in Psychological Science. Retrieved from https://journals.sagepub.com/doi/10.1111/1467-8721.ep10770581
- Pink, D. (2009). The Puzzle of Motivation. TED Global. Retrieved from https://www.ted.com/talks/dan_pink_the_puzzle_of_motivation/transcript
- N.a. (2019). The Seat Belt Reminder – What’s that noise all about? News, IEE. Retrieved from https://www.iee-sensing.com/en/blog/details/2019/09/the-seat-belt-reminder-what-s-that-noise-all-about.html
- McLeod, S. (2024, February 2). Operant Conditioning: What it is, How it Works, and Examples. Simply Psychology. https://www.simplypsychology.org/operant-conditioning.html
- Mummert, T., Subramanian, D., Vu, L., & Pham, N. (2022, September 15). What is reinforcement learning? IBM Developer. https://developer.ibm.com/learningpaths/get-started-automated-ai-for-decision-making-api/what-is-automated-ai-for-decision-making/
About the Authors
Oorja Majgaonkar
Oorja is a former content creator with a passion for behavioral science. She previously created content for The Decision Lab, and her insights continue to be valuable to our readers.
Sarah Chudleigh
Sarah Chudleigh is passionate about the accessible distribution of academic research. She has had the opportunity to practice this as an organizer of TEDx conferences, editor-in-chief of her undergraduate academic journal, and lead editor at the LSE Social Policy Blog. Sarah gained a deep appreciation for interdisciplinary research during her liberal arts degree at Quest University Canada, where she specialized in political decision-making. Her current graduate research at the London School of Economics and Political Science examines the impact of national values on motivations to privately sponsor refugees, a continuation of her interest in political analysis, identity, and migration policy. On weekends, you can find Sarah gardening at her local urban farm.