Understanding Operant Conditioning: Examples and the Origins of the Learning Method

Humans can learn behaviors based on positive and negative consequences. Positive consequences lead to more frequent repetition of those behaviors, while negative consequences make those behaviors less likely to occur.

Operant behavior occurs in natural settings, not just experiments. Reinforcement and punishment happen in controlled environments like therapy sessions or classrooms.

Classical conditioning links stimuli to automatic responses:

Operant conditioning uses voluntary behaviors, maintained by consequences. Scientists motivated pigeons to increase behavior frequency by rewarding them with food in experimental chambers.

By using this process, children and employees learn to consciously behave in order to receive reinforcements and avoid punishments. Positive reinforcers, like praise or gifts, stimulate desired behaviors, while negative punishments, like spanking, discourage undesirable behaviors.

Examples

Operant conditioning examples are all around us. For instance, a student who completes tasks promptly to receive a reward, or an employee who finishes assignments to earn promotions or praise. Additional instances encompass:

Receiving applause from the audience after performing in a community theater play. This is a positive reinforcer that will inspire you to make more performances and try out more roles.
Training your dog to fetch by giving him praise and patting his head, when he performs correctly. This is another example of a positive reinforcer.
A professor tells his students that to avoid taking final comprehensive exams they have to maintain full attendance throughout the semester. By removing the final test which acts as an unpleasant stimulus the students reinforce negatively to maintain full attendance.
Failing to submit a project on time turns your boss angry and harms your reputation in front of your boss and colleagues. This is a positive punisher, which decreases the probability that the projects will the submitted late in the future.
Parents punish a teen girl by taking her phone away since she did not clean up her room as told by her parents. This acts as a negative punishment in which parents take away a phone, a positive stimulus.
A child if showing certain behavior is under the threat of losing recess privileges will avoid that particular behavior. This possibility of punishment will result in the decline of disruptive behaviors.

Some examples demonstrate an increase in behavior due to the promise or possibility of receiving a reward. The process also aids in decreasing a specific behavior by applying a negative outcome or removing a desirable one.

The Origin

B.F. Skinner, a behaviorist, introduced Skinnerian conditioning. He believed that psychology should focus on observable behaviors rather than internal thoughts or motivations.

Behaviorism emerged as a significant influence in psychology during the early 20th century. While it initially drew inspiration from the ideas put forth by John Watson, it also emphasized associative learning.

Skinner focused on understanding the consequences of people’s actions, which are influenced by their behaviors. He believed that the response following classical conditioning is due to innate reflexes.

He named this behavior “respondent”.

Skinner extended Thorndike’s theory, showing that all behaviors result from operant conditioning. His experiments proved that people repeat reinforced behaviors and abandon punished ones. Without reinforcement or punishment, this association fades away.

Skinner explained operant and respondent behaviors. Operant behaviors are determined by their consequences. The term “operant” refers to behaviors in the environment that lead to outcomes. His theory explained how we acquire different behaviors.

Skinner’s theory was influenced by Edward Thorndike’s “Law of effect”

Edward believed outcomes are tied to actions, while bad behavior stems from different actions. Skinner built on Thorndike’s ideas by introducing reinforcement, which promotes repeated behaviors. To examine the effect of operant processing, Skinner conducted an experiment in a box called the “Skinner box.

The process is based on the idea that actions followed by reinforcement are strengthened and repeated. For instance, if your jokes make people laugh, you’ll likely tell similar jokes again. Similarly, if a certain gesture gets praise, you’re more likely to repeat it. The action is strengthened because it led to a desirable outcome.

Similarly, future undesirable consequences can be avoided. For instance, if you tell a joke again and receive no laughter, you are unlikely to repeat it. Likewise, if a specific gesture leads to punishment, it will not be repeated.

The Components

Source: Sprouts

Reinforcement and punishment

Reinforcement boosts positivity, self-esteem, and achievement. Punishment doesn’t create lasting change. Reinforcement creates a positive atmosphere, internalizes desired behavior, and suits individual needs. It motivates, leads to lasting change, and promotes harmony.

Positive reinforcement

Positive reinforcement is when a stimulus is given as a result of a specific behavior, leading to a favorable outcome. For instance, a child getting candy for good behavior, or a teacher complimenting a student for good grades. These rewards encourage the repetition of desired behavior.

Skinner used hungry rats in boxes with a lever. Rat accidentally hit lever and got food. Rat learned to go directly to lever. Desire for food made rat repeat action. Positive reinforcement strengthens behavior with rewarding consequences. Example: Student does homework on time for teacher’s reward.

Negative reinforcement

Negative reinforcement removes an unfavorable experience due to a behavior and strengthens it. For instance, if a monkey presses a lever, the experimenter stops giving electric shocks.

The monkey’s lever-pressing is negative reinforcement as it helps remove undesired electric shocks. Rats in skinner boxes also experienced shocks but accidentally discovered they could switch them off using a lever.

To avoid the consequence, rats repeated the action continuously. They were taught to turn on the lights before the electric shock, so they pressed the lever right before the lights came on, knowing it would stop the shock.

Some other reinforcements include:

Primary reinforcement

Primary reinforcement involves desired behavior, like the need for food. This type of reinforcement is a natural response to a stimulus and doesn’t require learning; it’s automatic.

For example, when you smell delicious food your mouth gets filled with water. This is a natural stimulus that is innate. This reinforcement mostly is the result of evolution ensuring the survival of species.

Watering food aids digestion. Primary reinforcers, such as biological elements, elicit involuntary responses and vary in their impact on individuals based on genetics and experiences. Some people naturally tolerate high temperatures more than others, either due to innate traits or repeated exposure.

Source: bypass publishing

Conditioned reinforcers

Reinforcements aren’t always wanted by humans. Conditioning is sometimes necessary to make the stimuli effectively reinforce behavior. These neutral stimuli become reinforcing when paired with primary reinforcers.

Conditioned reinforcers, like money, motivate behavior. Although not inherently desired, paper money becomes a reinforcer when paired with essential goods. This widely used reinforcement is present in our daily lives.

Extinction

In operant conditioning, extinction occurs when a rewarded behavior is no longer reinforced. The speed of extinction varies based on the reinforcement schedule. Undesirable behavior disappears when ignored or unrewarded. For example, if a teacher stops giving attention to a disruptive student, the behavior eventually stops.

Positive punishment and negative punishment:

The opposite of reinforcement is punishment. Punishment discourages or weakens a behavior if it follows a certain behavior. Punishment decreases the behavior that follows it since it is an aversive event. It works by making use of a stimulus that is unpleasant or by eliminating a rewarding stimulus.

Positive punishment

Positive punishment occurs when an unfavorable outcome follows a certain behavior. For example, a child is being spanked by his parent for using curse words.

Negative punishment

Negative punishment is when something good is taken away because of bad behavior. For instance, a child loses their allowance for misbehaving. Punishment is still used, but some researchers say it isn’t always effective. It may stop the behavior temporarily, but it will come back later, and usually worse.

Negative punishment can cause unwanted side effects. When a teacher punishes a child, they become fearful and uncertain about how to avoid the undesired behavior. In some cases, punishment can make the child more intense in their undesired behavior.

Skinner and other researchers propose reinforcing desired behavior and disregarding undesired behavior. Reinforcement informs individuals about the behavior that is desired, whereas punishment indicates the behavior that should be avoided.

However, punishment can cause many problems these include:

A person does not forget a Punished behavior rather such behavior is suppressed, and it returns in the future when the punishment does not exist.
It causes fear which generalizes undesirable behaviors. For example, a child might have a fear of going to school, if they are punished by their teacher due to undesired behavior.
It does not guide towards a desirable behavior. Reinforcement tells an individual what needs to be done, while punishment only tells what needs not to be done.

Appetitive stimuli are rewarding and aversive stimuli are unwanted. They are used in different types of reinforcement and punishment. For example, giving a child candy as a reward for good behavior or studying.

Candy increases desired behavior, while negative punishment takes away TV privileges for misbehavior. Yelling at a child is positive punishment as it uses aversive stimulus to eliminate disliked behavior.

Schedules of Reinforcement

Reinforcement is complex and influenced by various factors. Skinner’s research showed that the frequency and timing of reinforcement impact the acquisition and modification of behaviors. For instance, in a skinner box, a trapped rat’s learning is influenced by when and how often it is reinforced.

If rats don’t get food when they press the lever, they will eventually stop trying. This shows that people continue to do things as long as they are rewarded. Skinner found that how often we are rewarded affects how well we learn a behavior.

He identified different schedules of reinforcement with their own frequencies and timings. These patterns had varying effects on extinction and learning speed. Skinner and Ferster discovered methods of delivering reinforcement.

Response Rate

It is the rate at which the rat presses the lever.

Extinction Rate

Skinner discovered that the speed of a lever press cessation varies. The slowest rate is with variable-ratio reinforcement, while continuous reinforcement has the fastest cessation rate.

Continuous reinforcement

Continuous reinforcement is when every behavior is followed by a response, leading to quick learning. If reinforcement stops, behavior declines and eventually stops, known as extinction. Positive reinforcement occurs when a specific behavior happens for a human or animal. Response rate is slow, while extinction is fast.

Fixed-ratio schedules

This schedule rewards behavior after a certain number of responses. For example, a child may receive a star after every 5^th chore. However, the response rate slows down once the reward is received. The behavior is reinforced only after it has occurred for a specified time. The response rate is fast, but the extinction rate is medium.

Variable-ratio schedules

Source: khanacademymedicine

This schedule doesn’t specify the required behaviors for a reward, which can vary. It is commonly used in slot machines and is hard to extinguish because it maintains behavior variability. Examples are fishing and gambling, where reinforcement occurs at unpredictable times. The rate of response is fast, but extinction is slow.

Fixed-interval schedules

Rewards are given after a set time, like getting paid hourly. Response rate increases when reward is near, but slows after receiving it. This reinforcement is given after one correct response and at a fixed time, like getting paid hourly or receiving a pellet after pressing a lever. Response and extinction rate is moderate.

Variable-interval schedules

In this reinforcement method, rewards are given intermittently. For instance, a child gets sporadic allowance for good behavior, motivating them to continue behaving well. Likewise, paying someone at different times leads to a quick response rate and slow extinction rate.

Application

Behavior Modification

Operant conditioning is linked to behavior modification, a method used to treat psychological issues in children and adults such as bedwetting, phobias, and anxiety. Therapies based on this process are used in behavior modification.

Behavior modification changes behavior by altering environmental events, ignoring undesired behavior and reinforcing desired behavior. It relies on the connection between response and stimuli to shape behavior, using operant processing to understand motives and consequences.

Three types of responses are generally identified: neutral, reinforcing, and punishing. Neutral responses are neither negative nor positive, while reinforcing responses are positive and punishing responses are negative. Behavior shaping and token economy are examples of behavior modification.

Token economy

The concept explains various behaviors in learning and language acquisition. One practical application is the use of token money in prisons, psychiatric hospitals, and schools.

Token economy is a method of behavior modification where desired behaviors are rewarded with tokens, such as buttons or stickers. These tokens can be exchanged for real rewards, like privileges or activities, creating a system of reinforcement.

An example of token economy includes a primary school teacher making use of stickers as token money to give rewards to young children.

Token money helps manage psychiatric patients, but they may become overly reliant and struggle with society. Staff in token economy programs must avoid favoritism or neglect, and receive proper training for fair token use during shift changes in psychiatric facilities and prisons.

Token economies have 3 components: desired behavior, earned tokens, and token exchange. They are useful for changing preferences or motivating reinforcement in different environments.

Behavior Shaping

Skinner studied behavior shaping and its connection to complex behaviors. He observed that rewards and punishments play a role in bringing an organism closer to desired behavior. As the organism progresses, the contingencies for rewarding should change.

Skinner believes that most behaviors in humans and animals can be achieved through successive approximation and shaping. This process involves reinforcing each part of behavior until the entire behavior is mastered.

In Behavioral shaping reinforcement of successive approximations of response takes place, which leads the subject towards the desired behavior.

Behavior shaping occurs when a child starts swimming. They are praised for entering the water, and this praise continues as they learn different arm strokes, kicks, and eventually how to propel themselves through water using both kicking and specific strokes. This process shapes the entire behavior. Skinner often utilizes a behavior shaping approach in operant processing.

The behavior shaping process uses reinforcement to encourage similar behaviors to the target behavior, leading to the desired outcome. This method is effective for training animals and humans to perform complex tasks.

Parents can use shaping, a technique in operant conditioning, to teach their daughter how to clean her room. By giving rewards at each step of the cleaning process, parents can achieve their goal of teaching their daughter this skill.

Advantages

Psychologists use operant processing to understand patient behavior. This theory explains the use of reinforcements in learning and how they affect conditioning outcomes. It also helps explain learning in real-life situations.

Parents use rewards and praise to shape their child’s behavior from a young age. They also use verbal discouragement or removing privileges to deter misbehavior. This process is a common way of learning and applies to different learning environments.

The highest rate of response is in the variable ratio for students that focus on learning a new task.

At first, reinforcement is frequent to improve performance. For example, teachers praise students to motivate them. Later, praise is given for correct and exceptional answers. Ignoring bad behaviors can eliminate them. Teachers use encouragement, rewards, and praise to reinforce achievements, showing positive reinforcement.

Principles of operant processing also apply when elders detain expel or ground children for not showing good academic rewards. This further influences behavior by using it.

Future learning can be done by knowledge of success.

However, the type of given reinforcement needs to be varied to maintain behavior. This task is not easy, since the teacher appears insincere if she is more focused on the way to behave. The uses are not limited to humans, rather the behavior of animals such as dogs is also shaped by using reinforcements to encourage obedience.

The process of conditioning helps explain why zoo animals display repetitive behaviors. Animals, like dogs, learn through rewards. When you reward them, you condition them to associate the action with something positive. Similarly, if you punish a dog by hitting it, you condition it to associate the action with something negative.

To integrate the process, follow these principles. If there are issues at either end of the spectrum, usage can be halted. Both ends must ensure positivity. Animal facilities train animals to move between enclosures, ensuring safe examinations by vets.

Criticism

Source: TED-ed

While we can explain many behaviors by operant processing, several people criticize the process. People believe that it neglects the role of cognitive and biological elements and is an incomplete learning explanation.

Behaviors stem from authority, ignoring personal discoveries and curiosity. Critics claim it leads to manipulative behavior. According to Skinner, natural behavior arises from the environment. People can utilize knowledge for good or bad.

Scientists criticized Skinner’s theory for extrapolating animal studies, as he based his observations mostly on animal experimentation. Psychologists argue that this generalization is flawed because animals and humans differ cognitively and physically.

It is very effective but still contains many flaws or issues. It is hard to trace these issues outside your given vicinity.

Operant conditioning is simple and cannot teach complex concepts.

Therefore, you can encounter many issues if you are communicating a complex issue to someone. The same problem will happen if you try to teach concepts to an animal by making use of operant processing. Another criticism is that it ignores cognitive processes.

The theory by Skinner overlooks species-specific behavior patterns and genetic predispositions, and is criticized for oversimplifying complex human behaviors. It neglects cognitive processes and individual differences in learning behavior.

This is why critics have labeled the ideas of Skinner as deterministic. Operant processing only considers environmental factors to be responsible for the behavior of an individual and fails to consider the individual ability to take actions as per their own free will.

The theory explains behavior changes but ignores cognitive and inherited factors in learning, as well as how humans and animals learn. For example, Kohler discovered in 1924 that primates solve problems quickly without trial and error. Bandura also suggested in 1977 that humans learn by observing, not just through personal experience.

Conclusion

While the dominance held by behaviorism in the 20^th century has been lost, operant conditioning skill remains an important tool in behavior modification and the learning process. Sometimes behavior changes can lead to natural consequences.

Learning, whether training a dog or children, requires effort and time. Determine the most effective reinforcements or punishments for your situation and establish a reinforcement schedule for positive outcomes. Ultimately, reinforcement is crucial for successful learning and desired behaviors.