The learning method which includes giving rewards and punishments for a certain behavior is operant conditioning. Instrumental conditioning is another name for it. The process involves making a connection between behavior and the result or consequence of that behavior.
Through such a process, humans can learn to behave in a certain way which can lead them to rewards. This learning method determines the probability of repeating a particular response, by evaluating its consequences which can be negative or positive. The behavior which results in a positive consequence. Or a reward will occur more frequently. And the behavior with a negative consequence is less likely.
The word operant describes how an organism operates in the environment in a certain way. It does not always take place in experimental settings. Rather it takes place in natural settings. Both reinforcement and punishment take place in a structured setting i.e., a therapy session or a classroom.
Classical conditioning binds external stimuli to involuntary and reflexive responses.
Operating conditioning involves the use of voluntary behaviors. Which you can maintain by the consequences which occur because of those behaviors. While studying operant conditioning, the scientist placed pigeons in experimental chambers and gave food as a reward at systematic intervals. By rewarding pigeons for a certain behavior, scientists were able to motivate them to increase the frequency of that behavior.
Through the use of the process on children, employees, etc learn to behave consciously in order to receive reinforcements and avoid punishments. Reinforcers such as praise or gifts stimulate a certain behavior while punishments like spanking discourage a certain type of behavior from happening.
Operant conditioning examples are present around us.These include the example of a student completing his work quickly to get a reward, or an employee finishing work to get promotions or praise. More examples include:
- Receiving applause from the audience after performing in a community theater play. This is a positive reinforcer that will inspire you to make more performances and try out more roles.
- Training your dog to fetch by giving him praise and patting his head, when he performs correctly. This is another example of a positive reinforcer.
- A professor tells his students that to avoid taking final comprehensive exams they have to maintain full attendance throughout the semester. By removing the final test which acts as an unpleasant stimulus the students reinforce negatively to maintain full attendance.
- Failing to submit a project on time turns your boss angry and harms your reputation in front of your boss and colleagues. This is a positive punisher, which decreases the probability that the projects will the submitted late in the future.
- Parents punish a teen girl by taking her phone away since she did not clean up her room as told by her parents. This acts as a negative punishment in which parents take away a phone, a positive stimulus.
- A child if showing certain behavior is under the threat of losing recess privileges will avoid that particular behavior. This possibility of punishment will result in the decline of disruptive behaviors.
Some of these examples show an increase in behavior due to the promise or possibility of receiving a reward. The process also helps in decreasing a certain behavior by applying a negative outcome or by removing a desirable outcome.
The concept of was first described by B.F Skinner a behaviorist. Or Skinnerian conditioning. Being a behaviorist, in B.F Skinner’s view psychology has a limit. It is the study of behaviors that are apparent and observable. Skinner believed that looking at internal motivations and thoughts is not necessary to explain behavior. Rather the suggestion he made was to look at observable reasons of human behavior.
In psychology, behaviorism became a major force at the start of the 20th century. Early on this school of thought was under the influence of the ideas by John Watson. However, early behaviorists also focused on associative learning.
While the focus of others was on classical conditioning it was Skinner who wanted to learn what happened through another type of processing. He wanted to know the consequences of people’s actions that their behaviors influence. According to Skinner the response following classical conditioning is because of the reflexes that are innate and occur automatically.
He named this behavior “respondent”.
Skinner researched the process almost half a century after Thorndike’s publication. Skinner provided an extension to the theory by proving that all behaviors are results of operant conditioning in some way. He proved through his experiments that people repeat behavior that follow reinforcements while those that follow punishments do not. If we remove these reinforcement and punishments this association would become extinct.
He distinguished both operant and respondent behaviors. Operant behaviors according to skinner are the ones that its consequences reinforce. These consequences determine whether the behavior needs to be performed again or not. The word operant described the behavior operating in the environment which results in consequences. Skinner’s theory given helped in explaining how the range of behaviors shown every day is acquired.
Skinner’s theory was influenced by psychologist Edward Thorndike, who called it the “Law of effect”.
According to the principles by Edward, the actions which result in desirable outcomes generally happen while the actions which result in undesirable behavior do not. Skinner introduced the reinforcement concept into the ideas of Thorndike, specifying that the reinforced behavior will most likely be repeated. To study the behavior caused by operant processing, an experiment was conducted by Skinner in a box named a “Skinner box”.
While the process relies on a simple notion that the actions followed by reinforcement are more strengthened and will be repeated in the future. For example, if your jokes make everybody laugh, you will probably tell similar jokes again to get desired behavior. Furthermore, if your certain gesture results in getting praise, you are more likely to repeat it next time. The preceding action is strengthened since the behavior was followed by a desirable outcome.
Similarly resulting undesirable consequences will not occur in the future. For example, if you tell the joke again and nobody laughs you are not likely to repeat the joke. Or if your certain gesture results in getting punishment you will not repeat it again.
Reinforcement and punishment
Reinforcement is used to maintain or intensify a particular type of desired behavior while punishment helps in reducing behavior that is disliked. In modifying behavior, reinforcement seems to be more effective as compared to punishment.
Positive reinforcement involves introducing a stimulus to a situation. It occurs when a favorable outcome occurs because of a particular behavior. For example, a child receiving candy after showing good behavior, or a compliment from the teacher after achieving good grades. These gifts or rewards increase the likelihood that the desired behavior will be repeated to get the reward again.
To show positive reinforcement Skinner placed hungry rats in Skinner boxes. Since the boxes had a lever, the rat while moving accidentally knocked the lever. When the rat did so the food pellet fell into the container. However, after sometimes the rat learned to go straight to the lever. The desire of receiving food, made the rat, repeat the action continuously. This shows that positive reinforcement helps in strengthening the behavior by providing the individual with consequences that are rewarding for them. Another example of positive reinforcement includes a student completing his homework on time if he is rewarded by the teacher for completing his work on time.
Negative reinforcement occurs when an unfavorable experience is removed due to a particular behavior. It involves withdrawing a particular stimulus. It involves the termination of an unpleasant state as a result of the response. Furthermore, it strengthens the behavior since it removes or stops an unpleasant experience. For example, when the monkey presses a certain lever the experimenter ceases to give the monkey electric shocks.
In this particular scenario, the lever-pressing behavior is a negative reinforcement since the monkey will want to remove the undesired electric shocks again. To show how negative reinforcement works, rats were placed in the skinner boxes and were subjected to electric shocks causing them discomfort. The lever was accidentally knocked as the rat moved and the electric shock was switched off. The rats quickly learned how to turn off the electric current.
To avoid this unpleasant consequence the rats would repeat the action continuously. Furthermore, the rats were even taught to avoid electric shock by turning the lights on, before the electric shock turned on. The rats were able to press the lever Just before the lights came on since they knew that the electric current would stop when the lights are switched on.
Some other reinforcements include:
Primary reinforcement includes behavior that is innately desired. E.g., the desire for food. This type of reinforcement includes a naturally occurring reaction to a particular stimulus. It does not need intricate learning and is innate. This is a reflex that an individual performs automatically when the stimulus is presented.
For example, when you smell delicious food your mouth gets filled with water. This is a natural stimulus that is innate. This reinforcement mostly is the result of evolution ensuring the survival of species.
For example, watering food as a result of food anticipations helps the body in optimizing digestion. The reinforcers causing this reinforcement i.e., primary reinforcers are biologically important to an organism. These reinforces cause response that is involuntary. Primary reinforcers also impact each individual differently depending on their experiences and genetics although they occur naturally and are intrinsic. For example, some people have more tendency to tolerate high temperatures as compared to others. This ability to tolerate high temperatures may be inborn or may be developed due to repeated past encounters.
Not all reinforcements are innately desired by humans. Sometimes the subjects need to be conditioned to ensure that the stimuli work to reinforce behavior. These reinforcersreinforce behavior as it is associated with primary reinforcers and not because they are desired innately. It is a neutral stimulus that when paired with a primary reinforcer can acquire reinforcement properties that are similar to the primary reinforcer.
These stimuli then help in motivating an individual’s behavior. An example of a conditioned reinforcer is money. Paper money is used to acquire goods that are innately desired although paper money itself is not innately desired. The paper bills when paired with food, shelter, or water become reinforcers. This type of reinforcement is all around us and used by many people.
In operant conditioning when a reinforced behavior is extinguished it is called extinction. After reinforcement stops extinction takes place. The reinforcement schedule affects the pace of extinction. Extinction occurs when an undesirable behavior is unrewarded or ignored. Such unrewarded behavior disappears with time. For example, if a teacher acts out on the behavior of a certain student, who is doing it to get attention, the students may consider this as a reward thus reinforcing this behavior. However, if such behavior of the student is ignored by the teacher, over time the behavior disappears.
Positive punishment and negative punishment:
The opposite of reinforcement is punishment. Punishment discourages or weakens a behavior if it follows a certain behavior. Punishment decreases the behavior that follows it since it is an aversive event. It works by making use of a stimulus that is unpleasant or by eliminating a rewarding stimulus.
- Positive punishment
Positive punishment occurs when an unfavorable outcome follows a certain behavior. For example, a child is being spanked by his parent for using curse words.
- Negative punishment
Negative punishment occurs when something favorable is removed due to a certain undesirable behavior. For example, a child’s weekly allowance is denied due to his misbehavior. Although punishment is still used by many, some researchers have identified that it is not always effective. While punishment suppresses the behavior for some time, the undesired behavior will most likely come back in long run, and that too with more intensity.
Unwanted side effects can also occur due to negative punishment. For example, if a teacher punishes a child, they become fearful and uncertain since they do not know what needs to be done to avoid this undesired behavior (Future punishments) While in some circumstances punishment may remove the fear they have against the certain undesired behavior increasing the intensity of the undesired behavior.
Skinner and other experimenters suggest reinforcing behavior that is desired and ignoring undesired behaviors. Since reinforcement tells an individual about the desired behavior while punishment tells an individual which behavior needs to be avoided.
However, punishment can cause many problems these include:
- A person does not forget a Punished behavior rather such behavior is suppressed, and it returns in the future when the punishment does not exist.
- It causes fear which generalizes undesirable behaviors. For example, a child might have a fear of going to school, if they are punished by their teacher due to undesired behavior.
- It does not guide towards a desirable behavior. Reinforcement tells an individual what needs to be done, while punishment only tells what needs not to be done.
A rewarding or pleasing stimulus is also referred to as appetitive while an unwanted or unrewarding stimulus is also referred to as aversive. Appetitive stimuli are used in negative punishment and positive reinforcement while Aversive stimuli are used in both negative reinforcement and positive punishment. An example of a positive stimulus involves rewarding a child with candy to motivate them to behave nicely or to study.
The candy is an appetitive stimulus that helps in increasing the desired behavior. While on the other hand, negative punishment involves revoking the television privileges of a child when he misbehaves. This means that the appetitive stimulus has been eliminated to get the desired behavior. While continuous yelling of the parent at a child, due to his misbehavior is a positive punishment since it makes use of an aversive stimulus i.e yelling to eliminate a disliked behavior.
Furthermore, if the child still misbehaves the frustrated parent may negotiate with the child by giving him an offer to reduce chores that the child has to complete that week in exchange for the desired behavior. This form of reinforcement is called negative reinforcement since chores ( an aversive stimulus) are removed to get the good or desired behavior.
Schedules of Reinforcement
Reinforcement is affected by a number of factors; it is not straight forward, and the factors influence the rate and quality at which an individual learns things. Skinner found that acquisition strength and speed are greatly affected by how often and when behaviors are reinforced. In simple words the frequency and timing of reinforcement influence how old behaviors are modified and how new ones are learned. Imagine a skinner box with a rat trapped in it.
If no food delivery occurs when the rats press the lever, the rat will stop pressing it after several attempts. This shows that a person continues to act in a certain way as long as he is receiving a reward for it. Generally, behavior is not reinforced constantly in the real world. Skinner through his experiments identified that the reinforcement frequencies can affect how quickly and successfully an individual learns a specific behavior.
He identified specific schedules of reinforcement each having its own frequencies and timings. These different reinforcement patterns had varying effects on extinction and learning speed. Skinner and Ferster found ways in which delivery of reinforcement takes place.
It is the rate at which the rat presses the lever.
It is the rate at which the rate stops pressing the lever. The rate at which the rabbit gives up the pressing of a lever. It was identified by Skinner that the reinforcement type which produces the slowest extinction rate is variable-ratio reinforcement. Also, the reinforcement type which produces the quickest extinction rate is continuous reinforcement.
This type of reinforcement takes place when each and every performance of a given behavior is followed by a particular response. Continuous reinforcement ensures quick learning. However, the behavior will decline and ultimately stop if reinforcement is stopped. This is referred to as extinction. Any time a specific behavior occurs a human or animal is positively reinforced. The rate of response is generally slow while the rate of extinction is fast.
This type of schedule rewards behavior only when they meet a specified number of responses. For example, after completing every 5th chore a child may receive a star. However, by following this schedule the response rate slows down as soon the reward is received. Only after a certain behavior has occurred for a specified time a behavior is reinforced. The rate of response is fast while the extinction rate is medium.
This type of schedule does not specify the number of behaviors that are required to get a reward. The number of behaviors may vary. A rate of response is generally observed when following this schedule. Since it maintains the behavior variability it is also hard to extinguish. This kind of reinforcement is mostly seen in slot machines. Behavior is reinforced in this type of reinforcement only at an unpredictable time. The example includes fishing or gambling. In this type of reinforcement, the rate of response is fast while the rate of extinction is slow.
The reward is provided in these schedules only after passing a specific amount of time. One of the examples of this type of reinforcement is getting paid by the hour. The response rate increases when the reward is near while it slows down when the reward has been received, similar to the fixed-ratio schedule. This type of reinforcement is given when one correct response is provided and at a fixed time. Examples include getting paid hourly, or delivery of pellet after at least one lever is pressed. Both the rate of response and the rate of extinction is medium.
In this type of reinforcement, the amount of time between rewards varies. One example of variable-interval schedules is the child receiving allowance at various times as long as they exhibit good positive behavior. In anticipation of receiving their allowance eventually, the child will continue exhibiting positive behavior. Reinforcement in this schedule is given after passing an unpredictable amount of time, provided one correct response is given. For example, the response is given at an average of 5 minutes. Another example includes payment paid to a person at different times. The response rate is fast while the rate of extinction is slow.
Operant conditioning is also linked with behavior modification. Behavior modification is the approach that is used to treat certain psychological issues found in children and adults, these include bedwetting, phobias, and anxiety. Behavior modification involves the use of therapies that are based on the process.
The main principles comprise changing environmental events which are related to the behavior of a person. For example, the ignoring of undesired behavior and reinforcing desired behaviors. Behavior modification uses a framework based on the connection between response and stimuli to modify or form behavior. Operant processing is the foundation of behavior modification, which helps in learning about a person’s behavior and determining the motive of the behavior along with its consequences.
Three types of responses are generally identified. These include neutral, reinforcing, and punishing. Neutral responses are neither negative nor positive. While reinforcing responses are deemed positive and punishing responses are deemed negative. Behavior shaping and token economy are examples of behavior modification.
The concept is used to explain various behaviors from the learning process to language acquisition. One of its practical applications is the use of token money which is widely used in prisons, psychiatric hospitals, and schools.
Behavior modification can be implemented through the use of a token economy. In token economy, the desired behaviors are reinforced in the form of buttons, stickers, digital batches, fake money, stickers, poker chips, or other objects. They are reinforced by tokens which can be then exchanged for rewards. This system reinforces targeted behaviors by the help of tokens which act as secondary reinforcers and are then exchanged for real rewards which act as primary reinforcers. However, the reward can range from privileges to activities.
An example of token economy includes a primary school teacher making use of stickers as token money to give rewards to young children.
Token money has been quite effective in managing psychiatric patients. However, such patients may face difficulty in adjusting to society since they become too reliant on these tokens. Power is vested upon the staff which implements the token economy program. The staff mustn’t give favors or ignore individuals if the program does not work out properly. Giving staff proper training to make fair use of tokens (and during shift changes in psychiatric hospitals and prisons).
Token economies generally have 3 components. These include behavior that an individual needs to exhibit, tokens earned for showing a particular desired behavior, and exchanging tokens. A token can be paired with many things in a token economy. This makes the token economy a useful tool when there is a change in preferences or when environmental factors change the motivation of a person for reinforcement of a particular kind.
Skinner also studied and researched another important notion of behavior known as behavior shaping. He researched behavior shaping by studying successive approximations. As per skinner, complex behaviors occur through the operant processing principles. If the associated rewards and punishments are given in a way that brings an organism closer to the desired behavior. When an organism gets closer to a particular behavior there should be a shift in the contingencies that require obtaining a reward.
As per Skinner, the product of this successive approximation includes most human and animal behaviors. Through shaping, complex behaviors occur by operant processing, a step-by-step techniques shape behavior. It starts by reinforcing the first behavior part. Reinforcement occurs in the second part of behavior once the operator masters the first part of behavior. They achieve reinforcement when they master the entire behavior.
In Behavioral shaping reinforcement of successive approximations of response takes place, which leads the subject towards the desired behavior.
An example of behavior shaping is when a child decides to swim. First, he is praised for getting in the water. This praise continues when she learns specific arm strokes or when she learns how to kick or when she finally learns how to propel through water by performing both kicking and specific strokes at the same time. An entire behavior shapes through this process. In operant processing skinner often uses a behavior shaping approach.
The behavior shaping process involves reinforcement of successive approximation of target behavior. In this method, the subject first performs a behavior that is similar to the target behavior. To encourage target behavior performance these behaviors change by using reinforcement. The behavior shaping tool highly useful in training animals to perform complex tasks. It is also useful in human learning.
For example, to make the daughter learn how to clean her room, parents can take help from this technique. To master steps towards the main goal parents can use shaping. By giving various rewards at each step of cleaning, parents can achieve their desired goal of making their daughters learn how to clean. Through the association of behavior and consequences, operant conditioning helps in shaping behavior.
Psychologists were able to understand the behavior of their patients through the use of operant processing. The theory helps in understanding the effective use of reinforcements in the learning process and how the conditioning outcome affects the reinforcement schedules. The ability to explain learning in real-life situations is another advantage.
Parents use rewards to nurture child’s behavior that too from an early age. Such behaviors are reinforced by giving praise. Furthermore, verbal discouragement or privilege removal occurs when a child misbehaves. This is to forbid them from misbehaving again. The process is a common way of learning something since your mind remembers whether the action is good or bad when it happens. It also applies to various learning environments.
Largely it applies to classroom issues and student management in the conventional learning situation, instead of learning content. It also helps in shaping skill performance. Giving feedback on performance to learners also helps in shaping behavior. These feedbacks include encouragement, approval, and affirmation.
The highest rate of response is in the variable ratio for students that focus on learning a new task.
However, initially, reinforcement occurs more frequently while performance improvement reinforcements do not occur frequently. For example, the teacher praises students to encourage students to answer questions. Gradually teachers praise students that answer correctly, and over time give praise only to exceptional answers. Teachers can extinguish dominating class discussions, tardiness, or other bad behaviors by ignoring them instead of giving it attention. Teachers give encouragement, rewards, and praise students for their achievements. This is an example of positive reinforcement.
Principles of operant processing also apply when elders detain expel or ground children for not showing good academic rewards. This further influences behavior by using it.
Future learning can be done by knowledge of success.
However, the type of given reinforcement needs to be varied to maintain behavior. This task is not easy, since the teacher appears insincere if she is more focused on the way to behave. The uses are not limited to humans, rather the behavior of animals such as dogs is also shaped by using reinforcements to encourage obedience.
The process also helps in explaining why zoo animals display repetitive and stereotypical behaviors. Animals such as dogs learn with help of rewards. When you reward animals for doing something, you are conditioning them to associate the action with something good or positive. Furthermore, if you punish your dog by hitting it, you are conditioning it to associate the action with something negative.
You integrate the process by making use of these principles. If you face an issue on spectrum ends, it’s possible to stop the use. But both ends of the spectrum need to ensure that it is positive. In animal facilities, animals zookeepers train them to move between enclosures. This helps in making sure that the vets conduct examinations safely.
While we can explain many behaviors by operant processing, several people criticize the process. People believe that it neglects the role of cognitive and biological elements and is an incomplete learning explanation.
Behaviors happen because of authority figures. So it ignores the individuals’ ability to make discoveries and ignores the role of curiosity. Critics accuse the process of generating manipulative and controlling behavior. As per skinner, natural behavior happens because of the environment. It is people who can make use of knowledge for both good and ill.
Since Skinner based his observations mostly on animal experimentation, scientists criticized the theory for extrapolating animal studies. Why does it make predictions about human behavior? Psychologists believe this generalization to be flawed because animals and humans are different both cognitively and physically.
It is very effective but still contains many flaws or issues. It is hard to trace these issues outside your given vicinity.
Operant conditioning is simple and cannot teach complex concepts.
Therefore, you can encounter many issues if you are communicating a complex issue to someone. The same problem will happen if you try to teach concepts to an animal by making use of operant processing. Another criticism is that it ignores cognitive processes.
Assuming that only reinforcements are used for learning. It also overlooks behavior patterns that are specie specific and genetic predispositions. The theory presented by Skinner has also been criticized for oversimplification of complex human behaviors. Since it believes that reinforcement is used to learn behavior it generally neglects the cognitive process and individual differences that also influence human behavior.
This is why critics have labeled the ideas of Skinner as deterministic. Operant processing only considers environmental factors to be responsible for the behavior of an individual and fails to consider the individual ability to take actions as per their own free will.
Although it helps in understanding various behavior changes, it fails to account for cognitive and inherited factors in learning and provides incomplete knowledge related to humans’ and animals’ learning processes. For example, it was identified by Kohler in 1924 that primates solve problems quickly rather than through error learning and trials. Furthermore, Bandura 1977 suggested that humans learn through observation instead of personal experience.
While the dominance held by behaviorism in the 20th century has been lost, operant conditioning skill remains an important tool in behavior modification and the learning process. Sometimes behavior changes can lead to natural consequences.
It can be recognized immediately through your own life, whether it relates to training a dog or training your children. Every type of learning takes effort and time. Consider the reinforcements or punishments that work best for your situation and identify the reinforcement schedule that leads to good results. In the process, reinforcement plays a vital role. It can lead to effective learning tools to achieve desirable behaviors.