He introduced the concept of operant conditioning into scientific psychology. The theory of operant conditioning B.F. Skinner. Components of operant learning

According to this theory, most forms of human behavior are arbitrary, i.e. operant; they become more or less likely, depending on whether the consequences are favorable or unfavorable. In accordance with this idea, the definition was formulated.

Operant (instrumental) learning A type of learning in which the correct response or change in behavior is reinforced and made more likely.

This type of learning was experimentally studied and described by American psychologists E. Thorndike and B. Skinner. These scientists introduced into the learning scheme the need to reinforce the results of exercises.

The concept of operant learning is based on the “situation-reaction-reinforcement” scheme.

The psychologist and educator E. Thorndike introduced a problem situation as the first link into the learning scheme, the way out of which was accompanied by trial and error, leading to random success.

Edward Lee Thorndike(1874–1949) – American psychologist and educator Conducted research on animal behavior in "problem boxes". The author of the theory of learning by trial and error with a description of the so-called "learning curve". He formulated a number of well-known laws of learning.

E. Thorndike conducted an experiment with hungry cats in problem cages. An animal placed in a cage could get out of it and receive top dressing only by activating a special device - pressing a spring, pulling a loop, etc. The animals made many movements, rushed in different directions, scratched the box, etc., until one of the movements happened to be successful. With each new success, the cat has more and more reactions leading to the goal, and less and less useless ones.

Rice. 12. Problem cells, according to E. Thorndike

“Trial, error and random success” was the formula for all types of behavior, both animals and humans. Thorndike suggested that this process is determined by 3 laws of behavior:

1)the law of readiness - for the formation of a skill in the body, there must be a state that pushes to activity (for example, hunger);

2)statutory exercise - the more often an action is performed, the more often this action will be chosen subsequently;

3)law of effect - the action that gives a positive effect (“rewarded”) is repeated more often.

Concerning the problems of school education and upbringing, E. Thorndike defines "the art of learning as the art of creating and delaying stimuli in order to cause or prevent certain reactions." At the same time, stimuli can be words addressed to the child, a look, a phrase that he will read, etc., and responses can be new thoughts, feelings, actions of the student, his state. You can consider this provision on the example of the development of educational interests.



The child, through his own experience, has a variety of interests. The task of the teacher is to see among them the “good” ones and, based on them, develop the interests necessary for learning. Directing the interests of the child in the right direction, the teacher uses three ways. The first way is to connect the work being done with something important for the student that gives him satisfaction, for example, with the position (status) among peers. The second is to use the mechanism of imitation: a teacher who is interested in his subject himself will also be interested in the class in which he teaches. The third is to inform the child of such information that sooner or later will arouse interest in the subject.

Another well-known behavioral scientist B. Skinner revealed the special role of reinforcing the correct response, which involves the “designing” of a way out of the situation and the obligation of the correct response (this was one of the foundations of programmed learning). According to the laws of operant learning, behavior is determined by the events that follow it. If the consequences are favorable, then the likelihood of repeating the behavior in the future increases. If the consequences are unfavorable and not reinforced, then the likelihood of the behavior decreases. Behavior that does not lead to the desired effect is not learned. You will soon stop smiling at a person who does not smile back. There is a learning to cry in a family where there are small children. Crying becomes a means of influencing adults.

At the heart of this theory, as well as in the Pavlovian one, is the mechanism for establishing links (associations). Operant learning is also based on the mechanisms of conditioned reflexes. However, these are conditioned reflexes of a different type than classical ones. Skinner called these reflexes operant or instrumental. Their peculiarity is that activity is first generated not by a signal from outside, but by a need from within. This activity has a chaotic random character. In the course of it, not only innate responses are associated with conditioned signals, but any random actions that received a reward. In the classical conditioned reflex, the animal, as it were, passively waits for what will be done to it, in the operant reflex, the animal itself is actively looking for the right action, and when it finds it, it learns it.

The technique of developing "operant reactions" was used by Skinner's followers in the education of children, their upbringing, and in the treatment of neurotics. During World War II, Skinner worked on a project to use pigeons to control aircraft fire.

Having once visited an arithmetic lesson in college, where his daughter studied, B. Skinner was horrified at how little the data of psychology are used. In order to improve teaching, he invented a series of teaching machines and developed the concept of programmed learning. He hoped, based on the theory of operant reactions, to create a program for "manufacturing" people for a new society.

  • 6.1.1. Definition of operant conditioning
  • 6.1.2. Principles of Operant Conditioning
  • 6.1.3. Reinforcement modes
  • 6.1.4. Personal growth and development
  • 6.1.5. Psychopathology
  • 6.1.6. Advantages and disadvantages of learning theories

Psychological concepts - teaching, learning, teaching describe a wide range of phenomena associated with the acquisition of experience, knowledge, skills, abilities in the process active relationship subject with subject and social world- in behavior, activity, communication.

  • When it comes to learning, then the researcher has in mind such aspects of this process as:
    • gradual change;
    • the role of exercise;
    • the specifics of learning compared to the innate characteristics of the individual.

Usually the terms education And doctrine designate process acquisition of individual experience, and the term "learning" describes And myself process, and his result.
So, learning (training, teaching) - the process of acquiring by the subject of new ways of carrying out behavior and activities, fixing and / or modifying them. Change psychological structures, which occurs as a result of this process, provides an opportunity for further improvement of activities.
known classic concepts learning. This, for example, is the teaching of I.P. Pavlov (1849-1936) on the formation of conditioned reflexes. As a result of one or more presentations of an indifferent delimiter (conditioned stimulus) followed by an unconditioned stimulus (food), which causes an unconditioned, innate reaction (salivation), the indifferent stimulus itself begins to cause a reaction. In the process of establishing a temporary connection, the unconditioned stimulus performs the function of reinforcement, the conditioned stimulus acts as a signal value, and the reflex contributes to the adaptation of the organism to changing environmental conditions.
For the first time, the patterns of learning established experimental methods, have been established within behaviorism. These patterns, or "laws of learning", were formulated by E. Thorndike and supplemented and modified by C. Hull, E. Tolman and E. Gasri.

  • They are:
    • Law of readiness: the stronger the need, the more successful the learning. The law is derived on the basis of establishing a connection between need and learning.
    • Law of effect: Behavior that leads to a beneficial action causes a decrease in need and therefore will be repeated.
    • Law of exercise: all other things being equal, repeating a particular action makes it easier to commit a behavior and leads to faster execution and a decrease in the likelihood of errors. Later, Thorndike showed that not always exercise, repetition contributes to the simplification of a skill, although this factor is very important in motor learning, contributing to behavior modification.
    • Law of Recency: it is better to memorize the material that is presented at the end of the series. This law contradicts the effect of primacy - the tendency to better memorize the material that is presented at the beginning of the learning process. The contradiction is eliminated when the law "edge effect" is formulated. The U-shaped dependence of the degree of memorization of the material on its place in the learning process reflects this effect and is called the "positional curve".
    • Law of Conformity: there is a proportional relationship between the probability of response and the probability of reinforcement.
  • Let us now turn to theories of learning in personality psychology.
    Theories are based on two assumptions:
  1. All behavior is acquired in the process of learning.
  2. In order to maintain scientific rigor when testing hypotheses, it is necessary to observe the principle of objectivity of data. As variables that can be manipulated, external causes (food reward) are chosen, in contrast to "internal" variables in the psychodynamic direction (instincts, defense mechanisms, self-concept), which cannot be manipulated.

In the theories of learning (I.P. Pavlov), adaptation is considered as an analogue of human development. It can be carried out in different ways, for example, through classical Pavlovian conditioning.

  • In doing so, important phenomena were investigated:
    • Generalization- the conditioned response to the initially neutral stimulus extends to other stimuli similar to the conditioned stimulus (the fear that arose for a particular dog then spreads to all dogs).
    • Differentiation- a specific response to similar stimuli that differ in the degree of reinforcement (for example, differentiation of reactions to a circle and an ellipse).
    • extinction- the destruction of the connection between the conditioned stimulus and the response, if it is not accompanied by reinforcement.

A typical experiment involved strapping the dog to restrict its movement, then turning on the lights. 30 seconds after the lights were turned on, some food was placed in the dog's mouth, which caused salivation. The combination of turning on the light and food was repeated several times. After some time, the light, which initially acted as an indifferent stimulus, in itself began to cause a salivation reaction.
Similarly, it is possible to develop conditioned defensive responses to initially neutral stimuli. In the early studies of defensive conditioning, a dog was put on a harness to hold it in a crate, and electrodes were attached to its paw. Innings electric current(unconditioned stimulus) on the paw caused paw withdrawal (unconditioned reflex), which was a reflex reaction of the animal. If the bell rang several times immediately before the electric shock, then gradually the sound itself was able to induce a defensive paw withdrawal reflex.
According to the terminology of I.P. Pavlova, food (or electric shock) were unconditioned stimuli, and light (or sound) was conditioned. Salivation (or withdrawal of the paw) upon the appearance of food (or electric shock) was called an unconditioned reflex, and salivation to the inclusion of a light (or withdrawal of a paw to a sound) was called a conditioned reflex. The reactions that Pavlov studied were called reciprocal, or respondent, since they automatically arose after known stimuli (food, electric shock). Leading in the model I.P. Pavlova is a stimulus, the manipulation of which leads to the emergence of new forms of behavior.
So, classical conditioning is a process discovered by I.P. Pavlov, thanks to which an initially neutral stimulus begins to cause a reaction due to its associative connection with a stimulus that automatically generates the same or a similar reaction.
The theory developed by B.F. Skinner (1904-1990), is called operant conditioning theories. He said that a scientist, like any other organism, is a product of unique history. The field that he chooses for himself as a preferred one will depend in part on his personal background.
Skinner's interest in the formation and modification of behavior arose after acquaintance with the work of I.P. Pavlov "Conditioned Reflexes" and an article (critical in its orientation) by Bertrand Russell. The latter's articles not only did not alienate Pavlovian ideas, but, on the contrary, strengthened their influence.
Skinner's goal was to explain the mechanisms of learning in humans and animals (rats and pigeons) on the basis of a limited set of basic principles. The main idea was to manage the environment, control it, while getting orderly changes. He said: "Check the conditions (environment), and order will be revealed to you."

In the middle of the XX century. as a result of the revision of a number of fundamental ideas of orthodox behaviorism, neobehaviorism was formed (E. Tolman - cognitive behaviorism, K. Hull - hypothetical-deductive behaviorism, E. Gasri, B.F. Skinner - operant behaviorism, etc.). Serious criticism from the opponents of orthodox behaviorism was caused by its obvious mechanism in understanding behavior. Therefore, some neobehaviorists have made an attempt to introduce a number of new intermediate variables (cognitive cognitive map, value matrix, goals, motivation, anticipation, behavior control, etc.) into the traditional "stimulus-response" scheme. This significantly changed the general content of behaviorism.

While most of the supporters of neobehaviorism softened their positions by introducing concepts that are not characteristic of orthodox behaviorism, the famous American psychologist B.F. Skinner and a number of other researchers took the point of view of "radical behaviorism". This approach is even more rigid than it was accepted in orthodox behaviorism, it rejected any interpretations related to mentalism. B. F. Skinner condemned deviations from orthodox behaviorism as a return to unscientific psychology. material for scientific analysis, in his opinion, only observable and measurable aspects of the environment, the behavior of the organism and the consequences of this behavior can serve.

Neobehaviorism had a significant impact on learning theory and educational practice in the mid-20th century. worldwide. On the foundation of neobehaviorist ideas in educational psychology and learning theory formed a powerful scientific direction called "programmed learning". Since the mid 1950s. programmed learning has become widespread in the world (England, Poland, USSR, USA, France, Czechoslovakia, etc.). In the United States, special research institutions were created to deal with the development of new didactic technology. In the USSR, a special scientific council was also organized on issues of programmed learning.

Barras Frederick Skinner (1904–1990) was born in Suskahanna, Pennsylvania and received his M.A. in 1930 and his Ph.D. in 1931 from Harvard. The youthful desire to become a writer was not realized, and after a series of failed attempts searching for his own path, he went to study psychology at Harvard.

BF Skinner taught psychology at the University of Minnesota from 1936 to 1945. During this time he published one of his major works, The Behavior of Organisms. After three years as Dean of the Department of Psychology at Indiana University, he returned to Harvard in 1948, where he lived and worked until his death in 1990.

The main provisions of the theory of "operant learning" by B. F. Skinner

An important starting point for understanding BF Skinner's theory is his classification of behaviors. He singled out "respondent behavior" and "operant behavior". Respondent behavior is evoked by a known stimulus. An example of respondent behavior is all unconditional reactions; they arise as a result of an unconditional stimulus. Operant behavior is not caused by a stimulus, it is simply produced by the organism. Because operant behavior is not associated with known stimuli, it appears to occur spontaneously. Manifestations of operant behavior are diverse, most of our daily actions can be qualified as operant behavior.

BF Skinner did not claim that operant behavior occurs independently of stimulation, rather the opposite. It's just that the stimulus that causes the operant behavior is unknown and it is not necessary to know its cause. Respondent behavior depends entirely on the stimulus that preceded it. In contrast, an operant command is controlled by its consequences.

Along with the two types of behavior, according to BF Skinner, there are two types of conditioning: "respondent conditioning" and "operant conditioning". Respondent conditioning is identical to the classical conditioning of I. P. Pavlov, B. F. Skinner also called it conditioning of the type " S". Thereby emphasizing the importance of the stimulus, which causes the necessary response. Operant conditioning B. F. Skinner denotes by the letter "R", emphasizing in this case that the emphasis is on reaction.

When specifying the type "R" its strength is judged by the speed of the reaction, and when conditioning the type " S"The strength of conditioning is often determined by the magnitude of the conditioned reaction. It is easy to see that B.F. Skinner's "/?" conditioning is very similar to E. Thorndike's "instrumental conditioning", and his S"- on the "classical conditioning" of I.P. Pavlov. In his own research, B.F. Skinner paid the main attention to operant conditioning, or, in his terminology, type conditioning "R".

B. F. Skinner singled out two main principles of operant conditioning (conditioning of the type "R"):

  • 1. Any response following a reinforcing stimulus is prone to repetition.
  • 2. A reinforcing stimulus can be anything that increases the rate of appearance of an operant response.

A reinforcer can be anything that increases the likelihood of a response being repeated. As you can easily see, the principles of operant conditioning can be applied to a variety of situations. In order to change behavior, you need to find something that will serve as a reinforcement for the body. Then you should wait until the desired behavior manifests itself, and then provide reinforcement.

In a fascinating book by a follower of B. F. Skinner, an American animal psychologist and animal trainer Karen Pryor, "Don't growl at the dog," many examples of the use of operant learning principles in the training of marine animals are described. The dolphins participating in her research not only learned to follow human commands, they even successfully solved creative tasks.

After that, the frequency of the desired reaction will increase. When the desired behavior reappears, it is reinforced again, and the rate at which the response occurs increases even more. A similar effect can be exerted on any behavior of the organism.

B. F. Skinner considered the socio-cultural environment as a set of reinforcement opportunities.

Due to differences in the socio-cultural environment, different patterns of behavior are reinforced. According to B. F. Skinner, what is called "personality" is nothing but consistent patterns of behavior that are the sum total of our reinforcement history.

Last update: 09/12/2018

Operational learning involves a system of rewards and punishments to reinforce or stop a particular type of behavior.

Operant learning is a method of learning that occurs by rewarding and punishing a particular type of behavior. The essence of operant learning is to establish an associative relationship between behavior and the consequences of this behavior.

The idea of ​​operant learning belongs to the behaviorist, so this learning method is often called the Skinner method. Skinner believed that it was impossible to explain behavior in terms of internal thoughts and motivation. Instead, he suggested paying attention to external causes that influence human behavior.

Skinner used the term "operant" to describe any behavior that, under the influence of external factors, results in certain consequences. In other words, Skinner's theory explains how we acquire various daily habits and behaviors.

Examples of operant learning

In fact, there are many examples of operant learning all around us: a student who does his homework to get a reward from his parents, or employees who work on a project for a raise or a promotion.
These examples show us that the reward perspective promotes task completion, but operant learning can also be used to wean a person away from something through punishment or deprivation. For example, children can be weaned from talking in the classroom if they are deprived of the opportunity to play at a big recess for this.

Components of operant learning

Reinforcement is any action that will influence the development of a particular behavior. There are two types of reinforcements:
Positive reinforcement is a reward that is used to reward a desired behavior, such as praise or a reward.
Negative reinforcers are unpleasant actions or outcomes that are stopped or reduced to reward the desired behavior.
Both types of reinforcement are used to reward a particular behavior.

Punishment is an unpleasant action that is taken in order to stop an undesirable pattern of behavior.

There are two types of punishments:

  1. Positive punishment involves using an unwanted action to dampen the reaction that follows.
  2. Negative punishment involves the termination of the desired action or the deprivation of the desired object in the event of a behavior that needs to be weaned.

Both types of punishment are aimed at weakening an undesirable pattern of behavior.

B. Skinner (1904-1990) is a representative of neobehaviorism.

The main provisions of the theory of "operant behaviorism":

1. The subject of the study is the behavior of the organism in its motor component.

1. Behavior is what the organism does and what can be observed, and therefore consciousness and its phenomena - will, creativity, intellect, emotions, personality - cannot be the subject of study, since they are not observable objectively.

3. A person is not free, since he himself never controls his graying, which is determined by the external environment;

4. Personality is understood as a set of behavioral patterns "situation-reactions-, the latter depending on previous experience and genetic history.

5. Behavior can be divided into three kinds; the unconditioned reflex and the conditioned reflex, which are a simple response to a stimulus, and the operant, which occurs spontaneously and is defined as conditioning; this type of behavior plays a decisive role in the adaptation of the organism to external conditions.

6. Main characteristic operant behavior is its dependence on past experience, or the last stimulus, called reinforcement. Behavior is strengthened or weakened depending on the reinforcement, which can be negative or positive.

7. The process of positive or negative reinforcement for an action is called conditioning.

8. On the basis of reinforcement, it is possible to build the entire educational system of the child, the so-called programmed education, when all the material is divided into small parts and in case of successful completion and assimilation of each part, the student receives positive reinforcement, and in case of failure - negative.

9. The system of education and management of a person is built on the same basis - socialization occurs through positive reinforcement of the norms, values ​​and rules of behavior necessary for society, while antisocial behavior should have negative reinforcement from society.

reinforcement modes.

The essence of operant learning is that reinforced behavior tends to be repeated, while unreinforced or punished behavior tends not to be repeated or suppressed. Hence, the concept of reinforcement plays a key role in Skinner's theory.

The rate at which operant behavior is acquired and maintained depends on the mode of reinforcement applied. Reinforcement mode- a rule that establishes the probability with which reinforcement will occur. The simplest rule is to present reinforcement each time the subject gives the desired response. It is called continuous reinforcement mode and is commonly used at the beginning of any operant learning, when the organism is learning to produce the correct response. In most situations Everyday life, however, this is either not feasible or uneconomical to maintain the desired response, since the reinforcement of the behavior is not always the same and regular. In most cases, a person's social behavior is reinforced only occasionally. The child cries repeatedly before getting the mother's attention. A scientist is wrong many times before he arrives at the correct solution to a difficult problem. In both of these examples, unreinforced responses occur until one of them is reinforced.

Skinner carefully studied how the regime intermittent, or partial, reinforcements affects operant behavior. Although many different modes of reinforcement are possible, they can all be classified according to two main parameters: 1) reinforcement can only take place after a certain or random time interval has elapsed since the previous reinforcement (the so-called mode temporary reinforcements); 2) reinforcement can take place only after a certain or random number of reactions(mode proportional reinforcement). According to these two parameters, four main modes of reinforcement are distinguished.

1. Constant Ratio Reinforcement Mode(PS). In this mode, the body is reinforced by the presence of a predetermined or "constant" number of appropriate reactions. This mode is universal in everyday life and plays a significant role in the control of behavior. In many industries, employees are paid partly or even exclusively according to the number of units they produce or sell. In the industry, this system is known as the unit charge. The PS mode usually sets an extremely high operant level, since the more often the organism reacts, the more reinforcement it receives.

2. Regular Interval Reinforcement Mode(PI). In a constant interval reinforcement regimen, the organism is reinforced after a fixed or "constant" time interval has elapsed since the previous reinforcement. At the individual level, the PI regime is valid when paid for work done in an hour, a week, or a month. Similarly, a weekly allowance of pocket money to a child forms a PI form of reinforcement. Universities generally operate under the Temporary Regime of the PI. Examinations are set on a regular basis and academic progress reports are issued on time. Curiously, the PI mode gives a low response rate immediately after reinforcements are received - a phenomenon called pause after reinforcement. This is indicative of students who have difficulty learning in the middle of the semester (assuming they passed the exam well), as the next exam will not be soon. They literally take a break from learning.

3. Variable Ratio Reinforcement Mode(Sun). In this mode, the body is reinforced on the basis of some predetermined number of reactions on average. Perhaps the most dramatic illustration of the behavior of a person under the control of the BC regime is addictive gambling. Consider the actions of a person playing a slot machine, where you need to lower a coin or draw a prize with a special handle. These machines are programmed in such a way that the reinforcement (money) is distributed according to the number of attempts the person pays for to operate the crank. However, the winnings are unpredictable, inconsistent and rarely allow you to get more than what the player has invested. This explains the fact why casino owners receive significantly more reinforcements than their regular customers. Further, the extinction of the behavior acquired in accordance with the BC regimen occurs very slowly, since the body does not know exactly when the next reinforcement will be. Thus, the player is forced to drop coins into the slot of the machine, despite an insignificant gain (or even loss), in full confidence that next time he will “hit the jackpot”. Such persistence is typical of behavior induced by the VS regime.

4. Reinforcement with variable interval(IN AND). In this mode, the body receives reinforcement after an indefinite time interval has passed. Like the PI mode, the reinforcement under this condition is time dependent. However, the time between reinforcements according to the VI regime varies around some average value, and is not precisely established. As a general rule, the response speed in VI mode is a direct function of the applied interval length: short intervals generate high speed, and long intervals generate low speed. Also, when reinforcing in the VI mode, the body tends to establish a constant speed of response, and in the absence of reinforcement, the reactions fade away slowly. Ultimately, the body cannot accurately predict when the next reinforcement will arrive.

In everyday life, the VI mode is not often encountered, although several variants of it can be observed. A parent, for example, may praise a child's behavior rather arbitrarily, relying on the child to continue to behave appropriately at non-reinforced intervals. Likewise, professors who give "unexpected" test papers, the frequency of which varies from one in three days to one in three weeks, on average one in two weeks, use the VI regimen. Under these conditions, students can be expected to maintain a relatively high level of diligence, as they never know when the next test will be.

As a rule, the VI mode generates a faster response speed and greater resistance to fading than the PI mode.

Conditional reinforcement.

Learning theorists have recognized two types of reinforcement, primary and secondary. Primary A reinforcer is any event or object that itself has reinforcing properties. Thus, they do not require prior association with other reinforcers in order to satisfy a biological need. Primary reinforcers for humans are food, water, physical comfort, and sex. Their value for the organism does not depend on learning. Secondary, or conditional a reinforcer, on the other hand, is any event or object that acquires the property of producing reinforcement through close association with the primary reinforcer conditioned by the organism's past experience. Examples of common secondary reinforcers in humans are money, attention, affection, and good grades.

A slight change in the standard procedure for operant learning demonstrates how a neutral stimulus can become a reinforcing force for behavior. When the rat learned how to press the lever in the "Skinner box", an audio signal was immediately introduced (immediately after the reaction was performed), followed by a ball of food. In this case, the sound acts like discriminative stimulus(that is, the animal learns to respond only in the presence of a sound signal, as it communicates a food reward). After this specific operant response is established, extinction begins: when the rat presses the lever, neither food nor sound signal appears. After a while, the rat stops pressing the lever. The beep is then repeated each time the animal presses the lever, but no food ball appears. Despite the absence of the initial reinforcing stimulus, the animal recognizes that pressing the lever triggers the beep, so it continues to respond aggressively, thus reducing the extinction. In other words, the set speed at which the lever is pressed reflects the fact that the beep now acts as a conditioned reinforcer. The exact rate of response depends on the strength of the cues as a conditioned reinforcer (ie, the number of times the cues were associated with the primary reinforcer, food, during learning). Skinner argued that virtually any neutral stimulus can become reinforcing if it is associated with other stimuli that previously had reinforcing properties. Thus, the phenomenon of conditioned reinforcement greatly increases the scope of possible operant learning, especially when it comes to social behavior person. In other words, if everything we learned was proportional to the primary reinforcement, then the opportunities for learning would be very limited, and human activities would not be so diverse.

A characteristic of a conditioned reinforcer is that it generalizes when combined with more than one primary reinforcer. money - especially case in point. Obviously, money cannot satisfy any of our primary drives. Yet thanks to the system of cultural exchange, money is a powerful and powerful factor in obtaining many pleasures. For example, money allows us to have fancy clothes, flashy cars, health care, and education. Other types of generalized conditioned reinforcers are flattery, praise, affection, and submission to others. These so-called social reinforcers(involving the behavior of other people) are often very complex and subtle, but they are essential to our behavior in a variety of situations. Attention is a simple case. Everyone knows that a child can get attention when he pretends to be sick or misbehaves. Often children are annoying, asking ridiculous questions, interfering with adults' conversations, showing off, teasing younger sisters or brothers, and wetting the bed - all to attract attention. The attention of a significant other—parents, teacher, lover—is a particularly effective generalized conditioned stimulus that can promote pronounced attention-seeking behavior.

An even stronger generalized conditioned stimulus is social approval. For example, many people spend a lot of time preening in front of a mirror in the hope of getting the approval of a spouse or lover. Both women's and men's fashion are subject to approval, and it exists as long as there is social approval. students high school compete for a spot on the varsity track and field team or participate in non-curricular activities (drama, debate, school yearbook) in order to gain the approval of parents, peers, and neighbors. Good grades in college - too positive reinforcer, because earlier for this they received praise and approval from their parents. As a powerful conditioned reinforcer, good grades also encourage learning and academic achievement.

Skinner believed that conditioned reinforcers are very important in controlling human behavior (Skinner, 1971). He also noted that each person goes through a unique science of learning, and it is unlikely that all people are driven by the same reinforcers. For example, for someone, success as an entrepreneur is a very strong reinforcer; for others, an expression of tenderness is important; and others find a reinforcing stimulus in sports, academic or musical pursuits. The possible variations in behavior supported by conditioned reinforcers are endless. Therefore, understanding conditioned reinforcers in humans is much more difficult than understanding why a food-deprived rat presses a lever with only a sound signal as a reinforcer.

Controlling behavior through aversive stimuli.

From Skinner's point of view, a person's behavior is basically controlled aversive(unpleasant or painful) stimuli. The two most typical methods of aversive control are punishment And negative reinforcement. These terms are often used interchangeably to describe the conceptual properties and behavioral effects of aversive control. Skinner offered the following definition: “You can distinguish between punishment, in which an aversive event occurs that is proportional to the response, and negative reinforcement, in which the reinforcer is the removal of an aversive stimulus, conditioned or unconditioned” (Evans, 1968, p. 33).

Punishment. Term punishment refers to any aversive stimulus or occurrence that follows or depends on the occurrence of some operant response. Instead of reinforcing the response it accompanies, punishment reduces, at least temporarily, the likelihood that the response will occur again. The supposed purpose of punishment is to encourage people not to behave in a given way. Skinner (1983) noted that this is the most general method control behavior in modern life.

According to Skinner, punishment can be carried out in two different ways, which he calls positive punishment And negative punishment(Table 7-1). Positive punishment occurs whenever a behavior leads to an aversive outcome. Here are some examples: if children misbehave, they are spanked or scolded; if students use cheat sheets in an exam, they are expelled from the university or school; if adults are caught stealing, they are fined or jailed. Negative punishment occurs whenever a behavior is followed by the removal of a (possible) positive reinforcer. For example, children are forbidden to watch TV because of bad behavior. A widely used approach to negative punishment is the suspension technique. In accordance with this technique, a person is instantly removed from a situation in which certain reinforcing stimuli are available. For example, a disobedient fourth grade student who interferes with classes can be kicked out of the classroom.

<Физическая изоляция - это один из способов наказания с целью предотвратить проявления нежелательного поведения.>

Negative reinforcement. Unlike punishment, negative reinforcement - it is the process by which the organism limits or avoids the aversive stimulus. Any behavior that prevents the aversive state of affairs is thus more likely to be repeated and is negatively reinforced (see Table 7-1). Grooming behavior is one such case. Let's say a person who hides from the scorching sun by going indoors is likely to go there again when the sun becomes scorching again. It should be noted that avoiding an aversive stimulus is not the same as avoiding it, since the avoided aversive stimulus is not physically represented. Therefore, another way to deal with unpleasant conditions is to learn to avoid them, that is, to behave in such a way as to prevent their occurrence. This strategy is known as avoidance learning. For example, if studying proccess allows the child to avoid homework, negative reinforcement is used to increase interest in learning. Avoidance behavior also occurs when addicts develop clever plans to keep their habits, but not lead to the aversive consequences of imprisonment.

Table 7-1. Positive and negative reinforcement and punishment

Both reinforcement and punishment can be done in two ways, depending on whether the response is followed by the presentation or removal of a pleasant or unpleasant stimulus. Note that reinforcement enhances the response; punishment weakens it.

Skinner (1971, 1983) struggled with all forms of behavioral control based on aversive stimuli. He emphasized punishment as an ineffective means of controlling behavior. The reason is that, due to their threatening nature, punishment tactics for unwanted behavior can cause negative emotional and social side effects. Anxiety, fear, antisocial actions, and loss of self-esteem and confidence are just some of the possible negative side effects associated with the use of punishment. The threat posed by aversive control can also push people into behaviors even more controversial than those for which they were originally punished. Consider, for example, a parent who punishes a child for mediocre academic performance. Later, in the absence of a parent, the child may behave even worse - skip classes, roam the streets, damage school property. Regardless of the outcome, it is clear that the punishment was not successful in producing the desired behavior in the child. Since punishment can temporarily suppress unwanted or inappropriate behavior, Skinner's main objection was that the behavior followed by punishment is likely to reappear where there is no one who can punish. A child who has been punished several times for sexual play will not necessarily refuse to continue it; a person who is imprisoned for violent assault will not necessarily be less likely to be violent. The punished behavior may reappear after the likelihood of being punished has disappeared (Skinner, 1971, p. 62). It is easy to find examples of this in real life. A child who gets spanked for swearing in the house is free to do it elsewhere. A driver fined for speeding can pay the police officer and continue to speed freely when there is no radar patrol nearby.

Instead of aversive behavior control, Skinner (1978) recommended positive reinforcement, as the most effective method to eliminate unwanted behavior. He argued that since positive reinforcers do not have the negative side effects associated with aversive stimuli, they are more suitable for shaping human behavior. For example, convicted criminals are held in intolerable conditions in many penitentiary institutions (evidence of this is the numerous prison riots in the United States over the past few years). It is obvious that most attempts to rehabilitate criminals have failed, this confirms high level relapses or repeated violations of the law. Applying Skinner's approach, it would be possible to regulate the conditions of the prison environment in such a way that behavior that resembles the behavior of law-abiding citizens is positively reinforced (for example, learning social skills, values, attitudes). Such reform will require the involvement of behavioral experts with knowledge of the principles of learning, personality, and psychopathology. From Skinner's point of view, such a reform could be successfully carried out using existing resources and psychologists trained in the methods of behavioral psychology.

Skinner showed the power of positive reinforcement, and this influenced behavioral strategies used in parenting, education, business, and industry. In all these areas, there is a tendency to increasingly reward desirable behavior rather than punish undesirable behavior.

Generalization and differentiation of stimuli.

A logical extension of the reinforcement principle is that a behavior reinforced in one situation is very likely to be repeated when the organism encounters other situations that resemble it. If this were not the case, then our behavioral set would be so severely limited and chaotic that we might wake up in the morning and think for a long time about how to respond appropriately to each new situation. In Skinner's theory, the tendency for reinforced behavior to spread over many similar positions is called stimulus generalization. This phenomenon is easy to observe in everyday life. For example, a child who has been praised for refined good manners at home will generalize this behavior to appropriate situations and out of the home, such a child does not need to be taught how to behave decently in a new situation. Stimulus generalization can also be the result of unpleasant life experiences. A young woman raped by a stranger may generalize her shame and hostility toward all members of the opposite sex, as they remind her of the physical and emotional trauma inflicted by the stranger. Likewise, a single instance of fright or aversive experience caused by a person belonging to a particular ethnic group (white, black, Hispanic, Asian) may be enough for an individual to create a stereotype and thus avoid future social contact with all members of this ethnic group. groups.

Although the ability to generalize responses is an important aspect of many of our daily social interactions, it is clear that adaptive behavior requires the ability to make distinctions in different situations. Stimulus discrimination, an integral part of generalization is the process of learning to respond appropriately in various environmental situations. There are many examples. A motorist stays alive during rush hour by distinguishing between red and green traffic lights. The child learns to distinguish between a domestic dog and vicious dog. The teenager learns to distinguish between behavior that is approved by peers and behavior that irritates and alienates others. A diabetic immediately learns to distinguish between food containing a lot and a little sugar. Indeed, virtually all intelligent human behavior depends on the ability to discriminate.

The ability to discriminate is acquired through reinforcement of responses in the presence of some stimuli and non-reinforcement of them in the presence of other stimuli. Distinctive stimuli thus enable us to anticipate the likely outcomes associated with the expression of a particular operant response in various social situations. Accordingly, individual variation in discriminative power depends on the unique past experiences of the various reinforcers. Skinner suggested that healthy personal development results from the interplay of generalizing and discriminating abilities, by which we regulate our behavior to maximize positive reinforcement and minimize punishment.

Sequential Approach: How to Make the Mountain Come to Mohammed.

Skinner's early experiments in operant learning focused on responses that are usually expressed at medium or high frequency (eg, pecking of a dove on a key, pressing a lever by a rat). However, it soon became apparent that the standard method of operant learning was ill-suited to a large number complex operant responses that could spontaneously occur with a probability of almost zero. In the field of human behavior, for example, it is doubtful that a general strategy of operant learning could successfully teach patients in a psychiatric ward to acquire the appropriate skills. interpersonal communication. To make this task easier, Skinner (1953) devised a technique whereby psychologists could effectively and quickly reduce the time required to condition almost any behavior in the repertoire that a person had. This technique, called successful approximation method, or shaping behavior, consists of reinforcing the behavior closest to the desired operant behavior. This is approached step by step, and so one reaction is reinforced and then replaced by another that is closer to the desired result.

Skinner found that the process of forming behavior determines the development of oral speech. For him, language is the result of reinforcing the child's utterances, represented initially by verbal communication with parents and siblings. Thus, starting with fairly simple forms of babble in infancy, infantile verbal behavior gradually develops until it begins to resemble the language of adults. In Verbal Behavior, Skinner gives a more detailed explanation of how the "laws of language," like any other behavior, are comprehended using the same operant principles (Skinner, 1957). And, as might be expected, other researchers have questioned Skinner's claim that language is simply the product of verbal utterances selectively reinforced during the first years of life. Noem Chomsky (Chomsky, 1972), one of Skinner's most severe critics, argues that the greater rate of acquisition of verbal skills in early childhood cannot be explained in terms of operant learning. From Chomsky's point of view, the features that the brain possesses at birth are the reason why the child acquires language. In other words, there is an innate ability to learn the complex rules of conversational communication.

We have completed a brief overview of Skinner's teaching-behavioral direction. As we have seen, Skinner did not consider it necessary to consider the internal forces or motivational states of a person as a causal factor in behavior. Rather, he focused on the relationship between certain environmental phenomena and overt behavior. Further, he was of the opinion that personality is nothing more than certain forms of behavior that are acquired through operant learning. Whether or not these considerations add to a comprehensive theory of personality, Skinner has had a profound effect on our understanding of the problems of human learning. The philosophies underlying Skinner's system of views on man clearly separate him from most of the personologists with whom we have already met.

mob_info