imata logo
 
Username: Password:  

Forgot Password?
search imata Make a Donation IMATA on Facebook
Masthead Image

Home / Membership /

Awareness of Control as a Primary Reinforcer

by Gabrielle A. Harris — South African Association for Marine Biological Research (uShaka Sea World); Dr. Frantisek Susta — Prague Zoo; Trainingisdialogue.com

Originally published in Soundings Volume 41, Number 3 — Third Quarter 2016

Editor’s note:  The authors wish to engage the training community to think about training from a different perspective, with the hope of generating and guiding discussions regarding the readers own training situations. This paper is designed to present new concepts developed by the authors to represent what they have researched and witnessed over the years.

When our training does not go according to plan, we problem solve by looking at how we have used the reinforcers and schedules of reinforcement that were consciously chosen. When working with operant conditioning, we need to be cognisant of classical conditioning. When using Primary reinforcement there is always the risk that an unintended association can be reinforced. For example, if the animal is stressed due to his hunger drive, he may associate us with that stress. Our primary reinforcers satisfy biological needs, and therefore are powerful tools to reinforce behaviors that we are conditioning.

One primary reinforcer that does not receive much attention is an animal’s control over its own environment. This control is sometimes taken away and then reachieved inadvertently resulting in a significant behavior change. This happens if an animal has lost perceived control over a situation and become stressed -with the potential that the animal’s subsequent learning during this stressful state becomes classically conditioned. This paper will investigate control as a primary reinforcer. By control, we refer to an animal’s sense of being in control of any situation. In his 1978 book Man's Search for Meaning Viktor E. Frankl said “Between stimulus and response there is a space. In that space is our power to choose our response”.

In this paper the authors will elaborate the importance as animal trainers to ensure that we always allow the animals the ability to choose. A number of case studies will be dissected to show how behaviorial challenges have been created and solved by providing the animal with control. Where an animal does not feel in control, they feel the need to restablish that control. If we simply provide them their sense of control, we easily resolve a potential problem. In summary we will outline how stress is generated when an animal loses perceived control over its environment and how by maintaining an animal’s perceived ability to choose we are able to avoid this from occurring.

The methods used to affect solutions in the case studies will not be uncommon to you, but the reasoning behind the solutions might be different. Perhaps the perspective they bring will highlight how we can remain ethical and effective trainers.

In the IMATA Glossary, primary reinforcement is described as “an unconditioned reinforcer-anything of intrinsic value to an organism” (Klappenback, Davis, & Todd, 2005). I.e. an animal does not need to learn that a reinforcer is satisfying. A common primary reinforcer is food. The authors believe there is a foundational reinforcer that is more important than food. That is, an organism’s sense of being in control of any given situation. At our most primal level, control is effectively our ability to ensure our own survival, a basic biological need which,in behavioral training terminology would be considered a primary reinforcer.

If control helps to ensure survivability then perhaps it can be assumed that it also provides a certain level of comfort and security. If we are able to teach an animal something just beyond their comfort zone, they have the potential to further succeed in their learning. However, if we stray too far away from what they know, the lesson can turn sour and result in distress. In distress an animal becomes reactive. When an organism is reactive they are striving to get back to their comfort zone, and thus not learning anything new and instead looking for ways to achieve their status quo.

Our job as animal trainers is to steadily increase the size of an animal’s comfort zone through effective training and enrichment. The comfort zone can actually be decreased by a lack of required and appropriate stimulation (Figure 1).

Figure 1. Comfort and Learning Zones.

To illustrate the concepts we are outlining in this paper, we present two case studies, with discussions of the problem, the diagnosis, and the solution.

Case Study 1, Problem:

A dolphin called Frodo is over 40-years-old. She is a wise, dominant matriarch who has raised many calves. Historically she has always had minimal trainers working with her at any one time because of our focus on the new calves. She does a gentle water work show with trainers. It was surprising when after a dorsal tow she turned and rammed the trainer in the belly.

Case Study 2, Problem:

Khethiwe is a 7-year-old female dolphin. She is a high-energy dolphin. When integrating her into an established group of four females she appeared anxious. When in sessions, she was anticipating behaviors and exhibiting higher than normal energy levels in established behaviors. She would drift off if asked to station with the other females.

Consider these case studies as we travel through the theory that follows.

THE THEORY

To outline stress that occurs during learning, we review the learning diagram (Figure 2) we presented at IMATA in our paper about motivation a couple of years ago (Harris, Susta, Parker, & Bodenstaff, 2013).

Figure 2. Modified Yerkes Dodson Curve.

This diagram is based on the Yerkes Dobson curve (Yerkes & Dodson, 1908) which was affirmed with a stress hormone study that was conducted on human subjects. The curve relates to experiential learning. During learning, stress hormones are stimulated. They are plotted and mirror the performance of the humans during a task. The curve is depicted running from baseline through ‘eustress’ where ‘optimal’ learning occurs and finally to ‘distress’. If learning is successful in eustress, stress hormones automatically go back to normal. In distress, an organism does not deal with stress hormones effectively. This has ill effects on the animal’s constitution and can result in illness or even death. We have added our own quadrant to the diagram which we believe occurs in the ‘optimal learning’ area – where performance is at its peak. We depicted this area of learning to be when flight, fight, or freeze begins to occur. Created because we feel that powerful learning takes place when an organism learns in this phase. We termed this addition the ‘Autonomic stress’ phase.

Important to note in this diagram is that a responsive choice on the part of the organism only occurs up to where this additional quadrant is depicted. High or optimal’stress causes an organism to be reactive.

When one feels as though there is no choice, it is tantamount to the feeling of being out of control. As outlined, the authors’ theory is that the feeling of being in control is a primary reinforcer. If reaction results in regaining control, the behavior of reacting has physiologically been reinforced for a number of reasons.

  1. Control has been achieved once more.
  2. The behavior has been reinforced by a primary reinforcer (control).
  3. Success causes stress hormones to reduce back into the green zone which then regenerates a feeling of well-being.

To illustrate, here is an example. A dolphin is on a hand station with a new trainer. All of a sudden a large green gym ball it has never seen before is dropped into the water next to him. The dolphin reacts by quickly leaving the trainer. The dolphin is then hesitant to station with the new trainer. Using our theory the behavior would look like the following for the animal:

  • The dolphin lost perceived control when the ball landed nearby.
  • The dolphin swam away from the scary object thereby re-establishing perceived control.
  • The dolphin felt safer away from the new object and trainer.

The status quo is a feeling of safety and well-being. This is the state that an animal or person seeks to ensure when they are feeling too stressed. Regaining control has a further complication when we train animals in the shape of classical conditioning. Pavlov’s dogs did not choose to salivate when they heard the bell. In the case of the dolphin, the unconditioned response was swimming away from the ball and the trainer. Now the ball becomes paired with the new trainer and this association has resulted in the trainer becoming a conditioned punisher. A physiological reaction occurred. An animal regaining a sense of control is a comforting physiological sensation. In the case of the dolphin, the reaction or behavior was swimming away from the ball and the trainer. This behavior is paired with a reduction in stress hormones as well as a feeling of being in control. The association is that swimming away from the trainer is a way of regaining control. Remember that control is a primary reinforcer. By default, it is now possible that the very presence of the trainer is associated with a loss of control.

To look at it another way. This behavior of fleeing is paired with a reduction in stress hormones and a feeling of being in control. The feeling of well-being reinforces the behavior of fleeing. The lesson takes place at the top end of the learning performance curve. Where having a sense of control is the motivation. Thus, as with normal classical conditioning where a few pairings of the unconditioned reinforcer (e.g., food) and the novel stimulus (e.g., bell), result in the conditioned stimulus (bell) becoming a reinforcer. In the case of control, repetition to create the secondary may not be required.

Case Study 1, Diagnosis:

Because Frodo was facing more trainers than her norm, her experience of clarity was probably being blurred. Particularly with body manipulation cues which are fairly subjective from trainer to trainer. She had begun to resist a rostrum manipulation - a behavior where she should respond by moving her body around if we gently move our hand around her rostrum. A silent contest had begun. More often than not, Frodo’s resistance to move resulted in the trainer, after trying to exert more pressure, giving up. Frodo won! The hand pushing was classically paired with the re-establishment of control – when the trainer let go. This behavior eventually generalised into hard hand stationing for which good trainers delivered a Least Reinforcing Scenario (LRS). Trainers who gave up were in fact negatively reinforcing Frodo because she was establishing control once more. This generalised into her pushing our hands during general hand stationing and finally into ramming people when we swam with her.

Case Study 2, Diagnosis:

Part of the integration training of Khethiwe into the group of four females was stationing the group in a line-up. Khethiwe did not appear comfortable in the new social situation .Any apparent anxiety she felt in the scenario was reduced by her leaving the line-up, thereby regaining her sense of control. Feeling better as a result of her reaction of moving away reinforced her fleeing the situation.

THE THEORY OF FLIGHT, FIGHT, OR FREEZE:

Here follows another helpful diagram (Figure 3) that can assist us in recognizing what is going on when animals experience a lack of control and how we can go about rectifying these situations.

Figure 3. Scales of Experiences and Reactions.

In Figure 3 the lines are continuums. So an animal can be anywhere on the sliding scales. The vertical line runs from free movement where an animal feels free to express by moving, down to stasis, where an animal may have lost its faith in being able to move. The horizontal line runs from responsive where behavior is choice-based to reactive, where behavior is reactive.

In the first quadrant on the top left, the experience would be free movement and responsive. The organism is free to choose and easily moving at will.

On the top right an organism would have free movement but stressors have moved them to a point of being reactive. Here an organism has lost its confidence or sense of control or faith in the situation or trainer. It still has confidence in its ability to move, and thus the result is flight.

In the bottom left quadrant stasis and responsiveness are the experiences. Here, an organism has lost its perceived control or choice to move, but remains responsive with a need to defend its position. For example, they know there is no food to receive elsewhere, and yet don’t know what we want from them. Thus, fight is the result.

In the final bottom right quadrant, an organism has lost confidence in everything – itself and the situation. This is where freeze occurs. We define freeze as a behavior where an animal seems  to shut down and avoid any movement.

Fight, flight, or freeze have been typified as acute reactions. If we look at Figure 2’s modified Yerkes Dodson curve, and consider our personal experiences, we may note that we often experience chronic stress, perhaps as a result of a difficult time in our lives. We are able to cope with this and remain cognisant of our actions. All of a sudden, however, an insignificant acute stressful situation occurs, perhaps someone driving too close to us in the traffic, and we all of a sudden find ourselves shocked by the fact that we have overreacted to a situation that is not that serious. These situations also occur in animal training. We do not notice that our training is a little confusing perhaps, or that the animal is in a pattern of not being reinforced appropriately. Unwittingly we push the animal too far before a behavior is cemented and the animal swims away from us or tries to aggress towards us or a con-specific. The reaction exhibited by ourselves and the animal is easily understood in the diagram. The stress hormones have been ‘on the edge’ and have been nudged into the autonomic stress zone by a minor incident. This has caused a reaction.

With fight and flight in mind, when we refer to our two case studies, the following are the concerns. In the case of Frodo, we had inadvertently reinforced Fight. In the case of Khethiwe, we needed to deal with her Flight.

How can we prevent these situations from occurring? The information might not be new to you, however it is great to look at it again in the context of the diagram.

  1. Set them up to succeed – when an animal succeeds, it feels like it is in control.
  2. Success builds confidence – the more often an animal succeeds, the greater its faith in the trainer, the situation, and itself becomes. This faith is its sense of being in control. The classically conditioned associations are powerful in our training relationships.
  3. Go back a step – when the animal is failing, go back a step to enable a reduction in the stress hormones and to re-establish the classically conditioned stimuli alluded to in Point 2.
  4. Switch gears – don’t nag an animal to succeed. If you are unclear how to make a situation positive, move on and reconsider out of session.
  5. Stay present – Being caught up in the goal rather than being in relationship with the animal will mean we are not noting the subtle cues that could be telling us that the animal is feeling inappropriately stressed.

When we presented our paper on motivation we outlined four questions to which an animal must always have the answers. This will ensure that they are always experiencing a sense of being in control. Here are those questions.

  1. Why am I doing this? – Animals must be clear what the payoff is to participate in the learning session. For example, why must the dolphin stay for the introduction of the green gym ball – for food, stimulation, and attention. Remember, the animal must know the answer.
  2. How do I show I am ready for this session? – Animals must have the power to signal they are ready to start the session. For example, they are aware that the green ball is coming. They are still on station and ready for the introduction.
  3. How can I stop the session? – Animals must have a safe way of ending the training. For example, if the green ball is approaching, and the animal begins to shy away from it, the trainer should respond by removing the green ball.
  4. Where is my safe place? – The animals must have a retreat that they can choose at any point during the training. With the dolphin and the green ball we would aim for the safe place to be with a trainer. There could be another option. For example, an adjoining pool where the trainer respects their retreat, or another trainer.

Body language and the way an animal is responding will begin to tell us if they are feeling stressed in the session. Recognising these cues is a great way to avoid distress. Different signs will mean the animal is heading either towards fight or flight. Different species will have different natural history signs that signal they are becoming stressed. With all species, different individual animals may have a tendency to be either fight or flight animals. When we note the signs, it is important that we back track and do what needs doing to prevent the animal losing its sense of being in control.

At any point in time an animal may respond in flight, fight, or freeze, depending on the situation and its state of mind at that time. In general, however an animal’s character and history will predispose them to go in one or the other direction. Table 1 shows the general signs that would pre-dispose an animal to go into fight (aggress) or flight (drift off).

Table 1. Characteristics of a Fight or Flight Individual.

‘Fight-risk’ individuals

‘Flight-risk’ individuals

Older

Younger

Thoughtful

Impulsive

Don’t move much out of session

Energetic out of session

Slow to respond to environmental stimuli

Very responsive to external stimuli

Dominates in social groups

Submits in social groups

Easily bored

Easily distracted

When uninhibited by thoughts or concerns termed neutral, we as trainers are flexible, calm, present, and attentive to what is before us. When we work from this space, we generally succeed, so in essence, accomplished trainers have been reinforced to be neutral. Most of us do have our own tendencies that inhibit us from being perfect trainers and can unwittingly inspire fight or flight in our charges. So, when our training sessions are not going according to plan, or we are experiencing personal stress, we are pre-disposed to tendencies that can inspire fight or flight in our charges. Fight-inspiring trainers may be overly strict, very predictable, and not show much enthusiasm. They may even appear over-confident and are generally very goal-motivated. A flight-inspiring trainer might be more laissez-faire, more animated, and have unclear boundaries. These trainers appear unpredictable and train with anthropomorphic emotion. Table 2 outlines some of these generalities.

Table 2. Signs of a Fight, Neutral, or Flight Inspiring Trainer.


Fight-inspiring trainers

Neutral trainer

Flight-inspiring trainers

Insist on a response

Work in accordance with what is happening in the moment

Switch gears often

Respond predictably

Respond appropriately

Respond unpredictably

Less animated

Animated to reinforce appropriately

More animated

Reinforce predictably

Reinforce appropriately

Reinforce more for high energy behaviors

Slow training methods

Effective training methods

Unplanned training methods

Has high standards

Has standards that work for the situation and animal

Has unpredictable standards

Trains with logic

Trains with logic and emotion

Trains with emotion

Over-confident

Confident

Unconfident

Concerned about end point

Objectively concerned

Overly concerned about animal

Conservative

Realistic

Conservative

We can recognise if animals are heading towards fight or flight, and we can work to avoid them getting there. Signs to look for that mean that we are heading towards the fight or flight response are listed in Table 3. It must be known, that even if the animal is considered a flight or fight-risk from the profile listed earlier, there could be instances where they end up in a different quadrant. For example, if an fight-risk animal is in a new social group and not feeling very food-motivated, it may be more inclined to flight. Rules and tables will never take the place of good practical experience. Thus, the tables presented here are simply guidelines and food for thought.

Table 3. Characteristics of a Fight or Flight Animal.

Fight

Flight

Latent responses

Anticipating

Delivering precursors (different in each animal)

Increasing distance from you

Heavy eyes

Eyes darting

Moving slower than normal or not moving

Higher energy behaviors than normal

Seems to be deliberating

Not thinking

Appears disinterested

Easily distracted

Stuck on a step

Sharp movements

Seems stubborn

Seems nervous

May look lazy

Moving more than normal

If in a social group may displace on others

If in social group may favor others over you

 

With the two situations, our responses as trainers would be slightly different.

Fight – our goal is to re-establish the animal’s faith in themselves and their ability to move.

Flight – our goal is to re-establish their faith in us and the situation.

When we see signs of imminent flight or fight, we need to remember that the animal is heading towards the autonomic stress zone. (Figure 3.) Freeze occurs after both flight and fight, and hopefully we will pre-empt this situation by rectifying what comes before. If an animal retreats to freeze, they are no doubt in high stress, and we need to go back a number of steps to reaffirm our relationship with them, their feeling of safety in a situation or the environment, and their self-confidence.

Our responses to animals in these two scenarios are, generally speaking, outlined in the table to follow (Table 4). Obviously, once again, remembering that individuals and different species would require customised solutions as follows. At the end of the day it is always important to apply our best knowledge to each specific situation as it arises.

Table 4. Trainer Responses to Fight or Flight.

Fight

Flight

Ask them for less

 Let them burn off energy with high energy behaviors

Reward them more than normal for participation

Big reinforcement for their calm responses

Be more flexible

Be less animated

Go back a step, reinforce and then be variable in the session with fun creative learning

Go back a step, reinforce and then offer easy behaviors

Make your steps smaller

Make your steps simpler to understand

Switch gears and do fun behaviors they know well

Do behaviors they know well – consistency is key

Keep learning exciting and variable

Keep learning sessions short and simple

Get them to move, even if it means you move position. Reinforce movement strongly

Let them move and approximate them slowly to keep still

Don’t repeat steps too often in a session

Repeat steps more often

 

When diagnosing a problem behavior, noting this outline can provide clarity and assist us in finding a solution.

Case Study 1, Solution – End the battle:

In our first case study, we needed to resolve the fight concern. We had to stop the argument. A power-play had been created and this is a stress-enhancing scenario for the animal. Trainers were participating in the debate. There was more than one trainer doing this, and the subjective nature of the situation was therefore being communicated to Frodo in a confusing manner. This may have added to her frustration as we were not being clear and consistent with her. Our most important job was to end the possibility of her becoming frustrated. We started by reducing the number of trainers working with her. We avoided scenarios where she would push, thus not providing her the opportunity to rehearse the behavior and have it reinforced. We kept her sessions variable and exciting. We did not make them all about the hand stationing. We retrained her hand targeting within these sessions. We strongly reinforced any gentle hand stationing. Slowly we entered back into the water making gentle interactions the priority.

Case Study 2, Solution – Using control as a reinforcer:

To increase Khethiwe’s confidence in the new situation, we used her sense of control as a reinforcer. Initially, if ever she looked as though she wanted to move off during the session, we sent her away to a trainer B away from the line-up. Thus, we paired her primary need for control with a cue of being sent away from the apparent stressor. This increased her faith in us and the line-up. Because she had the choice to get away. The cue of being sent away became a classically conditioned reinforcer. We then began sending her away before she showed a need to flee, thereby strongly negatively reinforcing her for staying. Obviously she was also positively reinforced by trainer B. We approximated an increase in time spent next to the other girls. In a short space of time her comfort level in the line-up increased. It was interesting to note that initially, if anything out of the ordinary occurred she would resort to the flight response. In time and with sensitive management, this response has significantly reduced. The integration into the female group has been successful, with solo gating and group work all working well.

CONCLUSION

Ensuring an animal always has perceived control takes our training to the next level. To provide this control we need to be more perceptive in our relationships with them rather than simply applying the mechanics of operant conditioning. It takes gut instinct and logically analyzed feeling to assess where an animal is in relationship with us and the lesson. When we succeed at training in this way, we ensure that we are in a partnership with them as opposed to a dominant relationship. It also ensures that our lessons are more ethical, and more effective. If an animal has to seek out control on its own terms, we have failed to fulfil our commitment to be ethical animal trainers.

The author is willing to correspond with interested trainers. You can contact Gabby Harris at gabby@seaworld.org.za.

Editor's Note: This paper received the following award during the 43rd IMATA conference at Nassau, Bahamas: 2nd place Behavioral Training Award.

References

Frankl, V. E. quote. Retrieved 4 August 2015 from - www.brainyquote.com/quotes/quotes/v/viktorefr160380.html.

Harris, G., Susta, F., Parker, S., & Bodenstaff, C. (2013, September).Maintaining intrinsic motivation when using operant conditioning. Paper presented at the 41st Imata Conference, Las Vegas, NM.

Klappenback, S., Davis, C., & Todd, M. (Eds.). (2005). Training and behavioral terms glossary. IMATA. Retrieved 1 August 2015, from www.imata.org.

Lupien, S. J., Maheu, F., Tu, M., Fiocco, A., & Schramek, T. E. (2007). The effects of stress and stress hormones on human cognition:  Implications for the field of brain and cognition. Brain and Cognition, 65: 209–237. doi:10.1016/j.bandc.2007.02.007. PMID 17466428.

Yerkes R. M., & Dodson J. D. (1908). The relation of strength of stimulus to rapidity of habit-formation. Journal of Comparative Neurology and Psychology, 18: 459–482


FRIENDS OF IMATA VENDORS