Above: Mr B midway through his training to step and sit on the scale.
Training with Mr B, a siamese cat from a rescue facility, has been filled with interesting variables.
First, when he first “got” training he began purring and purrs through every training session.
When you start an animal training program the end goal is established and then most people do a “training plan” or outline the steps to get to the final behavior.
The lesson plan looked like this:
- train Mr B to understand the marker,
- get a stationary behavior away from the food bowl,
- maintain the behavior until released with a word,
- look at an object,
- investigate an object,
- step on the object,
- move another paw onto the object,
- move a third paw onto the object,
- move all feet paw the object,
- sit on the object (scale).
To accomplish the training I use a noise making device to “mark” the behavior.
Most people use a verbal maker–usually the word, “good” so that the animal knows the behavior is the right one.
The marker signals that something good is coming.
A reward can be praise, petting, play, food, or something else an animal enjoys.
Most people make the mistake of thinking that there always has to be a food reward. This is incorrect.
A reinforcer just needs to be something an animal enjoys.
When you use a marker, such as a clicker or a whistle, it signals the precise moment the animal is engaged in the right behavior–so the behavior is marked.
A recent study indicates that clickers might be better than using a voice due to previous conditioning to voice intonations etc.
In Mr B’s case I have been using a clicker as an IOU for the reward. The “click” means that is right and a reward is coming.
His primary reward is food but the clicker becomes a secondary reinforcer (reinforcing in itself) through its use.
Most animals do better if you go to a variable schedule of reinforcement versus a consistent one but you have to be skilled in order to use variable reinforcement.
Let me explain:
- In the beginning you must always do a consistent schedule. This means one click gets one treat.
- Later, when you introduce a variable, the reward might come on a second, third, or first click but you vary it each time.
There are other types of variables you can use (like the time interval I mentioned as the “window of opportunity) but the most common is the variation on the delivery of the primary reinforcer or reward.
Mr B understood the concept of training quickly and began purring because he liked training.
He invents games and so training school was really to his liking–plus he could now manipulate my behavior when all his previous attempts had failed!
I started him on the training program to differentiate feeding away from the food bowl (where he is frenzied) and into other pursuits.
He is also in a weight loss program due to his gross obesity and is slimming down nicely…which is why his normal kibble is being used as the rewards–no extra treats for this big boy who was having trouble breathing and moving around when we started.
Once he understood the one click equals one treat, he could experiment and then get rewarded for it.
The first thing I did was teach him to sit away from the food bowl outside of the kitchen.
He got that quickly.
This is replacing an undesirable behavior with an acceptable one.
Previously he would run into the kitchen in a frenzy thinking he would be fed.
He would also bat and bite if he was not fed…drawing blood from the owner is not acceptable.
Now he knows that if he wants to be fed that he must go sit and wait…and so he does.
BUT what he also learned is that he should be on the carpet not on the hard flooring.
This was clear when I began training the scale in the kitchen on the hard floor and then transferred this behavior to another location while we were working on variables–but only after he learned the “step on the scale” lesson.
If you do anything the same way at the same time it conditions an animal to that behavior at that place and time.
Variables add a variety and take away the cue (aka discriminative stimulus or SD) that was established by accident previously.
Before the kitchen and presence of the owner, sometimes at a specific time, was the signal for feeding.
Now the SD is the release word followed by the phrase, “come eat” or “come.”
When you have a pattern of behavior that has been reinforced for years, it is harder to extinguish than something more recent.
Mr B has to still be reminded with an, “out of the kitchen” command but he is getting better daily.
Anyway, Mr B is very diligent and persistent. So, he did not do well at first with the variable reinforcement schedule.
When I introduced the variable he stared at me.
Then, unsure what to do, he just sat down. Which is a behavior that works for him in many cases and the first one he learned.
So, I went back to a consistent schedule and raise the criteria for the behavior instead of using a variable to mark the various stages of behavior that he was doing correctly.
That means he had to do more to get a click and then a reward.
One foot on the scale was no longer enough for a click–he had to put two, and then he was rewarded at that level a few times before three feet were required.
Some animals do better than others with training concepts and since this is so new to him I had to step back in the training process for it to be clearer to him.
Part of the problem was that instead of focusing on training during the initial variable, he began to see how he could get rewarded rather than focusing on the training.
He has been a master of manipulation so he moved through stages and my interpretations:
- Mr B staring at the clicker for long, hard durations: “If I stare hard enough and will it–she will click.”
- Next, staring at the counter and food bucket, “If I can successfully jump onto the counter, I can get ALL the food rewards in the dispenser.”
- Finally, “It is easier to experiment around the object than to try and focus on other things.”
The training of the scale was a fun process and now we are using another scale.
Variables can confuse an animal sometimes so moving locations, changing the target object, and other activities can help establish the behavior and get it in other circumstances.
I’ll talk more about that next.