Behavior that has been acquired through either Pavlovian or operant conditioning can be reduced when the outcome or reinforcer is withheld following the conditioned stimulus (CS) or response. Although such extinction can eliminate behavior, it is known that it does not erase the original learning. For example, extinguished responding can return when the animal is removed from the physical context of extinction in both Pavlovian (Bouton & Bolles, 1979; Bouton & Peck, 1989) and operant (Bouton, Todd, Vurbic, & Winterbauer, 2011; Nakajima, Tanaka, Urushihara, & Imada, 2000) procedures. This return of responding is known as renewal. Renewal suggests that extinction results in the creation of new learning that is especially context-dependent, rather than erasing the original learning. In Pavlovian procedures, contextual cues appear to disambiguate a CS that now has two distinct meanings (from both acquisition and extinction; e.g., Bouton, 2002). However, recent evidence suggests that during operant extinction, animals learn to inhibit a specific response in a specific context (e.g., Bouton & Todd, 2014; Todd, 2013; Todd, Vurbic, & Bouton, 2014). Removal of contextual cues associated with inhibition of the response is enough to cause a return of responding. This is the case even when the contexts are matched for associative history (Todd, 2013). Renewal has also been shown when the response has been eliminated through punishment, rather than extinction (Bouton & Schepers, 2015; Marchant, Khuc, Pickens, Bonci, & Shaham, 2013).

Several other “relapse” phenomena that have been demonstrated after extinction make a similar point (e.g., Bouton & Woods, 2008; Vurbic & Bouton, 2014). For example, in experiments on resurgence, animals learn to perform one response (R1) to receive a food outcome. Once this behavior is established, R1 is extinguished and a newly introduced second response, R2, is now reinforced. When reinforcement for R2 behavior is subsequently removed, R1 responding increases or “resurges” (Leitenberg, Rawson, & Bath, 1970; Leitenberg, Rawson, & Mulick, 1975). Although there are several accounts of resurgence (see Leitenberg et al., 1970; Shahan & Sweeney, 2011), it has been suggested that resurgence is a special case of the renewal effect, in which the removal of reinforcers creates the contextual change necessary to produce relapse (e.g., Winterbauer & Bouton, 2010). According to this view, receiving reinforcers for R2 creates a context for the extinction of R1, and the removal of those reinforcers creates a context change that causes behavior to return. If resurgence occurs due to a lack of generalization between the context with reinforcers and the testing context without, then encouraging generalization between these two phases should theoretically decrease resurgence. Consistent with this idea, it has been shown across a wide array of experimental procedures that leaner schedules of R2 reinforcement (which should encourage greater generalization to the testing phase, during which no reinforcers are delivered) do attenuate (and sometimes abolish) the resurgence effect (Bouton & Schepers, 2014; Bouton & Trask, in press; Leitenberg et al., 1975; Schepers & Bouton, 2015; Sweeney & Shahan, 2013; Winterbauer & Bouton, 2012).

The contextual explanation of resurgence follows a research tradition suggesting that reinforcers have discriminative, as well as a reinforcing, roles. For example, Reid (1958) found that an extinguished operant behavior could return when a reinforcer that had previously been used to establish that behavior was presented noncontingently across rat, pigeon, and human subjects. Presenting the reinforcer returned a stimulus (or context) that had originally set the occasion for the response. In an experiment reported by Ostlund and Balleine (2007), rats learned to perform one response (R1) for one outcome (O1). However, R1 only produced O1 following a noncontingent presentation of a second outcome (O2), which served as the discriminative stimulus to signal the R1–O1 relationship (i.e., O2: R1–O1). Concurrently, animals were also taught to perform a second response (R2) to earn the O2 reinforcer, but only after free presentation of the O1 reinforcer (O1: R2–O2). When tested for reinstatement following O1 and O2 presentations, it was found that when O1 was presented, R2 was elevated relative to R1, and when O2 was presented, R1 was elevated relative to R2. In other words, each reinforcer selectively elevated the response that it preceded rather than the one it followed, further suggesting that the stimulus properties of the reinforcer (and not its reinforcing properties) accounted for the reinstatement effect. Stimulus properties of the reinforcer have been invoked to explain a number of interesting phenomena in animal learning (e.g., Neely & Wagner, 1974; Sheffield, 1949).

The idea that reinforcers can have discriminative properties has been expanded to include the idea that reinforcers can also control extinction performance. For example, in an experiment reported by Bouton, Rosengard, Achenbach, Peck, and Brooks (1993), rats received repeated cycles of conditioning, in which a tone CS was paired with a food US, followed by extinction, in which the CS was no longer paired with the US. One group of animals, Group HiLo, received unsignaled US presentations during the intertrial interval (ITI) throughout conditioning, but not extinction. A second group, Group LoHi, received unsignaled US presentations during the ITI as a feature of extinction, but not of conditioning. When finally tested with and without US presentations, animals in Group HiLo showed enhanced conditioned responding when the US was presented in the ITI, whereas Group LoHi showed a suppression of conditioned responding when the US was presented in the ITI. In other words, when the extra US presentations were a feature of acquisition, they enhanced responding, and when they were a feature of extinction they inhibited responding. Similarly, Lindblom and Jenkins (1981) reduced conditioned responding through either negatively correlated or noncorrelated presentations of the CS and US; both procedures involve unsignaled presentations of the US during response elimination (as in Bouton et al., 1993). They found that removal of the food US entirely resulted in a renewal-like return of responding, wherein the context change was created by removal of the US presentations.

In a direct test of the idea that removal of reinforcers in the resurgence paradigm causes renewal through context change, Bouton and Trask (in press, Exp. 2) found that distinct reinforcers associated with extinction can serve as an effective cue to inhibit operant responding. In that experiment, rats were taught to perform a response (R1) for a distinct outcome (O1). In a second phase, R1 was placed on extinction, whereas a newly introduced response (R2) produced a new reinforcer (O2). In a final phase, in which both levers were available but neither was reinforced, rats were tested in one of three conditions: with O1 reinforcers given freely (these had been associated with acquisition of R1), with O2 reinforcers given freely (these had been associated with acquisition of R2 and inhibition of R1), or with no reinforcers. Whereas animals tested with either O1 reinforcers or no reinforcers showed a robust increase in R1 responding (i.e., resurgence), animals tested with the O2 reinforcers showed no increase in R1 responding. The presentation of O2 reinforcers during the test made testing more similar to response elimination conditions. In terms of the context hypothesis, the animals had learned to suppress their R1 behavior in a distinct “context” signaled by the addition of O2.

If reinforcers delivered in extinction can come to suppress or inhibit performance, as the context account of resurgence suggests, then they should also be able to inhibit other examples relapse after extinction. In the present experiments, we therefore asked whether a distinct O2 that had been presented during extinction might actively attenuate the renewal of a free-operant response. In all experiments, rats received operant conditioning in Context A, extinction in Context B, and then renewal testing in Context A. In Experiment 1, presentations of an O2 reinforcer delivered noncontingently during extinction attenuated operant ABA renewal. A second experiment replicated this effect and demonstrated that in order for O2 to attenuate ABA renewal, it had to be associated with response inhibition. A reinforcer that was not presented during extinction did not suppress behavior to the same degree. Experiment 3 then showed that when they were combined, changing the reinforcer context and the physical context could have additive effects. Although changing either was enough to cause an increase in responding, response recovery increased in magnitude when both were changed.

Experiment 1

In Experiment 1, we examined whether an O2 reinforcer presented during extinction could attenuate ABA renewal. In a within-subjects design, rats learned to press a lever for one outcome, O1 (either grain-based food pellets or sucrose-based food pellets, counterbalanced) in Context A. Once animals had acquired lever pressing, they were switched to a new context, Context B, where lever pressing no longer produced O1. At this time, however, a new reinforcer, O2 (either sucrose-based pellets or grain-based pellets, counterbalanced), was presented independently of responding throughout the session. Here, rats could potentially learn to inhibit lever pressing in both the physical Context B and the reinforcer Context O2. During the test, animals were switched back to Context A and tested for the renewal of lever pressing (where lever pressing produced no outcomes). They were tested both with free O2 reinforcers presented as they were during extinction and without reinforcers. If O2 reinforcers can cue or control the inhibition of responding that develops in extinction, then the renewal effect should be attenuated by the presentation of O2.

Method

Subjects

The subjects were 16 naïve female Wistar rats purchased from Charles River Laboratories (St. Constance, Quebec). They were between 75 and 90 days old at the start of the experiment and were individually housed in suspended wire mesh cages in a room maintained on a 16:8-h light:dark cycle. Experimentation took place during the light period of the cycle. The rats were food-deprived to 80 % of their initial body weights throughout the experiment.

Apparatus

Two sets of four conditioning chambers housed in separate rooms of the laboratory served as the two contexts (counterbalanced). Each chamber was housed in its own sound attenuation chamber. All boxes were of the same design (Med Associates Model ENV-008-VP, St. Albans, VT). They measured 30.5 × 24.1 × 21.0 cm (l × w × h). A recessed 5.1 × 5.1 cm food cup was centered in the front wall approximately 2.5 cm above the level of the floor. A retractable lever (Med Associates model ENV-112CM) positioned to the left of the food cup protruded 1.9 cm into the chamber. The chambers were illuminated by one 7.5-W incandescent bulb mounted to the ceiling of the sound attenuation chamber, approximately 34.9 cm from the grid floor at the front wall of the chamber. Ventilation fans provided background noise of 65 dBA.

In one set of boxes, the side walls and ceiling were made of clear acrylic plastic, whereas the front and rear walls were made of brushed aluminum. The floor was made of stainless steel grids (0.48 cm diameter) staggered such that odd- and even-numbered grids were mounted in two separate planes, one 0.5 cm above the other. This set of boxes had no distinctive visual cues on the walls or ceilings of the chambers. A dish containing 5 ml of Rite Aid lemon cleaner (Rite Aid Corporation, Harrisburg, PA) was placed outside of each chamber near the front wall.

The second set of boxes was similar to the lemon-scented boxes except for the following features. In each box, one side wall had black diagonal stripes, 3.8 cm wide and 3.8 cm apart. The ceiling had similarly spaced stripes oriented in the same direction. The grids of the floor were mounted on the same plane and were spaced 1.6 cm apart (center to center). A distinct odor was continuously presented by placing 5 ml of Pine-Sol (Clorox Co., Oakland, CA) in a dish outside the chamber.

The reinforcers were a 45-mg grain-based rodent food pellet (5-TUM: 181156) and a 45-mg sucrose-based food pellet (5-TUT: 1811251, TestDiet, Richmond, IN, USA). Both types of pellet were delivered to the same food cup. The apparatus was controlled by computer equipment located in an adjacent room.

Procedure

Magazine training

On the first day of the experiment, all rats were assigned to a box within each set of chambers. They then received one 30-min session of magazine training in Context A with their O1 reinforcer (grain-based or sucrose-based food pellet, counterbalanced). On the same day, the animals also received a second 30-min session of magazine training in Context B with their O2 reinforcer (sucrose-based or grain-based food pellet, counterbalanced). Half the animals were trained first in Context A, and half were trained first in Context B. The sessions were separated by approximately 1 h. Once all animals were placed in their respective chambers, a two-minute delay was imposed before the start of the session. In each magazine training session, approximately 60 reinforcers were delivered freely on a random time 30-s (RT 30-s) schedule. The levers were not present during this training.

Acquisition

On each of the next six days, all rats received two 30-min sessions of instrumental training in Context A. The sessions were separated by approximately 1 h. Following a 2-min delay, sessions were initiated by the insertion of the lever into the chamber. Throughout the sessions, presses on the lever delivered O1 reinforcers on a variable interval 30-s (VI 30-s) schedule of reinforcement. No hand sha** was necessary.

Extinction

On each of the next four days, all animals then received two sessions of response extinction in Context B. As before, following a 2-min delay, the lever was inserted and available for 30 min. During this phase, responding on the lever had no programmed consequences. During both the delay period and throughout the session, however, O2 reinforcers were delivered freely (i.e., not contingent on responding) according to an RT 30-s schedule of reinforcement.

Test

On the final day of the experiment, all rats were given two 10-min renewal tests in Context A. Following a 2-min delay, each test session began with the insertion of the lever and ended with the retraction of the lever. One test session occurred with O2 reinforcers delivered on the RT 30-s schedule during the delay and throughout the session. The other test session was conducted without any reinforcer presentation. Testing order was counterbalanced such that half of the animals were tested first with reinforcers present, and half of the animals were tested first without.

Data analysis

The data were subjected to either t tests or analysis of variance (ANOVA) as appropriate. For all statistical tests, the rejection criterion was set at p < .05.

Results

The results from acquisition (left panel), extinction (center panel), and test (right panel) are shown in Fig. 1. Animals increased responding throughout the acquisition phase and decreased responding throughout the extinction phase. During the test, animals showed an increase in responding (the standard ABA renewal effect) that was significantly attenuated by presentation of the O2 reinforcer.

Fig. 1
figure 1

Results of Experiment 1: Acquisition in Context A with O1 (left panel), extinction in Context B with noncontingent O2 (center panel), and renewal testing in Context A (right panel) with both free O2 reinforcers and no reinforcers delivered. All available comparisons are performed within subjects. Note the changes in the y-axes

Acquisition

As expected, the rats increased their responding over the 12 sessions of acquisition. This was confirmed by a repeated measures ANOVA conducted to assess responding throughout the 12 sessions of the acquisition phase, which showed a significant main effect of session, F(11, 165) = 21.67, MSE = 80.42, p < .001, η p 2 = .59.

Extinction

Animals decreased their responding throughout the eight sessions of the extinction phase, as confirmed by a repeated measures ANOVA conducted to assess responding over this phase, F(7, 105) = 14.01, MSE = 29.64, p < .001, η p 2 = .48. On the final day of extinction, rats that were to be tested first with reinforcers (M = 2.21) did not differ from those to be tested first without reinforcers (M = 4.89), t(14) = 0.73, p = .48.

Test

During the test, there was substantial responding in Context A. However, less responding occurred in test sessions, in which free O2 reinforcers were presented on an RT 30-s schedule, than in test sessions in which no reinforcers were presented. This was confirmed by a paired-samples t test that assessed responding throughout the first 3 min of each condition, t(15) = 2.42, p < .05, η 2 = .28. Additionally, t tests were conducted to assess renewal of responding from the last day of extinction to both the free O2 reinforcer and no-reinforcer condition. These revealed significant renewal in both the free O2 reinforcer condition, t(15) = 4.79, p < .001, η 2 = .60, and the no-reinforcer condition, t(15) = 7.33, p < .001, η 2 = .78. Together, the results indicate a robust contextual renewal effect that was attenuated in the free O2 test relative to the no-reinforcer test.

Discussion

As predicted, animals responded significantly less in Context A when O2 reinforcers (previously associated with extinction) were presented than when tested without reinforcers. O2 presentations did not abolish the renewal effect entirely, as responding was still increased when animals were tested with reinforcers relative to the final day of extinction, at which time responding was relatively low. However, the suppressive effects of the reinforcers are consistent with the idea that their addition to the test in Context A made the testing context more similar to that of extinction and provided a retrieval cue to signal response inhibition. In the present experiment, although the physical context changed between extinction and both tests, the reinforcer context only remained consistent between extinction and the O2 test. Thus, during the test without reinforcers present, the contextual change was more complete as both the physical context and the reinforcer context had changed, resulting in a greater renewal of responding.

Experiment 2

One alternative explanation of the results of Experiment 1 is that reinforcer presentation might have a generally suppressive effect on behavior. For example, presentation of O2 during testing might merely cause the animal to engage in competing behaviors, such as entering the food magazine or eating. Shahan and Sweeney (2011) have also emphasized the potentially disruptive effects of presenting reinforcers during Phase 2 in the resurgence design. However, Bouton and Trask (in press, Exp. 3) found that presentation of an O2 reinforcer noncontingently in extinction after operant conditioning with O1 actually augmented, rather than suppressed, responding relative to a group that received no extra reinforcers (see also Baker, 1990; Rescorla & Skucy, 1969; Winterbauer & Bouton, 2011). Such results suggest that the presence of noncontingent reinforcers on their own does not necessarily suppress operant responding. The approach under investigation here implies that they need to be featured in extinction first.

In Experiment 2, we therefore examined whether noncontingent O2 presentations needed to be featured in extinction in order to suppress the renewal effect. As in Experiment 1, all rats acquired lever pressing for an O1 reinforcer in Context A; lever pressing was then extinguished in Context B in the presence of noncontingent O2 reinforcers. Animals were then returned to and tested in Context A. In one group, testing occurred as in Experiment 1, both with noncontingent O2 reinforcers and without any reinforcers. For a second group, animals were tested with both noncontingent O1 reinforcers and without any reinforcers. O1 was presented at the same rate that O2 reinforcers had been delivered during extinction. If O2 reinforcers create a unique context in which extinction learning took place, then O2 reinforcers, but not O1 reinforcers, should suppress the ABA renewal effect. In other words, we hypothesized that in order for a reinforcer to be an effective retrieval cue to signal inhibitory learning, it had to be uniquely associated with response inhibition.

Method

Subjects and apparatus

The subjects were 24 naïve female Wistar rats of the same stock and maintained in the same conditions as Experiment 1. The same apparatus was used. Each animal was given two daily sessions.

Procedure

Magazine training, acquisition, and extinction

Magazine training, acquisition, and extinction proceeded in exactly the same way as Experiment 1, with the sole exception being that for each animal, sessions were separated by 1.5 h instead of 1 h.

Test

On the final day of the experiment, rats were each given two 10-min renewal tests in Context A. During the first session, eight animals were tested with O1 reinforcers delivered on an RT 30-s schedule, eight were tested with O2 reinforcers delivered on an RT 30-s schedule, and eight were tested with no reinforcer delivery. During the second test, the animals that had been tested with either O1 or O2 reinforcers were now tested without any reinforcers. Half of the animals that had been tested first with no reinforcer presentation were now tested with O1 presentations delivered on an RT 30-s schedule, and half were tested with O2 presentations delivered on an RT 30-s schedule. In addition to a between-subjects comparison of the effects of O1 and O2, and no reinforcer presentations on the first test, the experiment thus resulted in a group of rats that was tested both with and without O2, and another group that was tested both with and without O1. Although the counterbalancing of test orders in the latter grou** was not complete (eight rats were tested first with free reinforcers, and four were tested first without), an ANOVA on test data that included Test Order as a factor showed that test order did not interact with the other factors, Fs ≤ 1.55, ps ≥ .23.

Data analysis

All data were subjected to t tests or ANOVA where appropriate, with a rejection criterion of p < .05. One animal failed to learn extinction by the final day of extinction (when it still made 56.6 responses per minute, Z = 3.21) and was therefore excluded from all analyses (Field, 2005).

Results

The results from all three phases of Experiment 2 are shown in Fig. 2. Responding increased throughout acquisition (left panel) and decreased throughout extinction (center panel). During the test (right panel), responding was significantly attenuated by O2, but not by O1, presentations compared to the test with no reinforcers present.

Fig. 2
figure 2

Results of Experiment 2: Acquisition in Context A with O1 for animals later tested on either O1 or O2 (left panel), extinction in Context B with noncontingent O2 for animals later tested on either O1 or O2 (center panel), and renewal testing in Context A (right panel) with both free reinforcers (either O1 or O2) and no reinforcers delivered. Error bars are only appropriate for between-group comparisons. Note the changes in the y-axes

Acquisition

All animals increased responding throughout the 12 sessions of acquisition, as was confirmed by a 2 (Group) × 12 (Session) ANOVA in which a significant main effect of session was found, F(11, 231) = 54.59, MSE = 46.27, p < .001, η p 2 = .72. Neither the main effect of group nor the Group × Session interaction were significant, Fs < 1.

Extinction

The rats decreased responding over the eight sessions of extinction. A 2 (Group) × 8 (Session) ANOVA revealed a significant main effect of session F(7, 147) = 8.99, MSE = 20.98, p < .001, η p 2 = .30, but no main effect of group, nor a Group × Session interaction, Fs < 1. A one-way ANOVA conducted on response rates on the final day of extinction confirmed that animals did not differ on the basis of whether they were to be tested first with O1 (M = 9.1), O2 (M = 7.4), or no reinforcers (M = 2.0), F < 1.

Test

As in Experiment 1, the rats responded less in the renewal test when O2 reinforcers were presented freely than when they were tested without the reinforcer presentations. However, animals tested with O1 reinforcers showed no difference in responding relative to when tested without reinforcers. Data for the rats tested with and without O1 and with and without O2 over the two test sessions are presented at right in Fig. 2. A 2 (Group) × 2 (Test Condition: free reinforcers vs. no reinforcers) ANOVA conducted over the first 2 min of the test revealed a significant main effect of test condition, F(1, 21) = 5.40, MSE = 55.61, p < .05, η p 2 = .20, with rats responding less in the reinforcer condition than the no-reinforcer condition. There was no main effect of group, nor a significant interaction, Fs < 1. However, planned comparisons to examine within-subjects differences revealed that whereas Group O1 showed no difference between the free O1 reinforcer and no-reinforcer conditions, F < 1, Group O2 showed a significant reduction in responding when presented with free O2 reinforcers (as had been presented in extinction), as compared to when tested with no reinforcers, F(1, 21) = 5.17, p < .05, η p 2 = .20. Interestingly, the groups did not differ in either the free-reinforcer condition or the no-reinforcer condition, Fs < 1.

Analyses that focused on the first test session supported the same conclusions. Recall that during the first test, eight rats each were tested with free O1, free O2, and no reinforcers (None). Animals with O1 reinforcers and no reinforcers showed a significant renewal effect when assessed during the test, but this was not true of the animals in Group O2. The pattern was confirmed by the 3 (Group) × 2 (Session) ANOVA conducted to assess responding from the last day of extinction to the first 3 min of the test showed a significant main effect of session, F(1, 20) = 16.22, MSE = 68.47, p < .001, η p 2 = .45, with animals increasing responding in the test relative to extinction. Neither a main effect of group nor a significant interaction emerged, Fs < 1. Planned comparisons that assessed within-subject changes from extinction to the test (i.e., renewal) revealed that whereas the animals in both Group O1, F(1, 20) = 5.83, p < .05, η p 2 = .55, and Group None, F(1, 20) = 9.81, p < .01, η p 2 = .80, showed an increase in responding from the last day of extinction to the test (i.e., a renewal effect), Group O2 did not, F(1, 20) = 1.90, p = .18. The mean response rates on the last extinction session were 9.1, 7.4, and 2.0, and the mean rates on the first test session were 19.1, 13.1, and 15.8, for Groups O1, O2, and None, respectively. The results continue to suggest that the suppression of renewal depends on receiving a reinforcer that had been featured in extinction.

Discussion

As we predicted, subjects tested with response-independent O2, but not O1, showed a significant attenuation of responding when tested in the free reinforcer condition as compared to the no-reinforcer condition. In a complementary way, rats first tested with O1 presentations showed a significant renewal effect, whereas rats first tested with O2 presentations did not. This pattern, coupled with previous findings suggesting that reinforcers delivered independently of responding during or after extinction do not usually suppress responding (e.g., Baker, 1990; Bouton & Trask, in press; Rescorla & Skucy, 1969; Winterbauer & Bouton, 2011), suggests that in order for a reinforcer to attenuate ABA renewal, it needs to be a feature of extinction learning. The results are not consistent with the idea that the results of Experiment 1 were due to O2 unconditionally eliciting competing behaviors, or otherwise disrupting operant responding (e.g., Shahan & Sweeney, 2011); in that case, O1 should have been equally effective here. The results are instead consistent with the idea that noncontingent O2 presentations reduced renewal because they were associated with extinction or response inhibition.

Interestingly, and perhaps surprisingly, O1 presentation during testing did not augment or reinstate renewed responding above that seen when the rats were tested without reinforcers. This suggests that reinstatement by O1 reinforcers does not add to the renewal effect that occurs when testing occurs in Context A after extinction has occurred in Context B. This aspect of the findings will be discussed in more detail in the General Discussion.

Experiment 3

In resurgence, the context hypothesis proposes that removal of alternative reinforcers has a renewing effect that parallels the simple renewal effect that occurs when the animal is removed from the physical context of extinction. On this view, reinforcer removal and physical context change are held to have the same effect. The parallel may be similar to a parallel that has been noted previously regarding physical context change and a retention interval (i.e., a temporal context change; e.g., Bouton, 1993). Indeed, previous research had suggested that a physical context change and a temporal context change can have additive effects in producing response recovery of an extinguished response (Rosas & Bouton, 1998) and in attenuating latent inhibition (Rosas & Bouton, 1997). In Experiment 3, we thus examined how changing both the physical context and the “reinforcer context” would impact behavior after extinction. As in Experiments 1 and 2, all rats lever pressed for a distinct O1 reinforcer in Context A and the response was then extinguished in Context B with noncontingent O2 reinforcers. For testing, animals were split into two groups. In Group ABA, rats were tested for responding back in Context A with O2 reinforcers presented as in extinction and with no reinforcers presented (in a counterbalanced order). For rats in Group ABB, however, testing occurred in Context B, again with O2 reinforcers presented as in extinction and with no reinforcers delivered. We hypothesized that Group ABA would replicate the effect of Experiment 1 and 2, with rats responding less in the free O2 reinforcer condition than in the no-reinforcer condition, thus attenuating the ABA renewal effect (due to greater generalization between extinction and the O2 test). We predicted the same pattern in Group ABB, but that overall responding would be lower in this group than in Group ABA, since there would be no change in the physical context between extinction and testing. If the reinforcer/physical context relationship is similar to the temporal/physical context relationship (Rosas & Bouton, 1997, 1998), then the effects of changing the reinforcer and physical contexts should be additive: The ABA group tested without O2 (which received both physical and reinforcer context change) should show more response recovery than the ABA group tested with O2 reinforcers (which received physical context change only) and the ABB group tested without O2 (which received reinforcer context change only).

Method

Subjects and apparatus

The subjects were 32 naïve female Wistar rats of the same stock as Experiments 1 and 2. Animals were housed and maintained exactly as in Experiments 1 and 2. The same apparatus was used. Each animal was given two daily sessions.

Procedure

Magazine training, acquisition, and extinction

Magazine training, acquisition, and extinction proceeded in exactly the same way as Experiments 1 and 2, with the sole exception being that for each animal, sessions were separated by 2 h instead of 1 h (Exp. 1) or 1.5 h (Exp. 2).

Test

On the final day of the experiment, rats were separated into two groups (Groups ABA and ABB, ns = 16) and given two 10-min renewal tests during which responding had no programmed consequences. One test for each animal occurred with O2 reinforcers delivered freely according to an RT 30-s schedule during both the delay and the other test session occurred without any reinforcer presentation. For animals in Group ABA, these tests occurred in Context A. For animals in Group ABB, testing occurred in Context B. Testing order was counterbalanced so that half the animals in each group were tested first with free O2 reinforcers, and half were tested first without reinforcer presentations.

Data analysis

All data were subjected to t tests or ANOVA where appropriate, with a rejection criterion of p < .05.

Results

The results from Experiment 3 are shown in Fig. 3. As before, all rats increased responding throughout acquisition (left panel), and responding decreased during extinction (middle panel). In the test (right panel), clear and additive effects of changing occurred in both the physical context (ABA vs. ABB groups) and the reinforcer context (O2 vs. no-reinforcer groups).

Fig. 3
figure 3

Results of Experiment 3: Acquisition in Context A with O1 (left panel), extinction in Context B with noncontingent O2 (middle panel), and testing in both the free O2 reinforcer and no-reinforcer conditions. Error bars are only appropriate for between-group comparisons. Note the changes in the y-axes

Acquisition

As in Experiment 1, all animals increased their responding over the 12 sessions of acquisition. This was confirmed by a 2 (Group) × 12 (Session) ANOVA in which a significant main effect of session was found, F(11, 330) = 78.74, MSE = 29.83, p < .001, η p 2 = .72. Neither the main effect of group, F(1, 30) = 1.32, MSE = 928.32, p > .05, nor the interaction, F < 1, was significant.

Extinction

All animals decreased their responding throughout the extinction phase. A 2 (Group) × 8 (Session) ANOVA conducted over this phase revealed a significant main effect of session, F(7, 210) = 23.50, MSE = 15.80, p < .001, η p 2 = .44. Neither the main effect of group, F < 1, nor the interaction, F(7, 210) = 1.73, MSE = 15.80, p > .05, was significant. Importantly, on the final day of extinction, the animals in Group ABA who were to be tested first with free O2 reinforcers (M = 3.75) did not differ from those who were to be tested first without reinforcers (M = 2.88), t(14) = 0.27, p > .05. Similarly, the animals in Group ABB who were to be tested first with reinforcers (M = 2.96) did not differ from those who were tested first without reinforcers (M = 1.31), t(14) = 1.37, p > .05.

Test

As in Experiments 1 and 2, the rats in Group ABA responded more in the test session in which no reinforcers were presented than in the session in which O2 was presented freely. The same was true of animals in Group ABB, although overall responding was substantially lower in this group. A 2 (Group) × 2 (Testing Condition: reinforcers vs. no reinforcers) ANOVA run to assess responding during the test showed significant main effects of both group (test context), F(1, 30) = 14.74, MSE = 26.70, p = .001, η p 2 = .33, and reinforcer testing condition, F(1, 30) = 32.14, MSE = 9.93, p < .001, η p 2 = .52, but no interaction between the two, F < 1. Pairwise comparisons revealed that animals in Group ABA responded less when tested in the free O2 condition than in the no-reinforcer condition, F(1, 30) = 19.45, p < .001, η p 2 = .39. Similarly, animals in Group ABB showed more suppression of responding during the free O2 test than during the no-reinforcer test, F(1, 30) = 13.01, p = .001, η p 2 = .30. In both the free-O2 reinforcer condition, F(1, 30) = 12.54, MSE = 13.00, p = .001, η p 2 = .29, and the no-reinforcer condition, F(1, 30) = 9.89, MSE = 23.63, p < .01, η p 2 = .25, the animals in Group ABB responded less than the animals in Group ABA.

Discussion

For both Groups ABA and ABB, responding was significantly reduced during testing by noncontingent presentations of O2. However, in both testing conditions, Group ABB was suppressed relative to Group ABA. Thus, both the physical context of extinction (B) and the reinforcer context of extinction (noncontingent O2) had suppressive effects on behavior. In concordance with the results found on changing both the physical context and temporal context (Rosas & Bouton, 1997, 1998), this pattern of results suggests that the reinforcer context and the physical context have separate but additive effects. The results replicate the findings of Experiment 1 and extend the notion of the hypothesized “reinforcer context” (see Bouton et al., 1993; Bouton & Schepers, 2014). The results are also consistent with the finding in pigeons that the combination of a context change (created by changing key light color) and the discontinuation of Phase-2 reinforcers causes more resurgence in the resurgence paradigm than reinforcer discontinuation alone (Kincaid, Lattal, & Spence, 2015).

General discussion

The results of the present experiments indicate that reinforcers that have been associated with extinction can attenuate the renewal effect when they are presented during the renewal test. Where Experiment 1 established this effect, Experiment 2 replicated it and showed that it depends on whether the reinforcer has been specifically associated with extinction. Experiment 3 then demonstrated that the removal of reinforcers (O2) from the context of extinction (B) produced a resurgence-like relapse effect. The results also suggested that the reinforcer context in which extinction is learned can have a separate but additive effect with that produced by the physical context. Together, the results are consistent with the view that distinct reinforcers presented during extinction can serve to signal response inhibition during extinction. Through this mechanism, reinforcer presentations might also serve as a retrieval cue to attenuate renewal when responding is tested outside of the context of extinction. To our knowledge, this is to date the only evidence of a retrieval cue attenuating operant renewal.

The present results fit well with our interpretation of resurgence, which emphasizes the discriminative role of the alternative reinforcer. According to the context hypothesis, resurgence occurs when reinforcement is removed because animals have learned to inhibit their responding in the context of alternative reinforcement (e.g., Bouton & Schepers, 2014; Schepers & Bouton, 2015; Winterbauer & Bouton, 2010). Thus, removing reinforcers changes the context sufficiently to produce a relapse similar to ABC renewal (e.g., Bouton et al., 2011). This idea is clearly demonstrated in Experiment 3: Group ABB received training with O1 in Context A, extinction with free O2 in Context B, and testing with or without O2 in Context B. When they were tested with O2, there was a continuation of the extinction conditions, and thus suppressed responding was maintained. But when they were tested without O2 (i.e., the alternate reinforcement was removed), this constituted a context change and animals demonstrated a significant increase in responding. Although the present experiments did not examine resurgence per se, the results suggest a clear role for the reinforcer context in controlling relapse in that paradigm. They also extend the findings, reviewed in the introduction, that indicate that Pavlovian extinction performance can also be cued by reinforcers presented in the background (e.g., Bouton et al., 1993).

It should be noted that in Experiment 2, the O1 reinforcer failed to reinstate responding beyond the level seen in the no-reinforcer condition. This finding suggests that presentation of a reinforcer from acquisition did not add to the basic renewal effect. Because the response had originally produced the O1 reinforcer in Context A, it would not have been surprising to see augmented or reinstated responding with O1 presentations. One reason why such an effect was not observed may be that experience with O2 presentations during extinction reduced the ability of the O1 reinforcer to reinstate extinguished behavior. Related studies have showed that free presentations of a reinforcer during extinction can reduce or eliminate the ability of that reinforcer to augment responding during a reinstatement test (e.g., Rescorla & Skucy, 1969; Winterbauer & Bouton, 2011); such a result is consistent with the idea that reinforcer presentations in extinction extinguish the reinforcer’s ability to set the occasion for the operant response. It is worth noting that the earlier studies did not use different reinforcers in conditioning and extinction as we did here. However, in a related design, Bouton and Trask (in press, Exp. 3) found that animals that received free presentations of an O2 reinforcer during extinction after initial acquisition with O1 likewise showed no augmenting effect of O1 presentations during a final test. Given these results, it seems that free reinforcer presentations in extinction may have an effect on the ability of similar reinforcers to augment or reinstate extinguished responding. Although the present O1 and O2 reinforcers differed in some sensory properties, they presumably shared some sensory as well as motivational properties. Any generalization between O2 and O1 could have allowed O2 presentations in extinction to reduce the possible reinstating effects of O1.

Previous writers have noted that renewal has interesting implications for relapse after treatment for drug abuse disorders (e.g., Bouton et al., 2011; Crombag & Shaham, 2002). The idea is that if a patient undergoes treatment in a therapeutic setting, this behavior could be susceptible to relapse following the cessation of that treatment (or simple removal from the therapeutic setting, or context). The present results suggest that a salient cue (in the present case, a reinforcer) from the treatment situation could potentially attenuate relapse or renewal if presented in the settings in which relapse is likely to occur. They thus extend previous research on renewal and spontaneous recovery in Pavlovian conditioning suggesting that relapse effects can be attenuated by presenting a retrieval cue, just prior to the test, that had been featured in extinction (Brooks & Bouton, 1993, 1994).