Discrete trial instruction (DTI) has been used for decades to improve academic achievement for children with autism and developmental delays (Koegel et al., 1977; Lovaas, 2003; Smith, 2001). DTI involves small units of three-term contingencies, including the discriminative stimuli, prompts (as necessary), student responses, contingent consequences, and intertrial intervals (Smith, 2001). Reinforcers (e.g., edibles, tokens, and praise) and correction procedures are widely used as consequences in DTI. Contingent positive reinforcement for correct responses and correction procedures contingent on incorrect responses may influence the efficiency of skill acquisition for children with autism and developmental delays in DTI (Cariveau et al., 2019; Leaf et al., 2010; Scott et al., 2000; Smith et al., 2006; Vladescu & Kodak, 2010; Wordsdell et al., 2005).

A variety of studies have focused on the manipulation of differential reinforcement following correct responses to facilitate skill acquisition for children with disabilities (Boudreau et al., 2015; Johnson et al., 2017; Karsten & Carr, 2009; Vladescu & Kodak, 2010). Differential reinforcement refers to the delivery of higher-quality, higher-magnitude, shorter delay, or denser schedules of reinforcement following independent correct responses and lower-quality, smaller-magnitude, longer delay, or leaner schedules of reinforcement following prompted correct responses in an error-correction procedure (Jessel et al., 2020; Vladescu & Kodak, 2010). The results of these studies demonstrated that although differential reinforcement tended to be more efficient, nondifferential and differential reinforcement were often both effective on teaching new skills (e.g., tact and textual responding) to children with disabilities (Campanaro et al., 2020; Johnson et al.,; 2017; Karsten & Carr, 2009).

The implementation of differential reinforcement is commonly used simultaneously with error correction procedures that are implemented contingent on incorrect responses (Carroll et al., 2015, 2018; Joachim & Carroll, 2018; Kodak et al., 2016; McGhan & Lerman, 2013). A typical correction procedure consists of four components, including (a) prompts or models of the correct response immediately following the student’s emission of an incorrect response, (b) the student’s new response, (c) differential reinforcement of correct responses (e.g., the delivery of less preferred items in correction procedures), and (d) an additional work requirement following errors, which may function as a punishment contingency (e.g., re-present the discriminative stimulus and require the emission of the target responses multiple times; Worsdell et al., 2005). Several studies have compared the learning rates under conditions of differential reinforcement and without correction procedures (Jessel et al., 2020; Kodak et al., 2016; Rapp et al., 2012; Worsdell et al., 2005). The results revealed that children demonstrated little improvement when correction procedures were omitted and improvement when correction procedures and differential reinforcement were present. Thus, the correction procedure appears to be—at minimum—a necessary component (Ward-Horner & Sturmey, 2010) of skill acquisition in DTI. Nevertheless, the role of error correction procedure in skill acquisition needed further verification due to the involvement of positive reinforcement (e.g., the delivery of a less preferred but still reinforcing item following a prompted correct response) in the correction procedure.

Although a number of studies have investigated the synergistic effects of correction procedures and differential reinforcement in DTI for children with disabilities, few researchers have conducted a component analysis of skill acquisition consequences to compare the effects of reinforcement and correction procedures in DTI. Simonian and Brand (2022) directly analyzed the effects of and preference for positive and corrective feedback in learning novel dice games for 10 undergraduate students (18–24 years old) through Zoom instruction. The researchers delivered different types of feedback (i.e., confirmation of the correct steps, modeling of the correct steps with verbal statement, or no feedback) at the end of each round of a dice game. The results showed that none of the participants mastered the game in the positive feedback only condition, whereas all participants successfully mastered the game in the corrective feedback only condition. Furthermore, all participants demonstrated higher preference for the corrective feedback in the choice assessment following the acquisition phase. These findings suggested that corrective feedback was more effective and preferable than positive feedback in skill acquisition for undergraduate students. Further investigations on the effects of correction procedures for children with and without disabilities will elucidate whether a correction procedure is necessary and sufficient for skill acquisition. However, the ubiquitous application of differential reinforcement in DTI, which usually involves the delivery of less preferred items following prompted correct responses in the correction procedure, makes it difficult to isolate the effects of correction procedures from positive reinforcement operations in skill acquisition (Kodak et al., 2016; Rapp et al., 2012; Worsdell et al., 2005).

Learn unit (LU) instruction,Footnote 1 a type of DTI, includes both positive reinforcement for independent correct responses and correction procedures for incorrect responses (Albers & Greer, 1991). Essentially, LU instruction moves the traditional antecedent strategy of prompts to the consequence portion of instruction, delivered during error correction, and includes no reinforcement to prompted responses. Such exclusion of reinforcement from the correction procedure allows direct comparison of the effects of different consequences in skill acquisition, or a component analysis of skill acquisition consequences for correct and incorrect responding (Albers & Greer, 1991; Greer, 2002; Greer & McDonough, 1999).

Because few studies have isolated the effects of error correction procedure from differential reinforcement in skill acquisition, the purpose of this study was to analyze the component effects of skill acquisition consequences that involved contingent praise for independent correct responses (positive reinforcement operation) and corrections for incorrect responses in DTI. Considering that related research in DTI focused more on speaker responses, we extended past findings to the acquisition of listener responses (i.e., auditory-visual conditional discriminations, AVCDs). In this study, LU instruction was compared to conditions that omitted individual consequence portions of instruction. In the correction-only-for-incorrect responses (CI) condition, researchers ignored correct responses and implemented the correction procedure contingent on incorrect responses. In the praise-only-for-correct-responses (PC) condition, researchers delivered contingent praise for independent correct responses and ignored incorrect responses. Because LU instruction moves the antecedent strategy of prompts to the consequence portion of the correction procedure, the acquisition of listener responses in the PC condition was similar to the trial-and-error learning: reinforcement for correct responses and no direct consequences for incorrect responses. The results of this study can potentially help teachers and therapists identify the efficacious instruction procedure in learning novel listener responses. Last, because previous research has involved only children with disabilities and undergraduate students, we extended past findings for preschoolers with and without disabilities. In Experiment 1, researchers measured the acquisition rate, duration, and maintenance of responses across the LU, CI, and PC conditions with educationally relevant stimuli. We replicated these procedures in Experiment 2 with additional controls for history by teaching listener responses to abstract stimuli.

Experiment 1

Method

Participants

The researcher selected six participants, ranging in age from four to five years old. All but one of the participants were educationally classified as a preschooler with a developmental delay. The six participants attended a privately run, publicly funded preschool for students with and without disabilities in a suburb of a large urban city. The classroom implemented the Comprehensive Application of Behavior Analysis to Schooling (CABAS®) model of instruction. Researchers obtained written consent from the parents of all participants to participate and publish anonymous collected data before the experiment was conducted. Participants provided assent to complete sessions orally in response to the researcher asking if they want to play a special game of learning names of pictures before the experiment. All participants responded “yes” to the researcher. Andrew and Adam were five-year-old White males with developmental delays. Tom was a four-year White male with developmental delays. Jack was a four-year-old typically develo** White male. Celine and Amy were four-year-old White females with developmental delays. The participants’ repertoires were assessed using the Early Learner Curriculum and Achievement Record (ELCAR): A CABAS® Developmental Inventory (Greer, et al., 2020) and Verbal Behavior Development Assessment (VBDA). Detailed information regarding the participants’ verbal behavior development is outlined in Table 1.

Table 1 Detailed information of relevant verbal behavior cusps for all participants

We selected the participants according to the following inclusion criteria. (1) The participants had an instructional history that included both 1:1 and group instruction. Specific to this study, the participants all had an instructional history that included positive reinforcement of correct responses and correction procedures of incorrect responses (the learn unit condition, described below). The participants also had an instructional history of learning AVCDs on the laptop using Microsoft PowerPoint. (2) The participants were able to follow multiple step vocal directions and sit at the table during instruction appropriately for at least 5 min. (3) The participants had mastered prerequisite repertoires for learning AVCDs such as generalized identity matching, imitating gross motor actions, and orienting to and observing two-dimensional stimuli with 100% accuracy. (4) The participants attended the same integrated classroom for the convenience of implementing the experiment. (5) Teachers’ praise functioned as a conditioned reinforcer for all participants—evidenced in the ELCAR assessment for listener and speaker responses, in which there were no differences between the percentage and rate of correct responses emitted by the participants in a probe session with the use of tangibles or tokens and praise. Each probe session included one response opportunity to each of the 10 antecedent discriminative stimuli. The researcher delivered tangibles/tokens or praise contingent on correct independent responses based on the respective conditions. Incorrect responses were unconsequated. All participants demonstrated the same rates of correct responses in the assessment of (a) following one- and two-step vocal directions, (b) imitating one- and two-step gross motor actions, (c) pointing to the known visual stimulus in an array of three stimuli (e.g., colors, shapes, and letters), and (d) responding to “wh” questions regarding personal information (e.g., what’s your mom’s name?), regardless of the use of tokens or praise as reinforcers for independent correct responses.

Setting and Materials

The experiment occurred in the participants’ respective classrooms. The classroom consisted of nine students, including one typically develo** student and eight students with developmental delays, one teacher, and two teaching assistants. The teacher and/or teaching assistants delivered instruction to non-participant students in 1:1 or small group settings while the researchers conducted the study. All instructional responses (for this experiment and during daily instruction) were measured with frequent interobserver agreement checks and rating of teacher accuracy of implementing instruction using the Teacher Performance Rate and Accuracy (TPRA, Ingham & Greer, 1992) tool.

The classroom contained a play area, four small group stations, and a large rectangular communal table. The instruction occurred at the head of the communal table, where the participant sat with the researcher during the instructional sessions and the probe sessions for maintenance skills. The researcher used PowerPoint to present the two-dimensional stimuli during the pre- and post-instruction probes and during instruction. A laptop, data collection sheets, a timer, and pens with black ink were used in the procedure. The sets of educational stimuli used in the learning process and probe sessions for each participant are listed in Table 2 (top).

Table 2 List of 2D stimuli used in Experiments 1 and 2

Measurement

We measured four dependent variables in learning novel AVCDs for each participant. The primary dependent variable was the number of correct responses emitted by the participant in each instructional session. A correct response was defined as the participant pointing to the corresponding visual stimulus within 5 s of the delivery of the antecedent auditory stimulus. An incorrect response was defined as the participant pointing to a different stimulus or not responding within 5 s of the delivery of the antecedent. The correction procedure allowed two independent opportunities to respond correctly, but only the initial incorrect response was recorded for data purposes.

The second dependent variable was the number of trials required by the participant to meet criterion for each condition’s set of stimuli. We also measured the total duration required by the participant to meet criterion for each set of the stimuli in the three conditions (Carroll et al., 2015, 2018; Kodak et al., 2016). During each trial, we waited 5 s for the participant’s response followed by the delivery of contingent consequences in the respective condition and immediately moved on to present next stimulus. The duration of a session was measured as the time elapsed from the presentation of the first visual comparison array until the end of the consequence of the last trial (e.g., the delivery of praise following the correct response and correction following the incorrect response in the learn unit condition). The total duration was calculated by adding the duration of all sessions until mastery. Additionally, maintenance probes were conducted to measure the number of correct AVCDs during biweekly assessments for up to 6 weeks following the completion of the teaching phase. During maintenance, we recorded correct and incorrect responses in the same manner as during acquisition.

Independent Variable

We conducted a component analysis of skill acquisition consequences. During the learn unit (LU) condition, researchers praised correct responses and implemented a correction procedure contingent on incorrect responses following the antecedent instructions. The correction procedure involved the researcher modeling the correct response and re-delivering the antecedent and allowing the participant the opportunity to independently respond. We compared LU instruction to two dropout component analysis conditions (Ward-Horner & Sturmey, 2010), whereby we omitted consequence portions of instruction. In the praise-only-for-correct responses (PC) condition, researchers applied positive reinforcement operations to correct responses and ignored incorrect responses. In the correction-only-for-incorrect responses (CI) condition, researchers ignored correct responses and implemented the correction procedure contingent on incorrect responses. We manipulated this independent variable with educational stimuli and measured acquisition and response maintenance.

Experimental Design

We used an adapted alternating treatment design (Sindelar, 1985) to study the effects of consequence conditions on the acquisition and maintenance of AVCDs. We paired the participants into three dyads randomly and counterbalanced the assignment of stimulus sets to the three consequence conditions across the dyads. Additionally, the researchers controlled and equated the difficulty level of the educational stimulus sets by limiting the number of syllables contained in each antecedent and types of stimuli (Cariveau et al., 2021). The researchers chose nine novel two-syllable stimuli from each of the two categories, insects and office items according to ELCAR goals and randomly assigned them to three stimuli sets. None of the stimuli in each set shared the same first syllable. The use of Microsoft PowerPoint on the laptop ensured the accuracy of presenting the antecedent visual stimuli, such that multiple exemplars were used for each stimulus in each session and the order of the array of the stimuli was rotated after each trial. During instruction, the researchers counterbalanced the daily order of instruction conditions within each dyad and the targets across the three dyads to decrease the possibilities of sequence and carryover effects through this procedure. One instructional session under each consequence condition was conducted for all participants per day. All sessions were conducted before lunch.

Procedure and Data Collection

The study included four phases: (a) probe the number of correct auditory-visual conditional discrimination responses for the three sets of educational stimuli, (b) implement instruction across the three consequence conditions, (c) if the participant failed to master the stimuli set in one condition, switch to the more effective learning procedure, and (d) probe the number of correct responses to the stimulus sets during bi-weekly maintenance assessments for up to six weeks.

Pre-experimental probes to identify target stimuli

Prior to the experiment, stimuli were equated and assigned to three sets. The researcher sat next to the participant at a rectangular table and presented two-dimensional stimuli in PowerPoint on a laptop. Each trial included an array of six pictures (three stimuli on top and three at the bottom on a slide), including four target stimuli and two distracter stimuli that were never taught. The position of the target stimuli changed across trials. Sessions included three presentations of each target relation for a total of 12 trials. For each target relation, there were three variations of the picture that corresponded to the discriminative stimulus (e.g., three different pictures of a mantis).

The researcher first conducted probes for speaker responses for each set of stimuli for all participants. If the participant emitted the name of a picture accurately within 5 s following the researcher pointing to a stimulus, we recorded it as a correct response and replaced the target with another stimulus of the same category that was not in the participants’ repertoire. The probe procedure for AVCDs was similar to that for the speaker responses, in which the participant was required to point to the target stimulus within 5 s of the researcher presenting the auditory sample stimulus “point to _ (e.g., mantis).” If the participant pointed to a target stimulus correctly during all probe trials, the researcher replaced the target with a novel stimulus of the same category. All probe trials were unconsequated.

Baseline

Following pre-experimental probes, three baseline probes were conducted with each participant. The procedure was the same as those used during pre-experimental probes.

General Instructional Procedures

The researcher sat next to the participant at a rectangular table and presented the two-dimensional stimuli on a laptop similar to baseline and probe sessions. The researcher presented the auditory sample stimulus (e.g., “point to [e.g., mantis]”) and varied consequences as a function of the specific condition (described below). Like prior phases, each session included 12 trials (4 distinct stimuli, 3 opportunities for each, 3 versions of pictures). The acquisition criterion for a set of stimuli was set at either 11/12 (92%) accuracy or higher across two consecutive sessions (Tom, Andrew, Jack, Celine, and Amy) or 12/12 (100%) accuracy for one session (Adam). We determined the acquisition criteria based on the criteria set for the previous programs in learning listener responses for each participant. Each day, the researcher ran each of the three learning conditions once and this was continued until the participants met criterion for two conditions to ensure that they had the same amount of exposure to the stimuli before terminating instruction and moving to the maintenance assessments. If the participant met the acquisition criterion for two conditions within 5 sessions, we continued instruction until there were five sessions, to provide more exposures for the third condition. The researcher determined that performance plateaued for the third condition if the participant failed to meet the acquisition criterion with the additional exposures. In the case of performance plateauing in one condition, we applied the “best treatment” (Carroll et al., 2015; Cengher et al., 2015)—or the condition that met the acquisition criterion in the fewest session—to that plateaued condition. If the participant required the same number of sessions for both conditions, we decided the best treatment condition randomly by flip** a coin. The acquisition criterion for “best treatment” was set at either 90% (11/12) accuracy or higher across two consecutive sessions or 100% (12/12) accuracy for one session.

Learn Unit (LU) Instruction

LU instruction consists of three components: presentation of antecedent while the participant is attending, an opportunity for the participant to respond, and the delivery of reinforcement contingent on correct responses and a correction procedure contingent on incorrect responses. If the participant correctly pointed to the target picture within 5 s following the vocal antecedent “point to _ (e.g., mantis),” the researcher vocally praised the participant’s response (e.g., “Wow, you are right! It is mantis!”). If the participant responded incorrectly or did not respond within 5 s following the antecedent, the researcher implemented a correction procedure, including the researcher’s modeling the correct response once, re-presenting the antecedent, and requiring the participant to identify the target stimulus independently two times. The correct responses emitted by the participant during the correction procedure were unconsequated. The detailed procedure of LU instruction is outlined in Fig. 1.

Fig. 1
figure 1

The general procedure of the learn unit (LU) instruction

Praise-only-for-Correct Responses (PC) Instruction

PC instruction was identical to LU instruction except that the researcher did not provide any consequences following an incorrect response and continued to present the next trial. The researcher vocally praised the participant’s correct responses.

Correction-only-for-Incorrect Responses (CI) Instruction

CI instruction was the same as LU instruction except that the researcher did not provide any consequences following correct responses and moved on to present the next trial. The researcher implemented the correction procedure following an incorrect response (see above).

Maintenance Probes

The researcher conducted bi-weekly maintenance probes for up to 6 weeks following instruction. The procedure was identical to that of the baseline probes. Conditions that were subjected to the “best treatment” were omitted from this analysis because the learning history included two teaching conditions.

Interobserver Agreement and Treatment Fidelity

Interobserver agreement (IOA) was collected using the trial-by-trial correspondence on the TPRA form (Ingham & Greer, 1992), in which an observer simultaneously and independently collected data on students’ responses. The observers calculated IOA on the number of correct responses in each session by comparing each individual trial to determine if each trial was scored the same. The researcher divided the number of agreed trials by the total number of trials then multiplied by 100 to determine IOA. Regarding the duration per session, we calculated IOA by dividing the duration in minutes recorded by the independent observer by the duration recorded by the researcher then multiplied by 100%. The independent observer collected data for 68.5% of the target identification and baseline sessions (55.6% sessions for Tom, Adam and Amy; 66.7% sessions for Andrew; 88.9% sessions for Jack and Celine), 51.1% of the instructional sessions (52.6% sessions for Tom; 51.7% sessions for Andrew; 42.9% sessions for Jack; 60% sessions for Adam; 65.0% sessions for Celine; 45.8% sessions for Amy), and 77.8% of maintenance sessions (100% sessions for Tom and Amy; 66.7% sessions for Andrew, Jack, Adam, and Celine), all with a mean agreement of 100%.

Treatment fidelity was also collected and measured using the TPRA form (Ingham & Greer, 1992), in which a supervisor observed the sessions and collected data on the extent to which the researcher adhered to all components of a trial as scripted in the experiment procedure. This included rating: (a) the accuracy of researchers presenting antecedents, (b) securing attending, (c) recording participant’s responses accurately, and (d) delivering contingent consequences as described in each condition (e.g., no reinforcement such as smiling or nodding following a correct response in the CI condition). The supervisor counted a trial as incorrect if the researcher incorrectly implemented one or more components in the trial. The researcher calculated treatment fidelity by dividing the number of trials implemented correctly by the total number of trials, multiplied by 100. During the baseline for sets of educational stimuli, the supervisor collected data for 37% of pre-probe sessions (22.2% sessions for Tom, Andrew, and Adam; 66.7% sessions for Jack; 44.4% sessions for Celine and Amy), 45.3% for instructional sessions (42.9% sessions for Tom, Jack, and Amy; 54.2% sessions for Andrew; 50% sessions for Adam and Celine), and 77.8% of maintenance sessions (100% sessions for Tom and Amy; 66.7% sessions for Andrew, Jack, Adam, and Celine), all with 100% fidelity.

Results and Discussion

Figures 2 and 3 display the number of correct responses emitted by participants during baseline, acquisition, and maintenance across the three instructional conditions. Square data points represent those sessions when a participant’s performance met the acquisition mastery criterion. For all participants, baseline responding was at a stable low level ranging from 0 to 4 out of 12 correct responses, which was in the range of chance responding. With the exception of Celine, during all comparisons the praise-only-for-correct-response (PC) condition produced lower accuracy at the time when the participant mastered the learn unit (LU) and correction-only-for-incorrect-response (CI) conditions. The PC condition was switched to the “best treatment” for the remainder of the acquisition phase.

Fig. 2
figure 2

Baseline, acquisition, best treatment, and maintenance data for Tom (Dyad 1), Jack (Dyad 2), and Celine (Dyad 3) with educational stimuli in Experiment 1 Note. Square data points represent sessions where a participant’s performance met the mastery criterion

Fig. 3
figure 3

Baseline, acquisition, best treatment, and maintenance data for Andrew (Dyad 1), Adam (Dyad 2), and Amy (Dyad 3) with educational stimuli in Experiment 1 Note. Square data points represent sessions where a participant’s performance met the mastery criterion

During acquisition, Tom met the acquisition criterion (11/12 across two consecutive sessions) for AVCDs in the LU condition in 3 sessions and responses in the CI condition in 5 sessions (top panel, Fig. 2). We switched teaching targets in the PC condition to LU and Tom mastered those targets in 4 additional sessions. During maintenance, Tom responded with 12/12 accuracy in the LU condition and 10/12 in the CI condition at 6 weeks following the termination of instruction. Overall, Tom learned targets in the LU condition in 2 fewer sessions (40% fewer sessions than CI) and had higher maintenance in the LU condition as well. His dyad, Andrew, mastered the AVCDs in the LU condition in 8 sessions and in the CI condition in 5 sessions (38% fewer sessions than LU) (top panel, Fig. 3). The PC condition was switched to CI in the “best treatment” and Andrew mastered those targets in 5 additional sessions. Consistent with the results during acquisition, Andrew responded with higher accuracy in the CI condition (12/12) than in the PC condition (4/12) at 6 weeks following instruction. However, both the LU and CI conditions produced better acquisition than the PC condition for participants in Dyad 1. Both participants in Dyad 2, Jack and Adam, required one fewer session (20% and 17% fewer sessions, respectively) to master the responses under the CI condition than under the LU condition (middle panels, Figs. 2 and 3). Specifically, Jack required 4 sessions until mastery (11/12 across two consecutive sessions) in the CI condition and 5 sessions in the LU condition (middle panel, Fig. 2). Adam met criterion (12/12 for one session) under the CI condition in 5 sessions and the LU condition in 6 sessions (middle panel, Fig. 3). After we switched teaching targets in the PC condition to CI, both participants mastered those targets in 2 additional sessions. During maintenance, both participants responded with slightly higher accuracy level for targets learned under the LU condition: 7/12 in the LU condition and 6/12 in the CI condition at 6 weeks for Jack; 11/12 in the LU condition and 9/11 in the CI condition at 4 weeks for Adam. Celine was the only participant who successfully mastered the stimuli in the PC condition, in which she met criterion after 3 instructional sessions (bottom panel, Fig. 2). We decided to continue instruction after she reached mastery for the PC and LU conditions considering the high level of responding accuracy in the CI condition. Celine mastered (11/12 across two consecutive sessions) the responses in the LU condition in 5 sessions (38% fewer sessions than the CI condition) and in the CI condition in 8 sessions. She responded with 12/12 accuracy in both the LU and CI conditions 6 weeks following instruction. Her dyad, Amy, required one fewer session (25% fewer sessions) to master the AVCDs in the LU condition (3 sessions) than in the CI condition (4 sessions) (bottom panel, Fig. 3). The PC condition was switched to LU in the “best treatment” after five instructional sessions, in which Amy mastered those targets in 10 additional sessions. During maintenance, Amy responded with 9/12 accuracy in the LU condition and 10/12 in the CI condition at 6 weeks following the termination of instruction. Overall, Amy had similar levels on acquisition and maintenance in the LU and CI conditions.

Table 3 shows the total number of trials and the cumulative duration until mastery during acquisition for all participants. All participants required an average of 2 fewer trials (3% fewer) in the LU condition than in the CI condition. However, the average duration required to master the AVCDs was slightly shorter (0.2 min shorter) in the CI condition than in the LU condition. Specifically, three participants, Tom, Celine, and Amy, learned the targets in the LU condition in less time than in CI: 3.1 fewer min for Tom, 2.8 fewer min for Celine, and 0.6 fewer min for Amy. In contrast, Andrew, Jack, and Adam required shorter duration to master the responses in the CI condition than in LU: 7.3 fewer min for Andrew, 0.3 fewer min for Jack, and 0.2 fewer min for Adam. Overall, there was no considerable difference in regard to the total duration in learning the educational targets across the LU and CI conditions for all participants.

Table 3 The total number of trials and cumulative duration required until mastery in the LU, CI, PC conditions in Experiments 1 and 2

Summary

The results demonstrated that the LU and CI conditions were both effective in teaching auditory-visual conditional discriminations. For every participant except Celine, the CI and the LU conditions were more effective than the PC. Additionally, there was no consistent difference in the duration between the LU and CI conditions for all participants, suggesting that the LU condition that involved both reinforcement and corrections was not necessarily more effective or efficient than CI on teaching AVCDs. The findings also suggested that the correction procedure was a necessary component in learning AVCDs for 5 out of 6 participants. Furthermore, the correction procedure alone was sufficient for 3 out of 6 participants, meaning that the CI condition produced the same or faster acquisition than the LU condition (Ward-Horner & Sturmey, 2010).

Experiment 1 is limited in three aspects. First, participants may have had previous exposure to the educational stimuli used in the study or concurrent exposure, although we ensured that these responses were not targeted during school hours. Thus, we were unable to entirely rule out history as a threat to internal validity. The correct responses emitted by Celine increased from 3/12 during baseline to 10/12 in the first session of instruction in the PC condition. This increase may have resulted from fortuitous guessing; nevertheless, it would help verify the effects of the consequence components of correction procedure and praise by further controlling for target novelty and removing possible threats to internal validity associated with using educationally relevant targets. Second, the acquisition criterion was not consistent across the participants. Except for Adam, the mastery criterion was set at 11/12 (92%) accuracy or higher across two consecutive sessions. The criterion was at 12/12 (100%) accuracy for one session for Adam. It would help increase the reliability to compare the total trials and duration in the different conditions across the participants by implementing a consistent mastery criterion. Last, the researcher incorrectly included scorpion, a three-syllable stimulus, in the CI condition for Tom and Andrew, the LU condition for Jack, and the PC condition for Amy. It would help verify the effects of the LU, CI, and PC conditions by controlling the number of syllables of all targets. In Experiment 2, we conducted a systematic replication to test the generality and reliability of the findings with three-syllable abstract stimuli to examine within-subject replication in a more controlled study.

Experiment 2

Method

Participants

All participants in Experiment 1 participated in Experiment 2.

Setting and Materials

The sets of abstract stimuli used in the learning process and probe sessions for each participant are listed in Table 2 (bottom). The other materials and settings were identical to those in Experiment 1.

Measurement

We measured the same dependent variables as described in Experiment 1. Due to the school closure from the COVID-19 pandemic, we only collected the maintenance data for up to four weeks following instruction. The independent variable was identical to Experiment 1 except that participant learned AVCDs to abstract stimuli.

Procedure and Experimental Design

The researcher systematically replicated the procedure in Experiment 1 using abstract symbols for all participants. Targets included nonsense CVCV words and abstract symbols. The procedure was identical to Experiment 1 except that the acquisition criterion was set at 12/12 (100%) accuracy for one session for all participants. The researcher conducted bi-weekly maintenance probes for up to four weeks following mastery. All features of the design was the same as in Experiment 1.

Interobserver Agreement and Treatment Fidelity

Interobserver agreement (IOA) and treatment fidelity were collected and reported following the same procedures as in Experiment 1. Regarding IOA, the independent observer collected data for 61.1% of the target stimuli identification and baseline sessions (66.7% sessions for Tom and Adam; 55.6% sessions for Andrew and Jack; 77.8% sessions for Celine; 44.4% sessions for Amy), 35.1% of the instructional sessions (47.6% sessions for Tom; 38.9% sessions for Andrew; 33.3 sessions for Adam and Celine; 30.0% sessions for Amy), and 90% of the maintenance sessions (100% sessions for Tom, Andrew, Adam, and Celine; 50% sessions for Amy), all with a mean agreement of 100%. Fidelity data were recorded during 48.1% the target identification and baseline conditions, 30.7% instructional sessions and 90% of maintenance sessions (100% sessions for Tom, Andrew, Adam, and Celine; 50% sessions for Amy), all with a mean fidelity of 100%.

Results and Discussion

Figures 4 and 5 display the number of correct responses emitted by participants during baseline, acquisition, and maintenance across the three consequence conditions with the abstract stimuli. During acquisition, Tom mastered (12/12 for one session) the AVCDs in the learn unit (LU) and correction-only-for-incorrect-response (CI) conditions in 7 sessions. After switching teaching targets in the praise-only-for-correct response (PC) condition to CI, Tom mastered those targets in 2 additional sessions (top panel, Fig. 4). During maintenance, Tom responded with 10/12 accuracy in the LU and CI condition at 4 weeks following the termination of instruction. There were no differences on the acquisition and maintenance of AVCDs for abstract stimuli across the two conditions for Tom. Similarly, Andrew required the same number of sessions to master the AVCDs in the LU and CI conditions (6 sessions) (top panel, Fig. 5). The PC condition was switched to CI in the “best treatment” and Andrew mastered those targets in 4 additional sessions. Due to the absence from sickness, we only collected maintenance data for Andrew two weeks following instruction. He had a similar level of accuracy during maintenance across the two conditions: 11/12 in the LU condition and 12/12 in the CI condition. Jack also learned the target responses within the same number of sessions in the LU and CI conditions (5 sessions) (middle panel, Fig. 4). We switched teaching targets in the PC condition to LU, and Jack mastered those targets in 3 additional sessions. During maintenance, Jack responded with slightly higher accuracy level for targets learned under the LU condition: 9/12 in the LU condition and 7/12 in the CI condition four weeks following mastery. His dyad partner, Adam, mastered the AVCDs in the LU condition in 4 sessions and responses in the CI condition in 3 sessions (25% fewer sessions than in LU) (middle panel, Fig. 5). He mastered the targets assigned to the PC condition in 2 additional sessions under the best treatment of CI. Due to the school closure under the impact of COVID-19, we only collected maintenance data for up to two weeks for Adam. He responded with 6/12 accuracy in the LU condition and 12/12 accuracy in the CI condition. Celine required one fewer session (25% fewer) to master the responses in the LU condition than CI (bottom panel, Fig. 4). She reached criterion for targets in the PC condition in 2 additional sessions after we switched to PC condition. During maintenance, Celine had 100% accuracy for both conditions four weeks following instruction. Her dyad partner, Amy, learned the abstract symbols slightly faster in the CI condition: 10 sessions in the LU condition and 8 sessions in CI (20% fewer sessions than LU) (bottom panel, Fig. 5). We switched to the CI condition for the targets in the PC condition, and Amy mastered those targets in 8 additional sessions. She responded with 6/12 accuracy across both conditions at weeks 4 after instruction. Overall, both the LU and CI conditions produced better acquisition than the PC condition for all participants. Except for Celine, all participants learned at the same rate or faster in the CI condition than in the LU condition. All participants, except for Jack, demonstrated same or higher level of maintenance in the CI condition than in the LU condition. Consistent with findings in Experiment 1, there were no evident differences on the acquisition and maintenance of abstract stimuli between the LU and CI conditions.

Fig. 4
figure 4

Baseline, acquisition, best treatment, and maintenance data for Tom (Dyad 1), Jack (Dyad 2), and Celine (Dyad 3) with abstract stimuli in Experiment 2 Note. Square data points represent sessions where a participant’s performance met the mastery criterion

Fig. 5
figure 5

Baseline, acquisition, best treatment, and maintenance data for Andrew (Dyad 1), Adam (Dyad 2), and Amy (Dyad 3) with abstract stimuli in Experiment 2 Note. Square data points represent sessions where a participant’s performance met the mastery criterion

Table 3 shows the total number of trials and duration until mastery during acquisition for abstract stimuli. All participants required an average of 4 fewer trials (6% fewer) in the CI condition than in the LU condition. Consistently, the average total duration in the CI condition was 1.0 min shorter than in the CI condition. Except for Celine, all participants mastered the targets in the CI condition with slightly shorter duration than in the LU condition: 2 fewer min for Tom, 0.5 fewer min for Andrew, 0.2 fewer min for Jack, 0.9 fewer min for Adam, and 5.6 fewer min for Amy. In comparison, Celine required 3.1 min shorter duration to master the responses in the LU condition than CI. Overall, there are no significant differences on the total number of trials and duration in learning the abstract targets across the LU and CI conditions for all participants.

Summary

Similar to Experiment 1, the results of Experiment 2 demonstrated that the LU procedure, which included both praise and correction procedures, did not necessarily lead to faster acquisition and higher maintenance than the CI condition. After controlling for the novelty and history of exposure of the participants with abstract stimuli, all participants required similar or fewer number of trials and shorter duration to master the AVCDs in the CI condition than in the LU condition. Furthermore, the CI procedure was necessary for all participants and sufficient for 5 out of 6 participants. Although all participants responded with overall higher levels of accuracy in the instruction than the baseline in the PC condition, none of the participants met the acquisition criterion after at least five instructional sessions in the PC condition at the time when they mastered the LU and CI conditions. After switching to the best treatment of either LU or CI, all participants successfully mastered the abstract symbols assigned to the PC condition using a new teaching arrangement. The results suggested that the component of praise was not efficient for learning AVCDs by typically develo** and high-functioning preschoolers with developmental delays. In contrast, the correction procedure was necessary and occasionally sufficient for skill acquisition and maintenance.

General Discussion

This study was the first study that analyzed the component effects of skill acquisition consequences involving praise for correct responses and correction procedures for incorrect responses in DTI. Previous research focusing on the effects of correction procedures in DTI usually involved the implementation of differential reinforcement (Carroll et al., 2015, 2018; Jessel et al., 2020; Kodak et al., 2016; McGhan & Lerman, 2013). The results of past studies demonstrated that the package of differential reinforcement and multiple correction procedures (e.g., demonstration, active student response, and multiple response repetition) was effective in teaching new skills (e.g., tact and textual responding) to children with disabilities. Past research also revealed that children demonstrated little improvement when correction was absent in contrast to the obvious improvement when correction procedures and differential reinforcement were present (Kodak et al., 2016; Rapp et al., 2012; Worsdell et al., 2005). This study isolated and compared the effects of corrections for incorrect responses and positive reinforcement for correct responses in skill acquisition and identified that the correction procedure might be the reason why children learned faster when the instruction involved both correction procedures and differential reinforcement. This finding was consistent with Simonian and Brand (2022)’s results for college students that all participants mastered how to play novel dice game in the corrective feedback only condition but failed to master the game in the positive feedback (i.e., praise) only condition. According to Baumeister et al. (2001), across various psychological camps there is evidence that aversive events are more potent than appetitive ones, which suggests the strength of negative reinforcement contingencies. Thus, the reason that the students learned faster when the instruction included correction procedures may be that the motivation to avoid the correction procedure (i.e., negative reinforcement) is higher than the motivation to receive praise (i.e., positive reinforcement) following an independent correct response. The critical nature of corrections in the learning process is evident in studies conducted with peer tutoring and observational learning, as well. The researchers (Greer et al., 2004; Neu & Greer, 2019) found that elementary school students with and without developmental delays acquired novel skills (e.g., Korean terms and math problems) faster from observing their peers only receiving contingent corrections to incorrect responses during the LU instruction as compared to observing peers only receiving positive reinforcement following correct responses.

Additionally, the overall much slower learning in the PC condition than the CI and LU conditions suggested the low efficiency of trial-and-error learning in the absence of corrections because the learning rates depended on the occurrence rates of chance responding. According to Skinner (1965, p.248–249), trial-and-error learning is often an effect of random exploratory behavior that the target response may follow the antecedent stimulus by accident—and in this study, resulted in only modest gains in performance while other consequence conditions resulted in criterion-level performances. The results of current study extended previous findings that the correction procedure is a necessary component in skill acquisition. Since the CI procedure was sufficient for 3 out of 6 participants in Experiment 1 and 5 out of 6 participants in Experiment 2, these findings further suggest that the correction procedure may be sufficient in DTI for typically develo** and high functioning children with developmental delays.

The outcomes of current study also extended past research findings on the possible sources of control in correction procedures. The effects of correction procedure on increasing correct responses are commonly viewed as a result of negative reinforcement, such that additional prompted responses are required following the student’s emission of an incorrect response before the trial is terminated (Cariveau et al., 2019; Rodgers & Iwata, 1991). However, considering the consistent and overall fast acquisition and high maintenance level under the CI condition, in which contingent corrections were delivered following incorrect responses and no programmed reinforcers following correct responses (as compared to the LU condition), we hypothesize other sources of stimulus control in addition to negative reinforcement in the correction procedure. According to Simonian and Brand (2022), the corrective feedback (i.e., modeling of the correct response with verbal statement) alone was effective on teaching novel skills to college students. Thus, the modeling of the correct response immediately following an error in the correction procedure might serve a critical function in promoting the transfer of stimulus control in auditory-visual conditional discrimination training. According to Fantino (2010), “useful” or positive information could function as conditioned reinforcers and maintain the observing responses. Thus, the modeling of the correct response in the correction procedure may provide “positive” information to maintain the observing response and enhance the strength of stimulus control in learning AVCDs.

Interestingly, we did not notice any increase on the emission of problem behaviors for all participants in the CI condition. The researchers implemented a class wide independent group contingency throughout a school day for all students to maintain the high levels of compliance of students create a positive environment for learning. Data were recorded to measure each student’s compliance throughout a school day, but not specifically during this study. At the onset of the study, all participants were able to follow classroom rules independently with 90% accuracy for a 30-min interval. During the experiment, all participants demonstrated 100% accuracy in attending to the stimuli and following teacher’s directions across the conditions.

This experiment contained limitations. First, due to absences and school closure under the impact of COVID-19, we only collected maintenance data for up to two weeks for Andrew and Adam in Experiment 2. It would help verify the effectiveness of CI procedure on skill maintenance with more data. Second, we continued to rotate among the three consequence conditions until the participants met acquisition criterion for two conditions before moving to the maintenance assessments. Although this arrangement ensured that the participants had the same amount of exposure to the stimuli in the instruction, the extra trials for stimuli after reaching the acquisition criterion might lead to higher maintenance levels for the first mastered condition. We may further rule out this possibility by terminating the instruction and entering the maintenance probes for a condition immediately following the mastery. Third, the transition to the “best treatment” phase may make it look as though praise did not function as a reinforcer. However, we observed a) indications of praise functioning as a reinforcer during pre-experimental assessments (see Participants section), and b) an overall ascending trend in the PC condition for Jack, Adam, and Celine in Experiments 1 and 2. As noted earlier, the trial-and-error nature of the PC condition likely slowed acquisition. Eliminating the “best treatment” phase would have allowed for a more precise comparison of PC to LU and CI conditions. Since all participants had instructional history of LU instruction, such experience might favor the effectiveness of LU instruction and influence the outcomes (Coon & Miguel, 2012). However, all participants in current study demonstrated similar or faster acquisition rates in the CI condition than in the LU condition despite the lack of history with CI instruction. Nevertheless, participants without a history of LU instruction might respond differently to the comparison and future research should replicate the procedure with children with different histories of instruction.

The outcomes of the current study suggest several areas for future research. Because all participants in the study were high functioning with regard to language development (e.g., follow multiple step instructions; independently make requests using full sentences) and had an instructional history of LU instruction, the overall high effectiveness and efficiency of the CI procedure might be influenced by the participants’ high levels of compliance, robust verbal repertoires, and previous exposure to the LU instruction. The generality of the results to children with lower levels of compliance, less robust verbal repertoires, and different instructional histories needed further verification. Future research may also replicate the procedure with different sets of visual stimuli and multiple instruction programs requiring different responding topographies. The findings in the study also suggested a combined result of positive and negative reinforcement as well as antecedent control in the correction procedure. Future research may further test such hypotheses by conducting component analyses examining the effects of the correction procedures on skill acquisition. Since undergraduate students demonstrated higher preference for corrective feedback than positive feedback in skill acquisition (Simonian & Brand, 2022), future research may replicate their results and test the choice preference for positive and corrective feedback for children with and without disabilities either before or after the acquisition phase. Although praise functioned as reinforcers for all the participants in the study, future research may further investigate and compare the effects of LU and CI conditions using other reinforcers (e.g., tokens and edibles). The generality of the results with highly preferred items would help verify the hypothesis that the correction procedure is both a necessary and sufficient component of LU instruction in skill acquisition. CI condition produced the same or faster learning rate than LU condition for 3 out of 6 participants in Experiment 1 and 5 out of 6 participants in Experiment 2. Specifically, three out of six participants (Andrew, Jack, and Adam) required the same or fewer total trials in the CI condition than in the LU condition, whereas one participant (Celine) required fewer total trials in the LU condition than in the CI condition in both Experiments 1 and 2. Considering the different results across the participants, we hypothesized that the more efficient consequence procedure (LU or CI) in skill acquisition depended on the personal characteristics (e.g., instructional history, verbal behavior development, and self-management skills). Future research may further verify this hypothesis with data of more participants. Last, since prompting was present in the correction procedure of CI and LU conditions, additional comparisons among error-statement only condition without prompting (McGhan & Lerman, 2013), PC condition, and CI condition may help further assess the role of prompting in skill acquisition. While further research is needed to clarify aforementioned concerns, the results reported herein clearly demonstrated the critical role of the correction procedure in skill acquisition.

The results of this study also provided valuable information for instructors regarding the arrangement of instruction procedures. Considering that the LU and CI conditions involving the correction procedure produced overall much faster acquisition rates than the PC condition for all participants, we suggest instructors including the correction procedure in the arrangement of instruction consequences to effectively facilitate students’ learning in diverse educational settings. Because the participants in current study were able to sit at the table independently without emitting any problem behaviors for at least 5 min during academic instructions, we recommend instructors using the LU condition that involves positive reinforcement and correction procedure for children with lower levels of compliance to maintain the appropriate behaviors. For children like those we studied (e.g., with high levels of compliance, robust verbal repertoires, and previous exposure to the LU instruction), the instructors may select either LU or CI condition based on personal characteristics.