Abstract

Medical education has gradually shifted toward a competency-based approach over the past decade. To comply with expectations of accrediting bodies, medical educators have been asked to demonstrate that graduating students and residents are competent to advance to the next phase of training. However, traditional clinical skills evaluations are subjective and often based on one-on-one interactions with supervising faculty members. Additional concerns have been raised about uneven clinical experiences leading to variable clinical skill acquisition among graduates. Mastery learning curricula allow medical schools and GME programs to document clinical skill acquisition with high reliability and make valid decisions about trainee competency. In this chapter, we discuss historical standard setting methods used in health professions education and set a path forward for fair, reasonable, and evidence-based standard setting in mastery learning environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. McGaghie WC, Miller GE, Sajid A, Telder TV. Competency-based curriculum development in medical education. Public Health Paper No. 68. Geneva: World Health Organization; 1978.

    Google Scholar 

  2. McGaghie WC. Mastery learning: it is time for medical education to join the 21st century. Acad Med. 2015;90(11):1438–41.

    Article  Google Scholar 

  3. Yudkowsky R, Park YS, Lineberry M, Knox A, Ritter EM. Setting mastery learning standards. Acad Med. 2015;90(11):1495–500.

    Article  Google Scholar 

  4. Norcini J, Guille R. Combining tests and setting standards. In: Norman GR, Van der Vleuten CPM, Newble DI, editors. International handbook of research in medical education. Dordrecht: Kluwer Academic Publishers; 2002. p. 811–34.

    Chapter  Google Scholar 

  5. Downing SM, Tekian A, Yudkowsky R. Procedures for establishing defensible absolute passing scores on performance examinations in health professions education. Teach Learn Med. 2006;18:50–7.

    Article  Google Scholar 

  6. Wayne DB, Fudala MJ, Butter J, Siddall VJ, Feinglass J, Wade LD, McGaghie WC. Comparison of two standard-setting methods for advanced cardiac life support training. Acad Med. 2005;80(10 Suppl):S63–6.

    Article  Google Scholar 

  7. Wayne DB, Barsuk JH, Cohen E, McGaghie WC. Do baseline data influence standard setting for a clinical skills examination? Acad Med. 2007;82(10 Suppl):S105–8.

    Article  Google Scholar 

  8. Wayne DB, Cohen E, Makoul G, McGaghie WC. The impact of judge selection on standard setting for a patient survey of physician communication skills. Acad Med. 2008;83(10 Suppl):S17–20.

    Article  Google Scholar 

  9. Wayne DB, Butter J, Cohen ER, McGaghie WC. Setting defensible standards for cardiac auscultation skills in medical students. Acad Med. 2009;84(10 Suppl):S94–6.

    Article  Google Scholar 

  10. Cohen ER, Barsuk JH, McGaghie WC, Wayne DB. Raising the bar: reassessing standards for procedural competence. Teach Learn Med. 2013;25(1):6–9.

    Article  Google Scholar 

  11. Sharma R, Szmuilowicz E, Ogunseitan A, Montalvo J, O’Leary K, Wayne DB. Evaluation of a mastery learning intervention on hospitalists’ code status discussion skills. J Pain Symptom Manag. 2017;53(6):1066–70.

    Article  Google Scholar 

  12. Angoff WH. Scales, norms, and equivalent scores. In: Thorndike RL, editor. Educational measurement. 2nd ed. Washington, DC: American Council on Education; 1971. p. 508–600.

    Google Scholar 

  13. Clauser BE, Mee J, Baldwin SG, Margolis MJ, Dillon GF. Judges’ use of examinee performance data in an Angoff standard-setting exercise for a medical licensing examination: an experimental study. J Educ Meas. 2009;46:390–407.

    Article  Google Scholar 

  14. Hofstee WKB. The case for compromise in educational selection and grading. In: Anderson SB, editor. On educational testing. San Francisco: Jossey-Bass; 1983. p. 107–27.

    Google Scholar 

  15. Schindler N, Corcoran J, DaRosa D. Description and impact of using a standard-setting method for determining pass/fail scores in a surgery clerkship. Am J Surg. 2007;193(2):252–7.

    Article  Google Scholar 

  16. Barsuk JH, Cohen ER, Wayne DB, McGaghie WC, Yudkowsky R. A comparison of approaches for mastery learning standard setting. Acad Med. 2018;93:1079.

    Article  Google Scholar 

  17. McGaghie WC, Issenberg SB, Barsuk JH, Cohen ER, Wayne DB. Translational educational research: a necessity for effective health-care improvement. Chest. 2012;142(5):1097–103.

    Article  Google Scholar 

  18. Wayne DB, Didwania A, Fudals M, Barsuk JH, Feinglass J, McGaghie WC. Simulation-based education improves quality of care during cardiac arrest team responses at an academic teaching hospital: a case-control study. Chest. 2008;133:56–61.

    Article  Google Scholar 

  19. Barsuk JH, Cohen ER, Feinglass J, McGaghie WC, Wayne DB. Use of simulation-based education to reduce catheter-related bloodstream infections. Arch Intern Med. 2009;169(15):1420–3.

    Article  Google Scholar 

  20. Gossett DR, Gilchrist-Scott D, Wayne DB, Gerber SE. Simulation training for forceps assisted vaginal delivery and rates of maternal perineal trauma. Obstet Gynecol. 2016;128(3):429–35.

    Article  CAS  Google Scholar 

  21. Cohen ER, Feinglass J, Barsuk JH, Barnard C, O’Donnell A, McGaghie WC, Wayne DB. Cost Savings from reduced catheter-related bloodstream infection after simulation-based education for residents in a medical intensive care unit. Simul Healthc. 2010;5:98–102.

    Article  Google Scholar 

  22. Barsuk JH, Cohen ER, Feinglass J, Kozmic SE, McGaghie WC, Wayne DB. Cost savings of performing paracentesis procedures at the bedside. Simul Healthc. 2014;9(5):312–8.

    Article  Google Scholar 

  23. Yudkowsky R, Tumuluru S, Casey P, Herlich N, Ledonne C. A patient safety approach to setting pass/fail standards for basic procedural skills checklists. Simul Healthc. 2014;9(5):277–82.

    Article  Google Scholar 

  24. Prenner SB, McGaghie WC, Chuzi S, Cantey E, Didwania A, Barsuk JH. Effect of trainee performance data on standard setting judgments using the mastery Angoff method. J Grad Med Educ. 2018;10:301.

    Article  Google Scholar 

  25. Ilgen JS, Ma IW, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. Med Educ. 2015;49(2):161–73.

    Article  Google Scholar 

  26. Barsuk JH, Cohen ER, Caprio T, McGaghie WC, Simuni T, Wayne DB. Simulation-based education with mastery learning improves lumbar puncture skills. Neurology. 2012;79(2):132–7.

    Google Scholar 

  27. De Gruijter DNM. Compromise models for establishing examination standards. J Educ Meas. 1985;22(4):263–9.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diane B. Wayne .

Editor information

Editors and Affiliations

Appendix 6.1: Standard Setting Packets for Traditional Angoff and Hofstee and Mastery Angoff and Patient-Safety Methods for Simulated Lumbar Puncture

Appendix 6.1: Standard Setting Packets for Traditional Angoff and Hofstee and Mastery Angoff and Patient-Safety Methods for Simulated Lumbar Puncture

Performance data (reviewing these data may be useful for traditional Angoff and Hofstee approaches )

This table shows sample pretest and posttest data from a pilot group of 57 internal medicine residents performing a simulated lumbar puncture procedure. Overall pretest and posttest means/standard deviations are displayed as well as the frequency of each overall score at pre- and posttest

% Correct

Pretest frequency

Posttest frequency

10%

1

0

19%

1

0

24%

6

0

29%

4

0

33%

5

0

38%

7

0

43%

6

0

48%

6

0

52%

3

1

57%

3

0

62%

4

0

67%

5

1

71%

1

0

76%

3

0

81%

1

1

86%

1

5

90%

0

8

95%

0

15

100%

0

26

 

Mean = 46.3%

Mean = 94.4%

 

SD = 17.6%

SD = 8.5%

  1. A.

    Traditional Angoff Method

    1. 1.

      Select the judges.

    2. 2.

      Discuss the purpose of the test, the curriculum and assessment, the nature of the examinees, and what constitutes adequate and inadequate skills/knowledge. Review baseline performance data.

    3. 3.

      Define the “borderline” group, a group that has a 50–50 chance of passing.

    4. 4.

      Read the first item.

    5. 5.

      Each judge estimates the proportion of the borderline group that would perform it correctly.

    6. 6.

      The ratings are recorded for all to see, discuss, and change as appropriate.

    7. 7.

      Repeat steps 4–6 for each item.

    8. 8.

      Calculate the passing score by averaging the estimates of all judges for each item and summing the items.

    9. 9.

      Use the checklist belowa to do this exercise.

Checklist item

Pilot pretest data (%)

% of borderline residents who perform each step correctly

Clean the skin with betadine (may not use chlorhexidine) × 3

30

 

Drape the patient

91

 

Use 1% lidocaine to form a wheal at intended site

54

 

Numb deeper structure (larger needle)

54

 

Insert spinal needle advancing toward umbilicus (may be more cephalad depending on how flexed the spine)

65

 

Bevel must be in correct direction

46

 

Slowly advance the needle with periodic checking for CSF (removal of stylet) until enter space

23

 

Measure opening pressure

14

 
  1. aThis is a partial checklist adapted from Barsuk et al. [26]. In an actual standard setting exercise, insert complete assessment tool with performance data
  1. B.

    Traditional Hofstee Method

    1. 1.

      Select the judges.

    2. 2.

      Discuss the purpose of the test, the curriculum and assessment, the nature of the examinees, and what constitutes adequate and inadequate skills/knowledge. Review baseline performance data.

    3. 3.

      Review the test in detail.

    4. 4.

      Ask the judges to answer four questions:

      1. (a)

        What is the minimum acceptable required passing score?

      2. (b)

        What is the maximum acceptable required passing score?

      3. (c)

        What is the minimum acceptable fail rate?

      4. (d)

        What is the maximum acceptable fail rate?

    5. 5.

      After the test is given, graph the distribution of scores and select the cut score as described by De Gruitera

Clinical skill standard setting

Hofstee method

 

Minimum acceptable required passing score

Maximum acceptable required passing score

Minimum acceptable fail rate

Maximum acceptable fail rate

Clinical skill

    
  1. aDe Gruiter [27]
  1. C.

    Mastery Angoff Method

  1. 1.

    Select the judges.

  2. 2.

    Discuss the purpose of the test, the curriculum and assessment, the nature of the examinees, and what constitutes adequate and inadequate skills/knowledge.

    1. (a)

      Mastery learning: residents can continue to practice and retest until they achieve the passing standard (no penalty for taking a longer time or multiple retests).

    2. (b)

      Past performance data is not relevant, since residents can keep practicing until they can accomplish even difficult items.

  3. 3.

    Define the “well prepared to succeed” group: the standard reflects the expected performance in the sim lab of residents who are:

    1. (a)

      Well prepared to perform the procedure

    2. (b)

      Safely and successfully

    3. (c)

      On live patients

    4. (d)

      With minimal supervision

  4. 4.

    Read the first item.

  5. 5.

    Each judge estimates the proportion of the “well prepared” group that would get it right (or the probability that any individual “well prepared” resident would get it right).

  6. 6.

    The ratings are recorded for all to see, discuss, and change as appropriate.

  7. 7.

    Repeat steps 4–6 for each item.

  8. 8.

    Calculate the passing score by averaging the estimates of all judges for each item and summing the items.

  9. 9.

    Use the checklist belowa to do this exercise.

Checklist item

% of well-prepared residents who accomplish this item correctly in the sim lab

Clean the skin with betadine (may not use chlorhexidine) × 3

 

Drape the patient

 

Use 1% lidocaine to form a wheal at intended site

 

Numb deeper structure (larger needle)

 

Insert spinal needle advancing toward umbilicus (may be more cephalad depending on how flexed the spine)

 

Bevel must be in correct direction

 

Slowly advance the needle with periodic checking for CSF (removal of stylet) until enter space

 

Measure opening pressure

 
  1. aThis is a partial checklist adapted from Barsuk et al. [26]. In an actual standard setting exercise, insert complete assessment tool
  1. D.

    Patient-Safety Method

  1. 1.

    Select the judges.

  2. 2.

    Discuss the purpose of the test, the curriculum, assessment, and the nature of the examinees.

    1. (a)

      Mastery learning: residents can continue to practice and retest until they achieve the passing standard (no penalty for taking a longer time or multiple retests).

  3. 3.

    Determine dimensions relevant to patient safety.

    In this case we will consider relevant dimensions to be

    1. (a)

      Patient or provider safety

    2. (b)

      Patient comfort

    3. (c)

      The outcome of the procedure

  4. 4.

    For each item, each judge indicates whether performance or non-performance of this item would impact each of these dimensions.

  5. 5.

    Do this for the skills checklist belowa

  6. 6.

    Set standards separately for critical and non-critical items.

    1. (a)

      An item that impacts any one of the three dimensions is considered a critical item.

    2. (b)

      An item that does not impact any one of these dimensions is considered a non-critical item.

  7. 7.

    Average across judges to determine:

    1. (a)

      Which items are critical or non-critical

    2. (b)

      Passing scores for critical and non-critical items

  8. 8.

    Standards are not connected. Accomplishing non-critical items does not compensate for non-performance of critical items.

Checklist itema

Impacts safety?

Impacts comfort?

Impacts outcome?

Clean the skin with betadine (may not use chlorhexidine) × 3

Yes

No

Yes

No

Yes

No

Drape the patient

Yes

No

Yes

No

Yes

No

Use 1% lidocaine to form a wheal at intended site

Yes

No

Yes

No

Yes

No

Numb deeper structure (larger needle)

Yes

No

Yes

No

Yes

No

Insert spinal needle advancing toward umbilicus (may be more cephalad depending on how flexed the spine)

Yes

No

Yes

No

Yes

No

Bevel must be in correct direction

Yes

No

Yes

No

Yes

No

Slowly advance the needle with periodic checking for CSF (removal of stylet) until enter space

Yes

No

Yes

No

Yes

No

Measure opening pressure

Yes

No

Yes

No

Yes

No

  1. aThis is a partial checklist adapted from Barsuk et al. [26]. In an actual standard setting exercise, insert complete assessment tool

Setting the Standard

The passing standard represents performance in the simulation lab, before performing the procedure on live patients. Residents can continue to practice and retest until they achieve the passing standard; there is no penalty for taking a longer time or multiple retests.

  1. 1.

    What should be the passing standard for critical items, i.e., items that impact patient or provider safety, patient comfort, or procedure outcome? What proportion of critical items should residents perform correctly in the sim lab before performing the procedure on live patients with minimal supervision?

    ______%

  1. 2.

    What should be the passing standard for non-critical items , i.e., items that do not impact patient or provider safety, patient comfort, or procedure outcome? What proportion of non-critical items should residents perform correctly in the sim lab before performing the procedure on live patients with minimal supervision?

    ______%

Please add any comments you may have about these standard setting procedures:

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wayne, D.B., Cohen, E.R., Barsuk, J.H. (2020). Standard Setting for Mastery Learning. In: McGaghie, W., Barsuk, J., Wayne, D. (eds) Comprehensive Healthcare Simulation: Mastery Learning in Health Professions Education. Comprehensive Healthcare Simulation. Springer, Cham. https://doi.org/10.1007/978-3-030-34811-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34811-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34810-6

  • Online ISBN: 978-3-030-34811-3

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics

Navigation