Item Response Theory

  • Chapter
  • First Online:
Test Data Engineering

Part of the book series: Behaviormetrics: Quantitative Approaches to Human Behavior ((BQAHB,volume 13))

  • 346 Accesses

Abstract

Item response theory (IRT; e.g., Lord, 1980; Hambleton and Swaminathan, 1984; Toyoda, 2012; van der Linden, 2016) is the most popular statistical model today as a background theory of test administration. The Program for International Student Assessment, Trends in International Mathematics and Science Study, and Test of English as a Foreign Language, as well as many global-scale tests, are designed based on IRT. In addition, this theory is increasingly becoming an essential subject, not only for students of educational measurement, but also for students of educational psychology in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 117.69
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 149.79
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 149.79
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Multidimensional IRT (Reckase, 2009) can be applied when the ability scale is more than one-dimensional.

  2. 2.

    Also called the item characteristic curve (ICC).

  3. 3.

    Pr[A|B] stands for “the probability of B given A.”

  4. 4.

    \(\mathrm{LHS}:=\mathrm{RHS}\) means “LHS is defined as RHS.”

  5. 5.

    Often called the discrimination parameter.

  6. 6.

    Often called the difficulty parameter.

  7. 7.

    This is not true for the 3PLM and 4PLM.

  8. 8.

    \(\exp (0)=e^0=2.7183^0=1\). Note that a number to the 0th power is 1.

  9. 9.

    Note that this item is not highly discriminating across the entire \(\theta \) scale. There is no such globally discriminating item.

  10. 10.

    However, the 4PLM was excluded.

  11. 11.

    Also called test characteristic curve (TCC).

  12. 12.

    E[A|B] denotes the expectation of A given B. The expectation is the average of a random variable.

  13. 13.

    However, it is an assumption rarely met in the real world.

  14. 14.

    See also (p. 424).

  15. 15.

    Yen (1984, 1993) also proposed an index of local independence, \(Q_3\).

  16. 16.

    At age 12, the average scores of female students are generally higher in all subjects.

  17. 17.

    This is similar to the assumption of data normality in analysis of variance. Strictly speaking, there is no random variable in the real world that is exactly normally distributed (Matloff, 2020). The normal distribution is a mental abstraction.

  18. 18.

    \(\wedge \) means “and” (logical conjunction). In addition, \(\vee \) means “or” (logical disjunction).

  19. 19.

    If two dices are rolled, the joint probability of getting an even number and a three is factored as

    .

  20. 20.

    \(\Leftrightarrow \) represents “if and only if.” \(A\Leftrightarrow B\) means “A is true (false) if and only if B is true (false).”

  21. 21.

    For real numbers a, b, c, and d, \(a\gtrless b\Leftrightarrow c\gtrless d\) means “\(a>b\) if \(c>d\) and \(a<b\) if \(c<d\).”

  22. 22.

    For a continuous random variable x whose density is f(x), the expectation is \(\int xf(x)\mathrm {d}x\).

  23. 23.

    For score x, a transformation such as \(ax+b\).

  24. 24.

    All item parameters are predetermined.

  25. 25.

    \(x^{1/2}=\sqrt{x}\). Generally, \(x^{1/a}=\root a \of {x}\).

  26. 26.

    For the details of the derivation, see Sect. 4.4.5 (p. 114).

  27. 27.

    \(x^{-a}=1/x^a\).

  28. 28.

    As to a method using the Gibbs sampler, see Albert (1992) and Shigemasu and Nakamura (1996).

  29. 29.

    This in fact is a multiple (S-fold) integration as follows:

    $$\begin{aligned} \int \limits _{-\infty }^{\infty } f(\boldsymbol{\theta })\mathrm {d}\boldsymbol{\theta } =\int \limits _{-\infty }^{\infty }\cdots \int \limits _{-\infty }^{\infty }\int \limits _{-\infty }^{\infty } f(\boldsymbol{\theta })\mathrm {d}\theta _1\mathrm {d}\theta _2\ldots \mathrm {d}\theta _S. \end{aligned}$$

    .

  30. 30.

    The order of integration and summation is interchangeable.

  31. 31.

    \(n!=n\times (n-1)\times \cdots \times 2\times 1=\prod \limits _{i=1}^ni\).

  32. 32.

    The hyperparameters can be treated as unknown. Such an analysis is called hierarchical Bayesian estimation (e.g., Mislevy, 1986).

  33. 33.

    The constant term of the prior for the item parameters is excluded from this calculation.

  34. 34.

    A bad starting point is set to clearly show the optimization process.

  35. 35.

    In estimating the MLEs of the item parameters, the ELL \({ell}(\boldsymbol{U}|\boldsymbol{\Lambda }^{(t)})\) should be monitored.

  36. 36.

    The results may differ across software programs due to differences in optimization methods and prior densities of the item parameters. In particular, it is necessary to be careful whether the slope parameter includes the scaling factor or not (see , p. 87). In this book, the scaling factor is included in the slope parameter.

  37. 37.

    The IIF decreases as the lower asymptote parameter increases (see Sect. 4.4.3, p. 112).

  38. 38.

    More precisely, this is the asymptotic PSD that approaches the true PSD as the sample size increases.

  39. 39.

    More precisely, this is the asymptotic SE that approaches the true SE as the sample size increases.

  40. 40.

    A square matrix all of whose off-diagonal elements are 0.

  41. 41.

    The second-order derivative matrix is called the Hessian matrix. Thus, the Fisher information matrix is the negative expectation of the Hessian matrix.

  42. 42.

    \(\partial f(\boldsymbol{x})/\partial x_j\) denotes the partial derivative of a function f of multiple variables \(\boldsymbol{x}\) with respect to a single variable \(x_j\). When differentiated with respect to \(x_j\), the other variables are regarded as constants as follows:

    $$\displaystyle \frac{\partial }{\partial x_2} (ax_1^3+bx_1x_2^2+cx_2x_3^2) =0+2bx_1x_2+cx_3^2$$

    .

  43. 43.

    Similarly, the (2, 1)-th element is the (cross) second-order partial derivative with respect to a and b, where \({ell}(\boldsymbol{u}_j|\boldsymbol{\lambda })\) is first differentiated with respect to a and then with respect to b (location parameter). The order of the differentiation is interchangeable.

  44. 44.

    Note that \(\ln 0=-\infty \) and \(\ln 1=0\).

  45. 45.

    The ELL (not the ELP) should be used even in the case where the estimates are MAPs. This choice is necessary, for example, when comparing the goodness of fit between the 2PLM, 3PLM, and 4PLM, because the priors used for them are different.

  46. 46.

    Note that a scalar is a matrix with one row and one column.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kojiro Shojima .

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Shojima, K. (2022). Item Response Theory. In: Test Data Engineering . Behaviormetrics: Quantitative Approaches to Human Behavior, vol 13. Springer, Singapore. https://doi.org/10.1007/978-981-16-9986-3_4

Download citation

Publish with us

Policies and ethics

Navigation