INTERDISCIPLINARY JOURNAL OF ENVIRONMENTAL AND SCIENCE EDUCATION
Research Article

Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment

Interdisciplinary Journal of Environmental and Science Education, 2022, 18(4), e2297, https://doi.org/10.21601/ijese/12299
Full Text (PDF)

ABSTRACT

Along with the trend emphasizing ID learning, ID assessments to measure students’ ID understanding have been developed by several scholars. The interdisciplinary science assessment for carbon cycling (ISACC) was developed to assess ID understanding among high school and college students in integrating knowledge from different science disciplines to explain a scientific phenomenon, global carbon cycling. The ISACC’s construct validity was checked using traditional item response theory (IRT) models in 2021. The current study was motivated by the desire to reveal the difference in IRT analysis results of the ISACC using a Bayesian approach in comparison with the results using the traditional approach. The Bayesian approach has several strengths over the traditional IRT. The results of the study imply the need for additional research for the development and validation of interdisciplinary science assessments through strong psychometric properties.

KEYWORDS

interdisciplinary understanding carbon cycling assessment item response theory Bayesian approach

CITATION (APA)

You, H. (2022). Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment. Interdisciplinary Journal of Environmental and Science Education, 18(4), e2297. https://doi.org/10.21601/ijese/12299
Harvard
You, H. (2022). Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment. Interdisciplinary Journal of Environmental and Science Education, 18(4), e2297. https://doi.org/10.21601/ijese/12299
Vancouver
You H. Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment. INTERDISCIP J ENV SCI ED. 2022;18(4):e2297. https://doi.org/10.21601/ijese/12299
AMA
You H. Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment. INTERDISCIP J ENV SCI ED. 2022;18(4), e2297. https://doi.org/10.21601/ijese/12299
Chicago
You, Hyesun. "Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment". Interdisciplinary Journal of Environmental and Science Education 2022 18 no. 4 (2022): e2297. https://doi.org/10.21601/ijese/12299
MLA
You, Hyesun "Bayesian Versus Frequentist Estimation for Item Response Theory Models of Interdisciplinary Science Assessment". Interdisciplinary Journal of Environmental and Science Education, vol. 18, no. 4, 2022, e2297. https://doi.org/10.21601/ijese/12299

REFERENCES

  1. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord, & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397-479). Addison-Wesley.
  2. Boix Mansilla, V., & Duraisingh, E. D. (2007). Targeted assessment of students’ interdisciplinary work: An empirically grounded framework proposed. The Journal of Higher Education, 78(2), 215-237. https://doi.org/10.1080/00221546.2007.11780874
  3. Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press. https://doi.org/10.4324/9781410605269
  4. Furr, D. C. (2017). Bayesian and frequentist cross-validation methods for explanatory item response models. University of California, Berkeley.
  5. Gao, F., & Chen, L. (2005). Bayesian or non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model. Applied Measurement in Education, 18(4), 351-380. https://doi.org/10.1207/s15324818ame1804_2
  6. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4), 457-472. https://doi.org/10.1214/ss/1177011136
  7. Gelman, A., & Rubin, D. B. (1995). Avoiding model selection in Bayesian social research. Sociological Methodology, 25, 165-173. https://doi.org/10.2307/271064
  8. Gelman, A., Lee, D., & Guo, J. (2015). Stan: A probabilistic programming language for Bayesian inference and optimization. Journal of Educational and Behavioral Statistics, 40(5), 530-543. https://doi.org/10.3102/1076998615606113
  9. Hoffman, M. D., & Gelman, A. (2014). The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.
  10. Hsieh, M. I., Proctor, T. P., Hou, J. I., & Teo, K. S. (2010). A comparison of Bayesian MCMC and marginal maximum likelihood methods in estimating the item parameters of the 2PL IRT model. International Journal of Innovative Management, Information & Production, 1(1), 81-89.
  11. Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (pp. 17-64). American Council on Education/Macmillan.
  12. Klein, J. T. (1990). Interdisciplinarity: History, theory, and practice. Wayne State University Press.
  13. Kruschke, J. (2015). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Elsevier Science. https://doi.org/10.1016/B978-0-12-405888-0.00008-8
  14. Lord, F. M. (1986). Maximum likelihood and Bayesian parameter estimation in item response theory. Journal of Educational Measurement, 23(2) 157-162. https://doi.org/10.1111/j.1745-3984.1986.tb00241.x
  15. Luo, S., Ma, J., & Kieburtz, K. D. (2013). Robust Bayesian inference for multivariate longitudinal data by using normal/independent distributions. Statistics in Medicine, 32(22), 3812-3828. https://doi.org/10.1002/sim.5778
  16. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. https://doi.org/10.1007/BF02296272
  17. McNeish, D. M., & Stapleton, L. M. (2016). The effect of small sample size on two-level model estimates: A review and illustration. Educational Psychology Review, 28(2), 295-314. https://doi.org/10.1007/s10648-014-9287-x
  18. Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159-176. https://doi.org/10.1177/014662169201600206
  19. Nishio, M., Akasaka, T., Sakamoto, R., & Togashi, K. (2020). Bayesian statistical model of item response theory in observer studies of radiologists. Academic Radiology, 27(3), e45-e54. https://doi.org/10.1016/j.acra.2019.04.014
  20. Nitko, A. J., & Brookhart, S. M. (2010). Educational assessment of students. Pearson Education.
  21. Rasch, G. (1960). Probabilistic model for some intelligence and achievement tests. Danish Institute for Educational Research.
  22. Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. University of Chicago Press.
  23. Reise, S. P., & Waller, N. G. (2002). Item response theory for dichotomous assessment data. In F. Drasgow, & N. Schmitt (Eds.), Measuring and analyzing behavior in organizations: Advances in measurement and data analysis (pp. 88-122). Jossey-Bass.
  24. Reiska, P., Soika, K., & Cañas, A. J. (2018). Using concept mapping to measure changes in interdisciplinary learning during high school. Knowledge Management & E-Learning: An International Journal, 10(1), 1-24. https://doi.org/10.34105/j.kmel.2018.10.001
  25. Schaal, S., Bogner, F. X., & Girwidz, R. (2010). Concept mapping assessment of media assisted learning in interdisciplinary science education. Research in Science Education, 40(3), 339-352. https://doi.org/10.1007/s11165-009-9123-3
  26. Shen, J., Liu, O. L., & Sung, S. (2014). Designing interdisciplinary assessments in sciences for college students: An example on osmosis. International Journal of Science Education, 36(11), 1773-1793. https://doi.org/10.1080/09500693.2013.879224
  27. Spelt, E. J., Biemans, H. J., Tobi, H., Luning, P. A., & Mulder, M. (2009). Teaching and learning in interdisciplinary higher education: A systematic review. Educational Psychology Review, 21(4), 365-378. https://doi.org/10.1007/s10648-009-9113-z
  28. Tripp, B., Voronoff, S. A., & Shortlidge, E. E. (2020). Crossing boundaries: Steps toward measuring undergraduates’ interdisciplinary science understanding. CBE—Life Sciences Education, 19(1), ar8. https://doi.org/10.1187/cbe.19-09-0168
  29. Wilson, M. (2005). Constructing measures: An item response modeling approach. Lawrence Erlbaum Associates.
  30. Yang, Y., He, P., & Liu, X. (2017). Validation of an instrument for measuring students’ understanding of interdisciplinary science in grades 4-8 over multiple semesters: A Rasch measurement study. International Journal of Science and Mathematics Education, 16(4), 639-654. https://doi.org/10.1007/s10763-017-9805-7
  31. You, H. S., Marshall, J. A., & Delgado, C. (2018). Assessing students' disciplinary and interdisciplinary understanding of global carbon cycling. Journal of Research in Science Teaching, 55(3), 377-398. https://doi.org/10.1002/tea.21423
  32. You, H. S., Marshall, J. A., & Delgado, C. (2021). Toward interdisciplinary learning: Development and validation of an assessment for interdisciplinary understanding of global carbon cycling. Research in Science Education, 51, 1197-1221. https://doi.org/10.1007/s11165-019-9836-x
  33. You, H. S., Park, S., Marshall, J. A., & Delgado, C. (2022). Interdisciplinary science assessment of carbon cycling: Construct validity evidence based on internal structure. Research in Science Education, 52(5), 473-492. https://doi.org/10.1007/s11165-020-09943-9

LICENSE

Creative Commons License
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.