Differential item functioning; Mantel-Haenszel; impact; Type I error inflation; dispersion
Abstract :
[en] It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence of any DIF effect. This may happen when the items differ in discrimination and when there is item impact. On the other hand, outlier DIF methods have been developed that are robust against this Type I error inflation, while they are still based on the MH DIF statistic. The present paper gives an explanation for why the common MH method is indeed vulnerable to the inflation effect while the outlier DIF versions are not. In a simulation study we were able to produce the Type I error inflation by inducing item impact and item differences in discrimination. At the same time and in parallel with the Type I error inflation the dispersion of the DIF statistic across items was increased. As expected, the outlier DIF methods did not seem sensitive to impact and differences in item discrimination.
Disciplines :
Education & instruction
Author, co-author :
Magis, David ; Université de Liège - ULiège > Département des sciences biomédicales et précliniques > Histologie
De Boeck, Paul
Language :
English
Title :
Type I error inflation in DIF identification with Mantel-Haenszel: an explanation and a solution
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.
Bibliography
Angoff W. H.,Ford S. F.Item-race interaction on a test of scholastic aptitude.Journal of Educational Measurement. 1973;10:95-106
Benjamini Y.,Hochberg Y.Controlling the false discovery rate: A practical and powerful approach to multiple testing.Journal of the Royal Statistical Society. Series B (Methodological). 1995;57:289-300
Bolt D. M.A Monte Carlo comparison of parametric and nonparametric polytomous DIF detection methods.Applied Measurement in Education. 2002;15:113-141
Bolt D.,Gierl M. J.Testing features of graphical DIF: Application of a regression correction to three nonparametric statistical tests.Journal of Educational Measurement. 2006;43:313-333
Camilli G.,Shepard L. A.Methods for identifying biased test items. Thousand Oaks, CA: Sage; 1994:
Candell G. L.,Drasgow F.An iterative procedure for linking metrics and assessing item bias in item response theory.Applied Psychological Measurement. 1988;12:253-260
Cohen A. S.,Kim S.-H.A comparison of Lord’s chi-square and Raju’s area measures in detection of DIF.Applied Psychological Measurement. 1993;17:39-52
Clauser B. E.,Mazor K. M.Using statistical procedures to identify differential item functioning test items.Educational Measurement: Issues and Practice. 1998;17:31-44
Clauser B. E.,Mazor K. M.,Hambleton R. K.The effects of purification of the matching criterion on the identification of DIF using the Mantel-Haenszel procedure.Applied Measurement in Education. 1993;6:269-279
DeMars C. E.Type I error inflation for detecting DIF in the presence of impact.Educational and Psychological Measurement. 2010;70:961-972
Dorans N. J.,Holland P. W.Differential item functioning. Holland P.Wainer H., ed. Hillsdale, NJ: Erlbaum; 1993:
Dorans N. J.,Kulick E.Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test.Journal of Educational Measurement. 1986;23:355-368
Fidalgo A. M.,Mellenbergh G. J.,Muniz J.Effects of amount of DIF, test length, and purification type on robustness and power of Mantel–Haenszel procedures.Methods of Psychological Research. 2000;5:43-53
Finch W. H.The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST and the IRT likelihood-ratio.Applied Psychological Measurement. 2005;29:278-295
Finch W. H.,French B.Detection of crossing differential item functioning: A comparison of four methods.Educational and Psychological Measurement. 2007;67:565-582
Gnanadesikan R.,Kettenring J.Robust estimates, residuals, and outlier detection with multiresponse data.Biometrics. 1972;28:81-124
Holland P. W.,Thayer D. T.An alternate definition of the ETS delta scale of item difficulty. Princeton, NJ: Educational Testing Service; 1985:
Holland P. W.,Thayer D. T.Test validity. Wainer H.Braun H. I., ed. Hillsdale, NJ: Erlbaum; 1988:129-145.
Holland P. W.,Wainer H.Differential item functioning. Hillsdale, NJ: Erlbaum; 1993:
Holm S.A simple sequentially rejective multiple test procedure.Scandinavian Journal of Statistics. 1979;6:65-70
Kim J.,Oshima T. C.Effect of multiple testing adjustment in differential item functioning detection.Educational and Psychological Measurement. 2013;73:458-470
Lautenschlager G. J.,Park D.-G.IRT item bias detection procedures: Issues of model misspecification, robustness, and parameter linking.Applied Psychological Measurement. 1988;12:365-376
Lord F. M.Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum; 1980:
Magis D.,Béland S.,Tuerlinckx F.,De Boeck P.A general framework and an R package for the detection of dichotomous differential item functioning.Behavior Research Methods. 2010;42:847-862
Magis D.,De Boeck P.A robust outlier approach to prevent Type I error inflation in DIF.Educational and Psychological Measurement. 2012;72:291-311
Magis D.,Facon B.Angoff’s Delta method revisited: Improving the DIF detection under small samples.British Journal of Mathematical and Statistical Psychology. 2012;65:302-321
Meredith W.,Millsap R. E.On the misuse of manifest variables in the detection of measurement bias.Psychometrika. 1992;57:289-311
Millsap R. E.,Everson H. T.Methodology review: Statistical approaches for assessing measurement bias.Applied Psychological Measurement. 1993;17:297-334
Narayanan P.,Swaminathan H.Performance of the Mantel-Haenszel and simultaneous item bias procedures for detecting differential item functioning.Applied Psychological Measurement. 1994;18:315-328
Penfield R. D.Assessing differential item functioning among multiple groups: A comparison of three Mantel-Haenszel procedures.Applied Measurement in Education. 2001;14:235-259
Penfield R. D.,Camilli G.Handbook of statistics: Vol. 26. Psychometrics. Rao C. R.Sinharay S., ed. Amsterdam, Netherlands: Elsevier; 2007:125-167.
Philips A.,Holland P. W.Estimators of the variance of the Mantel-Haenszel log-odds ratio estimate.Biometrics. 1987;43:425-431
Raju N. S.The area between two item characteristic curves.Psychometrika. 1988;53:495-502
Raju N. S.Determining the significance of estimated signed and unsigned areas between two item response functions.Applied Psychological Measurement. 1990;14:197-207
Rogers H. J.,Swaminathan H.A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioning.Applied Psychological Measurement. 1993;17:105-116
Roussos L.,Stout W.Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and Mantel-Haenszel Type I error performance.Journal of Educational Measurement. 1996;33:215-230
Shealy R.,Stout W. F.A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DIF as well as item bias/DIF.Psychometrika. 1993;58:159-194
Swaminathan H.,Rogers H. J.Detecting differential item functioning using logistic regression procedures.Journal of Educational Measurement. 1990;27:361-370
Wainer H.Fourteen conversations about three things.Journal of Educational and Behavioral Statistics. 2010;35:5-25
Wang W.-C.,Su Y.-H.Effects of average signed area between two item characteristic curves and test purification procedures on the DIF detection via the Mantel-Haenszel method.Applied Measurement in Education. 2004;17:113-144
Wang W.-C.,Yeh Y.-L.Effects of anchor item methods on differential item functioning detection with the likelihood ratio test.Applied Psychological Measurement. 2003;27:479-498
Zwick R.When do item responses function and Mantel-Haenszel definitions of differential item functioning coincide?.Journal of Educational Statistics. 1990;15:185-197
This website uses cookies to improve user experience. Read more
Save & Close
Accept all
Decline all
Show detailsHide details
Cookie declaration
About cookies
Strictly necessary
Performance
Strictly necessary cookies allow core website functionality such as user login and account management. The website cannot be used properly without strictly necessary cookies.
This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.
Performance cookies are used to see how visitors use the website, eg. analytics cookies. Those cookies cannot be used to directly identify a certain visitor.
Used to store the attribution information, the referrer initially used to visit the website
Cookies are small text files that are placed on your computer by websites that you visit. Websites use cookies to help users navigate efficiently and perform certain functions. Cookies that are required for the website to operate properly are allowed to be set without your permission. All other cookies need to be approved before they can be set in the browser.
You can change your consent to cookie usage at any time on our Privacy Policy page.