Differential Item Functioning of University Entrance Exam: Using Rasch Analysis

Document Type : research article

Author

Iran University of Medical Science

Abstract

The present study aims at investigating the presence of Differential Item Functioning (DIF) in terms of gender in a high stakes language proficiency test, the National University Entrance Exam for Foreign Languages (NUEEFL). The participants (N = 5000) of this study have been selected randomly from a pool of examinees who had taken the NUEEFL as a university entrance requirement for English language studies (English literature, Teaching, and Translation). The results revealed that among 95 items, 40 items exhibit DIF between male and female. Our investigation revealed that the test is not unidimensional and a correct answer requires other knowledge, ability, and skill than the ones that the items aim to measure. It is concluded that the NUEEFL test’s scores are not free of construct-irrelevant variance and the overall fairness of the test is under question. In addition, the current research provides several important implications for test designers, stake-holders, administrators, as well as teachers and students.

Keywords


Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 57, No. 1., pp. 289-300

Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch Analysis in the Human Sciences. Dordrecht: Springer Science, and Business Media.

Boyle, J. (1987). Sex differences in listening vocabulary. Language Learning, 37(2), 273-284.

Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational Measurement (4th ed., Vol. 4, pp. 221-256). Westport, CT: American Council on Education & Praeger.

Camilli, G., & Penfield, D. A. (1997). Variance estimation for differential test functioning based on the Mantel-Haenszel log-odds ratio. Journal of Educational Measurement, 34, 123–139.

Cole, N. S. (1997). The ETS gender study: How females and males perform in educational settings. Princeton, NJ: Educational Testing Service.

DeMars, C. (2010). Item Response Theory: Understanding statistics measurement. Oxford, UK: Oxford University Press.

Embretson, S. E., & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.

Furr, M. R., & Bacharach, V. R. (2007). Psychometrics: An Introduction. Thousand Oaks, CA: SAGE.

Geranpayeh, A., & Kunnan, A. J. (2007). Differential Item Functioning in terms of age in the Certificate in Advanced English Examination. Language Assessment Quarterly, 4, 190-222.

Holland, P. W., & Wainer, H. E. (2012). Differential item functioning. London, UK: Routledge.

Karami, H. (2015). A closer look at the validity of the University Entrance Exam: Dimensionality and generalizability. (Unpublished Ph.D dissertation, University of Tehran).

Linacre, J. M. (1991-2006). A user’s guide to Winsteps® Ministep Rasch-model computer programs. Retrieved January, 10, 2007, from http://www.winsteps.com/aftp/winsteps.pdf.

Linacre, J. M. (2012). A user’s guide to https://www.winsteps.com/winman/copyright.htm

Linacre, J. M. (2016). Winsteps® (Version 3.92.1) [Computer Software]. Beaverton, OR: Winsteps.com. Retrieved from http://www.winsteps.com/

McNamara, T., & Roever, C. (2006) Language testing: The social dimension. Malden, MA: Blackwell.

Ostini, R., & Nering, M. L. (2006). Polytomous item response theory models. Quantitative applications in the social sciences. Thousand Oaks, CA: SAGE.

Pae, H. (2011). Differential item functioning and unidimensionality in the Pearson Test of English Academic. Pearson Education Ltd.

Ramsey, P. A. (1993). Sensitivity review: The ETS experience as a case study. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 367-388). Hillsdale, NJ: Erlbaum.

Ryan, K., & Bachman, L. (1992). Differential item functioning on two tests of EFL proficiency. Language testing, 9(1), 12-29.