An Argument-based Validation of the Vocabulary Subtest of MA Entrance Exam for English Majors

Document Type : research article

Author

Foreign Languages Department, Faculty of Humanities and Arts, Hazrat-e Masoumeh University, Qom, Iran

Abstract

Despite ample research on university entrance examinations, the vocabulary section of the M.A. entrance exam for English-related majors in Iran, which has long been criticized by both candidates and lecturers, has not been examined comprehensively and exclusively. The present study aimed to evaluate the vocabulary section of the exam in the past five years through an argument-based validation. The participants included 194 English-major undergraduate students, 24 M.A. students, 16 university professors, and six native speakers of English, who responded to the vocabulary section, a vocabulary size test as a criterion measure, and a questionnaire. The lexical items in the vocabulary sections were analyzed against general and specialized corpora as well as major word lists. To examine item characteristics, test validity, and reliability, the researcher employed item analysis procedures, criterion-related validation, and internal consistency (Cronbach’s alpha), and the participants’ questionnaire responses were analyzed qualitatively. The results indicated that 49.1% of the words were not of appropriate frequency in the BNC and COCA corpora, and 61.1% had no or scant frequency in specialized corpora for English majors. The validity and reliability indices of the test were found to be 0.32 and 0.48, respectively. Many items suffered from problems of item difficulty (70% of the items), item discrimination (38%), and choice distribution (58%), and participants generally deemed the test as unsuitable for admitting M.A. students. Consequently, the use of the test as a criterion for M.A. student admissions was found neither justified nor defensible. Implications, limitations, and suggestions for further research are discussed.

Keywords

Main Subjects


پیش قدم، رضا، ابراهیمی، شیما، شایسته، شقایق، طباطبایی فارانی، سحر و جاجرمی، هانیه (1399). بررسی و آسیب‌شناسی آزمون‌های بسندگی زبان انگلیسی وابسته به وزارت عتف و دانشگاه‌های ایران و  نیازسنجی زبانی ذی‌نفعان. پژوهشهای زبان شناختی در زبانهای خارجی، 10(4)، 686-705.
جعفرپور، عبدالجواد (1376). نقدی بر آزمون انگلیسی ورودی تحصیلات تکمیلی (مجموعه زبان انگلیسی). مجله پژوهشی دانشگاه اصفهان (علوم  انسانی)، 1(2)، 15-22.
خسروانی، محمود، رستمیان، مرتضی، اشرف، حمید و خدابخش زاده، حسین (1401). پدیدارشناسی، ساخت و اعتباربخشی انگارۀ روایی پسینی سنجۀ زبان در آزمون ورودی دانشگاه: کاربردهایی برای آزمون‏سازی در تحصیلات تکمیلی. پژوهش و نگارش کتب دانشگاهی، 26(51)، 189-217.
فراهانی، الهام، یزدانی، هوشنگ، احمدیان، موسی و عامریان، مجید (1399). بررسی کاربرد اصطلاحات انگلیسی در مقاله‌های پژوهشی زبان‌شناسی کاربردی: پژوهش پیکره محور. پژوهشهای زبان شناختی در  زبانهای خارجی، 10(2)، 390-405.
منابع انگلیسی
Alavi, S. M., Karami, H., & Khodi, A. (2021). Examination of factorial structure of Iranian English language proficiency test: An IRT analysis of Konkur examination. Current Psychology, 42(10), 8097–8111. https://doi.org/10.1007/s12144-021-01922-1.
Amirian, S. M. R., Ghonsooly, B., & Amirian, S. K. (2020). Investigating fairness of reading comprehension section of INUEE: Learner’s attitudes towards DIF sources. International Journal of Language Testing, 10(2), 88–100.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.
Bachman, L.F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2, 1-34. https://doi.org/10.1207/s15434311laq0201_1.
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests. Oxford University Press.
Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice. Oxford University Press.
Baghaei, P., & Amrahi, N. (2011). Validation of a multiple choice English vocabulary test with the Rasch model. Journal of Language Teaching and Research, 2(5), 1052–1060. https://doi.org/10.4304/jltr.2.5.1052-106.
Bagheri Nevisi, R., Safari, M., Hosseinpur, R. M., & Mousakazemi, R.S. (2023). A high frequency word list for political sciences. Journal of Modern Research in English Language Studies, 10(4), 21-43.
Beglar, D. (2010) A Rasch-based validation of the Vocabulary Size Test. Language Testing, 27(1), 101-118. https://doi.org/10.1177/0265532209340194
Brown, D. H. & Abeywickrama, P. (2019). Language assessment: Priciples and classroom practices (3rd ed.). Pearson Publications.
Brown, J. D. (2005). Testing in language programs: A comprehensive guide to English language assessment (2nd ed.). McGraw-Hill College.
Chapelle, C. A. (2020). Argument-based validation in testing and assessment. Sage Publications.
Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (2008). Building a validity argument for the Test of English as a Foreign Language. Routledge.
Chapelle, C.A. & Lee, H. (2021). Understanding argument-based validity in language testing. In C. A. Chapelle & E. Voss (Ed.), Validity argument in language testing (pp. 19-44). Cambridge University Press.
Choi, I. C., & Moon, Y. (2019). Predicting the difficulty of EFL tests based on corpus linguistic features and expert judgment. Language Assessment Quarterly, 17(1), 18–42. https://doi.org/10.1080/15434303.2019.1674315
Cohen, L., Manion, L., & Morrison, K. (2017). Research methods in education (8th ed.). Routledge.
Coxhead, A.  (2000).  A new academic word list. TESOL Quarterly, 34(2), 213-238.
DeVellis, R. F. (2016). Scale development: Theory and applications (4th ed.). Sage Publications.
Eghlidi, M. & Tabatabaei, O. (2018). Evaluating the construct validity of dinal test of grade three of junior high school in Iran. Journal of Applied Linguistics and Language Research, 5(2), 275-285.
Elder, C. & O'Loughlin, K. (2003). Investigating the relationship between intensive EAP training and band score gains on IELTS. IELTS Research Reports, 4, 207-254.
Farhady, H., Jafarpur, A. and Birjandi, P. (1994). Testing language skills: From theory to practice. SAMT Publications.
Fulcher, G., & Davidson, F. (2007). Language testing and assessment: An advanced resource book. Routledge.
Ghahraki, S., Tavakoli, M., Ketabi, S. (2022). Applying a two-parameter item response model to explore the psychometric properties: The case of the ministry of Science, Research and Technology (MSRT) high-stakes English Language Proficiency test. Journal of English Language Teaching and Learning, 14(29), 1-26. https://doi.org/10.22034/ELT.2021.46325.2396
Ghasemivarzaneh, S. (2005). On the predictive validity of the proficiency section of the M.A. entrance examination in English language major [Unpublished master’s thesis]. Allameh Tabatabaee University.
Ghorbani, M. R., Abbassi, H., & Razali, A. B. M. (2021). Exploring the Shortcomings of the Iranian MSRT English Proficiency Test. Pertanika Journal of Social Sciences & Humanities, 29(S3), 115-132.
Gyllstad, H. (2012, June). Validating the vocabulary size test: A classical test theory approach. 9th EALTA Conference, Innsbruck, Austria.
Haladyna, T. M. (2004). Developing and validating multiple-choice test items (3rd ed.). Routledge.
Hughes, A., & Hughes, J. (2020). Testing for language teachers (3rd ed.). Cambridge University Press.
Im, G., Shin, D. & Cheng, L. (2019). Critical review of validation models and practices in language testing: their limitations and future directions for validation research. Language Testing in Asia, 9(14).
Jamalifar, G., Heidari Tabrizi, H. & Chalak, A. (2014). Islamic Azad University entrance examination of master program in TEFL: An analysis of its reliability of the general English section. The Asian EFL Journal, 10(5), 386-403.
Jennings, B., Powel, D., Jaworska, S. & Joseph, H. (2024). A corpus study of English language exam texts: Vocabulary difficulty and the impact on students' wider reading (or Should students be reading more texts by dead white men?). Journal of Adolescent & Adult Literacy, 67, 303-316. https://doi.org/10.1002/jaal.1331
Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 17-64). American Council on Education/Praeger.
Kane, M. T. (2013). The argument-based approach to validation. School Psychology Review, 42(4), 44-457.
Karami, H. (2012). The development and validation of a bilingual version of the vocabulary size test. RELC Journal, 43(1), 53-67. https://doi.org/10.1177/0033688212439359
Karami, H. (2013). An investigation of the gender differential performance on a high-stakes language proficiency test in Iran. Asia Pacific Education Review, 14(3), 435–444. https://doi.org/10.1007/s12564-013-9272-y
Khamboonruang, A. (2025). Argument-based validation of Chulalongkorn University Language Institute (CULI) test: A Rasch-based evidence investigation. Language Testing in Asia, 15(10), 1-24. https://doi.org/10.1186/s40468-025-00346-z
Khodi, A., Alavi, S. M., & Karami, H. (2021). Test review of Iranian university entrance exam: English Konkur examination. Language Testing in Asia, 11(14), 1–10.
Khoii, R. (1998). A qualitative and quantitative evaluation of the English subtests of the entrance examinations of universities using Rasch model [Doctoral dissertation]. Islamic Azad University, Science and Research Campus, Tehran.
Kumazawa, T., Shizuka, T., Mochizuki, M. & Mizumoto, A. (2016). Validity argument for the VELC Test® score interpretations and uses. Language Testing in Asia, 6(2), 1-18. https://doi.org/10.1186/s40468-015-0023-3.
Kremmel, B. (2016). Word families and frequency bands in vocabulary tests: Challenging conventions. TESOL Quarterly, 50, 976-987. https://doi.org/10.1002/tesq.329.
Linn, R. L., & Gronlund, N. E. (2000). Measurement and assessment in teaching (8th ed.). Prentice Hall.
Madsen, H. S. (1983). Techniques in testing. Oxford University Press.
Marandi, S., Tajik. L. & Zohali., L. (2020). On the construct validity of the Iranian Ministry of Health Language Exam (MHLE). Journal of Language Horizons, 4(2), 9-36.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement, (pp. 13–103). American Council on Education/Macmillan.
Milton, J. (2009). Measuring second language vocabulary acquisition. Multilingual Matters.
Miller, M. D., Linn, R. L., & Gronlund, N. E. (2013). Measurement and assessment in teaching (11th ed.). Pearson.
Nation, I.S.P. (2017). The BNC/COCA Level 6 word family lists (Version 1.0.0) [Data file]. Available from http://www.victoria.ac.nz/lals/staff/paul-nation.aspx
Nation, P. & Beglar, D. (2007). A vocabulary size test. The Language Teacher, 31(7), 9-13.
Noroozi, S., & Karami, H. (2022). A scrutiny of the relationship between cognitive load and difficulty estimates of language test items. Language Testing in Asia, 12(1), 1-19  https:// doi.org/10.1186/s40468-022-00163-8
Obioma, G. & Salau, M. (2007, September 16-21). The predictive validity of public examinations: A case study of Nigeria. The 33rd Annual Conference of International Association for Educational Assessment (IAEA) Baku, Azerbaijan.
Pardo-Ballester, C. (2010). The validity argument of a web-based Spanish listening exam: Test usefulness evaluation. Language Assessment Quarterly, 7(2), 137-159, https://doi.org/10.1080/15434301003664188
Rafatbakhsh, E., & Ahmadi, A. (2022). The argument-based validation of a large-scale high-stakes vocabulary test. Practical Assessment, Research, & Evaluation, 27(28). Available online: https://scholarworks.umass.edu/pare/vol27/iss1/28
Rafatbakhsh, E., & Ahmadi, A. (2024). A Corpus-based Evaluation of a High-stakes EFL Exam. Journal of Studies in Language Learning and Teaching, 1(2), 211-225.
Ravand, H., & Firoozi, T. (2016). Examining construct validity of the master’s UEE using the Rasch model and the six aspects of the Messick’s framework. International Journal of Language Testing, 6(1), 1–23.
Razavipur, K. (2014). On the substantive and predictive validity facets of the university entrance exam for English majors. Research in Applied Linguistics, 5(1), 77-90.
Safari, M. (2018). Do university students need to master the GSL and AWL words? A psychology word list. Journal of Modern Research in English Language Studies, 5(2), 101-122.
Safari, M (2019). English vocabulary for equine veterans: How different from GSL and AWL words. Iranian Journal of English for Academic Purposes, 8 (2), 51-65.
Schmitt, N., Nation, P., & Kremmel, B. (2019). Moving the field of vocabulary assessment forward: The need for more rigorous test development and validation. Language Teaching, 53, 109 - 120. https://doi:10.1017/S0261444819000326
Schmitt, N., & Schmitt, D. (2014). A reassessment of frequency and vocabulary size in L2 vocabulary teaching. Language Teaching, 47(4), 484-503. https://doi.org/10.1017/S0261444812000018
Sheikholeslami, A. (1999). On the validity of the TEFL MA Entrance Examination proficiency tests in Iran [MA thesis]. Allamehtabatabaee University.
Sydorenko, T. (2011). Item writer judgments of item difficulty versus actual item difficulty: A case study. Language Assessment Quarterly, 8(1), 34-52.  https://doi.org/10.1080/15434303.2010.536924
Tarrant, M. & Ware, J. (2010). A comparison of the psychometric properties of three- and four-option multiple-choice questions in nursing assessments. Nurse Education Today, 30, 539-543. https://doi.org/10.1016/j.nedt.2009.11.002.
Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Palgrave Macmillan.
Wright, B. & Stone, M. (1979). Best test design. Mesa Press.  
Yanagisawa, A., & Webb, S. (2020). Measuring depth of vocabulary knowledge. In S. Webb (Ed.), The Routledge handbook of vocabulary studies (pp. 371-386), Routledge.
Yujie, J. & Wenxia. Z. (2007). Evaluating the construct validity of an EFL test for PhD candidates: A quantitative analysis of two versions. Shiken: JALT Testing & Evaluation SIG Newsletter, 11(1), 2-16.