بررسی سوگیری در سوالات درک مطلب آزمون مقطع دکترای رشته زبان انگلیسی تحت سنجش تشخیصی شناختی

شهمیرزادی, نیلوفر; سیری, مسعود; مرعشی, حمید; گرامی پور, مسعود

doi:10.22059/jflr.2019.282328.634

بررسی سوگیری در سوالات درک مطلب آزمون مقطع دکترای رشته زبان انگلیسی تحت سنجش تشخیصی شناختی

نوع مقاله : علمی پژوهشی(عادی)

نویسندگان

¹ دانشجوی دکتری آموزش زبان انگلیسی، دانشکده زبانهای خارجی، دانشگاه آزاد اسلامی تهران مرکزی، تهران، ایران

² استادیار آموزش زبان انگلیسی، دانشکده زبانهای خارجی، دانشگاه آزاد اسلامی علوم و تحقیقات، تهران، ایران

³ دانشیار آموزش زبان انگلیسی، دانشکده زبانهای خارجی، دانشگاه آزاد اسلامی تهران مرکزی، تهران، ایران

⁴ استادیار سنجش و اندازه گیری، دانشگاه خوارزمی ، تهران، ایران

10.22059/jflr.2019.282328.634

چکیده

در چند دهۀ اخیر، بررسی سطح توانایی در آزمون‌های سرنوشت‌ساز از دیدگاه سنجش تشخیصی شناختی مورد توجه محققان ارزشیابی و سنجش و اندازه‌گیری قرار گرفته است. مدل‌های تشخیصی شناختی میزان تسلط و عدم تسلط آزمودنی‌ها را با ویژگی‌های چندگانه مورد بررسی قرار می‌دهد. بدین منظور، هدف پژوهش حاضر بررسی کنش افتراقی سوال و کنش افتراقی خصیصه در آزمون ورودی زبان عمومی مقطع دکترای زبان انگلیسی است. بنابراین، 3220 آزمودنی دختر و پسر از جامعه آماری شرکت کننده در آزمون سازمان سنجش کشور به‌صورت تصادفی انتخاب شدند. در مدل اکتشافی تحلیل داده‌ها، مراحل کیفی و کمی به‌ترتیب اجرا شد. سپس از ماتریس کیو ساخته شده به‌وسیله پروتکل تفکر با صدای بلند استفاده شد تا داده‌ها با استفاده از بسته آر استودیو تجزیه و تحلیل شوند. مدل جی دینا تحت سنجش تشخیصی شناختی قرار گرفته و نتایج بدست آمده از کنش افتراقی سوال، سوگیری در سوالات را گزارش داده است.

کلیدواژه‌ها

عنوان مقاله [English]

Test Fairness Analysis in Reading Comprehension PhD Nationwide Admission Test Items under CDA

نویسندگان [English]

Niloufar Shahmirzadi ¹
Masood Siyyari ²
Hamid Marashi ³
Masoud Geramipour ⁴

¹ PhD Candidate in Applied Linguistics, Islamic Azad University, Central Tehran Branch, Tehran, Iran

² Assistant Professor of Applied Linguistics, Islamic Azad University, Science and Research Branch, Tehran, Iran

³ Associate Professor of Applied Linguistics, Islamic Azad University, Central Tehran Branch, Tehran, Iran

⁴ Assistant Professor of Assessment and Measurement, Kharazmi University, Tehran, Iran

چکیده [English]

During the past few decades, documentation of test takers’ proficiency level has been accomplished through large-scale assessments most importantly Cognitive Diagnostic Assessment (CDA) in order to provide skills mastery profile of test takers in fine-grained detailed information. Accordingly, the present study attempts to scrutinize reading comprehension test items of a high-stakes test under CDA. To delve into this issue, DAF was used to detect the probability of mastery of attributes among test takers, and DIF was applied to show item performance among different candidates in terms of gender. Thus, the participants of this study were 3220 females and males attending PhD national admission test in Iran. Through adopting sequential exploratory mixed method design, GDINA model was run by the application of R-studio package. Results of the study revealed that test items suspected DIF against female. In the end, the findings of this study were discussed in light of their implications for language testing community to perceive potential social harm which derived from biased test items in PhD national admission exams.

کلیدواژه‌ها [English]

Reading Comprehension
DIF
DAF
CDA
PhD National Admission Exam

مراجع

خدایی، ابراهیم. (1388). الف. بررسی عوامل مؤثر بر قبولی در آزمون کارشناسی ارشد. فصلنامه پژوهش و برنامه‌ریزی در آموزش عالی. شماره 54، صص 34–19. http://journal.irphe.ac.ir

نیستانی، محمدرضا. (1391). برنامه ریزی آموزشی راهبردهای بهبود کیفیت در سطح یک واحد (مدرسه، واحد دانشگاهی و آموزش مجازی) اصفهان: آموخته.

https://www.adinehbook.com/gp/product/6006465012

Adams, R. J., Wilson, M. R., & Wang, W. C. (1997). The multidimensional

random coefficients multinomial logit model. Applied Psychological

Measurement, 21, 1–23. https://doi.org/10.1177/0146621697211001

Amirian, S. M. R., Alavi, S. M., & Fidalgo, A. M. (2014). Detecting gender DIF

with an English proficiency test in EFL context. Iranian Journal of Language

Testing, 4(2).

http://ijlt.ir/content/ijlte_1_ov2_com/wp-content_138/uploads/2019/07/429- 2014-4-2.pdf

Barati, H., & Ahmadi, A. R. (2010). Gender–based DIF across the subject area:

A study of the Iranian National University Entrance Exam. The Journal of

Teaching Language Skills (JTLS), 2(3), 1–22.

http://ensani.ir/fa/article/304325/gender-based-dif-across-the-subject-area-a-

study-of-the-iranian-national-university-entrance-exam

Barnes, B. J., & Wells, C. S. (2009). Differential item functional analysis by

gender and race of the national doctoral program survey. International Journal

of Doctoral Studies, 4, 77–96.

http://ijds.org/Volume4/IJDSv4p077-096Barnes258.pdf

Chen, J., Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in

cognitive diagnosis modeling. Journal of Educational Measurement, 50(2),

123–140.

https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1745-3984.2012.00185.

De La Torre, J. (2009). A cognitive diagnosis model for cognitively based

multiple-choice options. Applied Psychological Measurement, 33, 163–183.

https://journals.sagepub.com/doi/abs/10.1177/0146621608320523

DiBello, L. V., Roussos, L., & Stout, W. F. (2007). Review of cognitively

diagnostic assessment and a summary of psychometric models. In: Rao CR,

Sinharay S (eds) Handbook of statistics, vol 26. Amsterdam, Elsevier, pp 979–

1030.

https://www.sciencedirect.com/science/article/pii/S0169716106260310

Doudeen, H., & Annabi, H. (2008). Sex-Related Differential Item Functioning

(DIF) Analysis of TIMSS. Educational Sciences, 35(697).

https://journals.ju.edu.jo/DirasatHum/article/view/1807

Embretson (Whitely), S. E. (1983). Construct validity: Construct representation

versus nomothetic span. Psychological Bulletin, 93, 179-197.

https://psycnet.apa.org/record/1983-09487-001

Falmagne, J. C., &Doignon, J. P. (1988). A class of stochastic procedures for

assessment of knowledge. British Journal of Mathematical and Statistical

Psychology, 41, 1–23. https://doi.org/10.1111/j.2044-8317

Finch, H. (2005). The MIMIC method as a method for detecting DIF:

Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio.

Applied Psychological Measurement, 29, 278–295.

https://doi.org/10.1177/0146621605275728

Gao, L., & Rogers, W. T. (2010). Use of tree-based regression in the analyses of L2 reading test items. Language Testing, 28(2), 1–28.

https://doi.org/10.1177/0265532210364380

Haagenars, J., & McCutcheon, A. (2002). Applied latent class analysis.

Cambridge: Cambridge University Press.https://books.google.com/books

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of

item response theory. Newbury Park, CA: Sage Publications.

https://books.google.com/books

Hartz, S. M. (2002). A bayesian framework for the unified model for assessing

cognitive abilities: Blending theory with practicality. Unpublished doctoral

dissertation, University of Illinois at Urbana-Champaign.

https://psycnet.apa.org/record/2002-95016-234

Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the

Mantel- Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity

(pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.

https://onlinelibrary.wiley.com/doi/abs/10.1002/j.2330-8516.1986.tb00186.x

Hou, L., de la Torre, J., & Nandakumar, R. (2014). Differential item functioning

assessment in cognitive diagnosis modeling: Applying Wald test to investigate

DIF for DINA model. Journal of Educational Measurement, 51, 98–125.

https://doi.org/10.1111/jedm.12036

Jang, E. E. (2005). A validity narrative: Effects of reading skills diagnosis on

teaching and learning in the context of NG TOEFL. Unpublished doctoral

dissertation, University of Illinois, Urbana-Champaign.

https://www.researchgate.net/profile/Eunice_Jang/publication/33746641

Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading

comprehension ability: Validity arguments for Fusion Model application to

LanguEdge assessment. Language Testing, 26(1), 31–73.

https://doi.org/10.1177/0265532208097336

Leighton, j., & Gierl, M. (Eds). (2007). Cognitive diagnostic assessment for

education: Theory and applications. Cambridge University Press.

https://books.google.com/books

Li, F. M. (2008). A modified higher-order DINA model for detecting differential

item functioning and differential attribute functioning. Unpublished doctoral

dissertation, University of Georgia.

https://pdfs.semanticscholar.org/ebeb/8e863918509185f2e25f6031515a56c2b30c.pdf

Li, H. (2011). A cognitive diagnostic analysis of the MELAB reading test. Spaan

Fellow, 9, 17– 46.

https://pdfs.semanticscholar.org/9916/0423205405cae7171c3f91e0215db2122947.pdf

Li, H. & Suen, H. K. (2013). Constructing and validating a Q-matrix for

cognitive diagnostic analyses of a reading test, Educational Assessment, 18(1),

1–25.https://doi.org/10.1080/10627197.2013.761522

Li, X. & Wang, W. C. (2015). Assessment of differential Iiem functioning under

cognitive diagnosis models: The DINA model example. Journal of Educational

Measurement, 52, 28–54. https://doi.org/10.1111/jedm.12061

Lim, Y. (2015). Cognitive diagnostic model comparisons. PhD Dissertation

submitted to the Georgia Institute of Technology.

https://smartech.gatech.edu/bitstream/handle/1853/53513/LIMDISSERTATION-

2015.pdf

Lord, F. M. (1980). Applications of item response theory to practical testing

problems. Hills-dale, NJ: Lawrence Erlbaum.

https://www.ets.org/research/policy_research_reports/publications/book/1980/jexj

Mantel, N. (1963). Chi-square tests with one degree of freedom: Extensions of

the Mantel-Haenszel procedure. Journal of the American Statistical

Association, 58, 690–700. https://doi.org/10.1080/01621459.1963.10500879

Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data

from retrospective studies of disease. Journal of the National Cancer Inst, 22,

719–748. https://doi.org/10.1093/jnci/22.4.719

Pae, T. I. (2004). Gender effect on reading comprehension with Korean EFL

learners. System, 32(2), 265–281. https://doi.org/10.1016/j.system.2003.09.009

Penfield, R. D., & Camilli, G. (2007). “Differential item functioning and item

bias”. In C.R. Rao & S. Sinharay (Vol. Eds.), Handbook of statistics, Vol. 26

(pp. 125 – 167), Elsevier. https://doi.org/10.1016/S0169-7161(06)26005-X

Ranjbaran, F., & Alavi, S. M. (2017). Developing a reading comprehension test

for cognitive diagnostic assessment: A RUM analysis. Studies in Educational

Evaluation, 55, 167–179.

http://isiarticles.com/bundles/Article/pre/pdf/134341.pdf

Roever, C. (2007). DIF in the Assessment of second language pragmatics.

Language Assessment Quarterly, 4(2), 165–189.

https://doi.org/10.1080/15434300701375733

Rupp, A. A., & J., Templin. (2008). Unique characteristics of diagnostic

classification models: a comprehensive review of the current state-of-the-art.

Meas Interdiscip Res Perspect, 6, 219–262.

https://doi.org/10.1080/15366360802490866

Rupp, A. A, Templin, J, & R. A., Henson. (2010). Diagnostic measurement:

theory, methods, and applications. Guilford, New York.

https://books.google.com/books

Shanmugam, S. K. S., & Lan, O. S. (2014). The validity of administering

bilingual mathematics test among malasian bilingual students using

Differential Item Function (DIF). Asia Pacific Journal of Educators and

Education, 29, 1–18. http://apjee.usm.my/APJEE_29_2014/Art%201(1-18).pdf

Shealy, R., & Stout, W. F. (1993a). An item response theory model for test bias.

In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 197–

329). Hillsdale, NJ: Lawrence Erlbaum.

https://psycnet.apa.org/record/1993-97193-004

Shealy, R., & Stout, W. F. (1993b). A model-based standardization approach that

separates true bias/DIF from group differences and detects test bias/DTF as

well as item bias/DIF. Psychometrika, 58, 159–194.

https://link.springer.com/article/10.1007/BF02294572

Snow, R. E., & Lohman, D. F. (1989). Implications of cognitive psychology for

educational measurement. American Council on Education.

https://psycnet.apa.org/record/1989-97348-007

Song, X., Cheng, L., & Klinger, D. (2015). DIF investigations across groups of

gender and academic background in a large-scale high-stakes language test.

https://arts.unimelb.edu.au/__data/assets/pdf_file/0010/1770679/Song_et_al.pdf

Swaminathan, H. & Rogers, H. J. (1990). Detecting differential item functioning

using logistic regression procedures. Journal of Educational Measurement, 27,

361– 370. https://doi.org/10.1111/j.1745-3984.1990.tb00754.x

Tatsuoka, K. K. (1983). Rule space: an approach for dealing with misconception

based on item response theory. Journal of Educational Measurement, 20(4),

345–354. https://doi.org/10.1111/j.1745-3984.1983.tb00212.x

Tatsuoka, K. K. (1990). Toward an integration of item-response theory and

cognitive error diagnosis. In N. Fredericksen, R. Glaser, A. Lesgold, & M. G.

Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp.

453–488). Hillsdale, NJ: Erlbaum.

https://psycnet.apa.org/record/1990-97343-018

Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in

the study of group differences in trace lines. In H. Wainer & H. I. Braun

(Eds.), Test validity (pp. 147–169). Hillsdale NJ: Erlbaum.

https://books.google.com/books

Young, J. W., Morgan, R., Rybinski, P., Steinberg, J., & Wang, Y. (2013).

Assessing the Test Information Function and Differential Item Functioning for

the TOEFL Junior® Standard Test. ETS Research Report Series, 1, i-27.

https://www.ets.org/Media/Research/pdf/RR-13-17.pdf

Zumbo, B. D. (1999). A Handbook on the theory and methods of differential item

functioning (DIF): Logistic regression modeling as a unitary framework for

binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of

Human Resources Research and Evaluation, Department of National Defense.

https://s3.amazonaws.com/academia.edu.documents

Zumbo, B. D. (2007). Three generations of DIF analysis: Considering where

it has been, where it is now, and where it is going. Language Assessment

Quarterly, 4(2), 223–233. https://doi.org/10.1080/15434300701375832

تعداد مشاهده مقاله: 908
تعداد دریافت فایل اصل مقاله: 719

بررسی سوگیری در سوالات درک مطلب آزمون مقطع دکترای رشته زبان انگلیسی تحت سنجش تشخیصی شناختی

Test Fairness Analysis in Reading Comprehension PhD Nationwide Admission Test Items under CDA

مراجع

دوره 10، شماره 1
فروردین 1399
صفحه 152-165

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

بررسی سوگیری در سوالات درک مطلب آزمون مقطع دکترای رشته زبان انگلیسی تحت سنجش تشخیصی شناختی

Test Fairness Analysis in Reading Comprehension PhD Nationwide Admission Test Items under CDA

مراجع

دوره 10، شماره 1فروردین 1399صفحه 152-165

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

دوره 10، شماره 1
فروردین 1399
صفحه 152-165