Construct Irrelevant Factors and Test Validity: Investigating the Relationship among Gender, Age, Mother Tongue, Field of Study and TOEFL IBT ® Results

Document Type : research article

Author

Sharif University of Technology

Abstract

Ary, Sorensen, and Walker (2014) refer to these factors as construct irrelevant ones. Computer like other innovations has aspects that should be investigated. In some cases, computer can distort the tests results and be a source of threat for test validity (construct irrelevant variance). Besides computer familiarity, other demographical factors such as gender, age, mother tongue and filed of study can be the sources of construct irrelevant variance. Therefore, the researchers chose to analyze the role of these phenomenon in the assessing procedure. In order to do this, one hundred participants answered computer familiarity and attitude toward computer questionnaires and a computer based TOEFL IBT®. At the first step, the level of computer familiarity was investigated and the next stage was related to probing attitude of the participants toward computer and effect of three specified variables (age, gender, and mother tongue) on the test results. Results did not indicate meaningful relation between these variables and test scores.

Keywords


Alderson, J. C. (2000). Assessing reading. New York: Cambridge University Press.
 
American Psychological Association. (1985). Standards for Educational and
Psychological Testing. Washington, ’ DC: American Psychological Association
 
Ary, D., Jacobs, L., Sorensen, C., & Walker, D. (2014). Introduction to research in
education. Belmont: Cengage Learning.
 
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford
University Press.
Clariana, R., & Wallace, P. (2002). Paper–based versus computer–based assessment: key factors
associated with the test mode effect. British Journal of Educational Technology33(5), 593-602.
Choi, I. C., Kim, K. S., & Boo, J. (2003). Comparability of a paper-based language test and a
computer-based language test. Language Testing, 20(3), 295-320.
 
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the
multitrait-multimethod matrix. Psychological Bulletin56(2), 81.
 
Cohen, A. (1984). On taking tests: what the students report. Language Testing, 1, 70-81.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis for field
settings. New York: Rand McNally.
 
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological
Bulletin52(4), 281.
 
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd
ed., pp. 443–507). Washington DC: American Council on Education.
 
Durndell, A., & Lightbody, P. (1993). Gender and computing: change over time? Computers &
 
Education21(4), 331-336.
 
Eignor, D. (1999). Selected technical issues in the creation of computer-adaptive tests of second
language reading proficiency. In M. Chalhoub-Deville (Ed), Issues in computer-adaptive testing of reading proficiency (pp. 167-181).UK: Cambridge University Press.
 
Embretson, S. (1983). Construct validity: Construct representation versus nomothetic span.
Psychological Bulletin, 93, 179-197.
 
Fulcher, G. (1999). Computerizing an English language placement test. ELT Journal, 53(4), 289-
299.
Geissler, J. E., & Horridge, P. (1993). University students’ computer knowledge and
commitment to learning. Journal of Research on Computing in Education, 25(3), 347-365.
 
Geranpayeh, A., & Kunnan, A. J. (2007). Differential item functioning in terms of
age in the certificate in advanced English examination. Language Assessment Quarterly, 4 (2), 190-22.
Hambleton, R. K., (1984). Validating the test score.  In R.A. Berk (Ed.), A guide to criterion-
referenced test construction. (pp. 199- 230).Baltimore: Johns Hopkins University Press,.
Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL
examinees. ETS Research Report Series1998(1), i-23.
 
Lee, J. A. (1986). The effects of past computer experience on computerized aptitude test
performance. Educational and Psychological Measurement46(3), 727-733.
 
Lennon, R. T. (1956). Assumptions underlying the use of content validity. Educational and
Psychological Measurement, 16, 294-304.
 
Loevinger, J. (1957). Objective tests as instruments of psychological theory: Monograph
Supplement 9. Psychological Reports3(3), 635-694.
 
Loyd, B. H., & Gressard, C. (1984). Reliability and factorial validity of computer attitude
scales. Educational and Psychological Measurement44(2), 501-505.
 
Messick, S. (1975). The standard problem: Meaning and values in measurement and
evaluation. American Psychologist30(10), 955.
 
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist35(11),
1012.
 
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-
103). New York: Macmillan.
 
Messick, S. (1993). Foundations of validity: Meaning and consequences in psychological
assessment. ETS Research Report Series1993(2), i-18.
 
Odo, D. M. (2012). Computer familiarity and test performance on a computer-based cloze ESL
reading assessment. Teaching English with Technology, 12(3), 18-35.
 
Rahman, M. A. (2011). Teacher educators' attitudes towards computer: Perspective
Bangladesh. Available 
 
Rezaee, A., & Salehi, M. (2008). The construct validity of a language proficiency test: A multitrait multimethod approach. TELL, 2 (8), 93-110.
Popham, W. J. (1978). Criterion-referenced measurement. Englewood Cliffs, NJ: Prentice-Hall.
 
Salehi, M. (2011). On the construct validity of the reading section of the university of Tehran
English Proficiency Test. Journal of English Language Teaching and Learning, Faculty of Literature and Humanities of Tabriz University, 22, 129-160.
 

Salehi, M. (2012). The construct validity of a test: A triangulation of approaches.

Language Testing in Asia. https://doi.org/10.1186/2229-0443-2-2-102.

 
Salehi, M., & Bagheri Sanjareh, H. (2013). On the comparability of C-test and cloze test: A
verbal protocol approach. English for Specific Purposes World, 14(39).
 
Salehi, M. and Tayebi, A. (2012). Differential item functioning in in terms of gender in the
reading sub-section of a high-stakes test. Iranian Journal of Applied Language Studies, 4(1), 135-168.
 
Schmidt, F. L., Hunter, J. E., Pearlman, K., Hirsh, H. R., Sackett, P. R., Schmitt, N., ... &
Zedeck,S. (1985). Forty Questions about Validity Generalization and Meta‐Analysis. Personnel Psychology, 38(4), 697-798.
 
Shulman, L. S. (1970). Reconstruction of educational research. Review of Educational Research,
40(3), 371-396.
 
Taylor, C., Kirsch, I., Jamieson, J., & Eignor, D. (1999). Examining the relationship between
computer familiarity and performance on computer‐based language tasks. Language Learning, 49(2), 219-274.
 
Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Houndmills,
England: Palgrave Macmillan.
 
Weir, C., & O’Sullivan, B. Jin Yan and Bax, S. (2007). Does the computer make a difference?
Reaction of candidates to a computer-based versus a traditional hand-written form of the IELTS Writing component: effects and impact. IELTS Research Report, 7, 311-347.
Winter, S. J., Chudoba, K. M., & Gutek, B. A. (1998). Attitudes toward computers: when do they
predict computer use? Information & Management34(5), 275-284.