Alderson, J. C. (2000). Assessing reading. New York: Cambridge University Press.
American Psychological Association. (1985). Standards for Educational and
Psychological Testing. Washington, ’ DC: American Psychological Association
Ary, D., Jacobs, L., Sorensen, C., & Walker, D. (2014). Introduction to research in
education. Belmont: Cengage Learning.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford
Clariana, R., & Wallace, P. (2002). Paper–based versus computer–based assessment: key factors
associated with the test mode effect. British Journal of Educational Technology, 33(5), 593-602.
Choi, I. C., Kim, K. S., & Boo, J. (2003). Comparability of a paper-based language test and a
computer-based language test. Language Testing, 20(3), 295-320.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the
multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81.
Cohen, A. (1984). On taking tests: what the students report. Language Testing, 1, 70-81.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis for field
settings. New York: Rand McNally.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological
Bulletin, 52(4), 281.
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd
ed., pp. 443–507). Washington DC: American Council on Education.
Durndell, A., & Lightbody, P. (1993). Gender and computing: change over time? Computers &
Education, 21(4), 331-336.
Eignor, D. (1999). Selected technical issues in the creation of computer-adaptive tests of second
language reading proficiency. In M. Chalhoub-Deville (Ed), Issues in computer-adaptive testing of reading proficiency (pp. 167-181).UK: Cambridge University Press.
Embretson, S. (1983). Construct validity: Construct representation versus nomothetic span.
Psychological Bulletin, 93, 179-197.
Fulcher, G. (1999). Computerizing an English language placement test. ELT Journal, 53(4), 289-
Geissler, J. E., & Horridge, P. (1993). University students’ computer knowledge and
commitment to learning. Journal of Research on Computing in Education, 25(3), 347-365.
Geranpayeh, A., & Kunnan, A. J. (2007). Differential item functioning in terms of
age in the certificate in advanced English examination. Language Assessment Quarterly, 4 (2), 190-22.
Hambleton, R. K., (1984). Validating the test score. In R.A. Berk (Ed.), A guide to criterion-
referenced test construction. (pp. 199- 230).Baltimore: Johns Hopkins University Press,.
Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL
examinees. ETS Research Report Series, 1998(1), i-23.
Lee, J. A. (1986). The effects of past computer experience on computerized aptitude test
performance. Educational and Psychological Measurement, 46(3), 727-733.
Lennon, R. T. (1956). Assumptions underlying the use of content validity. Educational and
Psychological Measurement, 16, 294-304.
Loevinger, J. (1957). Objective tests as instruments of psychological theory: Monograph
Supplement 9. Psychological Reports, 3(3), 635-694.
Loyd, B. H., & Gressard, C. (1984). Reliability and factorial validity of computer attitude
scales. Educational and Psychological Measurement, 44(2), 501-505.
Messick, S. (1975). The standard problem: Meaning and values in measurement and
evaluation. American Psychologist, 30(10), 955.
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35(11),
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-
103). New York: Macmillan.
Messick, S. (1993). Foundations of validity: Meaning and consequences in psychological
assessment. ETS Research Report Series, 1993(2), i-18.
Odo, D. M. (2012). Computer familiarity and test performance on a computer-based cloze ESL
reading assessment. Teaching English with Technology, 12(3), 18-35.
Rahman, M. A. (2011). Teacher educators' attitudes towards computer: Perspective
Rezaee, A., & Salehi, M. (2008). The construct validity of a language proficiency test: A multitrait multimethod approach. TELL, 2 (8), 93-110.
Popham, W. J. (1978). Criterion-referenced measurement. Englewood Cliffs, NJ: Prentice-Hall.
Salehi, M. (2011). On the construct validity of the reading section of the university of Tehran
English Proficiency Test. Journal of English Language Teaching and Learning, Faculty of Literature and Humanities of Tabriz University, 22, 129-160.
Salehi, M. (2012). The construct validity of a test: A triangulation of approaches.
Language Testing in Asia. https://doi.org/10.1186/2229-0443-2-2-102.
Salehi, M., & Bagheri Sanjareh, H. (2013). On the comparability of C-test and cloze test: A
verbal protocol approach. English for Specific Purposes World, 14(39).
Salehi, M. and Tayebi, A. (2012). Differential item functioning in in terms of gender in the
reading sub-section of a high-stakes test. Iranian Journal of Applied Language Studies, 4(1), 135-168.
Schmidt, F. L., Hunter, J. E., Pearlman, K., Hirsh, H. R., Sackett, P. R., Schmitt, N., ... &
Zedeck,S. (1985). Forty Questions about Validity Generalization and Meta‐Analysis. Personnel Psychology, 38(4), 697-798.
Shulman, L. S. (1970). Reconstruction of educational research. Review of Educational Research,
Taylor, C., Kirsch, I., Jamieson, J., & Eignor, D. (1999). Examining the relationship between
computer familiarity and performance on computer‐based language tasks. Language Learning, 49(2), 219-274.
Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Houndmills,
England: Palgrave Macmillan.
Weir, C., & O’Sullivan, B. Jin Yan and Bax, S. (2007). Does the computer make a difference?
Reaction of candidates to a computer-based versus a traditional hand-written form of the IELTS Writing component: effects and impact. IELTS Research Report, 7, 311-347.
Winter, S. J., Chudoba, K. M., & Gutek, B. A. (1998). Attitudes toward computers: when do they
predict computer use? Information & Management, 34(5), 275-284.