On the Validity of Reading Assessments: Relationships Between Teacher Judgements, External Tests and Pupil Self-Assessments

Stefan Johansson (Institutionen för pedagogik och specialpedagogik)
Gothenburg studies in educational sciences / Acta Universitatis Gothoburgensis, ISSN 0436-1121
University of Gothenburg
Kjell Härnqvistsalen, Pedagogen Hus A
Prof. Astrid Pettersson, PRIM-gruppen/Institutionen för matematikämnets och naturvetenskapsämnenas didaktik, Stockholms universitet
The purpose of this thesis is to examine validity issues in different forms of assessments; teacher judgements, external tests, and pupil self-assessment in Swedish primary schools. The data used were selected from a large-scale study––PIRLS 2001––in which more than 11000 pupils and some 700 teachers from grades 3 and 4 participated. The primary method used in the secondary analyses to investigate validity issues of the assessment forms is multilevel Structural Equation Modeling (SEM) with latent variables. An argument-based approach to validity was adopted, where possible weaknesses in assessment forms were addressed. A fairly high degree of correspondence between teacher judgements and test results was found within classrooms with a correlation of .65 being obtained for 3rd graders, a finding well in line with documented results in previous research. Grade 3 teachers’ judgements correlated higher than those of grade 4 teachers. The longer period of time spent with the pupils, as well as their different education, were suggested as plausible explanations. Gender and socioeconomic status (SES) of the pupils showed a significant effect on the teacher judgements, in that girls and pupils with higher SES received higher judgements from teachers than test results accounted for. Teachers with higher levels of formal competence were shown to have pupils with higher achievement levels. Pupil achievement was measured with both teacher judgements and PIRLS test results. Furthermore, higher correspondence between judgements and test-results was demonstrated for teachers with higher levels of competence. Comparisons of classroom achievement were shown to be problematic with the use of teachers’ judgements. The judgements reflected different achievement levels, despite the fact that test-results indicated similar performance levels across classrooms. Pupil self-assessments correlated slightly lower to both teacher judgement and to test results, than did teacher judgements and test results. However, in spite of their young age, pupils assessed their knowledge and skills in the reading domain relatively well. No differences in self-assessments were found for pupils of different gender or SES. In summary, a conclusion of the studies on the three forms of assessment was that all have certain limitations. Strengths and weaknesses of the different assessment forms were discussed.
Utbildningsvetenskap ->
Validity; Validation; Assessment; Teacher judgements; External tests; PIRLS 2001; Self-assessment; Multilevel models; Structural Equation Modeling; Socioeconomic status; Gender
2013-03-20 12:57

