Higher Grades = Higher Evaluations
 
 
"How’m I Doing?" by W. M. Williams &
S. J. Ceci, Change, Sept./Oct./1997, pp.13-23.

 

We also looked at the degree to which students’ performance in the course influenced their ratings of the instructor and the course. As expected on the basis of past studies by Abrami et al., Feldman, and Greenwald, there was a strong positive correlation between grade received and overall course rating (N = 468, r = .42, p < .0001). This value is close to the "grand mean" of .40 reported in the largest meta-analysis done to date by Abrami et. al. (Grade reported also predicted all of the other instructor and course ratings, a common halo-type effect. See Greenwald for a review.)

Given all of the limitations of a naturalistic case study, our modest study nevertheless shows that student ratings are far from the bias-free indicators of instructor effectiveness that many have touted them to be. Moreover, student ratings can make or break the careers of instructors on grounds unrelated to objective measures of student learning, and for factors correctable with minor coaching.

 

 

Anthony G. Greenwald and Gerald M. Gillmore,
"No Pain, No Gain? The Importance of Measuring Course Workload
in Student Ratings of Instructions"
Journal of Educational Psychology, Vol. 89, No.4, 1997, pp.743-751.
 

Samples of about 200 undergraduate courses were investigated in each of 3 consecutive academic terms. Course survey forms assessed evaluative ratings, expected grades, and course workloads. A covariance structure model was developed in exploratory fashion for the lst term’s data, and then successfully cross-validated in each of the next 2 terms. The 2 major features of the successful model were that (a) courses that gave higher grades were better liked (a positive path from expected grades to evaluative ratings), and (b) courses that gave higher grades and lighter workloads (a negative relation between expected grades and workload). These findings support the conclusion that instructors’ grading leniency influences ratings. This effect of grading leniency also importantly qualifies the standard interpretation that student ratings are relatively pure indicators of instructional quality.

Previous convergent validation studies (reviewed by Abrami, Cohen, & d’Apollonia, 1988) have found correlations averaging approximately r = .40 between evaluative ratings and measures of achievement in multisection validity designs. In these studies multiple sections of the same course receive grades based on the same or similar examinations, thereby controlling grading criteria. The r = .40 convergent validity figure may be seen either as an underestimate, because error of measurement and restriction of range of measures can attenuate validity estimates (Cohen, 1981, p.301), or as an overestimate, because of uncontrolled third variables that might inflate validity estimates (Marsh & Dunkin, 1992, p. 170). Using a perhaps optimistic estimate of 20% of variance in ratings explained by the desirable correlation of student ratings with achievement differences, together with an approximate 20% explained by grading leniency, one still is left with an unexplained 60% of variance in student ratings. Even if this remaining 60% of ratings variance is uncorrelated with achievement, it may still be correlated with desirable attitudinal outcomes of instruction, such as liking for the course’s subject and interest in further study.

 

 

Anthony G. Greenwald and Gerald M. Gillmore,
"Grading Leniency Is A Removable Contaminant of Student Ratings,"
Journal of Educational Psychology, Vol. 89, No.4, pp.1209-1216.
 

Conclusions

 

Yes, I Can Get Higher Ratings by Giving Higher Grades

 

Recall that this conclusion has been previously supported by experimental studies in which grading policies were manipulated in natural classroom settings (see also Greenwald, 1997, this issue). Figures 1 and 3 suggest that the magnitude of this effect corresponds to a standardized path coefficient as high as .50. In the context of the grading-leniency interpretation, this .50 figure means that in the population of courses included in the University of Washington data sets, changing from giving grades one standard deviation below the university mean to one standard deviation above should produce a one standard deviation change in one’s percentile rank in the university’s student ratings. A standard-deviation change from, say, half a standard deviation below the university mean rating to half a standard deviation above would be a change from the university’s 31stpercentile of instructors to the 69th percentile. Giving high grades by itself, might not be sufficient to ensure high ratings. Nevertheless, if an instructor varied nothing between two course offerings other than grading policy, higher ratings would be expected in the more leniently graded course.

 

Comment by Greenwald (p.1216)

 

For the past two decades, the dominant view among researchers of student ratings has been that ratings provide valid and substantially bias-free measures of teaching effectiveness. The wide acceptance of this view is indicated by the prevalence of research efforts directed at establishing convergent validity, relative to ones directed at pursing discriminant-validity criticisms.

It is all too easy for someone who had no role in constructing the dominant view to register dissatisfaction with it. With the advance confession of perspective, I confess further to some amazement at the respect accorded to this dominant view in the state-of-the-art articles by Marsh and Roche (1977, this issue), d’Apollonia and Abrami (1997, this issue), and McKeachie (1997, this issue) in the Current Issues sections. These scholars, like many others, appear not to be disconcerted by convergent-validity findings that typically report correlations of .40 or less with nonratings indicators of teaching effectiveness (e.g., ratings typically explain only about 15% of the variance in achievement measures). I am similarly puzzled by the wariness with which these scholars and others treat research findings that, to my reading, indicate that ratings measures can be unfair. I have in mind not only evidence concerning grading leniency (with which this article is primarily concerned) but also evidence that students (in making ratings judgments) mistake enthusiastic teaching style for effective teaching (Abrami, Leventhal, & Perry, 1982) and that being assigned to teach large classes lowers one’s rating (we have very reliably found this effect at University of Washington). Our program of research at University of Washington definitely aims to upset the dominant view.

 

Comment by Gillmore (p.1216)

 

Take two courses that differ only in that students in one expect higher grades than those in the other. Although (by assumption) students in both courses learn the same amount and receive the same quality of instruction, our reading of the research evidence indicates the former course will receive generally higher evaluative ratings. Even excellent teachers whose outstanding pedagogy leads to high student achievement will receive elevated ratings if their students expect very high grades rather than just high grades. We advocate ratings adjustments for such situations.

Student instructional ratings provide data for three distinct functions: personnel decisions, instructional improvement, and information to students. It is also to achieve fairness in personnel decisions that adjustments for grades, class size, and perhaps other variables are potentially most useful and justifiable. We do not think that teaching careers should be injured when faculty take on the difficult task of teaching large sections or when they uphold strict grading standards. The questions is not whether adjustments will turn an imperfect measure into a perfect one but rather whether adjustments can improve decisions that must make use of necessarily imperfect measure.

 



"The classroom -- not the trench -- is the frontier of freedom now and forevermore." 
  
                                                - Lyndon Baines Johnson 

Society for a Return to Academic Standards 
  
State Legislators Must Act Promptly 
 



 
 

Last Updated: 24 April 1998