Goals
for Peer EvaluationConclusion
| Peer
Evaluation of TeachingThor
A. Hansen Western Washington University There
has been a lot of discussion at Western recently about peer evaluation of faculty
teaching, which, as a balance and supplement to student course evaluations, most
faculty recognize the need for. Indeed, a great deal has been written about such
evaluation in the assessment literature, although I do not intend to review that
literature here. Instead I will describe the system that the Geology Department
has recently implemented-and the thinking behind it-as a kind of nuts and bolts
example of one way that peer evaluation could work at Western. Goals
for Peer EvaluationIt
is important to recognize that there are two purposes behind faculty teaching
evaluation: 1)
Accountability, in which a judgment is made about the teaching ability
of the professor, which in turn informs some authority (e.g. T&P committee,
Dean, Provost, state legislature), that then acts on this information; and 2)
Assessment, in which the professor's abilities are gauged for the purpose
of self-correction and improvement.1 Though
differing in purpose, accountability and assessment should have a similar ultimate
goal, that of improving teaching ability; after all, if not to insure we have
excellent teachers, why have such measures? Unfortunately, gathering information
for the purposes of accountability and assessment can have very different effects
on the person being evaluated. Let's consider, for example, this scenario (somewhat
extreme, though based on fact) where accountability, not assessment, is the primary
goal: Professor
Smith is up for promotion next year. His colleague, Professor Bones, visits his
class for the purpose of forming a judgment about Smith's teaching abilities.
Smith sees Bones enter the classroom and knows that he is about to be "graded"
on his teaching. Professor Bones has never visited Smith's class before and will
likely never visit it again. Smith's teaching abilities are on the line now; it
is make or break. He had planned to try a new discussion technique in class today,
one he read about in a teaching journal. Flustered by this sudden change in events,
Smith fumbles his way through the discussion. The students sense his apprehension
and say virtually nothing during the discussion. Bones scribbles a few notes and
leaves halfway through. At the conclusion of the dismal class, Smith returns to
his office and scans the want ads in the Chronicle. In
this example, the visitor's primary purpose is to make a judgment. The relationship
between Smith and Bones is somewhat like that of jobseeker and interviewer, except
in this case Smith already has the job, and the purpose of the visitor is to see
if he keeps it. The person being reviewed has little reason to welcome the visitor
and will probably be nervous during the class. The fact that Professor Bones makes
only one visit means that he will get a small and possibly biased sample of Smith's
abilities. While
there is the potential that Bones' bias might be mitigated by having made more
classroom visits, I make the suggestion that the fundamental relationship between
the observer and the observed be changed: from one of accountability to one of
assessment. Rather than a culture where teaching is a private endeavor,we should
strive to make it public; rather than an environment where members of a department
sit in judgment, we should strive to create an atmosphere where faculty avidly
seek out colleagues for pedagogical discussion and advice, where classroom visits
by fellow faculty are frequent and welcome. Yes, accountability will always be
present; we still need to make tenure decisions and at some point each faculty
member must pass judgment on others. But there are ways to ameliorate this adversarial
component while encouraging support. Such a culture rests on two pillars: 1) non-judgmental
feedback, and 2) frequent and multiple modes of assessment. A
good parallel for this model can be found in what we consider good practice in
the teaching of writing. It is well known that student writing improves quickest
if students are given frequent assignments which receive ungraded
comments on drafts. The grade is given only when the final report is handed in.
In this type of class it behooves the student to write the first draft as soon
as possible and go through as many revisions as possible, with input from the
instructor, in order to turn in the best final product on which the grade is based. I
would also like to comment on the value of student evaluations in general. There
are those who think that student evaluations are basically worthless; that they
are governed mainly by how "popular" or easy a professor is. Some of
these people advocate eliminating student evaluations altogether. Personally I
can't imagine not asking students how they felt about a course. Even in my introductory
courses, which most students, as science-phobes, would rather avoid, my primary
objectives include that students like taking my course, that they learn how science
works and to enjoy thinking in a scientific manner-perhaps even be inspired. If
most of the students don't like the course by the end of the quarter, then I have
failed. Many
times I have heard the "statistic" that student evaluations are correlated
with grades, the implication being that easy-grading instructors get better evaluations.
Yet there have been over 1300 articles and books published which contain research
on the topic of student ratings, and when the data is synthesized it clearly indicates
that students "who receive higher course grades do not give higher course
ratings." 2 Another comment I often hear is that
students didn't like a particular course because it was "too rigorous".
Yet again, when the numerous data sources are synthesized students "do not
give lower ratings to difficult or challenging courses that require a heavy work
load." 3 Actually, just the opposite is true:
a recent study of Western course evaluations found that students responded positively
to challenging courses. 4 Moreover, data synthesized
from national studies indicate that "students' overall ratings of course
quality and teaching effectiveness correlate positively with how much they actually
learn in the course (as measured by their performance on standardized final exams)."
5 In my personal experience I have seen many very
rigorous yet popular professors. For example, one professor in the Geology Department
teaches a series of courses that are highly quantitative in nature and require
copious amounts of difficult homework. I regularly see crowds of students in the
lounge, calculators in hand, conferring over their problem sets for this course-if
not exactly "enjoying" themselves, they are definitely fully engaged.
Yet in spite of their level of difficulty (and the fact that grade averages for
these courses are at or below the averages for the department), students flock
to these courses and give this professor among the highest course evaluations
in the department. Clearly, factors other than grades and ease of coursework are
at work here.
|
We
can mimic this practice for faculty development by creating a system of regular
evaluations of "drafts"-for example, visits to classes by reviewers
who make observations, take notes, then review their observations with the instructor.
The reviewer would give a copy of their comments to the instructor only (because
this evaluation is primarily for self-review) and, at their discretion, keep a
copy for themselves. A reviewer's primary questions would be "What can this
person do to improve?" and "Is this person making progress?" Ideally,
over the course of a year several classroom visits would be made. The year-end
and tenure evaluations would be independent and separate from the classroom evaluations
but would be informed by the observations made during the year(s). At the point
when judgments must be made, the questions informing the case would be: "How
good a teacher is this person now?", "What is the potential for this
person in terms of teaching?", and ultimately, "Is this person good
enough to tenure?" In this system, there is an incentive for the probationary
professor to encourage faculty to attend their classes and get feedback. It
is also important to vary the objects of evaluation. There are many sources of
information on teaching abilities besides classroom visits. The standard source,
of course, is student evaluations. These are fine as far as they go, but I have
found them, as presently constructed, to be a relatively undiscerning tool. Using
the standard evaluation, I can tell when my students like my class and when they
don't. But the machine-graded questions are far too general (and in some cases
useless) to inform my teaching, while the questions that elicit comments (What
did you like? What needs improvement?) do not provide the kind of reflective thought
I need for feedback. In order to get comprehensive feedback on my specific learning
outcomes and on the teaching techniques I employed, I append customized questions
to the standard evaluation form. For instance, I hand out and discuss a list of
course objectives and learning outcomes at the beginning of each course. When
I evaluate the course, I attach this list and ask each student to rate their improvement
on each item. I also ask specific questions about the efficacy of new teaching
techniques. With these directed questions, I get much fuller and more thoughtful
comments than the usual "Great (or lousy) course!" My experience with
these sorts of evaluations has convinced me that students are very discerning
and astute commentators if they are asked the right questions. Other
sources for information on teaching include such course materials as syllabi,
exams, project assignments, etc. Online materials, too, especially those that
include multimedia and interactive components, can give us excellent insight into
the effort that is put into teaching. In the Geology Department we have an irregular
forum where one or more faculty demonstrate a teaching technique that they have
developed that they find particularly useful. These presentations are an excellent
low-pressure vehicle for demonstrating creativity in teaching methods. Interviews
with students, particularly graduate students are also important, because they
touch on aspects that may not be reported in standard evaluations. Moreover, graduate
student interviews are useful for understanding a professor's abilities as a mentor. As
a way of bringing all their expertise together, faculty could assemble a teaching
portfolio, which would, obviously, provide a place for assembling their materials,
but more importantly a context for explaining and/or demonstrating their teaching
philosophy. Indeed, the power of teaching portfolios came to the attention of
the Geology Department during job searches conducted in the last four years. Our
position announcement demanded demonstration of both research and teaching expertise.
Those applicants that submitted a teaching portfolio along with their student
evaluations stood out from the crowd because 1) they cared enough about teaching
to create a teaching portfolio, and 2) the portfolio assembled their teaching
materials and philosophy into a coherent whole. At
this point, having presented some alternative modes of peer teaching evaluations,
let's revisit the scenario presented earlier involving Professors Smith and Bones.
This time, rather than an adversarial approach to peer evaluation, let's imagine
that their department has embraced a peer evaluation model based on non-judgmental
feedback and the improvement of teaching-on the idea of assessment rather than
accountability: Professor
Bones enters Smith's classroom. Smith looks up and says "Ah, Bones! So happy
you're joining us! Today, I'm trying a new discussion technique and I would be
most interested in your feedback." At Smith's invitation, Bones has visited
his classes twice before. (Moreover, Smith's classes have been visited by two
other faculty at different times of the year). Bones has also read Smith's teaching
portfolio and understands his interesting though sometimes unorthodox approach
to teaching. Bones has already had one discussion with Smith regarding his observations.
Smith has also given a short departmental presentation on an innovative classroom
demonstration he developed. On this day, Smith handles the class discussion moderately
well. Afterward Smith and Bones confer about the class and both agree that the
new technique has merit but could be improved by letting the students work in
groups for a few minutes. Smith looks forward to trying this idea out. When Professor
Bones writes the tenure evaluation for Smith, he has a file of observations from
multiple sources from which to draw and is aware of Smith's progress and potential. Clearly,
this scenario is strikingly different from the first. For one thing, it follows
the "best practices" outlined in this essay by having multiple modes
of input and frequent observations. For another, it is rooted in an atmosphere
of trust-the understanding that the visitor is there to help and not to judge. Granted
that the first example, representing the accountability model, is somewhat extreme,
it has nevertheless been my experience that teaching and teaching evaluation in
most departments at most schools tends more towards that end of the spectrum.
In the accountability model, teaching is generally done in isolation with little
outside feedback. Student evaluations, when performed, are confidential and read
only by the professor until it is time for the annual evaluation. Student evaluations
are generally the only means of assessment, so there is pressure to make sure
they are high. If the professor is lucky enough to have had relatively high student
evaluations, there is now a disincentive to try new teaching techniques for fear
of lowering those scores. Worse, if the professor has low scores, there is incentive
to hide this fact and perhaps stop giving evaluations altogether. On
the other end of the spectrum, the assessment model, the emphasis is on improvement
and self-correction, on collegiality and teaching creatively. Putting
this model into place can transform the atmosphere for peer evaluations from one
of wariness and skepticism to one of trust, can transform nerve-wracking stress
into meaningful hard work. [Top] ConclusionMost
importantly, we must accept the fact that there are many kinds of teaching with
different audiences and that even if the ideal system for improving teaching were
in place, not all faculty would excel in all modes of teaching. Large non-major
introductory courses require different teaching skills than those needed for mentoring
graduate students. I am an outstanding undergraduate classroom teacher, but I
am only moderately successful as a graduate mentor. Likewise, members of my department
display a wide variety of strengths. One, an only adequate large lecture teacher,
attracts graduate students and upper division undergraduates like bees to honey,
involving them in an endless variety of independent research projects. Another
professor is particularly gifted in teaching field classes. It cannot be stated
forcefully enough that teaching is not one size fits all; indeed, for a department
to have real strength, we need all types of teaching expertise. When making course
assignments, the trick lies in playing to an instructor's strengths while at the
same time trying to improve areas of weakness. Importantly, a teacher's varied
dimensions need to be recognized and appreciated by those who make tenure decisions.
Otherwise we run the risk of selecting teachers who score well on standard student
evaluations (such as those in large undergraduate classes) and neglecting those
whose strength lies in the role of mentor. Finally,
all this talk about a less stressful and more meaningful peer evaluation model
is well and good, but where do we find the time for it? Although Geology's teaching
evaluation system contains all of these components, and generally occurs in a
positive and supportive atmosphere, it is by no means clockwork. Our classroom
visits tend to cluster in the quarter before evaluations are due, and the winter
and spring quarters prior to tenure applications see a flurry of professors, sometimes
two or three at a time, visiting other's classrooms. But when duty calls we respond,
and those probationary faculty who are up for review can be sure of having at
least three faculty visits in the quarter prior to their evaluation. Clearly,
however, for peer evaluation techniques to change universally, the challenge would
indeed be one of incorporating changes systematically. Like anything new, there
would be transitional issues to address, and certainly not all instructors would
wildly embrace the changes.6 Yet there is clearly
a need to make peer evaluations more meaningful if they are to have continuing
influence on hiring and tenure decisions.< *
* *1.
Frye, Richard (February, 1999). Assessment, Accountability, and Student Learning
Outcomes (Dialogue, Issue No. 2). Office of Institutional Assessment and Testing.
Western Washington University, Bellingham, WA. 2.
Cuseo, Joe (October, 2000). Evaluating New-Student Seminars & Other First-Year
Courses via Course-Evaluation Surveys: Research-Based Recommendations Regarding
Instrument Construction & Administration, Data Analysis, Data Summary, &
Reporting Results. http://www.brevard.edu/fyc/fya/CuseoLink.htm 3.
Ibid. 4.
Frye, Richard (April, 2001). Teaching Evaluation Summary: An Analysis of Fall,
2000 Course Evaluations. (Unpublished.) Office of Institutional Assessment
and Testing. Western Washington University, Bellingham, WA. 5.
Ibid, Cuseo, Joe, et. al. 6.
As an example: when Western changed its phone system in the late 1980's, one professor,
so accustomed to the old ways, refused to direct dial long-distance calls, but
rather continued to have the department secretary make such connections for him.
He retired having never made such a call from his office.
[Top] published
by Office of Institutional Assessment and Testing Dr. Joseph E. Trimble,
Director; Gary R. McKinney, General Editor technical assistance by Center
for Instructional Innovation Dr. Kris Bulcroft, Director; Web Design by Karen
Casto
For copies of Dialogue, OIAT technical reports, Focus Research Summaries,
or InfoFacts, please contact Gary McKinney, Western Washington University, MS:
9010, Bellingham, WA 98225. Telephone: (360) 650-3409. FAX: (360) 650-6893. E-mail:
garyr@cc.wwu.edu. TTY: (800) 833-6388.
Join in discussions of Dialogue issues on the web at: http://www.ac.wwu.edu/~dialogue. Dialogue
Home | Institutional Assessment
Home | Center for Instructional
Innovation Home | Western Home
.gif)
|