Harvard Macy Institute

Variation is good. In fact, we want to see variation in evaluations so that we can differentiate between learners. Learners can also grow from receiving specific, actionable feedback. After all, not every learner performs the same way. The goal, however, is to hone in on the variation that is truly due to learner performance and not other causes. For example, we want to minimize or account for variation due to differences in learning environments, evaluators, or available support because factors like these result in inequities in evaluations. Unfortunately, despite efforts to reduce unwanted variation, substantial parts of performance evaluations and grades are not explained by learner performance alone.

Not surprisingly, the largest contribution to variation in grades usually is attributed to the evaluators. Our Office of Assessment and Evaluations uses methods (for example, generalizability and decision studies) to help us understand these differences including such as intra-rater and inter-rater biases. Although these techniques might help us identify and potentially minimize the impact of these biases on evaluations and grades, they do not provide solutions on how to counteract these differences. Our focus should be on all potential barriers to fair performance evaluation but we should start with further investigating the barriers that unfairly impact learners because of their race, gender, disability, or sexual orientation.

To have equity in assessment, all learners would have an opportunity to learn, be evaluated and graded without any negative consequences from implicit or explicit bias from their peers and supervisors or the negative effects of their learning environment. Such difference in the learning experience could have significant short and long-term consequences for a medical student or resident. For example, it has been documented that medical trainees who identify as part of an underrepresented minority in medicine (URiM) may receive lower clerkship grades or evaluation scores compared to their non-URiM peers. In these studies, differences were not explained by observations of performance. In the short term, these negative impacts on evaluations could certainly impact the learner’s self-confidence, self-efficacy, and overall wellbeing. In the long-term, this could impact who may receive awards or honors (e.g. induction into honors societies) which, in turn, could lower the trainee’s chances of matching into a residency or fellowship program or obtaining a job.

We often refer to a framework to recognize areas of potential inequities within the assessment process. The authors propose three components to equity in the assessment process. Bias can be introduced and then potentially compounded in each of these components.

Intrinsic equity (equity of the program of assessment and individual assessment tools): The language in assessment tools themselves can introduce gender and race bias. For example, if the language used to describe an “excellent” performance uses more masculine terms, trainees who identify as women may be less likely to be rated “excellent.” Similarly, languages used to describe learners as part of their narrative evaluations have been shown to differ by race and gender.
Contextual equity (fairness in the learning experience and the environment in which the assessment is conducted): Trainees have identified that certain learning environments may not allow them to learn or perform at their best.
Instrumental equity (how the results are shared with and used by the stakeholders): Schools or programs can also introduce bias in terms of how they interpret or use the results of evaluations. For example, most medical knowledge tests have been designed to assess minimal competency. However, schools and programs may use them to rank trainees. Using a result in a way it was not intended can either introduce additional bias or compound the bias in the original instrument.

We introduce this framework for considering equity in assessment because it also provides a scaffold for discussing potential strategies to reduce inequities in assessment while simultaneously reducing systemic oppression. It is not enough to adjust for potential biases of the individual evaluators or rotation. We must also embrace a shared mental model that all of our learners deserve a fair and safe environment to learn and be assessed as well as recognize non-performance based differences as manifestations of bias. In order to address these biases, we must look at each component of the assessment process to ensure that, (i) our instruments are fair; (ii) we train faculty to evaluate learners appropriately and correct for faculty contributions to the variation in grades; (ii) our learning environments provide equitable opportunities for learning where each learner performs at their best; and (iv) we use data in a responsible manner without further compounding bias. And we must measure and track these efforts.

What can you do at your institution to start improving the equity of your assessment and evaluation process?

Did you know that the Harvard Macy Institute Community Blog has had more than 300 posts? Previous blog posts have explored topics including core EPAs for entering residency, maintaining academic integrity during remote proctoring, and systems of assessment in educational settings.

C. Jessica Dine, MD MSPH (Educators ‘19, Leaders ’22) is Associate Professor in the Division of Pulmonary, Allergy, and Critical Care and Associate Dean of Assessment, Evaluation and Medical Education Research for the Perelman School of Medicine. Her education role focuses on the development of robust evaluation systems and professional development with a particular focus on career theory. Jessica can be followed on Twitter or contacted via email.

Horace M. DeLisser, MD is Associate Professor in the Division of Pulmonary, Allergy, and Critical Care and Associate Dean for Diversity and Inclusion at the Perelman School of Medicine. His areas or professional interest include social medicine, cultural competency, doctor-patient interactions, medical ethics, end-of-life issues, and religion and spirituality. Horace can be contacted via email.

HMI Staff

Addressing Evaluation Inequities