What were you studying and why is it important?
I
recently learned that a disproportionate number of women in academic medicine
end up leaving their positions and even academia altogether. Although there are
likely multiple reasons for this, our long-term research goal is to determine
whether gender bias hinders women in academic medicine from obtaining
leadership positions or being promoted. This study is one of the first steps in
that longer research plan. We wanted to determine whether trainees use
different language to describe their male and female attending physicians.
What did your research reveal?
We
found that the overall numeric ratings of the effectiveness of teaching did not
differ between genders. However, trainees did use different language to
describe their male and female attendings. For example, trainees were much more
likely to use the words “warm” or “model physician” for female attendings and
“master” or “master clinician” to describe their male attendings.
What do you think these findings mean?
We
found language differences in the free text comments — trainees use different
words to describe their male and female attending physicians. Even if the
attendings were deemed equally capable (as determined by their numeric scores),
language differences persisted. At first glance, it may not seem important that
the words used are different since the numeric scores are the same. However,
these words could potentially impact whether a physician is hired or promoted
based on the desired qualifications for the job or role. The next step in this
research will be to assess whether these differences do indeed carry forward
and impact such important decisions.
Were you surprised by anything you found or anything that happened during the research process?
One
of the unique things about this particular research project is the use of
natural language processing — text mining of the narrative comments of
evaluations. Similar gender-based differences to describe men and women have
been documented in other fields. Therefore, we attempted to use the same
lexicon to mine for gender-based language differences. It turns out that these
lexicons do not apply in medicine — we tend to use different language in
medicine to evaluate each other. Although I was not entirely surprised by this,
I was surprised that the gender-based differences were not the same.
How do you hope these findings will be used?
We
will now assess whether these language differences impact high-stakes decisions
such as hiring for leadership positions or promotions. I hope that these
findings will help us improve gender gaps in academic medicine. In addition, it
is my personal view that we will not be able to eliminate any type of bias.
Although we have not yet demonstrated that these word differences are because
of bias, I suspect that they are. I hope that this kind of research will create
safety checks that determine whether any biases were corrected for when making high-stakes
hiring or promotion decisions.
What impact does this research have on medical students and faculty?
I hope that as we learn about gender-based differences in the way we evaluate faculty (and there is no reason to believe that this does not occur for medical students or residents as well), we start developing mechanisms to counteract or correct for such bias in any high-stakes decision-making that incorporates such evaluations.
Read the full study: “Assessment of Gender-Based Linguistic Differences in Physician Trainee Evaluations of Medical Faculty Using Automated Text Mining”