The Excellence Paradox: Selecting for excellence is self-defeating
Competitive research processes lead to many perverse incentives - incentives that put the researcher and the research process at odds. For example, because of selective publication pressures researchers primarily end up publishing those research processes that end up in significant or novel results. This despite longstanding evidence that this form of publication bias reduces the validity of findings drastically.
Another perverse incentive is the notion of “excellence” in a competitive landscape. It seems logical: Select the best to get the best. That it seems logical, does not make this logic consistent with reality. In this blog I outline why selecting for excellence is self-defeating, and you do not end up with the best by trying to select for the best.
By most practical definitions, excellence is reserved for the few. The excellent are the cream of the crop. As such, excellence is most often defined relatively. Consider this scenario: If there are a 100 people, the top ten performing people could be considered the excellent ones.[1]
Beyond an alleged means to select the best, selecting for excellence also bakes in exclusivity, scarcity, and inflated value. Even if hypothetically every single one of the 100 people could be considered excellent to some degree, selecting for excellence often means only the top remain — in this case the top ten. Ninety people would be excluded, the work that they could produce is removed from the equation, and the value of the remaining ten is higher because of lower supply.
By selecting for excellence, we create a form of professional Darwinism. This is self-perpetuating, because the absolute performance does not matter. Even if you would be considered one of the top ten performers five years ago, maybe you do not make the cut today. No matter how good your absolute performance is, you need to prove your relative worth. This means that not only does selecting for excellence bake in exclusivity, scarcity, and inflated value; it also creates the ambiguous condition where your previous “excellence” is no guarantee in the future, in essence gaslighting yourself about how good you actually are. This strongly goes against the mental health advise to not compare your performance with other’s, as it can be (severely) detrimental.[2]
Naive realism
At the bottom of the line, selecting for excellence adheres to a naive realist philosophy: The results you get are 100% accurate. There are no false positives, there are no false negatives. Because somebody gets selected as excellent, they become, by definition, excellent. In this world, what we observe becomes real instead of our observations trying to reflect the real.
Fact of the matter is, none of the methods to assess relative excellence are that good. Even if the method with which we could unambiguously rank order people for the skill in question, we would have the problem that we do not know the distribution of what we are selecting upon. If there is a large gap between the so-called “excellent” and the “non-excellent,” then it might work. If there is a continuous increase of skill as we go up “the excellence hierarchy,” simple error will cause us to make mistakes. A person could be “more excellent” than someone else, but the measurement error on that occasion could make them “less excellent”[3] and potentially miss the cut. Arbitrary cut-offs can result in arbitrary results.
Base rate fallacy
Which brings us to the fundamental theoretical problem of selecting for excellence. At its core, the problem is the same as the base rate fallacy caused by the low prevalence of excellence. If only the top 10% are considered excellent, it means the prevalence is 10%.
Let us consider a very optimistic scenario. In our decisions, we only have 1% false positives (⍺; “non-excellent” designated as “excellent”) and 90% true positives (1-β; “excellent” as “excellent”). That would result in a positive predictive value (PPV), the chances of “excellent” being “correct”, of 91%. I calculate below using N=100.
In a visual representation, that could look like this:
You could say, not too bad — but remember this is the best case scenario! In the best case scenario, just shy of 1 out of 10 people will not get the career opportunities they supposedly deserve under the excellence framework. This can have all kinds of downstream consequences (housing, relationships, mental health). Again: This is the best scenario I will be offering here!
We can switch out the rates in our scenario quite easily to take less optimistic scenario’s. If we take the basic error rate for false positives, often set at 5%, only 2 out of 3 people selected as excellent would be considered correctly selected.
Which in visual terms could look like this:
Again, this means that people’s livelihoods depend on a measure that fails 1 out of 3 times. This is still a rather optimistic scenario --- we can also get more pessimistic, in terms of selection pressures.
Imagine a world ten years down the line. Researchers still want to get tenure. The government has kept funds for researchers at parity levels and did not substantively increase them to increase capacity. However, the amount of researchers seeking tenure has increased, because of the higher amount of graduate candidates. As a result, competition increased and only the top 1% is considered excellent, instead of top 10%.
When looking at the visuals it seems like a fruitless exercise, with more mistakes than anything else.
In this scenario, selecting for excellence fails for 17 out of 20 researchers (85%). Selecting for excellence fails 17 out of 20 people’s lives and livelihoods. People who may have moved to another country for this opportunity, who may have all kinds of circumstances that create additional stress on their lives. Would it not be unacceptable to use this as a reliable mechanism to make career and life altering decisions? Unethical?
Of course, we can never know the reality of this excellence mechanism. I call it naive realism, others may simply say it is realism. All probability theory would tell us is that even in the best case scenarios selecting for excellence fails; in worse cases it fails miserabley — especially as excellence becomes an ever more exclusive concept.
What to do?
It could be straightforward: Reduce selection pressures. Of course university budgets matter; of course the number of graduate students matter. So many things matter, but at the end of the day all of these play into what the selection pressures are, and we must reduce them.
Even if we do not reduce the pressures, we must at least keep the same capacity. Without available resources growing at least at the same rate as the pool of candidates (whether that is funding, journal space, or otherwise), we will get even higher selection pressures. If capacity is not kept at parity, selection pressures will grow.
Better than keeping things at parity is to widen the pool of people to consider excellent. We would be way better off to select for what is good enough, rather than for what is excellent. What if good enough is the top 30%, instead of the excellent top 10%? We would approach the best case scenario very quickly with a positive predictive value of 89%.
A practice like selecting for excellence, but outside of academia, is called "rank and yank." It caused much trouble for companies like Microsoft over th years until they discontinued it in 2013. Yet, in academia ranking and yanking thrives on all levels to everybody's discontent and the failure of its purpose.
In a future blog post I will evaluate journal rejection rates in light of the excellence paradox. Stay tuned!
Absolute excellence can exist too, if the rule to define excellence is strictly defined. Because in most research situations excellence is highly ambiguous, I consider relative excellence as the primary form of excellence that is of interest. ↩︎
Someone is always going to be better than you in one specific thing, even if they are not performing as well as you on all the things you do! ↩︎
Talking about excellence makes very little sense to me, as I notice in writing this. ↩︎
Sorry for the screenshots. I tried embedding the formulae directly and it got real messy 😔 If anybody knows how to properly embed LaTeX in Ghost CMS, I'd love to hear! ↩︎