Richard Marr
Jul 27, 2018 · 12 min read

Hiring can be a hot mess of bust-up traditions, psychometric snake oil, technotopian crypto-babble and downright quackery… but I’m still optimistic about its future. In this post I’ll try and summarise the four main problem areas using a rather pedestrian apocalypse metaphor (that I’m secretly quite proud of).


There are plenty of reasons to feel downbeat about hiring right now, and you wouldn’t be alone; articles like “The Decline of Recruiting” and Abbie’s “The Future of Job Hunting is a Black Mirror Hellscape You Can’t Avoid” aren’t exactly rays of sunshine. Their negative view is totally understandable, but I think there are some important factors that might make the future slightly rosier.

Since the problems seemed to fall into four areas I’m existentially obliged to roll out a full blown Revelations metaphor. I’ll take you through each of the four horsemen and explain the pain points, but then why it might not be so bad after all.

Excuse me while I enjoy this. And no, this isn’t juvenile at all. Shut up.

Figure 1. Detail of “Four Horsemen of the Apocalypse”, Viktor Vasnetsov, 1887. Vasnetsov now has a minor planet named after him, one of the 213 discovered by Lyudmila Zhuravlyova, which is nice. This was the painting that clued me into the fact that our ill-tempered equestrian provocateurs are a lot more ambiguous than pop culture pretends, for example, yonder fella with the gold hat has been interpreted as either Pestilence, or War, or as Christ himself. The format they pioneered, referred to in occult circles as “four harbingers, one hat”, was an early influence for modern woe-heralds The Monkees.

Horseman 1: Sweat

Looking for a job has always been time consuming and emotionally draining. In addition to maintaining CVs, talking to recruiters, job searches, screening calls, phone interviews, regular interviews, negotiation, rejection, and all the waiting around, lots of companies are adopting new and burdensome testing techniques, sometimes taking days.

A few examples of sweat

  • Creating and submitting recorded video. Even just 5 minutes of footage requires preparation, a quiet location, spending time on appearance, and likely re-take after re-take until people are happy with their video.
  • Numeracy tests, reasoning tests, cognitive ability tests, personality test, psychometric tests and situational judgement tests. These typically take 30 to 60 minutes, and can be pretty stressful.
  • Technical tests, or even projects to be completed at home. Even if they say “just spend a couple of hours on it” the context demands otherwise. People spend days on this stuff.

People are changing jobs more frequently, so the new stream of tasks, tests, and evaluations starts to resemble the chap in figure 2 below:

Figure 2. King Sisyphus, Johann Vogel, 1649. The Greek god Zeus condemned King Sisyphus to eternally receive Privacy Policy update emails over and over again… driving him to endlessly roll a giant boulder over his Samsung Galaxy S7 to try and silence the notifications, not realising that model phone is a curse sent by Hades and cannot be destroyed by mortal man. Vogel used Sisyphus as a symbol for continuing senseless war, and that metaphor seems to work for modern job hunting too, but with fewer boulders usually.

Knowing which candidates are most likely to do well is a huge competitive advantage, so it’s understandable that organisations are looking for better ways to assess them, but there’s a trade-off to be made between asking candidates to demonstrate ability and respecting people’s time, labour, and stress levels.

But there are a few rays of sunshine

Below are a few pressures that act on companies to limit the work they can get away with asking candidates to do.

  • Being quick to offer is an advantage. While employers are starting to move away from “time to hire” as their primary success metric in the direction of quality of hire, it will never go away entirely. Candidates will often accept their first good offer.
  • Hiring brand. Companies care about how people perceive them, even rejected candidates. In 2016 it was ranked #2 of all issues concerning HR folk. The desire to be liked puts pressure on companies to minimise sweat. This is especially true for companies with a consumer brand or hiring in small talent pools like tech.
  • Less sweat elsewhere. While there aren’t always net reductions in time and effort for candidates, some of these assessments replace rather than add to existing processes, e.g. replacing CVs/résumés. If people no longer need to maintain and customise a CV for every job that’s a good thing. Similarly companies asking for recorded video submissions may reduce the number of on-site interviews meaning less travel, days off, etc.
  • Status updates and feedback. Since some of these techniques build structured data about the performance of candidates, that data can be communicated back to those candidates, so that they can learn from the process rather than just being given a flat “no” or whatever diluted safe-speak a recruiter is prepared to give them.
  • Reject earlier. The best of these tests improve on the ability of existing processes to predict ability (if they don’t then frankly what’s the point). Where there are such improvements, fewer peoples time would be wasted pulling them into interviews for jobs they’re not going to get.

Horseman 2: Bias

While many vintage hiring techniques are already steeped in hand-made artisanal bias, we also need to be extremely wary of biases introduced or persisted by new assessment techniques.

Figure 3. “The Bible”, John Huston, 1966. In this epic tale of bias, original troll Abraham tells Isaac & Ishmael about the time he told King Abimelech that his wife Sarah was his sister and God had to intervene to stop Abimelech marrying her. Sarah was like “LOL”. Abraham later decided that Isaac would inherit his board game collection, including his limited edition Settlers of Catan expansion, and that Ishmael had to live in the shed.

Some examples of modern bias

Not all of the following are strictly new, but they’re certainly relevant.

  • Amplifying existing biases. Take recorded video interviews as an example. No bias is inherently removed by video as a format, although some providers do claim less bias via machine learning. Attractiveness bias is already a problem, but (I suspect) amplified in those who know how to present themselves well on camera… an Instagramisation of hiring. Then there will likely be an additional demographic bias caused by better access to time, quiet space, equipment, and differing degrees of entitlement.
  • Numeracy testing. Leading test providers in this space see skew towards men of between 8% and 25%. There’s no such severe skew in higher education or in the population as a whole, which (barring some strange differential self-selection) that some incidental non-mathematical detail of the tests somehow triggers this performance gap.
  • Ineffective bias training. US corporates spend $8bn annually training their staff to combat bias, but studies of the efficacy of that training show it to be of mixed value, and can even have a detrimental effect.
  • Machine Learning. Training a machine requires lots of high quality unbiased data and rigorous training methods, which is hard, especially when so many data sources (eg. CVs) are full of bias triggers and only weakly predictive of ability. Such systems require as strong (or stronger) checks against bias than human decision makers.
  • Keeping the weakest links. If teams final decisions are biased, then it’s a waste of time fixing the earlier stages of the hiring funnel. For example, a recent study found that shortlists that included a single woman or minority had a near-zero chance of hiring them. But those that included two were 80 times more likely to hire one of them, even after accounting for the increased number.
  • Making the weakest links even weaker. Daniel Effron’s work on moral licencing indicates that giving people a reason to feel virtuous prior to a decision can licence them to behave in a more biased way than they otherwise would. In hiring, this means de-biasing some steps of the process but failing to put all decisions under adequate scrutiny may result in worse decisions.
  • CV blinding. Hiding names does half a job. To properly de-bias a CV you also need to remove schools, addresses, company names, even indications of age (such as dates, and numbers of past jobs). Then you have bias inherent in the content itself; people with a lucky start to their career (eg. a good school, or the right connection) will be able to capitalise on that luck, and their CVs will inherit the hiring bias of each organisation they pass through, further reducing the predictive signal strength of that CV.
  • Imperfect bias correction. Some assessments claim reduction or removal of demographic bias, but instead try to correct for it. Rather than operating a bias-free assessment, they plot candidates against a known performance curve for people in similar demographic categories. Since the correction algorithm will contain assumptions and rely on finite data sets. This approach can be made to look unbiased, but will never actually be unbiased. It’s just harder to see the pattern.

Reasons to be more optimistic about bias

All that bias sounds pretty bad, and it is, but we’re starting to get better at measuring it and countering it. Maybe machine learning isn’t a silver bullet, but here are some things that can help:

  • Bias is error. Being swayed by non-predictive information is by definition an error, so companies that correctly avoid it will naturally build better teams. Companies are already starting to catch on to this idea, creating a positive push away from bias-prone hiring methods.
  • Other companies’ bias is an opportunity. The presence of bias error in legacy ‘best practice’ means that there are currently untapped pools of talent that can be accessed by savvy organisations. There’s profit in bypassing the pitfalls other employers are tripping over. For example, high performance companies are starting to drop educational attainment as a filter because it’s not a strong predictor of ability; they lose too many great people. So there’s an opportunity to find similarly great people who are being overlooked by other hirers.
  • Things will settle down. This space is currently going through an unusually rapid transition, so the tools and processes we’re seeing today won’t be the same as the processes we’ll see in 10 years. I’m too old to be an idealist, but I still think that transitions like this tend to result in net improvements to ‘best practice’.
  • Options are already available. Most (or all?) types of bias that effect hiring have techniques and tools that can counter or mitigate them, and it’s not difficult to monitor each stage of a hiring process, allowing you to monitor and adapt it. That process of constant measurement, trial and error, and adaptation is what will really solve bias. Another name for that process is science.

Horseman 3: Snooping

Like the insurance sector, recruitment attempts to predict the future. It looks at your past and your present, and tries to guess what your future looks like. Like insurance, there’s always more data about you that hirers want to see, something that might reveal a red flag, or give them an edge. Privacy is eroded, little by little.

A few examples of snooping

  • Prospective employers asking for Facebook passwords.
  • Companies that use machine learning to examine your social media presence and make prediction about your personality, ability, and your team fit.
  • If recent research makes the obvious jump to hiring, machines will soon be able to watch your interview video and use your eye movements to predict your personality traits. These traits, such as diligence and agreeableness, are often linked to aspects of workplace performance, so supposedly give insights employers might want.
  • Personality testing in general feels a bit questionable to be honest. “Personality”, like “culture” is a word that covers a multitude of things that we don’t adequately understand, and some personality traits are heritable.
Figure 4. Detail from Jeremy Bentham’s design for the Panopticon Penetentiary, drawn by Willey Reveley, 1791. A single guard could observe the activity of all inmates, who could not themselves tell whether they were being observed at any given moment, like Love Island but where they watch us. The inability to observe the solitary prison guard became problematic when it was discovered that they were using that privacy to perform shameful acts, such as watching Mrs Brown’s Boys on BBC iPlayer. Several Panopticons have existed over the years, but the last of them was finally defeated by Optimus Prime in 1987.

Most people will be in the workforce for a good portion of their lives, so that adds up to quite a large pile of privacy invasion.

(As an aside, my local council is switching from weekly bin collections to every other week, which is likely to make my street smell like nappies and rotten eggs. If any insurance or recruitment companies want to collect my bins in exchange for rummaging rights then Mondays are good for me. Ta.)

Reasons employers might get less Snoopy

So, yeah… you get this by now.

  • Snooping can result in low quality data. Not all data was created equal. Companies want to hire better, and the most predictive indicators of good hires are closely related to the actual tasks performed in the job. Aside from being patchy, the predictive power of ‘clues’ found on Facebook will be zero, or near-zero.
  • Snooping risks adding bias. Looking at social media can trigger bias, both in a demographic sense and in terms of all the other ways humans can be misled, eg. if Linkedin shows a candidate has mutual contact that you don’t get on with, or if you discover a potential hire is a Tottenham supporter. Even using specialist job-related snooping is problematic. For example not all devs have a Github account, and of those that do only 7% push code more than 10 times per year. That 7% are hugely skewed towards white men.
  • Competition & hiring brand. Just like in the previous two areas, these elements work in job seekers’ favour. While they’re no panacea, most companies have a hard time attracting and retaining talent. Laying the groundwork for employee resentment before they even join is a poor strategy
  • Some tools help job-seekers snoop back. While some privacy concerns are absolute, some are relative. Tools that swing the power balance back from employers and empower job-seekers may help. Linkedin lets you stalk employees at a company to see who you can get an intro to, or how their CVs compare to yours. Glassdoor lets you listen to a company’s disgruntled employees, and sometimes CEOs in disguise saying that things aren’t so bad after all.

Horseman 4: Hubris

Two full decades after Schmidt & Hunter (1998) published their landmark meta-study, plenty of folks are still basing decisions off the low-predictivity data from CVs and unstructured interviews, and it looks like plenty of other questionable techniques are jumping into the ring too.

In a world full of evidence, failure to look for that evidence is an indication of false confidence, so some of the blame here lands with CEOs, HR, and hiring managers… but there are also vendors out there selling systems and training making claims that are inadequately supported by evidence.

Figure 5. Illustration for Paradise Lost, Gustave Doré, 1866. In John Milton’s 1667 tale of biblical hubris, Paradise Lost, Lucifer is punished for copying God’s act of creation by casting him into Tartarus, where he met his daughter and things got bleak. John… u okay hon? Debate among theologists suggests the world would be a better place if God had just given Lucifer an ASBO, an ankle tag, and a taxpayer funded holiday to Magaluf.

Some of the hubris and quackery in hiring

  • Basing decisions on CVs, despite the low predictive power and the large number of triggers for bias.
  • Myers Briggs has no basis in science. It’s astrology for business. Just stop.
  • Continued use of training techniques that have actually been shown to have no effect, or even a detrimental effect.
  • Failure to instrument hiring processes. In product design, you measure what you care about and those metrics help shape your decisions. If you care about your hiring funnel (i.e. your competitive advantage) then you instrument it to understand where it needs improvement.
  • Continued use of “gut” and “culture fit” in hiring, when evidence shows these to be problematic.
  • Failure to de-bias performance data and promotion decisions
  • Failure to link that performance data to hiring practices. After all, how can you improve hiring if you don’t figure out what worked well and what didn’t.
  • Unstructured interviews; there’s just no excuse for that.

It might not be such a pile of quack

Aww, this is the last one. I’m getting a bit teary.

  • Pressure to report hiring data. Silicon Valley is starting to compete on transparency to attract talent from under-represented groups. Here are a few examples from Google, Apple. Slack, and Facebook. Similarly, in the UK companies over 250 people are now legally required to report Gender Pay Gap data.
  • There’s more and more science on this. Looking at published studies can help unpick what might be going on in any given organisation. There are issues with replication, some studies get under-publicised or over-interpreted, but we’re learning more and more.
  • The usual suspects. As I’ve mentioned before, the market for this stuff is currently transitioning and things will settle down. As long as the free market does its job properly the new ‘best practices’ we settle on will be a compromise between predictive power, labour required, privacy and bias reduction… because that’s how to hire good teams.

Now what?

These horsemen are a pain but they’re not unstoppable, and we as a society get to choose what to do next.

We could wait for companies to fix the problem, but even well-designed markets have imperfections. In this case, buying decisions for tools and training are made by companies, not job-seekers, and the companies doing the buying have varying degrees of knowledge, enthusiasm, and inertia. That’s a market in which the candidate’s priorities will come second, since providers of recruitment tools will typically compete to impress the buyer.

“If you’re not the customer, you’re the product.”

I’m actively trying to address some of this stuff in my day job, but it’d be hubris to assume that we can change the entire sector. That’s why I think it’s worth writing about how things could be, and how we might get there… to raise expectations amongst HR folk and job-seekers alike, and gently put pressure on the industry to up its game.

On that basis I’ll follow this up with a manifesto* of sorts, to provide some positive concrete steps that employers and service providers can take in order to take things in a better direction…

Maybe nobody else will ever read it, and that’s fine, I’ll use it to guide our product roadmap.


* If you want to comment on (or contribute to) an early draft please get in touch.


Richard Marr is co-founder and CTO of Applied, a SaaS platform that increases hiring precision and reduces bias. While he talks big, he’s actually full of crap, and rather spuriously claims to have invented A/B testing.

Finding Needles in Haystacks

The ongoing story of Applied, a team obsessed with using science to make workplaces fairer and more efficient by removing hiring bias and replacing it with things more predictive of potential. (photo: Shibuya Crossing © Joshua Damasio)

Thanks to Andrew Babbage

Richard Marr

Written by

Occasional outragious claims. Moderate or good. Founded @BeApplied, @HeyGuevara. Formerly @Justgiving. Invented A/B testing.

Finding Needles in Haystacks

The ongoing story of Applied, a team obsessed with using science to make workplaces fairer and more efficient by removing hiring bias and replacing it with things more predictive of potential. (photo: Shibuya Crossing © Joshua Damasio)

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade