The systemic decay of tech hiring
Complaining about tech interviews is the favorite pastime of software engineers. We know they're broken, we've seen exactly how they're broken, but, despite a decade of collective hand-waving, we haven't fixed them. In fact, we haven't even understood why they're broken: "interviewers are stupid" is a tempting explanation, but most people I've worked with are quite smart, so it can't be the only reason.
I stepped into hiring manager shoes as the job market was descending into AI panic. I've run over a hundred interviews, made a few dozen hires (with a few epic fails), and eventually led a redesign of our process for AI reality. After months of experiments and reflection, I think I see how this works.
In this article, we'll trace how an industry full of smart, well-intended engineers created the lunacy of the modern tech interview. This story has it all: chance and fate, fear and bravery, good people crushed by the system that's set up to degrade. I might not have an easy fix, but at least we'll understand the mechanics of this death loop. Let's go!
Hiring is not deterministic
We engineers expect actions to have a predictable outcome. Assign a variable in code, beep bop, its memory register value changes. Cause and effect. Naturally, we want an algorithm for hiring that reliably separates good engineers from bad ones. Nice as it would be, this is not a realistic expectation.
Imagine: I'm a good engineer, your interview process is great, you hire me. One month in, I decide that work is not really my thing, and I won't try that hard any more. The process was fine, but it failed to predict my performance, because the future is inherently uncertain.
Hiring is, in essence, a binary classification problem: based on some observations, predict whether a candidate will perform well on our project. Apart from the good outcomes, we have two types of errors:
- False positive: we hire a candidate who turns out really bad at the actual job.
- False negative: we don't hire a candidate who would have been a great fit.
With enough hires, we're statistically certain to make both types of errors. But their effects are drastically different.
The error asymmetry
A bad hire (false positive) is costly and visible. On top of the wasted hiring budget, a truly bad engineer creates security risks, damages architecture, and brings down production, all in a day's work. That's the kind of mistake that gets everyone's attention and causes unpleasant questions along the lines of "Vladimir what were you thinking hiring that guy"?
Missed good hires (false negatives), on the other hand, are practically invisible. Usually, the rejected candidate just vanishes. In the rare case we accidentally learn they turned out to be a great success somewhere else, it's easy to shake it off — just because they're a great fit over there, doesn't mean they would've been any good over here. We might get some extra time-to-hire and interview load, but this is nothing compared to the costs of a bad hire, right?
So, the conventional wisdom is "false positives, avoid at all costs; false negatives, fine". After a bad hire, we'd run a postmortem to understand what went wrong, and how to avoid all this trouble in the future. Focusing on false positives collapses the hiring problem into one dimension: if a low performer sneaked through, the bar was too low, and we must raise it by adding complexity. This makes tech interviews progressively harder. But is harder always better?
The tale of two complexities
Welcome to our software engineering interview. Here, a surprise awaits: to pass, you must win a game of chess against Grandmaster Anatoly Karpov, former Chess World Champion. This kind of interview will be very hard, but would it be good at selecting productive software engineers? Probably not, because chess is very loosely related to software engineering.
While real-world interviews rarely involve chess (I wouldn't mind if they occasionally did), the core principle is clear. Interview complexity has two parts:
- Relevant complexity predicts the candidate's performance.
- Incidental complexity randomly eliminates candidates by unrelated criteria.
Both kinds make the interview harder, but only the relevant produces useful signals. Incidental complexity produces random noise, and is essentially a time-consuming equivalent of making hiring decisions through dice rolls.
"Got it, don't involve chess in the interview and I'll be fine" — I hear you say. But in practice, isolating relevant complexity can be tricky. Say, we're hiring for a realtime dashboard team, and we add a pretty relevant topic — WebSockets. Are we filtering on genuine experience, or on recent exposure to the technology? Does current WebSocket knowledge really predict the overall performance, say, 3 months in? Maybe we've done the right call, maybe we're playing chess — probably a bit of both.
So, we're stuck piling up complexity based on our gut feeling. Every new topic inevitably adds some incidental complexity, causing more false negatives. This is bad enough for a single team, but once we zoom out to the organization level, things escalate.
Scaling the noise
All across the company, hiring managers are ramping up complexity. At some point, the inefficiency and inconsistency become obvious: some teams go so hard they spend all their bandwidth on endless interviews and rejections; others devolve into low-performer silos. In a noble attempt to balance the complexity, spread the load evenly, and cancel out individual biases, we introduce a shared interviewer pool composed of engineers from different teams. When done naively, this makes things worse.
Say, a candidate writes all the code in a single huge function. Interviewer A thinks this rushed slop demonstrates a lack of decomposition skills. To interviewer B, this is pragmatic speed — the candidate prioritized delivering the product over architectural meandering. When hiring for their respective teams, both interviewers may be right. But now the context needed to ground this subjective decision is gone. If the interviewers swap candidates, we get two rejections. When both interview any candidate, at least one is likely to call a "no hire", because their preferences don't intersect. If the interviewers blindly act on each other's "green flag", they get inept candidates. Everybody loses.
To counteract, we standardize the process through a shared problem pool and scoring rubrics. A perfect rubric is both relevant and objective, but this turns into an impossible balancing act. Leaning into alignment too hard dumbs down the interview with easily measurable questions like "can list the nine JavaScript data types". Any space for nuance and judgement is also space for subjectivity and misalignment: we're stuck arguing whether the single huge function is a valid balance of speed and maintainability. Balancing between dumbery and misalignment, we get a bit of both.
The alienation
Sadly, misalignment is children's games compared to the structural collapse that lies ahead. A hiring manager fully responsible for their hiring is not a perfect setup, but it gets one thing right: process and outcome are owned by the same actor. The labor of interviewing, the benefits of a good hire, and the pain of a bad one are all yours. A shared interviewer pool fractures this incentive structure.
The individual interviewers get the accountability for a bad hire, but gain nothing from a good one. This leaves no incentive to make a more adventurous hire every now and then — when in doubt, just say "no". Group ownership of the hiring decision creates a perfect setup for bystander effect. Facing a barely-there candidate who narrowly misses all the red flags, an interviewer might think: if they're really bad, one of the other 5 interviewers will surely reject them. Once all the interviewers act on this logic, the barely competent hire happily progresses.
The interplay between these effect results in a schizophrenic process that favors mediocrity. We reject "spiky" candidates who don't strictly fit our one-size-fits-all profile — say, a great systems engineer with subpar SQL. Meanwhile, borderline candidates get hired via bystander effect.
Facing the full cost of running the interviews with a net-negative outcome, some interviewers grow disengaged. Our rubric, designed to help a motivated hiring manager stay objective, backfires. To a burnt-out cog in the hiring machine, it gives an easy way out: just mechanically check the boxes on our "green / red flag" list, and you'll be fine. Unknowingly, we've set things up for the final act.
The snake bites its tail
We've seen how the fear of bad hires makes interviews harder and more random, and ultimately cements them into a rigid process that's not owned by anyone in particular. But to close the feedback loop and see how this causes more bad hires, we must look at the other party — the candidates.
Faced with brutal multi-stage hiring processes with lots of incidental complexity and a significant chance of random rejection, candidates adapt by overfitting to the incidental complexity of the interviews (aka interview prep). Instead of mastering what you do best, you grind leetcodes — or, even worse, practice cheating with AI. Instead of telling a messy war story, you polish a STAR advertisement with doctored success metrics.
This is a classic example of Goodhart's law: when a measure becomes a target, it ceases to be a good measure. Before you know it, you're hiring great interview takers instead of great engineers. A strict rubric additionally sets things up for a nightmare scenario. For top companies, it's very easy to find not just generic interview prep, but a crowdsourced list of specific tasks used in the interviews. Once that happens, the process decays into who's done their homework.
And behold, the snake has swallowed its tail: in an effort to avoid bad hires, we inadvertently set in motion the exact machinery that eventually made bad hires more likely. Time for some bar raising!
Now that I've laid out all the pieces of the puzzle, I should propose the fix. Well, here you go. To make no hiring mistakes, don't hire. To avoid challenges of growth, don't grow.
Once we break these rules, the snake starts eating its tail. We trade bad hires for random rejections. We trade nuance for alignment. We can make the snake eat slower, or pull out a few inches of the tail, but we can't save the snake, because it's driven by probability and the human mind. The best we can do is embrace the mess and keep at least some decency.
Tomorrow, I'll log into zoom. I'll give the candidate the best problem we've got. I know it's not 100% relevant, but at least it's refreshing. The candidate won't cheat with AI because we explicitly allow and encourage AI-assisted solutions. If they hit any red flags on our rubric, I'll note this in my report. If they show exceptional promise in some unscored aspect, I'll note that too. At any rate, I'll be as friendly and engaged as I can, because even failing miserably at a random task doesn't mean you're a bad engineer. And even being a bad engineer doesn't mean you don't deserve respect.