Startups & BusinessJune 13, 2026 6 min read

Hiring Senior Engineers in 2026: The Trial Project That Beats the Whiteboard

Whiteboard interviews keep failing us on senior hires. Here's the paid trial project we run instead — what it costs, what it measures, and the failure modes to watch.

We've hired roughly forty engineers across the last six years, and the pattern is embarrassingly consistent: every bad senior hire came from an interview loop that leaned hard on whiteboard puzzles and "system design" theatre. Every great one came from a process where we watched them actually work. So a couple of years ago we ripped the loop apart and rebuilt it around a paid trial project. Here's how that works in 2026, what it costs us, and the parts we still get wrong.

Why the standard loop keeps failing on seniors

The canonical four-stage loop — recruiter screen, coding test, system design, behavioural — was designed by big-tech recruiting teams optimising for throughput at a 1% acceptance rate. Agencies and product startups don't have that problem. We have the opposite problem: a small pool of candidates we actually want, and a high cost per mis-hire because a senior engineer is touching client work or core architecture within two weeks.

The loop has three specific failure modes for senior hires:

It rewards interview athletes. People who grind LeetCode for a quarter look indistinguishable from people who can lead a delivery. They are not the same person.
It punishes pragmatists. The engineers we most want — the ones who'd push back on a bad ticket, refactor the smallest thing that unblocks the team, and ship — often interview like they're bored. Because they are.
It ignores collaboration. A senior hire's job is half code, half negotiation with PMs, clients, and juniors. A whiteboard tells you nothing about that.

LLM-assisted cheating made it worse. By 2024 we were watching candidates ace take-homes that they couldn't explain in the follow-up. By 2026, assume every async coding test is being co-piloted. That's fine — we want people who use AI tools well — but it means the test no longer measures what it used to.

The trial project, end to end

The replacement is a paid, scoped, two-to-four-day trial project that runs in the open. Candidates sign a short contractor agreement, get paid at a fair day rate (we land somewhere between a junior contractor rate and a senior one, framed as honoraria not market-rate consulting), and work on a real-but-isolated problem from our backlog.

The shape:

1. A real ticket, sandboxed

We keep a rotating pool of three or four "trial tickets" — small features or refactors pulled from internal tools, never from live client work. Examples:

Add a webhook retry layer with idempotency to an internal billing service
Build a small Next.js dashboard that consumes an existing GraphQL endpoint
Migrate a Postgres table with a tricky backfill, zero downtime

The ticket has a written brief, acceptance criteria, and deliberate ambiguity. We want to see what they ask.

2. A kickoff call, then async work

Thirty-minute kickoff with the hiring manager and one engineer from the team. We walk through the brief, answer questions, and explicitly tell the candidate: "Use whatever tools you want, including AI. We care about the result and your reasoning."

Then they work async over two to four calendar days (usually around 12–16 hours of actual effort — we ask them to log it honestly). They have Slack access to the hiring engineer for questions, the same way a new hire would on day one.

3. A walkthrough, not a defence

At the end, a 60-minute session where they walk us through:

What they built and why
What they cut and why
What they'd do with another week
Any code they leaned on AI for, and how they validated it

This is the part that replaces the system design round. It's vastly more honest, because it's grounded in something they actually built two days ago.

What we're actually measuring

The scorecard has four axes, weighted roughly equally:

Code quality on a real-world problem. Not algorithmic elegance — boring, readable, tested code with sensible boundaries.
Judgement under ambiguity. Did they ask the right questions? Did they cut the right corners? Did they over-engineer?
Communication. Commit messages, PR description, the walkthrough itself. Can they explain trade-offs to a non-technical PM?
AI fluency. Are they using assistants as a force multiplier, or are they pasting in output they don't understand?

Here's the rough rubric we use internally — feel free to steal it:

trial_scorecard:
  code_quality:
    weight: 25
    signals: [readability, test_coverage_intent, error_handling, boundaries]
  judgement:
    weight: 25
    signals: [scope_cuts, clarifying_questions, over_engineering_avoided]
  communication:
    weight: 25
    signals: [pr_description, commit_hygiene, walkthrough_clarity]
  ai_fluency:
    weight: 25
    signals: [tool_choice, verification_habits, can_defend_every_line]
  red_flags:
    - cannot_explain_own_code
    - silent_for_days_then_data_dump
    - ignored_acceptance_criteria
    - hostile_to_feedback_in_walkthrough

Two reviewers score independently before comparing. If the scores diverge by more than 20%, we talk it out before deciding.

What it costs

This process is more expensive than a traditional loop, and we think that's the point — it forces us to only invite candidates we're already serious about.

Rough numbers from our last hiring cycle:

Trial honoraria: a few hundred to low four figures per candidate, depending on seniority and locale
Engineer time to set up the ticket: amortised, since we rotate the same tickets
Engineer time to review and walkthrough: about 3 hours per candidate

We only invite candidates to the trial after a screening call and one technical conversation. So we're paying for maybe two or three trials per hire, not ten. The total cost per hire is comparable to a recruiter fee on a single placement — and the mis-hire rate, in our experience, has dropped sharply.

The failure modes (we hit all of them)

Scoping the ticket too big

Our first version of this had a trial ticket that took us, the people who designed it, about a day. Candidates were spending three. We were filtering for free time, not skill. Now we time-box at 12–16 hours of effort and explicitly tell candidates to stop and document if they hit the cap.

Letting it turn into spec work

If the trial ticket is something you actually need shipped, you've crossed a line. The output has to be throwaway, or at minimum, something you'd happily pay for whether or not you hire the person. Otherwise you're exploiting candidates and the word gets around fast.

Hiring the best trial, not the best engineer

The trial is one signal. We've had candidates ace the trial and then bomb the team-fit conversation with a senior PM, and we passed. The trial replaces the technical theatre. It doesn't replace the rest of the loop.

Not paying enough

We started low and saw drop-off from exactly the candidates we wanted — the ones who already had jobs and didn't need a side gig. Pay enough that it reads as respect, not as bargain labour.

When this doesn't work

A few honest caveats. It doesn't work well for very junior hires — they don't have the context to navigate ambiguity, and the trial measures things they haven't learned yet. For juniors we still use a structured pairing exercise.

It also doesn't work in markets or jurisdictions where short paid contractor work is legally painful. Check with your accountant before rolling this out across borders.

And it's slow. From first call to offer is usually three to four weeks. If you're hiring against a competing offer with a one-week fuse, you'll lose. We've decided we're okay losing those.

Where we'd start

If you're running an agency or a small product team and your last two senior hires underwhelmed, don't redesign your whole interview loop this week. Do this instead:

Pick one small, real-but-non-critical ticket from your backlog. Write a one-page brief with acceptance criteria and one deliberate ambiguity.
Decide a fair day rate and a time cap. Put it in writing.
Run it on the next senior candidate who passes your screening call. Keep the rest of your loop intact for now.
Compare what you learned in the walkthrough against what you'd have learned in a whiteboard round. Be honest.

You'll know within one or two trials whether this is the right shape for your team. If you want help thinking through the engineering hiring stack — or you'd rather hand the build to a team that already ships — that's most of what we do at 72Technologies.

#hiring#engineering management#agency#startups#interviews

Want a team like ours?

72Technologies builds production software for the kind of teams who actually read this blog.

Start a project

Keep reading

Change Requests Without the Fight: A Scope-Control System for Non-Technical Clients

A practical system for handling scope creep with non-technical clients — how we log, price, and communicate change requests without turning every meeting into a negotiation.

July 28, 2026 7 min

Fixed-Price vs Time-and-Materials in 2026: A Decision Framework That Actually Holds Up

Fixed-price feels safe until the AI-generated scope doc meets reality. Here's how we decide between fixed bids, T&M, and hybrid models on real agency deals — and when each one bites back.

July 26, 2026 6 min

Kill Fees and Deposit Structures: How to Get Paid When Agency Deals Stall

Clients ghost, pivot, or run out of money mid-project. Here's how we structure deposits, kill fees, and milestone gates so the agency doesn't eat the cost when a build stalls out.

July 23, 2026 6 min