DevOps & CloudJune 4, 2026 6 min read

We Moved Our Vercel ISR Cache to S3 + CloudFront. Here's the Math.

Vercel's bandwidth and function invocation costs got loud at scale. We moved the hot read path to S3 + CloudFront while keeping the DX. Here's the architecture, the numbers, and what broke.

A content-heavy marketing site we run on Next.js was costing us more in Vercel function invocations and bandwidth than the rest of the stack combined. The pages didn't need server rendering on every request — they needed honest, boring HTTP caching. So we kept Vercel for previews and the dashboard, but moved the hot read path to S3 + CloudFront. The savings were real. The migration was not free.

Why Vercel ISR stopped making sense at our scale

Incremental Static Regeneration is a great default. You get a stale-while-revalidate model baked in, edge caching across Vercel's network, and you don't have to think about invalidation. For a launch, a blog, or anything under a few million requests a month, it's the right call.

Our problem was shape, not Vercel. The site had roughly 40k pages, most of them long-tail content with low individual traffic. The aggregate was high — tens of millions of requests a month — but the cache hit ratio on Vercel's edge was lower than we wanted because each PoP had to warm independently for low-traffic URLs. That meant more function invocations than the marketing pitch implied, and bandwidth-heavy pages (lots of inline SVG, embedded JSON-LD) added up.

In our experience, once you're consistently above ~20M monthly requests on content that genuinely doesn't change per-user, the Vercel pricing model stops being the cheapest path. That's not a knock on Vercel — it's a tool fit issue.

What we actually measured before deciding

We spent two weeks instrumenting before we touched architecture. The questions we wanted answered:

What percent of requests hit a warm ISR cache vs trigger a regeneration?
What's the byte size distribution of our HTML responses?
Which pages account for the top 80% of bandwidth?
What's our p50/p95 TTFB across regions?

We pulled this from Vercel Analytics, our own OpenTelemetry traces (we instrument the Next.js server with @vercel/otel), and a sampled log of CDN responses. The headline finding: about 18% of requests were hitting a cold ISR path, and those were responsible for the bulk of our function bill.

The architecture we landed on

The idea is simple. A build job renders every page to static HTML and ships it to S3. CloudFront serves it. A separate worker handles regeneration on a schedule and on webhook triggers from our CMS. Vercel still runs preview deployments and the authenticated parts of the app.

CMS webhook ──► SQS ──► Renderer (Fargate) ──► S3 ──► CloudFront ──► User
                                  │
                                  └──► CloudFront invalidation API

The renderer is a small Node service that imports the same Next.js page modules and calls renderToString for each route. We didn't want two source-of-truth codebases, so the renderer lives in the same monorepo and shares the page components.

Cache headers that actually do work

The headers matter more than the architecture. We settled on:

Cache-Control: public, max-age=60, s-maxage=86400, stale-while-revalidate=604800

Browsers cache for a minute (so a hard refresh feels fresh), CloudFront holds for a day, and we serve stale for up to a week while a background revalidation happens. The stale-while-revalidate directive is honored by CloudFront when you've set the cache policy to respect origin headers — easy to miss in the console.

For pages that genuinely never change (legal, archived posts), we go further:

Cache-Control: public, max-age=3600, s-maxage=31536000, immutable

The numbers, with honest caveats

I'll give ranges rather than exact figures because your traffic shape will be different. For our workload:

Vercel monthly spend before: roughly 6–7x our AWS spend for the same property
After migration: S3 storage was negligible (~$3/mo), CloudFront bandwidth landed around 35% of what Vercel was charging us for equivalent traffic, and the Fargate renderer runs about $40/mo
TTFB p95: improved by ~80ms on average, mostly because CloudFront's PoP coverage is denser for our user base than Vercel's edge for our plan tier
Cache hit ratio: went from ~82% on Vercel edge to ~97% on CloudFront, because we're pre-warming rather than lazy-filling

The caveats: we're not counting engineering time, which was about three weeks of one senior engineer. We're not counting the ongoing maintenance, which is real. And the comparison only holds because our content is genuinely static between CMS edits. If you have per-user content, this entire architecture is wrong for you.

What we lost

This is the part most migration posts skip. We lost:

Per-request personalization. Anything that needed cookies or headers now happens client-side via a small hydration call to a separate API.
Instant preview of production changes. A CMS edit now takes 30–90 seconds to propagate, because the renderer has to run and CloudFront has to invalidate.
Vercel's analytics on those routes. We backfilled with our own OpenTelemetry pipeline into Grafana, but it's not as polished.
Easy A/B testing at the edge. We moved experiments to a client-side framework, which has its own latency cost.

If you're going to do this, write the loss list before you write the gain list. It's the honest way to make the decision.

Invalidation: the part that bit us

CloudFront invalidations are free for the first 1,000 paths per month, then $0.005 each. Sounds cheap. It is not cheap if you naively invalidate on every CMS save and your editors are busy.

We made two specific mistakes in week one:

We invalidated /blog/* on every post update. Wildcard invalidations count as one path, but they nuke the entire blog cache. Our cache hit ratio dropped to 40% for a day.
We didn't debounce. An editor saving a draft five times in a minute triggered five invalidations and five renderer jobs.

The fix was a 60-second debounce window in the SQS consumer and targeted path invalidations instead of wildcards. We also added a deny-list for draft saves — we only invalidate on publish events.

// Debounce key: post ID, window: 60s
const shouldInvalidate = await redis.set(
  `invalidate:${postId}`,
  '1',
  'EX', 60,
  'NX'
);
if (!shouldInvalidate) return; // Another job already queued

Observability after the move

We lost Vercel's built-in dashboards, so we wired up the basics ourselves. CloudFront ships access logs to S3, which we tail into a small consumer that emits OpenTelemetry metrics: cache hit ratio per path prefix, origin fetch latency, 4xx/5xx rates. Sentry catches client-side errors as before.

The one metric I'd insist on if you do this: origin fetch rate. If it spikes, either an invalidation went wide or your renderer is failing. We alert at >2% origin fetches over a 5-minute window. That's caught two bad deploys so far.

When you should not do this

If any of these are true, stay on Vercel ISR:

Your traffic is under ~5M requests/month. The engineering cost isn't worth it.
Your pages have meaningful per-user content. You'll end up with a hybrid that's worse than either pure approach.
Your team doesn't have someone who's comfortable with CloudFront cache policies, S3 lifecycle rules, and IaC. This is not a weekend project.
You rely on Vercel-specific features like Edge Config, Edge Middleware for auth, or their image optimization at scale. You can replicate them, but the math changes.

Where we'd start

If you're staring at a Vercel bill that's growing faster than your traffic, don't migrate first. Instrument first. Spend a sprint understanding your actual cache hit ratio, your function invocation breakdown, and which routes drive the cost. Half the teams we've talked to who were ready to leave Vercel found that fixing their revalidate values and consolidating a few API routes solved 70% of the problem.

If the numbers still say move, start with a single high-traffic route. Render it to S3, point CloudFront at it, and run it in parallel with Vercel for a week. Compare TTFB, cache hit ratio, and error rates with real traffic before you commit to the full migration. We help teams work through exactly this kind of tradeoff on our DevOps and cloud engagements — usually the answer isn't "leave Vercel," it's "use Vercel for what it's good at and stop paying it for what it isn't."

#Vercel#AWS#CloudFront#Next.js#Caching#Cost Optimization

Want a team like ours?

72Technologies builds production software for the kind of teams who actually read this blog.

Start a project

Keep reading

AWS NAT Gateway Bills Ate Our Margins. Here's How We Cut Them 78%.

A single misconfigured VPC route turned our NAT Gateway into a five-figure monthly line item. Here's the audit trail, the fixes, and what we'd do differently.

July 19, 2026 6 min

GCP Cloud Run vs AWS Lambda for Bursty APIs: What Broke, What Held

We ran the same bursty checkout API on Cloud Run and Lambda for six months. Cold starts, concurrency, and billing quirks all bit us in ways the marketing pages don't mention.

July 14, 2026 6 min

Vercel Preview Deployments Are Leaking Secrets. Audit Yours Now.

Preview URLs are treated like staging by developers and like production by attackers. Here's how we found real secrets exposed across three client accounts, and the guardrails we now enforce by default.

July 11, 2026 6 min