Sentry Performance Quotas Blew Up Our Bill: What We Changed
A war story about Sentry transactions, span ingestion, and a 6x bill spike — plus the dynamic sampling, SDK config, and quota guardrails we now ship by default.

We turned on Sentry Performance for a Next.js app on a Tuesday. By Friday, our monthly Sentry spend had jumped roughly 6x and we had a Slack thread full of confused engineers asking why a tool we thought of as "error tracking" was suddenly the second-biggest line item on our observability bill. This is the post-mortem, the fixes, and the defaults we now ship on every project.
How a quiet error tracker became our most expensive tool
For years we used Sentry the boring way: capture exceptions, attach breadcrumbs, page someone if error rate crosses a threshold. Errors are sparse by nature, so the bill stayed predictable.
Performance monitoring is a different animal. Once you enable tracing, the unit of ingestion stops being "a bad thing happened" and starts being "a request happened". On a backend that handles a few hundred requests per second, every percentage point of sample rate is millions of transactions per month. And in Sentry's current pricing model, transactions and spans are billed separately — and spans can outnumber transactions by 20x or more on a modern app with database calls, HTTP fetches, and a few instrumented libraries.
We found this out the expensive way.
What we actually shipped
The rollout looked innocent. A junior engineer (with our blessing) added this to the Next.js config during a reliability sprint:
// sentry.server.config.ts — the version that hurt
import * as Sentry from '@sentry/nextjs';
Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampleRate: 1.0,
profilesSampleRate: 1.0,
integrations: [
Sentry.prismaIntegration(),
Sentry.httpIntegration(),
],
});
tracesSampleRate: 1.0 means every transaction is sent. With Prisma and HTTP integrations on, each transaction dragged 30–80 spans with it. Add profiling at 100% and you've also turned on a second, separately-billed product.
We deployed it on Tuesday afternoon. The quota alert that should have caught it was set at 80% of plan, but the plan was monthly — and we burned through 80% in about 36 hours, well after the daily noise had drowned the email.
Where the money actually goes
If you only remember one thing: in Sentry, spans are the meter that spins fastest. Transactions get the headlines, but a single transaction with a fan-out of database queries and outbound HTTP calls can easily produce 50+ spans. We pulled a week of data from one of our services and the ratio was roughly:
- 1 transaction → ~42 spans on average
- p95 was ~110 spans (checkout flow with payment provider + tax service + inventory)
- Error events were a rounding error in comparison
So when you set tracesSampleRate: 0.1, you're not cutting cost by 10x — you might be cutting it by closer to 10x on transactions but the span multiplier stays the same. Cost scales with sample_rate × avg_spans_per_transaction × request_volume. That middle term is the one nobody talks about.
The other expensive surprise: profiling. It's billed in profile-hours and it's easy to forget you turned it on. At 100% sample rate on a busy service, it adds up fast. We now treat it like a debugger you switch on for a week, not a permanent setting.
The fixes, in the order we'd do them again
We didn't get this right on the first try. Here's the sequence that actually worked, ordered by impact-per-hour-of-engineering-time.
1. Stop ingesting traffic you don't care about
The single biggest win was filtering. Health checks, /_next/static requests, bot traffic, internal cron pings — none of it deserved a transaction. We added a tracesSampler that returns 0 for known-boring paths and a sane default otherwise:
Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampler: (ctx) => {
const name = ctx.transactionContext?.name ?? '';
// Drop noise entirely
if (name.includes('/api/health')) return 0;
if (name.includes('/_next/static')) return 0;
if (name.startsWith('GET /robots.txt')) return 0;
// Always sample errors and slow paths
if (ctx.parentSampled === true) return 1;
// Sample checkout and auth more heavily
if (name.includes('/checkout') || name.includes('/auth')) return 0.5;
// Everything else
return 0.05;
},
profilesSampleRate: 0, // turn it back on per-incident
});
That alone cut transaction volume by about 70% with no loss of signal we cared about. The parentSampled check is important: when an upstream service decides to sample a trace, you want the downstream to honour that so traces stay connected.
2. Use dynamic sampling and stop guessing rates
Sentry's dynamic sampling (server-side) will retain rare and interesting transactions even when your client SDK is set to a low rate. The pattern we now use: set tracesSampleRate low at the SDK (say 0.05 to 0.1), and let dynamic sampling rules keep slow transactions, error-bearing transactions, and specific releases. This gives you a useful signal without paying for the long tail of boring 200s.
The tradeoff: if you set the SDK rate to 0.01 you'll have great cost control and terrible visibility into low-traffic endpoints. We landed on 0.05 as a default for high-traffic services and 0.2 for low-traffic internal tools, then let dynamic rules do the rest.
3. Tame the span count per transaction
This one is underrated. Some integrations are chatty. The Prisma integration, for example, will emit a span for every query — including the ones your ORM fires under the hood that you didn't write. The HTTP integration spans every outbound fetch, including fire-and-forget telemetry calls to other tools.
We started auditing spans per transaction and cutting:
- Wrapped repeated low-value queries (
SELECT 1health probes, session lookups on every request) and marked them to skip - Disabled the HTTP integration for outbound calls to internal services that already had their own tracing
- Set
maxSpansdefensively so a runaway loop couldn't blow a single transaction up to 10,000 spans (we saw one do exactly that during a retry storm)
Result: average spans-per-transaction dropped from ~42 to ~14.
4. Set quotas you actually notice
Sentry lets you set per-category spend caps and on-call notifications. We now configure:
- A hard cap at 120% of expected monthly usage (errors are still ingested, performance gets dropped first)
- A Slack alert at 50%, 75%, and 90% of plan, with the channel set to one humans read
- A separate alert when daily ingestion is 3x the trailing 7-day median
That last one is the killer. Monthly thresholds give you days of warning at low traffic and hours of warning at high traffic. A daily anomaly alert catches a bad deploy on the same afternoon.
What we'd push back on
We like Sentry. The error tracking is genuinely good and the trace UI for debugging a slow request is faster than stitching it together in a generic OTel backend. But two things deserve a clear-eyed look:
- Span-based billing rewards chatty instrumentation. That's an awkward incentive. We've seen teams add more spans because "more detail is better" without realising they were doubling their bill. Audit your integrations the same way you'd audit a dependency.
- Performance monitoring overlaps heavily with whatever OTel pipeline you already run. If you're sending traces to Sentry and to a separate OTel collector feeding Grafana Tempo or Honeycomb, you're paying twice for the same data. Pick a primary and downgrade the other to errors-only or a much lower sample rate. We wrote about our broader observability stack choices on the blog if you want the longer version.
Where we'd start tomorrow
If you're staring at a Sentry bill that just doubled, do these four things this week:
- Open the Performance usage view and sort transactions by spans ingested, not transaction count. Find your top three offenders.
- Add a
tracesSamplerthat drops health checks and static assets, and lowers your base rate to 0.05–0.1. - Turn off
profilesSampleRateunless you're actively debugging something. Treat it likeconsole.login production. - Set a daily anomaly alert at 3x trailing median. Monthly caps are not enough.
This is the kind of work that's deeply unglamorous and pays back in the first invoice cycle. If you'd rather not own it in-house, our team does this kind of DevOps and reliability work for product teams — usually as a focused two-week engagement rather than an ongoing contract. Either way, the lesson is the same: observability tools meter what you give them. Give them less, more deliberately.
Want a team like ours?
72Technologies builds production software for the kind of teams who actually read this blog.
Start a projectKeep reading

Our Vercel Cron Jobs Silently Stopped Firing for 6 Hours. Here's the Postmortem.
A scheduled job that hadn't fired in six hours, no alert, no error in Sentry, and a billing email that didn't get sent. Here's exactly what broke, how we caught it, and the cron monitoring pattern we run now.

Pulumi vs Terraform in 2026: A Real Migration, Not a Bake-Off
We moved part of a production AWS estate from Terraform to Pulumi over six months. Here's what actually changed, what broke, and where we'd quietly stay on HCL.

OpenTelemetry Sampling at Scale: Why Tail-Based Bit Us First
We rolled out OpenTelemetry across a Node and Go fleet, picked tail-based sampling because everyone said to, and learned why head-based wins for most teams. Here's the tradeoff we wish someone had drawn for us.
