DevOps & CloudJune 12, 2026 6 min read

Sentry Performance Quotas Blew Up Our Bill: What We Changed

A war story about Sentry transactions, span ingestion, and a 6x bill spike — plus the dynamic sampling, SDK config, and quota guardrails we now ship by default.

We turned on Sentry Performance for a Next.js app on a Tuesday. By Friday, our monthly Sentry spend had jumped roughly 6x and we had a Slack thread full of confused engineers asking why a tool we thought of as "error tracking" was suddenly the second-biggest line item on our observability bill. This is the post-mortem, the fixes, and the defaults we now ship on every project.

How a quiet error tracker became our most expensive tool

For years we used Sentry the boring way: capture exceptions, attach breadcrumbs, page someone if error rate crosses a threshold. Errors are sparse by nature, so the bill stayed predictable.

Performance monitoring is a different animal. Once you enable tracing, the unit of ingestion stops being "a bad thing happened" and starts being "a request happened". On a backend that handles a few hundred requests per second, every percentage point of sample rate is millions of transactions per month. And in Sentry's current pricing model, transactions and spans are billed separately — and spans can outnumber transactions by 20x or more on a modern app with database calls, HTTP fetches, and a few instrumented libraries.

We found this out the expensive way.

What we actually shipped

The rollout looked innocent. A junior engineer (with our blessing) added this to the Next.js config during a reliability sprint:

// sentry.server.config.ts — the version that hurt
import * as Sentry from '@sentry/nextjs';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampleRate: 1.0,
  profilesSampleRate: 1.0,
  integrations: [
    Sentry.prismaIntegration(),
    Sentry.httpIntegration(),
  ],
});

tracesSampleRate: 1.0 means every transaction is sent. With Prisma and HTTP integrations on, each transaction dragged 30–80 spans with it. Add profiling at 100% and you've also turned on a second, separately-billed product.

We deployed it on Tuesday afternoon. The quota alert that should have caught it was set at 80% of plan, but the plan was monthly — and we burned through 80% in about 36 hours, well after the daily noise had drowned the email.

Where the money actually goes

If you only remember one thing: in Sentry, spans are the meter that spins fastest. Transactions get the headlines, but a single transaction with a fan-out of database queries and outbound HTTP calls can easily produce 50+ spans. We pulled a week of data from one of our services and the ratio was roughly:

1 transaction → ~42 spans on average
p95 was ~110 spans (checkout flow with payment provider + tax service + inventory)
Error events were a rounding error in comparison

So when you set tracesSampleRate: 0.1, you're not cutting cost by 10x — you might be cutting it by closer to 10x on transactions but the span multiplier stays the same. Cost scales with sample_rate × avg_spans_per_transaction × request_volume. That middle term is the one nobody talks about.

The other expensive surprise: profiling. It's billed in profile-hours and it's easy to forget you turned it on. At 100% sample rate on a busy service, it adds up fast. We now treat it like a debugger you switch on for a week, not a permanent setting.

The fixes, in the order we'd do them again

We didn't get this right on the first try. Here's the sequence that actually worked, ordered by impact-per-hour-of-engineering-time.

1. Stop ingesting traffic you don't care about

The single biggest win was filtering. Health checks, /_next/static requests, bot traffic, internal cron pings — none of it deserved a transaction. We added a tracesSampler that returns 0 for known-boring paths and a sane default otherwise:

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampler: (ctx) => {
    const name = ctx.transactionContext?.name ?? '';

    // Drop noise entirely
    if (name.includes('/api/health')) return 0;
    if (name.includes('/_next/static')) return 0;
    if (name.startsWith('GET /robots.txt')) return 0;

    // Always sample errors and slow paths
    if (ctx.parentSampled === true) return 1;

    // Sample checkout and auth more heavily
    if (name.includes('/checkout') || name.includes('/auth')) return 0.5;

    // Everything else
    return 0.05;
  },
  profilesSampleRate: 0, // turn it back on per-incident
});

That alone cut transaction volume by about 70% with no loss of signal we cared about. The parentSampled check is important: when an upstream service decides to sample a trace, you want the downstream to honour that so traces stay connected.

2. Use dynamic sampling and stop guessing rates

Sentry's dynamic sampling (server-side) will retain rare and interesting transactions even when your client SDK is set to a low rate. The pattern we now use: set tracesSampleRate low at the SDK (say 0.05 to 0.1), and let dynamic sampling rules keep slow transactions, error-bearing transactions, and specific releases. This gives you a useful signal without paying for the long tail of boring 200s.

The tradeoff: if you set the SDK rate to 0.01 you'll have great cost control and terrible visibility into low-traffic endpoints. We landed on 0.05 as a default for high-traffic services and 0.2 for low-traffic internal tools, then let dynamic rules do the rest.

3. Tame the span count per transaction

This one is underrated. Some integrations are chatty. The Prisma integration, for example, will emit a span for every query — including the ones your ORM fires under the hood that you didn't write. The HTTP integration spans every outbound fetch, including fire-and-forget telemetry calls to other tools.

We started auditing spans per transaction and cutting:

Wrapped repeated low-value queries (SELECT 1 health probes, session lookups on every request) and marked them to skip
Disabled the HTTP integration for outbound calls to internal services that already had their own tracing
Set maxSpans defensively so a runaway loop couldn't blow a single transaction up to 10,000 spans (we saw one do exactly that during a retry storm)

Result: average spans-per-transaction dropped from ~42 to ~14.

4. Set quotas you actually notice

Sentry lets you set per-category spend caps and on-call notifications. We now configure:

A hard cap at 120% of expected monthly usage (errors are still ingested, performance gets dropped first)
A Slack alert at 50%, 75%, and 90% of plan, with the channel set to one humans read
A separate alert when daily ingestion is 3x the trailing 7-day median

That last one is the killer. Monthly thresholds give you days of warning at low traffic and hours of warning at high traffic. A daily anomaly alert catches a bad deploy on the same afternoon.

What we'd push back on

We like Sentry. The error tracking is genuinely good and the trace UI for debugging a slow request is faster than stitching it together in a generic OTel backend. But two things deserve a clear-eyed look:

Span-based billing rewards chatty instrumentation. That's an awkward incentive. We've seen teams add more spans because "more detail is better" without realising they were doubling their bill. Audit your integrations the same way you'd audit a dependency.
Performance monitoring overlaps heavily with whatever OTel pipeline you already run. If you're sending traces to Sentry and to a separate OTel collector feeding Grafana Tempo or Honeycomb, you're paying twice for the same data. Pick a primary and downgrade the other to errors-only or a much lower sample rate. We wrote about our broader observability stack choices on the blog if you want the longer version.

Where we'd start tomorrow

If you're staring at a Sentry bill that just doubled, do these four things this week:

Open the Performance usage view and sort transactions by spans ingested, not transaction count. Find your top three offenders.
Add a tracesSampler that drops health checks and static assets, and lowers your base rate to 0.05–0.1.
Turn off profilesSampleRate unless you're actively debugging something. Treat it like console.log in production.
Set a daily anomaly alert at 3x trailing median. Monthly caps are not enough.

This is the kind of work that's deeply unglamorous and pays back in the first invoice cycle. If you'd rather not own it in-house, our team does this kind of DevOps and reliability work for product teams — usually as a focused two-week engagement rather than an ongoing contract. Either way, the lesson is the same: observability tools meter what you give them. Give them less, more deliberately.

#Sentry#Observability#Cost Optimization#Reliability#DevOps

Want a team like ours?

72Technologies builds production software for the kind of teams who actually read this blog.

Start a project

Keep reading

OpenTelemetry Sampling in Production: The Config That Saved Our Trace Bill

Head sampling threw away the traces we needed. Tail sampling blew up our collector memory. Here's the sampling config we landed on after six months in production.

July 30, 2026 6 min

CloudFront to Vercel: The Cache Header Mismatch That Cost Us a Weekend

We fronted a Vercel app with CloudFront to satisfy a compliance requirement. Two weeks later, stale checkouts and missing Set-Cookie headers taught us how differently these two CDNs think about caching.

July 27, 2026 6 min

Vercel Edge Middleware Latency: What We Measured When We Moved Auth to the Edge

We moved auth checks from a Node API route to Vercel Edge Middleware expecting free speed. Some routes got faster, some got slower, and the bill moved in ways we didn't predict.

July 25, 2026 6 min