SEO & GrowthMay 25, 2026 6 min read

AdSense Brand Safety at Scale: A Pre-Publish Content Gate That Actually Works

AdSense doesn't tell you which page got demonetized — it just quietly tanks your RPM. Here's the pre-publish content gate we wire into CI to catch policy-risky pages before they ship.

AdSense doesn't email you a list of demonetized URLs. It just quietly drops your RPM, and you spend three weeks A/B testing layouts before realizing the real culprit was a single category page that mentioned a banned product term. If you run a content site with more than a few thousand URLs, you need a pre-publish gate — not a post-hoc audit.

This is how we build that gate at 72Technologies, what rules belong in it, and where the false-positive tax actually lives.

Why post-publish audits lose

The usual workflow looks like this: publish, wait for Search Console and AdSense to react, find the bad pages, fix or noindex them. That works fine when you're shipping ten posts a week. It collapses the moment you go programmatic.

Three reasons:

Detection lag. AdSense policy actions are not always surfaced per-URL. You often see aggregate "limited ad serving" warnings days after the offending content went live.
Cohort contamination. If 0.5% of a 40,000-page site trips policy rules, the site-wide ad quality score drops. Your clean pages earn less because of the dirty ones.
Reindexing cost. Once a URL is in the index with a policy flag, removing it doesn't immediately restore trust. You're paying for that page for weeks.

A pre-publish gate flips the economics. The cheapest moment to kill a bad page is before it has a URL.

What the gate is actually checking

AdSense's program policies and Google Publisher Policies are public, but they're written for humans, not regex. You have to translate them into testable rules. We group ours into four buckets:

Hard blockers — categorical content that's never acceptable (e.g., content sexualizing minors, instructions for violence, recognized hate speech targets). These return a failing score immediately and route to human review, not auto-publish.
Restricted categories — alcohol, gambling, prescription drugs, firearms, adult themes. Allowed in some markets with disclosures; need geotargeting and ad-serving config changes.
Quality signals — thin content, scraped phrasing, broken templates, missing author/source attribution.
YMYL hygiene — "Your Money or Your Life" topics (health, finance, legal). Not banned, but need stricter sourcing and disclaimers, and we route them through a separate review queue.

Note we don't try to detect everything. The gate is a filter, not a judge. Its job is to catch the top 80% of policy-risky pages cheaply, and escalate the ambiguous ones.

The rule format

Every rule is a small, testable unit with a category, severity, and a check function. We keep them in a flat YAML registry so non-engineers can read and propose edits:

- id: restricted_gambling_terms
  category: restricted
  severity: 0.7
  applies_to: [body, title, h1]
  match:
    type: phrase_list
    list: gambling_terms_v3
    case_sensitive: false
  action: route_to_review
  notes: >
    Terms commonly associated with online gambling. Requires market-specific
    config and AdSense restricted-category enablement before publish.

- id: thin_body_word_count
  category: quality
  severity: 0.4
  applies_to: [body]
  match:
    type: function
    name: word_count_below
    args: { threshold: 250 }
  action: flag

Severity is a 0 – 1 weight, not a boolean. A page's final score is the max of hard-blocker severities plus a weighted sum of soft signals. We tuned the thresholds against a labeled set of ~600 pages — about half clean, half known-bad from prior policy hits.

Wiring it into CI

For a programmatic site, "publish" usually means: a content row gets a status = ready flag, a build job picks it up, renders a page, and pushes to the CDN. The gate runs between ready and rendered.

Here's a sketch of the check runner:

def evaluate_page(page, rules, lists):
    hits = []
    for rule in rules:
        if not rule.applies_to_page(page):
            continue
        result = rule.run(page, lists)
        if result.matched:
            hits.append(Hit(rule_id=rule.id,
                            severity=rule.severity,
                            evidence=result.evidence))

    hard = [h for h in hits if h.severity >= 0.9]
    if hard:
        return Verdict("block", hits)

    score = sum(h.severity for h in hits)
    if score >= 1.2:
        return Verdict("review", hits)
    if score >= 0.5:
        return Verdict("flag", hits)
    return Verdict("pass", hits)

Three verdicts, three destinations:

pass → render and deploy.
flag → render and deploy, but log to a dashboard so editors can sample.
review → hold in a queue; a human approves, edits, or kills.
block → write to a quarantine table; do not render, do not generate a URL.

The quarantine table matters more than people expect. You will get false positives, and you need a way to bulk-review and re-run them when you tune a rule. Don't just delete the row.

Phrase lists, not regex soup

The single biggest source of false positives is over-eager regex. "Shot" matches "shotgun" but also "screenshot". "High" matches drug slang and also "high-resolution". We've moved almost everything to curated phrase lists with optional context windows:

- phrase: "buy guns online"
  context_required: false
- phrase: "shot"
  context_required: true
  context_window: 6
  context_terms: ["firearm", "caliber", "rifle"]

This is more work upfront and dramatically less work over the next six months. Maintain the lists like you'd maintain a dependency — versioned, code-reviewed, with changelogs.

The override workflow

A gate without a clean override path becomes shadow IT within a month. Editors will start renaming their pages to slip past it. Build the escape hatch on purpose.

We use a per-page override record:

{
  "page_id": "prog-loc-1882",
  "override_rule": "thin_body_word_count",
  "reason": "Glossary stub; intentional, linked from 14 hub pages.",
  "approved_by": "editor:meera",
  "expires_at": "2026-09-01"
}

Three rules about overrides:

Per-rule, not per-page blanket. An override silences one specific rule on one page, nothing else.
Always expires. Force re-review. Six months is our default.
Auditable. Every override is logged with the human who approved it. When AdSense complains, you want a paper trail.

Measuring the gate itself

The gate is a system. Treat it like one. We track:

Block rate — % of pages stopped before render. If this drops to near-zero, either your content is genuinely clean or your rules have rotted.
Review queue latency — how long pages sit waiting for human eyes. Above 48 hours, content velocity suffers and editors start abusing overrides.
False positive rate — sampled monthly by having an editor review 50 blocked or reviewed pages and judge them.
Escapes — pages the gate passed that later got policy-flagged by AdSense or removed by editorial. This is the metric that matters most.

In our experience, a healthy gate sits around a 2 – 5% review rate, sub-1% block rate, and single-digit monthly escapes on a six-figure URL site. Your numbers will be different; the point is to watch them move.

Don't forget the templates

Programmatic sites have a quirk: a single bad template can mint thousands of bad pages overnight. Run the gate not just on individual records but on a sample render of every template change. A template-level CI check that renders 20 randomized pages and runs the gate against them has saved us from at least three policy incidents.

Where this fits with the rest of the stack

The gate is one layer. It works because the other layers exist:

A clean data model so pages aren't generated from junk inputs.
Schema validation so structured data doesn't lie about content type.
A content review sample on a rolling basis, even for pass verdicts.
A kill-switch for noindexing entire template families when something slips.

If you want to see how we structure the underlying data and rendering pipeline, our team has written more about that side of the build under /blog, and the broader content engineering work sits under /services.

Where we'd start

If you're staring at a programmatic site with no gate today, don't build the whole thing in week one. Do this instead:

Pull your last 90 days of removed, noindexed, or manually edited pages. Cluster them by reason.
Write the five rules that would have caught 80% of those cases. Just five.
Put them in CI as warnings only for two weeks. Watch the false positive rate.
Promote the cleanest two rules to blocking. Leave the rest as flags.
Add a review queue and an override record format before you add more rules.

The gate earns trust by being right more than it's wrong. Build slowly, log everything, and let the rules graduate from flag to review to block as they prove themselves.

#SEO#AdSense#Content Ops#CI/CD#Programmatic SEO

Want a team like ours?

72Technologies builds production software for the kind of teams who actually read this blog.

Start a project

Keep reading

Internal Linking for Programmatic SEO: Building a Link Graph That Survives 100k Pages

Most programmatic sites die from flat, random internal linking. Here's how we model the link graph as a data problem so PageRank actually flows where it should.

June 26, 2026 6 min

Content Freshness Signals at Scale: When to Actually Re-Publish Programmatic Pages

Bulk-updating dateModified on a million pages is a great way to get ignored — or worse. Here's how we decide which programmatic pages deserve a real refresh, and how to wire the signal cleanly.

June 21, 2026 7 min

Faceted Navigation on Programmatic SEO Sites: Rules That Keep Google Sane

Facets are where programmatic SEO sites quietly bleed crawl budget and rank signals. Here's the rule set we use to decide which combinations earn a URL, which get noindex, and which never see a link.

June 18, 2026 6 min