All articles
E-commerceJune 26, 2026 7 min read

Cart Abandonment Webhooks Lie: Building a Recovery Pipeline That Actually Attributes Revenue

Shopify's abandoned checkout webhook is noisy, late, and bad at attribution. Here's the event pipeline we build instead so recovery emails, SMS, and WhatsApp don't double-count or miss revenue.

Every store we audit has a cart recovery flow. Almost none of them can tell us, with a straight face, how much revenue that flow actually drives versus what those customers would have bought anyway. The culprit is usually the same: a thin integration on top of Shopify's checkouts/update webhook, wired to an ESP, with attribution windows pulled out of a hat.

This is a breakdown of why that setup lies to you, and the event pipeline we build for mid-market clients who want recovery numbers they can defend in a board meeting.

Why the abandoned checkout webhook is a bad source of truth

Shopify exposes checkouts/create and checkouts/update webhooks, plus the Abandoned Checkouts API. On paper, that's everything you need: a customer starts a checkout, you get a payload, you wait, you send a recovery message.

In practice, three things break:

  1. Latency is unpredictable. checkouts/update fires on nearly every keystroke-level change — email entered, shipping selected, discount applied. You get dozens of events per session, and the "final" abandoned state is just whichever one happened to be last before the customer left. There's no explicit "this one is abandoned" signal.
  2. The same customer generates multiple checkout tokens. Switch device, clear cookies, come back via a different ad — new token, new "abandonment." Your ESP sees three abandoned carts; the customer sees one indecisive afternoon.
  3. Attribution is done by the ESP, not by you. Klaviyo, Omnisend, and similar tools attribute a conversion to a flow if the order happens within an attribution window (often 5 days) of an email open or click. If the customer also got an SMS, a WhatsApp nudge, and saw a retargeting ad, every channel claims the sale.

The result: leadership sees "abandoned cart flow drove $X this month" numbers that, when summed across channels, exceed total revenue. We've seen this in audits more than once.

What we actually want to measure

Before touching code, get agreement on the question. "How much did recovery drive?" is too vague. The defensible versions are:

  • Incremental revenue: revenue from customers who would not have converted without a recovery touch. Requires a holdout group.
  • First-touch recovery revenue: revenue from orders where the first post-abandonment touch came from a specific channel.
  • Last-touch recovery revenue: same, but last touch before the order.

Pick one as the headline metric and report the others as context. We default to first-touch with a holdout for mid-market stores, because it's the cleanest story and the holdout keeps the team honest.

The pipeline

Here's the shape we deploy. It sits between Shopify and whatever messaging tools you use — Klaviyo, Postscript, a WhatsApp BSP, your own service.

Shopify webhooks ──► Ingest (Cloudflare Worker / Lambda)
                          │
                          ▼
                    Event store (Postgres or BigQuery)
                          │
          ┌───────────────┼────────────────┐
          ▼               ▼                ▼
   Identity resolver  Abandonment    Attribution
                       detector       resolver
                          │
                          ▼
                 Outbound dispatcher
                 (email / SMS / WhatsApp)

Four responsibilities, deliberately separated.

Ingest and normalize

We take the raw webhook, drop the noisy fields, and write a canonical event. The point is to be able to replay history later when someone changes the definition of "abandoned."

type CheckoutEvent = {
  eventId: string;          // hash of token + updated_at
  checkoutToken: string;
  customerKey: string;      // see identity resolution
  email: string | null;
  phone: string | null;
  cartValue: number;
  currency: string;
  lineItems: Array<{ variantId: string; qty: number }>;
  step: 'contact' | 'shipping' | 'payment' | 'completed';
  occurredAt: string;       // ISO from Shopify
  receivedAt: string;       // our clock
};

Deduplicate on eventId. Shopify retries webhooks, and your pipeline must be idempotent or you'll trigger the same recovery flow twice.

Identity resolution

This is where most homegrown pipelines fall apart. A customerKey should survive across checkout tokens. We build it with a priority chain:

  1. Logged-in customer ID, if present.
  2. Normalized email (lowercased, trimmed).
  3. Normalized phone (E.164).
  4. A first-party cookie ID passed through from the storefront.

When two keys collide later — for example, a session with only a cookie ID later supplies an email — we merge the historical events under the higher-priority key. Keep a merges table so you can audit it.

Abandonment detection

Don't define abandonment as "we got a checkouts/update and then nothing for 30 minutes." That's circular and depends on webhook delivery being timely. Instead, run a scheduled job — every 5 minutes is fine — that asks:

For each customerKey with a checkout event in the last 24 hours where step != 'completed', has there been any activity in the last N minutes?

N is a business decision, not a technical one. We typically start at 45 minutes for the first touch and tune from there.

Crucially, the detector also checks for completed orders on other tokens. If the customer started checkout twice and finished the second one, the first is not abandoned in any meaningful sense. The orders/create webhook feeds the same event store so this lookup is one query.

Attribution resolution

When an order lands, the attribution resolver walks the event timeline for that customerKey and answers:

  • Was there an abandonment event before this order?
  • What recovery touches did we send between the abandonment and the order?
  • Which channel was first? Which was last?
  • Was the customer in a holdout cohort?

Write this answer into an order_attribution row at conversion time. Don't compute it on the fly in your dashboard — you'll get different numbers every time the definition shifts.

Holdouts without sandbagging revenue

The pushback on holdouts is always the same: "You want us to not email 10% of abandoners?" Yes. That's the only way to know if the flow works.

Make it cheap politically by:

  • Keeping the holdout small (5–10%).
  • Rotating cohorts monthly so no customer is permanently excluded.
  • Reporting holdout conversion rate alongside treatment conversion rate every month. When the treatment is genuinely working, the gap is visible and the conversation ends.

In our experience, recovery flows on well-run stores produce a 10–30% lift in abandoner conversion versus holdout. On poorly-targeted flows, the lift is statistically indistinguishable from zero — which is exactly the finding that justifies the rebuild.

Where WhatsApp and SMS change the math

For stores in LATAM, India, MENA, and Southeast Asia, WhatsApp recovery messages often outperform email by a wide margin on open rate. But they also cost real money per send (template message fees through the BSP) and burn customer goodwill faster.

Two rules we apply:

  • Channel selection happens in the dispatcher, not the ESP. The pipeline decides whether this customer gets WhatsApp, SMS, or email based on consent, locale, and prior engagement. Don't let three tools each decide independently.
  • Cap total touches across channels. A customer who got a WhatsApp message at hour 1 should not also get an SMS at hour 2 and an email at hour 4. The dispatcher enforces a global frequency cap keyed on customerKey.

If you're wiring conversational commerce into Shopify, our e-commerce engineering team has written more on the pattern, but the short version is: the pipeline owns the decision, the channels execute it.

Things that will bite you

A partial list, from real projects:

  • Test orders and bot traffic generate abandoned checkouts. Filter them at ingest by IP reputation and known test emails, or your conversion rates will look worse than they are.
  • B2B customers with net terms don't "abandon" — they get a quote and pay later. Segment them out of the recovery audience entirely.
  • GDPR and consent. A webhook payload containing email and phone is personal data. Your event store needs a deletion path keyed on customerKey, and you need to actually run it when requested.
  • Currency. If you operate in multiple markets, store cartValue in both transaction currency and a normalized reporting currency at event time. Retroactive FX conversion will haunt your finance team.

Where we'd start

If you have an existing recovery flow and no idea whether it works, don't rebuild everything on day one. Do this instead:

  1. Stand up the event store and start logging Shopify webhooks into it. Two weeks of data is enough to begin.
  2. Add a 10% holdout to your existing flow. Don't change anything else.
  3. After 30 days, compare holdout versus treatment conversion on abandoners. That single number tells you whether the rebuild is worth funding.
  4. If the lift is real but small, the issue is usually targeting and channel mix, not the messages themselves. Rebuild the dispatcher first.
  5. If the lift is invisible, you have permission to question whether the flow should exist at all — and the data to defend that conversation.

The pipeline is the boring part. The holdout is the part that changes how the team argues about CRO.

#Shopify#CRO#Attribution#Webhooks#Checkout

Want a team like ours?

72Technologies builds production software for the kind of teams who actually read this blog.

Start a project