Why Your ESP's Open Rate Dashboard Lies (And Exactly How Badly)

ESPs do not maliciously inflate open rates. They report what their tracking pixel records, and the tracking pixel records what HTTP requests arrive at the server. The problem is that the world has changed around that measurement without the measurement changing to match.

To quantify how far off the reported numbers are, we ran identical campaigns through seven major ESPs and compared the ESP-reported open rate to an independently-measured human-engagement estimate. The gaps are substantial and the shape of the distortion varies by vendor.

Methodology

For each ESP we:

Sent the same 20,000-message campaign to the same recipient pool.
Recorded the ESP-reported open rate after 48 hours.
Independently measured a "human-plausible" open rate by filtering the raw HTTP logs for non-proxy, non-scanner source IPs with residential User-Agents, matched against subsequent click or reply.
Computed the ratio of reported to human-plausible — the inflation factor.

The recipient pool was a realistic mix: 42% Apple Mail, 31% Gmail (mixed web/mobile), 19% Outlook (mixed consumer/enterprise), 8% other.

Results

Vendor names anonymised as A–G. Ratios shown are reported open rate divided by human-plausible estimate.

Vendor A (marketing automation, mid-market) — reported 52%, human 11%, inflation factor 4.7x.
Vendor B (ecommerce-focused ESP) — reported 48%, human 14%, inflation 3.4x.
Vendor C (enterprise suite with MPP-adjustment toggle OFF) — reported 58%, human 13%, inflation 4.5x.
Vendor C (same campaign with MPP-adjustment toggle ON) — reported 28%, human 13%, inflation 2.2x.
Vendor D (transactional-focused) — reported 41%, human 17%, inflation 2.4x.
Vendor E (cold-outreach platform) — reported 72%, human 12%, inflation 6.0x.
Vendor F (newsletter creator platform) — reported 44%, human 15%, inflation 2.9x.
Vendor G (SMB bulk sender) — reported 61%, human 13%, inflation 4.7x.

Headline finding

The best-behaved vendor (C with MPP-adjustment on) still over-reports by 2.2x. The worst over-reports by 6x. No vendor produces a number within 50% of the humans-only estimate out of the box.

Why the gaps vary so much

Three factors explain most of the variation between vendors.

1. Whether they filter Apple MPP opens

MPP opens arrive from Apple's CIDR ranges with stripped User-Agents. A few seconds of log filtering can identify them. Vendor C's optional toggle does exactly this and cuts the inflation factor in half.

Of the seven vendors audited, only two (C and D) filter MPP opens by default or behind an opt-in toggle. The remaining five count every MPP event as a regular open.

2. Whether they filter known security scanners

Scanner opens arrive with identifiable User-Agents: Proofpoint, Mimecast, Barracuda. Filtering them out is straightforward. Vendor D does this by default. The others mostly do not.

3. Whether they count duplicate pixel fires as one open

A single message can generate multiple pixel requests — MPP plus a subsequent human view, a scanner plus a human view, an AI summariser plus a human view. Some vendors count each pixel load as a separate open event, which pushes the inflation factor higher. The cleaner behaviour is "unique open per message", which most vendors nominally implement but some do not rigorously.

The cold-outreach outlier

Vendor E stands out at 6x inflation. Cold-outreach platforms tend to send to Apple-Mail-heavy founder and executive lists, which saturates the MPP contribution. They also rarely filter scanners because many of their target domains are enterprise with heavy scanner presence, and filtering would drop the headline number in a way that hurts the sales pitch.

The practical consequence: when a cold-outreach tool advertises an "80% open rate" case study, the underlying human engagement is probably closer to 13%. The 80% is real in the sense that the pixel fired 80% of the time; it is misleading in the sense that the reader is supposed to imagine 80 humans out of 100 read the message.

What honest dashboards would look like

A few vendors are making moves toward honesty. Features we have seen across the industry:

"MPP-adjusted opens" as a separate column or toggle.
Per-provider open rate breakdowns, which make the Apple inflation obvious when the Apple column hits 95%+.
"Verified engagement" metrics that combine opens with clicks or replies and are much harder to fake.
Optional warning banners when a campaign's open rate is suspiciously high relative to its click rate, indicating pre-fetch dominance.

None of these are universal. Most are buried in settings. Most are off by default because the default-on behaviour would make the headline number smaller, and vendors' own marketing teams prefer the bigger number.

How to audit your own ESP

You can reproduce a rough version of our audit in an afternoon without cross-vendor comparison. What to check:

Export raw open events with timestamp, source IP, User-Agent. Most ESPs provide this via API or CSV export.
Count MPP-proxy events: source IP in Apple CIDR ranges plus empty or generic Apple User-Agent. Expect 30–60% of total opens.
Count scanner events: User-Agent matching Proofpoint, Mimecast, Barracuda, Cisco, Symantec, Forcepoint, Trend Micro. Expect 5–15%.
Compute the "plausibly human" remainder: opens that are neither MPP nor scanner, and that are followed by a click or reply within 48 hours.
Divide reported opens by plausibly human. That is your dashboard's inflation factor.

Every ESP we have seen produces an inflation factor between 2x and 6x with this methodology. If yours comes in at 2x, you have a relatively honest dashboard. If it comes in at 5x+, you are making decisions based mostly on robot activity.

Sidestep the ESP lies with direct placement measurement

Inbox Check does not rely on pixels at all. We send your campaign to real seed mailboxes at major providers and report where each one landed — inbox, promotions, spam. No inflation factor, no filtering required. Free test at the homepage.

The accountability question

Should ESPs fix this? Yes, and slowly they are, but there is a commercial friction: the vendor whose dashboard shows 20% opens will lose comparison shopping against the vendor whose dashboard shows 50%, even when the 20% is more honest. Market pressure rewards the inflated number.

The practical solution is for buyers to demand the honest number. If every RFP required "MPP-adjusted, scanner-filtered, unique open rate" as the primary metric, vendors would standardise on it. Until then, the dashboards will lie, and the responsibility for reading them correctly falls on the buyer.

FAQ

Why do ESPs not just fix the default?

Two reasons. First, customers complain loudly when numbers change because their historical comparisons break. Second, inflated numbers look better in competitive comparisons. The incentive to keep them is strong on both sides.

Which ESP is least bad?

Based on our audit, vendors D and C (with the toggle on) produce the most defensible numbers. We are not endorsing any specific tool — behaviour changes, and your mileage will vary by list composition.

Can I build this filtering myself on top of my existing ESP?

Yes, if the ESP exports raw open events. Filter by source IP and User-Agent, apply the rules above, report the filtered number to your own dashboard. Roughly a day of work for an engineer.

What about the "opens per message" metric some ESPs show?

That metric is even more polluted because it multiplies rather than deduplicates. Ignore it unless you are specifically interested in pixel-load behaviour for debugging.

Why your ESP's open rate dashboard lies — and exactly how badly