Every enterprise buys a secure email gateway. Every secure email gateway opens your mail, renders the HTML, and clicks your links. For any B2B sender, between 10% and 30% of "clicks" in a typical campaign come from bots, not buyers. This article is the rough taxonomy we use internally when we audit ESP data.
The main scanner families
Not every gateway leaves the same fingerprint, but there are enough recurring patterns that a quick signature match filters most noise.
- Microsoft SafeLinks / Defender — rewrites URLs through
safelinks.protection.outlook.com. Usually leaksMSOfficeor empty User-Agent. Most common. - Barracuda Link Protection — rewrites to
linkprotect.cudasvc.com. Fetches withBarracudain User-Agent. - Proofpoint URL Defense — rewrites to
urldefense.proofpoint.com. Sometimesurldefense.com. UA usually containsproofpoint. - Mimecast URL Protect — rewrites to
protect-*.mimecast.com. Scanner IPs are in AS-34440. - Cisco Secure Email / IronPort — rewrites to
secure-web.cisco.com. UACiscoSecureEmailor similar. - Trend Micro Email Security — various rewrites, UA
TMES. - FortiMail / Sophos / F-Secure — less common for SMB, but Fortinet and Sophos leak identifiable UAs.
A simple click-log audit
Export one hour of raw click events from your ESP. You want at minimum: timestamp, recipient, link_id, url, ip, user_agent, referrer. Load it into a notebook and ask three questions:
- What fraction of clicks happen within 5 seconds of delivery? Humans are not that fast. If it is above 5%, you have scanner noise.
- How many recipients clicked every single link in the email? Real humans click one, maybe two. Bots click all.
- What is the distribution of clicks per recipient in the first 30 seconds? A long tail above 3 is bots.
SELECT
recipient,
COUNT(*) AS clicks_in_first_30s,
COUNT(DISTINCT link_id) AS unique_links_clicked
FROM clicks
WHERE clicked_at - delivered_at < INTERVAL '30 seconds'
GROUP BY recipient
HAVING COUNT(DISTINCT link_id) >= 3
ORDER BY unique_links_clicked DESC;Every row in that output is almost certainly a bot. Mark and exclude.
User-Agent signatures worth blocking
The following substrings are unambiguous and safe to filter:
safelinks
barracuda
proofpoint
mimecast
cisco
ironport
sophos
fortimail
trend
symantec
microsoft office
msoffice
ms-office
google-safety
google-inspectiontool
googleimageproxy
slack-imgproxy
outlookwebapp
checkpoint
tmes
f-secureA few more are context-dependent. For example,HeadlessChrome and PhantomJS are sometimes used by sandboxing engines, and while neither belongs in marketing click logs, you may see them from headless browsers on your own side if you prefetch links in your dashboard. Check before blacklisting.
The empty-UA problem
Many scanners send no User-Agent or set it to the literal string-. About a third of all bot hits in our sample had no UA. Relying on UA alone will miss them. The fallback signals:
- IP in AS-owned ranges of a gateway vendor. Use
whoisor a passive DNS service. - No
AcceptorAccept-Languageheader. - First click on the link is within 2 seconds of delivery.
- No cookie set, ever. Real browsers set cookies; scanners drop them.
Scanner User-Agents change every six months. Static blocklists go stale fast. We recommend storing scanner signatures in a database you update weekly, not in a hard-coded file. Review it during every quarterly deliverability audit.
Why you still want to see the bot clicks
Strip them from CTR, but keep them in raw logs. Bot activity is a reliable signal of delivery. If a scanner never hits your link, it often means the message went to spam and the scanner never rendered it. The ratio of "scanners hit the link" to "humans hit the link" is itself a deliverability KPI.
Two practical workflows
Workflow A: post-hoc clean-up
Ship your CTR as-is to the dashboard, but add actr_human metric alongside. Finance and exec teams keep seeing what they expect; analysts and growth teams use the clean number.
Workflow B: pre-blast seed test
Before the real send, blast to a seed list that has no scanner between sender and inbox. Every click on a seed list is a human-ish click (or your own test) and you get a noise-free baseline. Inbox Check seeds are exactly this — real mailboxes, no gateway in front.