You have a deploy pipeline. It runs tests, builds artifacts, ships to prod, notifies Slack. What it almost certainly does not do: verify that your transactional email still lands in the inbox afterward.
Which is a problem because a surprising share of deploys break email. A template change. An ESP credential rotation. A new SPF include. A link-wrapping tool added to the sending path. Each can tank placement for days before anyone notices.
Template HTML that breaks rendering, DNS changes that break SPF/DKIM, ESP configuration drift, content regressions (new trigger words), and authentication misalignment. Thirty seconds of automation, all of the above — caught before a customer ever reports it.
What to check, exactly
A good post-deploy email check answers four questions:
- Did the template render? Send a real message via the production path, fetch it back, verify it parses and the key visible elements are present.
- Does authentication pass? SPF, DKIM, DMARC all evaluate correctly.
- Does it land in the inbox? Across Gmail, Outlook, Yahoo.
- Is the spam score reasonable? SpamAssassin score below 3, no obvious red flags.
The first question catches template regressions. The second catches DNS/ESP changes. The third catches the compounded effect of the above plus content drift. The fourth is a leading indicator.
Hooking into CI/CD
The check runs after the deploy step, not as part of it. You do not want a slow placement check blocking a rollback. Run it in parallel or as a post-deploy job that can page if it fails.
GitHub Actions example
# .github/workflows/deploy.yml
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: ./scripts/deploy.sh
post_deploy_email_check:
needs: deploy
runs-on: ubuntu-latest
steps:
- name: Run inbox placement check
env:
INBOX_CHECK_API_KEY: ${{ secrets.INBOX_CHECK_API_KEY }}
run: |
TEST_ID=$(curl -s -X POST https://check.live-direct-marketing.online/api/check \
-H "Authorization: Bearer $INBOX_CHECK_API_KEY" \
-H "Content-Type: application/json" \
-d '{"senderDomain":"mail.acme.io","templateId":"welcome","source":"ci"}' \
| jq -r '.id')
# Wait for completion
for i in {1..30}; do
STATUS=$(curl -s https://check.live-direct-marketing.online/api/check/$TEST_ID \
-H "Authorization: Bearer $INBOX_CHECK_API_KEY" | jq -r '.status')
[[ "$STATUS" == "complete" ]] && break
sleep 10
done
RATE=$(curl -s https://check.live-direct-marketing.online/api/check/$TEST_ID \
-H "Authorization: Bearer $INBOX_CHECK_API_KEY" | jq -r '.summary.inboxRate')
echo "Inbox placement: $RATE%"
# Fail the job (and notify via GH notifications) if below threshold
if (( $(echo "$RATE < 80" | bc -l) )); then
echo "::error::Inbox placement $RATE% is below 80% threshold"
exit 1
fi
- name: Post to Slack on failure
if: failure()
run: |
curl -X POST "${{ secrets.SLACK_WEBHOOK }}" \
-H "Content-Type: application/json" \
-d '{"text":":warning: Post-deploy email check failed on release ${{ github.sha }}"}'GitLab CI example
# .gitlab-ci.yml
post_deploy_email:
stage: verify
needs: [deploy]
script:
- ./scripts/post_deploy_email_check.sh
rules:
- if: $CI_COMMIT_BRANCH == "main"
allow_failure: falseWhich templates to test after deploy
Testing every template on every deploy is slow and wasteful. Tag templates by criticality instead, and test only what actually changed.
- Critical (always test). Password reset, 2FA, payment receipt, order confirmation. Anything where a customer support ticket gets opened if placement drops.
- Changed in this deploy. Diff the template files; if one changed, test it specifically.
- Monthly rotation. Cycle through non-critical templates at weekly cadence on staging.
The critical list is usually under 10 templates even at a large product. Test all of them on every deploy that touches anything email-adjacent.
What to fail the deploy on
Be careful with the fail conditions. If your threshold is too aggressive, you will block deploys for placement noise. If it is too lax, you miss the regression you built the check for.
- Fail hard on authentication failure. SPF/DKIM/DMARC not passing on a production domain is a real bug; block the deploy and rollback.
- Fail hard on inbox rate below 60%. That is an unambiguous regression on a previously-working sender.
- Warn (do not fail) on inbox rate 60—80% or a 10pp drop from baseline. Post to Slack; let a human triage.
- Warn on spam score over 4. Borderline but fixable; not a rollback trigger.
Run the same check against a staging sender domain (warmed, SPF aligned, different selector) as part of the pre-deploy pipeline. A template regression caught on staging is a 5-minute fix; one caught in production after a deploy is an incident.
Runbook when the check fails
- Rollback if auth failed. This is a production email regression. Deploy the previous version.
- If rate dropped, diff the templates. Template HTML changes account for 70%+ of post-deploy placement drops.
- Check the DNS. Did someone add an ESP include to SPF and push past the 10-lookup limit?
- Check ESP config. Credential rotation, new sending subdomain, domain verification expiry.
- Rerun the check. Some placement drops are transient; if the rerun passes, log the incident but do not rollback.