Mailtrap has been the de-facto standard email sandbox for the past decade. Every backend engineer has at some point pointed their staging SMTP credentials at sandbox.smtp.mailtrap.io and breathed easier, knowing that a broken merge template or a runaway loop cannot actually mail 50,000 customers. It is a brilliantly simple product. It also does one thing — and by design, it does not do the other thing you need.
The thing it does not do: tell you how your production email lands in a real Gmail inbox. That is not a criticism. Mailtrap is a sandbox. Its entire value is that mail never leaves the sandbox. So the moment your code ships to production, you lose all visibility.
Inbox Check is not a replacement for Mailtrap. It is the missing production half. This article maps the boundary and shows you the CI/CD pattern to use both together.
Problem 1: "Does the email render correctly and contain the right variables?" — this is Mailtrap territory.
Problem 2: "Does the email land in the inbox, in Spam, or in Promotions at each major provider?" — this is Inbox Check territory. No sandbox can answer it because no sandbox goes through the real receiver's filters.
What Mailtrap does well
Mailtrap Email Testing (the sandbox) is perfect for the bugs you catch early:
- Template rendering — did the Handlebars merge actually resolve
{{firstName}}, or is the raw placeholder in the production email? - Broken HTML — unclosed tags, missing alt-text, inline-CSS regressions.
- Spam score — static SpamAssassin score of the content itself, independent of sender reputation.
- Plain-text fallback — did the multipart version actually include the text body?
- Link checks — are all links valid, not broken staging URLs?
- Runaway prevention — no email ever leaves. If your staging job loops, no customer is harmed.
For all of this, Mailtrap is the correct tool. Use it in every staging environment, every CI run, every PR preview.
What Mailtrap cannot do
Deliverability is not content-only. Gmail's filter decides based on: sender IP reputation, sender domain reputation, SPF / DKIM / DMARC alignment, engagement history at your seed mailboxes, complaint rate, unsubscribe compliance, content, and feedback loops. Only three of those signals are visible to a sandbox. The rest exist only in the real production path.
This is why teams that rely on Mailtrap alone often ship to production, see "everything is green," and two days later get a panicked message: "nobody is opening our emails." The email renders perfectly. It just went to Spam at Gmail because the sending IP got listed last night.
The CI/CD pattern: both tools, one pipeline
The pattern we run at customers with high email volume is:
- Dev / PR previews — Mailtrap sandbox. SMTP credentials injected via
MAIL_URLenv. All email from the preview lands in a per-branch Mailtrap inbox. - Staging — Mailtrap sandbox. Same credentials as previews, but with content-QA rules enabled (spam score, HTML validation, broken links).
- Staging release candidate — switch to a staging ESP (real sender, dedicated subdomain), and trigger an Inbox Check seed test. Placement must be above threshold or the release is blocked.
- Production deploy — post-deploy hook fires one more Inbox Check seed on the live sender. A regression between staging and production (different DNS, different IP pool, different content) is caught before customers notice.
- Production monitoring — scheduled Inbox Check every N hours, alerting into Slack on score drop.
Example pipeline YAML
Below is a simplified GitHub Actions workflow that invokes both. Mailtrap runs in the test job; Inbox Check runs in the post-deploy job. Secrets live in GitHub Secrets.
name: email-ci
on:
push:
branches: [main]
jobs:
test-emails-in-mailtrap:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 20 }
- run: npm ci
- name: Run email rendering tests
env:
SMTP_HOST: sandbox.smtp.mailtrap.io
SMTP_PORT: 2525
SMTP_USER: ${{ secrets.MAILTRAP_USER }}
SMTP_PASS: ${{ secrets.MAILTRAP_PASS }}
run: npm run test:emails
- name: Fail if Mailtrap spam score > 5
run: npm run mailtrap:check-spam-score
deploy:
needs: test-emails-in-mailtrap
runs-on: ubuntu-latest
steps:
- name: Deploy to production
run: ./scripts/deploy.sh
seed-check-after-deploy:
needs: deploy
runs-on: ubuntu-latest
steps:
- name: Create Inbox Check placement test
id: seed
run: |
RESP=$(curl -s -X POST \
https://check.live-direct-marketing.online/api/tests \
-H "Content-Type: application/json" \
-d '{
"subject": "Post-deploy seed",
"from": "alerts@yourbrand.com",
"html": "<p>Deploy #${{ github.run_number }}</p>"
}')
echo "token=$(echo $RESP | jq -r .token)" >> $GITHUB_OUTPUT
- name: Wait 5 minutes for placement results
run: sleep 300
- name: Fetch placement score
run: |
SCORE=$(curl -s \
https://check.live-direct-marketing.online/api/tests/${{ steps.seed.outputs.token }} \
| jq -r .score)
echo "Score: $SCORE"
if [ "$SCORE" -lt 70 ]; then
echo "Placement below threshold"
exit 1
fiA native integration is in private beta. Instead of scripting curl + jq + sleep, you will have a single GitHub Action and a Slack app that reports placement on every deploy.
Seed on every real send
The deeper pattern, beyond CI/CD hooks, is this: for every real production send of a new template, include your own seed mailbox addresses in the recipient list. If you send a campaign to 100,000 users, the cost of adding 20 more seed addresses is zero. Those seeds are your production canaries. They tell you where the real email landed, not where a sandbox thinks it would have landed.
This closes the loop that Mailtrap alone cannot close. Mailtrap validates what the code produced; seeds validate what the receiver did with it. Both are needed. Neither replaces the other.
TL;DR
Mailtrap is a staging sandbox. Inbox Check is production placement verification. If your CI only runs Mailtrap, you are testing 30% of the email delivery pipeline. Add an Inbox Check step after deploy and a seed on every send, and you cover the other 70% for free.