Mailtrap for Staging, Inbox Check for Production: The Missing Half of Your Email QA

Mailtrap has been the de-facto standard email sandbox for the past decade. Every backend engineer has at some point pointed their staging SMTP credentials at sandbox.smtp.mailtrap.io and breathed easier, knowing that a broken merge template or a runaway loop cannot actually mail 50,000 customers. It is a brilliantly simple product. It also does one thing — and by design, it does not do the other thing you need.

The thing it does not do: tell you how your production email lands in a real Gmail inbox. That is not a criticism. Mailtrap is a sandbox. Its entire value is that mail never leaves the sandbox. So the moment your code ships to production, you lose all visibility.

Inbox Check is not a replacement for Mailtrap. It is the missing production half. This article maps the boundary and shows you the CI/CD pattern to use both together.

The two problems developers confuse

Problem 1: "Does the email render correctly and contain the right variables?" — this is Mailtrap territory.

Problem 2: "Does the email land in the inbox, in Spam, or in Promotions at each major provider?" — this is Inbox Check territory. No sandbox can answer it because no sandbox goes through the real receiver's filters.

What Mailtrap does well

Mailtrap Email Testing (the sandbox) is perfect for the bugs you catch early:

Template rendering — did the Handlebars merge actually resolve {{firstName}}, or is the raw placeholder in the production email?
Broken HTML — unclosed tags, missing alt-text, inline-CSS regressions.
Spam score — static SpamAssassin score of the content itself, independent of sender reputation.
Plain-text fallback — did the multipart version actually include the text body?
Link checks — are all links valid, not broken staging URLs?
Runaway prevention — no email ever leaves. If your staging job loops, no customer is harmed.

For all of this, Mailtrap is the correct tool. Use it in every staging environment, every CI run, every PR preview.

What Mailtrap cannot do

Deliverability is not content-only. Gmail's filter decides based on: sender IP reputation, sender domain reputation, SPF / DKIM / DMARC alignment, engagement history at your seed mailboxes, complaint rate, unsubscribe compliance, content, and feedback loops. Only three of those signals are visible to a sandbox. The rest exist only in the real production path.

This is why teams that rely on Mailtrap alone often ship to production, see "everything is green," and two days later get a panicked message: "nobody is opening our emails." The email renders perfectly. It just went to Spam at Gmail because the sending IP got listed last night.

The CI/CD pattern: both tools, one pipeline

The pattern we run at customers with high email volume is:

Dev / PR previews — Mailtrap sandbox. SMTP credentials injected via MAIL_URL env. All email from the preview lands in a per-branch Mailtrap inbox.
Staging — Mailtrap sandbox. Same credentials as previews, but with content-QA rules enabled (spam score, HTML validation, broken links).
Staging release candidate — switch to a staging ESP (real sender, dedicated subdomain), and trigger an Inbox Check seed test. Placement must be above threshold or the release is blocked.
Production deploy — post-deploy hook fires one more Inbox Check seed on the live sender. A regression between staging and production (different DNS, different IP pool, different content) is caught before customers notice.
Production monitoring — scheduled Inbox Check every N hours, alerting into Slack on score drop.

Example pipeline YAML

Below is a simplified GitHub Actions workflow that invokes both. Mailtrap runs in the test job; Inbox Check runs in the post-deploy job. Secrets live in GitHub Secrets.

name: email-ci

on:
  push:
    branches: [main]

jobs:
  test-emails-in-mailtrap:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm ci
      - name: Run email rendering tests
        env:
          SMTP_HOST: sandbox.smtp.mailtrap.io
          SMTP_PORT: 2525
          SMTP_USER: ${{ secrets.MAILTRAP_USER }}
          SMTP_PASS: ${{ secrets.MAILTRAP_PASS }}
        run: npm run test:emails
      - name: Fail if Mailtrap spam score > 5
        run: npm run mailtrap:check-spam-score

  deploy:
    needs: test-emails-in-mailtrap
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to production
        run: ./scripts/deploy.sh

  seed-check-after-deploy:
    needs: deploy
    runs-on: ubuntu-latest
    steps:
      - name: Create Inbox Check placement test
        id: seed
        run: |
          RESP=$(curl -s -X POST \
            https://check.live-direct-marketing.online/api/tests \
            -H "Content-Type: application/json" \
            -d '{
              "subject": "Post-deploy seed",
              "from": "alerts@yourbrand.com",
              "html": "<p>Deploy #${{ github.run_number }}</p>"
            }')
          echo "token=$(echo $RESP | jq -r .token)" >> $GITHUB_OUTPUT
      - name: Wait 5 minutes for placement results
        run: sleep 300
      - name: Fetch placement score
        run: |
          SCORE=$(curl -s \
            https://check.live-direct-marketing.online/api/tests/${{ steps.seed.outputs.token }} \
            | jq -r .score)
          echo "Score: $SCORE"
          if [ "$SCORE" -lt 70 ]; then
            echo "Placement below threshold"
            exit 1
          fi

Native integration in beta

A native integration is in private beta. Instead of scripting curl + jq + sleep, you will have a single GitHub Action and a Slack app that reports placement on every deploy.

→ Join the beta waitlist

Seed on every real send

The deeper pattern, beyond CI/CD hooks, is this: for every real production send of a new template, include your own seed mailbox addresses in the recipient list. If you send a campaign to 100,000 users, the cost of adding 20 more seed addresses is zero. Those seeds are your production canaries. They tell you where the real email landed, not where a sandbox thinks it would have landed.

This closes the loop that Mailtrap alone cannot close. Mailtrap validates what the code produced; seeds validate what the receiver did with it. Both are needed. Neither replaces the other.

TL;DR

Mailtrap is a staging sandbox. Inbox Check is production placement verification. If your CI only runs Mailtrap, you are testing 30% of the email delivery pipeline. Add an Inbox Check step after deploy and a seed on every send, and you cover the other 70% for free.

Should I stop using Mailtrap?

No. Mailtrap is the right tool for content QA and runaway prevention. Keep it in dev and staging. Add Inbox Check for production placement.

Does Inbox Check work with Mailtrap Email Sending (their production ESP)?

Yes. Inbox Check is ESP-agnostic. If you send production email through Mailtrap Email Sending, SendGrid, Postmark, SES or anything else, the placement test treats each the same.

Can I run Inbox Check in CI without deploying?

Yes. The placement test uses its own sending path by default, so you can run it against your template without involving your production ESP. It measures content- and authentication-related placement, not sender-reputation placement. For full production accuracy, also send to seed mailboxes through your real ESP.

What placement score threshold should my pipeline enforce?

Start at 70 to avoid false positives. As your pipeline stabilises and you see steady 90+ scores, raise the bar to 80 or 85. Treat any single test below 60 as a pager-worthy incident.

Mailtrap for staging, Inbox Check for production