Every platform team running their own observability stack ends up asking the same question about deliverability: why is the business-critical signal (inbox rate) stuck in a third-party SaaS dashboard when every other metric on the wall is in Grafana? This article fixes that. We scrape the Inbox Check API into Prometheus, expose it as a proper time series, and import a pre-built Grafana dashboard with per-provider inbox rate, DMARC alignment, DNSBL counts, and alert rules.
Inbox Check API → small Node exporter → Prometheus scrape → Grafana dashboard. Alerts via Grafana Unified Alerting → PagerDuty / Slack. No SaaS dashboard. No extra vendor.
Why Grafana beats SaaS dashboards for serious teams
SaaS dashboards are great for a team of one. They break down for teams that already have observability discipline. Three reasons Grafana wins once you have a platform team:
- One pane of glass. Inbox rate next to API error rate next to queue depth. Correlations are obvious; in separate tabs they are invisible.
- Alerting in the same place as everything else. Your on-call already has Grafana alerting rules for everything else. Do not bolt on a second alerting story.
- Retention under your control. Most SaaS dashboards keep 30–90 days. Prometheus + Thanos / Mimir gives you whatever retention you want.
Architecture
Two moving parts. The first is a small exporter that calls the Inbox Check API on a schedule, keeps the last known good result per monitor, and exposes it at /metrics in Prometheus text format. The second is the Grafana dashboard that queries Prometheus for those series.
You do not strictly need Prometheus — Grafana has a JSON API datasource plugin that can hit the Inbox Check API directly. That works, but you lose time-series retention, joins with other metrics, and recording rules. For a one-off panel it is fine; for a production dashboard, use the exporter.
Two options: direct JSON or Prom exporter
Direct JSON (quick and dirty)
Install the marcusolsson-json-datasource Grafana plugin. Point it at https://check.live-direct-marketing.online/api. Add the Authorization: Bearer ic_live_... header. Query paths like /check/{id} in individual panels. You will not get time-series though — every panel shows the current value only.
Prometheus exporter (recommended)
A 60-line Node process that does two jobs: (1) runs a setInterval every N minutes to kick off new placement tests, (2) serves Prometheus metrics at :9464/metrics. Prometheus scrapes it on its normal schedule. Everything else is plumbing you already have.
Exporter script
Minimal but production-adjacent:
import http from 'node:http';
import client from 'prom-client';
const KEY = process.env.INBOX_CHECK_API_KEY;
const BASE = 'https://check.live-direct-marketing.online';
const SENDERS = (process.env.SENDERS || '').split(',');
const INTERVAL_MIN = Number(process.env.INTERVAL_MIN || 60);
const reg = new client.Registry();
const inboxRate = new client.Gauge({
name: 'inbox_check_inbox_rate',
help: 'Inbox rate 0..1 by provider and sender',
labelNames: ['sender', 'provider'],
registers: [reg],
});
const dmarcPass = new client.Gauge({
name: 'inbox_check_dmarc_pass',
help: '1 if DMARC passes with alignment, else 0',
labelNames: ['sender'],
registers: [reg],
});
const dnsblCount = new client.Gauge({
name: 'inbox_check_dnsbl_listings',
help: 'Count of DNSBL listings detected',
labelNames: ['sender'],
registers: [reg],
});
const spamScore = new client.Gauge({
name: 'inbox_check_spamassassin_score',
help: 'SpamAssassin score (higher is worse)',
labelNames: ['sender'],
registers: [reg],
});
async function runOne(sender) {
const start = await fetch(`${BASE}/api/check`, {
method: 'POST',
headers: {
Authorization: `Bearer ${KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
senderDomain: sender,
subject: 'Observability probe',
html: '<p>metric probe</p>',
}),
}).then((r) => r.json());
let res;
for (let i = 0; i < 40; i++) {
await new Promise((r) => setTimeout(r, 5000));
res = await fetch(`${BASE}/api/check/${start.id}`, {
headers: { Authorization: `Bearer ${KEY}` },
}).then((r) => r.json());
if (res.status === 'complete') break;
}
if (!res || res.status !== 'complete') return;
for (const p of res.providers) {
inboxRate.set({ sender, provider: p.name }, p.inboxRate);
}
dmarcPass.set({ sender }, res.auth.dmarc === 'pass' ? 1 : 0);
dnsblCount.set({ sender }, res.dnsbl?.listingsCount ?? 0);
spamScore.set({ sender }, res.spamAssassinScore ?? 0);
}
setInterval(() => {
for (const s of SENDERS) runOne(s).catch(console.error);
}, INTERVAL_MIN * 60_000);
http
.createServer(async (req, res) => {
if (req.url !== '/metrics') return res.end();
res.setHeader('Content-Type', reg.contentType);
res.end(await reg.metrics());
})
.listen(9464, () => console.log('exporter on :9464'));Scrape config on the Prometheus side is the usual:
scrape_configs:
- job_name: inbox-check
scrape_interval: 60s
static_configs:
- targets: ['inbox-check-exporter.observability.svc:9464']The dashboard JSON
A partial but representative excerpt — the full file is in the inbox-check/dashboards repo. Import via Grafana → Dashboards → New → Import, paste JSON, select your Prometheus datasource.
{
"title": "Email Deliverability",
"tags": ["deliverability", "email"],
"schemaVersion": 39,
"timezone": "",
"panels": [
{
"type": "timeseries",
"title": "Inbox rate by provider",
"targets": [
{
"expr": "inbox_check_inbox_rate",
"legendFormat": "{{sender}} · {{provider}}"
}
],
"fieldConfig": {
"defaults": {
"unit": "percentunit",
"min": 0,
"max": 1,
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "orange", "value": 0.6 },
{ "color": "green", "value": 0.8 }
]
}
}
}
},
{
"type": "stat",
"title": "DMARC alignment (now)",
"targets": [{ "expr": "inbox_check_dmarc_pass" }]
},
{
"type": "bargauge",
"title": "DNSBL listings",
"targets": [{ "expr": "inbox_check_dnsbl_listings" }]
}
]
}Alerting rules
Two alerts cover most real incidents:
# inbox_rate_drop.yml (Grafana Unified Alerting or Prometheus)
groups:
- name: deliverability
rules:
- alert: InboxRateDrop
expr: avg by (sender) (inbox_check_inbox_rate) < 0.8
for: 15m
labels:
severity: warning
team: growth
annotations:
summary: "Inbox rate below 80% on {{ $labels.sender }}"
- alert: DNSBLListing
expr: inbox_check_dnsbl_listings > 0
for: 0m
labels:
severity: page
annotations:
summary: "{{ $labels.sender }} appears on {{ $value }} DNSBL(s)"Panel ideas
- Inbox rate over time, per provider. Your top panel. One line per (sender, provider) pair.
- DMARC alignment heatmap. X-axis time, Y-axis sender, cell colour pass/fail. Spot patterns like "transactional fails alignment on the 15th of every month".
- DNSBL listing count stat. Big red number when it matters.
- SpamAssassin score time-series. Lets you catch slow content drift before it lands you in Spam.
- Per-provider placement last run. Stat panels showing today's number vs yesterday's, with sparklines.
Expose a counter for API calls, a histogram for API latency, and a gauge for your current quota usage. Deliverability silently turning off because you ran out of test credits is exactly the sort of thing an observability-first team should not miss.
Integrating with PagerDuty via Grafana
Grafana Unified Alerting has a PagerDuty contact point out of the box. Create one, paste your PagerDuty integration key, route the severity=page alerts to it and severity=warning to Slack. The DNSBL alert earlier is the classic page-worthy event: the moment one of your sending domains shows up on Spamhaus is the moment an on-call engineer wants to know.