Project Overview
Pulse Check is a productized website audit that runs against any public URL and returns a scored report — a single Pulse Score plus four pillar scores covering performance, modernization, compliance, and ecosystem signals — in under a minute. It is the public lead-magnet on thepulse.net and the scoring engine behind every customer engagement we run.
We built it for ourselves first: a tool that could replace the hour we used to spend manually triaging a prospect's site before a sales call, and could explain in concrete, evidence-backed terms why a site felt slow, fragile, or out of date. The same pipeline now runs as a service for visitors and as the qualification step for inbound leads.
The Challenge
An audit tool sounds simple until you try to ship it as production infrastructure. The requirements pulled in opposite directions:
- Real-browser fidelity — Static fetches lie. Modern sites only resolve correctly under a real browser with JavaScript, layered fonts, and cookies. The audit had to drive Chromium against the live site, not parse raw HTML.
- True dark-mode rescan — Reading
prefers-color-scheme: darkfrom CSS isn't enough. Many sites apply theme decisions at boot time in JavaScript, so we had to be willing to relaunch the browser under a darkBrowserContextand re-navigate the page. - Sub-minute end-to-end — From form submit to scored report, the wall-clock budget was 60 seconds. That ruled out cold container starts and disk-bound scratch space on every run.
- Three environments, one worker — Local, staging, and production all had to share a single warm worker without cross-database reads or noisy-neighbor failures. Spinning up dedicated workers per environment was wasteful for a service with bursty, human-driven traffic.
- Storage that mirrors lifetime — A scored report has fields with very different access patterns: small structured rows queried often, multi-megabyte HTML blobs written once and never queried, and Playwright traces only opened during incident response.
- Honest scoring — Anyone can publish a score. Defending it required versioned, deterministic rules with the rubric checked in next to the code.
Technical Solution
Architecture
Pulse Check is intentionally split across two .NET 10 processes that communicate through Azure Service Bus rather than HTTP:
| Component | Process | Responsibility |
|---|---|---|
| Web UI | ThePulseNet.UI | Blazor Server form, live status, public report rendering |
| Audit Worker | ThePulseNet.AuditWorker | Headless Chromium driver, scoring, persistence |
| Transport | Azure Service Bus | Topic-per-tenant routing, retry, dead-letter |
| Persistence | SQL + Blob + Table | Federated stores, each sized to its data lifetime |
Why split processes at all? Playwright is heavy — Chromium, fonts, the browser binaries themselves — and unpredictable. Co-locating it with the user-facing Blazor app would have meant every memory spike or hung navigation could land on a paying user's session. Keeping the worker on the other side of a queue lets us scale, restart, or even fail it without anyone on the website noticing.
Why one shared Container App
We deploy a single audit worker to Azure Container Apps and route work to it from local, staging, and production environments using SQL filters on a single Service Bus topic. Each environment publishes messages tagged with its own Environment property; the worker holds three subscriptions, one per filter, and pulls from whichever has work.
This means there is exactly one warm Chromium pool to keep healthy, one set of secrets to rotate, and one place to read logs — but no cross-environment data leakage, because each subscription is isolated and each message carries its own SQL connection string and storage account back to the right tenant.
KEDA scales the worker on queue depth: zero replicas when nobody is auditing, scaling out under burst, scaling back down inside a minute.
The Audit Pipeline
An audit is a fixed nine-stage pipeline. Stages are pure transforms over a context object, each emitting structured signals into a single SignalLog consumed by the scorer:
- Resolve — Normalize the URL, follow redirects, surface the canonical hostname.
- Network probe — TLS, HSTS, HTTP/2, Brotli, certificate chain, DNS metadata.
- Light navigation — Drive Playwright against the page, capture HTML, headers, console errors, and a light-mode screenshot.
- Dark rescan — If the site advertises
prefers-color-scheme: dark, re-navigate under a fresh darkBrowserContextto capture the real dark theme. - Brand extraction — Parse computed CSS for primary palette, then ask Gemini to confirm and label the colors against the captured screenshots; both the CSS source and the AI annotation are kept in the report.
- SEO / metadata — Open Graph, Twitter Card, JSON-LD, sitemap, robots.
- Compliance sweep — Privacy and Terms pages, cookie banner shape, security headers (CSP, HSTS, Referrer-Policy).
- Ecosystem — PWA manifest, service worker, analytics surface, third-party network calls.
- Score & persist — Apply the rubric, write the row, upload the artifacts, return the report ID.
Stages can fail independently. A missing service worker is a yellow signal, not a dead audit; an unreachable origin is a hard fail with a useful error.
Pulse Score
The score is deterministic and versioned. Each pillar is a weighted sum of signals, the rubric is checked in beside the code, and every report is stamped with the rubric version it was scored against — so a 2026-rubric report will not silently change when we ship a 2027 rubric.
| Pillar | Weight | Representative Signals |
|---|---|---|
| Performance | 40% | HTTP/2, Brotli/Gzip, TLS handshake time, image format mix |
| Modernization | 30% | Framework fingerprint, build pipeline hints, JS module/legacy ratio |
| Compliance | 15% | HSTS, CSP, privacy/terms presence, cookie banner shape |
| Ecosystem | 15% | PWA manifest, OG/Twitter cards, JSON-LD, analytics |
The Pulse Score itself is the weighted aggregate, presented as a single number plus the four pillars so a reader can immediately see where the deduction came from rather than guessing.
Federated Persistence
One report writes to three stores, each chosen for the data's actual lifetime:
| Store | Holds | Why |
|---|---|---|
| SQL Server | Report row, signal log, score breakdown, rubric version | Indexed, queryable, joined against organizations and historic scans |
| Azure Blob Storage | Raw HTML, light/dark screenshots, computed-CSS dumps | Large, write-once, served back behind signed URLs only when a user opens the report |
| Azure Table Storage | Playwright traces, console logs, network HAR slices | Cheap, append-only, opened only during incident triage |
This separation means the day-to-day path — list the latest reports, render the Pulse Score, draw the pillar bars — is a single small SQL round trip. We never page megabytes through SQL Server just to display a score.
A representative report renders like this:
thePulse Inc.
https://thepulse.netprefers-color-scheme: dark,
so we re-navigated under a dark BrowserContext for boot-time theme JS.
Operational Surface
Pulse Check exposes the same observability we expect from the platforms we build for clients:
- Live audit status — The Blazor UI subscribes to per-audit progress over SignalR, so the user sees the pipeline advance stage by stage rather than staring at a spinner.
- Structured logs end-to-end — Serilog with a correlation ID per audit, threaded from the web request through the Service Bus message into every worker stage. Looking up a single audit's full trace is one filter.
- Replay from queue — Failed audits dead-letter cleanly. Replay is a tooling concern, not a database surgery concern.
- Rubric diffing — Because rubric versions are recorded with each report, we can re-score historical scans against a new rubric without touching the original artifacts.
Key Outcomes
| End-to-end latency | Median audit completes in well under 60 seconds, including a true dark rescan when warranted |
| One warm worker | Local, staging, and production share a single Container App via SQL-filtered Service Bus subscriptions, with no cross-environment data access |
| Scale-to-zero | KEDA collapses the worker to zero replicas when idle and recovers under burst within seconds |
| Federated persistence | SQL, Blob, and Table each carry the data they're best at — no megabyte payloads through SQL, no structured queries against blobs |
| Versioned rubric | Every report is stamped with the rubric version it was scored against, so historical scores remain stable across releases |
| Lead-magnet that pays for itself | Pulse Check both ships as a free public tool and runs as the qualification step on every inbound engagement |
Technology Stack
| Framework | ASP.NET Core / Blazor Server (.NET 10) |
| Worker host | Azure Container Apps with KEDA queue-depth scaling |
| Browser automation | Playwright for .NET (Chromium) |
| Messaging | Azure Service Bus (single topic, SQL-filtered subscriptions) |
| AI extraction | Google Gemini for brand-color labeling and recommendation drafting |
| Persistence | SQL Server (rows), Azure Blob Storage (artifacts), Azure Table Storage (traces) |
| ORM | Entity Framework Core 10 |
| Auth | OpenID Connect (PulsePass) for authenticated history; anonymous one-shot audits for public visitors |
| Real-time UI | SignalR for live pipeline progress |
| Logging | Serilog with per-audit correlation IDs |
What Made This Project Interesting
True dark rescan, not prefers-color-scheme guessing. The first version of Pulse Check just toggled a CSS media query and snapped a second screenshot. The results were wrong on every site that decides its theme in boot-time JavaScript, which turns out to be most modern marketing sites. Switching to a real second navigation under a dark BrowserContext doubled the audit time on dark-aware sites, but it gave us screenshots we could actually trust — and made the brand-extraction step honest.
Scoring as code, not policy. The rubric is a versioned object next to the scorer; reports record which version scored them. That means we can ship rubric improvements without invalidating last quarter's reports, and we can re-score historical scans on demand to see what a fairness change would have done. It is the same idea as database migrations, applied to the question of how good a website is.
Federated persistence as a default. The temptation in a small team is to put everything in SQL Server because that's where the auth schema already lives. Pulse Check is the project where we stopped doing that. Splitting the data across SQL, Blob, and Table — each chosen for the access pattern of that data — kept the hot path small, made the cold path cheap, and made it obvious where new data should land the next time a stage gets added.