cloudflare-workers
Cloudflare KV
Cloudflare D1
analytics

Counting views at the edge for a dollar a month without a single cookie

A deep dive into the privacy-first view counter behind my portfolio: a Cloudflare Workers KV hot path, a D1 read model, two cron jobs, and a salted hash that forgets you at midnight.

Azraf Al Monzim
213 views
Listen to this post··:··

TL;DR — My portfolio counts views with a Cloudflare Worker that writes to KV on the hot path, mirrors into D1 every 15 minutes, and snapshots a daily time-series at 00:05 UTC. It stores no cookies, no raw IP addresses, and never blocks a page render. The whole thing costs well under a dollar a month and fits in your head.

There is a specific kind of itch that comes from shipping a personal site: you want to know which posts people actually read, but you do not want to bolt Google Analytics onto a page you spent a weekend making fast, and you definitely don't want a cookie banner apologizing for it. The off-the-shelf privacy-friendly tools (Plausible, Fathom, Umami) solve this beautifully — but I was already running everything on a single Cloudflare Worker, and the counter I needed was small enough that pulling in a hosted service felt heavier than just building it.

So I built it. This post is the full teardown: every design decision, the one bug that quietly ate my data for a day, and the tradeoffs I'd flag before you copy it.

The constraints I started with were non-negotiable:

  1. Never block the response. Analytics is the least important thing happening during a page load. It must run after the user already has their bytes.
  2. Store zero personally identifiable information. No cookies, no localStorage, no raw IPs sitting in a database waiting to become a liability.
  3. Be cheap and boring. Edge-only, no extra services, no queue to babysit, idempotent enough that a double-fired cron can't corrupt the numbers.
  4. Still produce real charts. Counts per post, unique vs. total, and a daily trend line — the stuff that's actually useful.

Those four pull in different directions, and the architecture is mostly a story about resolving the tension between them.

Architecture at a glance

At the component level it's one Worker sitting between the browser and two storage layers, with a cron stitching them together (Figure 1).

System architecture: browser, Cloudflare Worker (classifier, session hash, recordView, scheduled cron router), Workers KV write model, D1 read model, and the dashboard

Figure 1. System architecture. A single Cloudflare Worker classifies the request, derives a salted session id, and calls recordView() on the hot path (solid). Its own scheduled() cron router (pink) flushes KV into D1 and snapshots a daily series. The dashboard reads D1 only.

Zooming in on the request itself, the same pieces line up as a flow (Figure 2).

End-to-end view-counter data flow: browser to Worker to KV, cron-flushed into D1, snapshotted daily, read by the dashboard

Figure 2. End-to-end data flow. The request-time path (solid) writes to KV and returns immediately. Two cron jobs (pink) reconcile KV into D1, which is the only thing the dashboard ever reads.

There are two data stores doing two different jobs, and a cron between them:

  • Workers KV is the write model. Every countable view does a couple of tiny KV operations and nothing else. KV is the live source of truth for "what is the count right now."
  • D1 (Cloudflare's SQLite) is the read model. A cron flushes KV into it every 15 minutes. D1 is where the admin dashboard runs ORDER BY, GROUP BY, and SUM.
  • Two cron triggers glue them together: one drains KV → D1, one snapshots D1 into a daily time-series table.

If you squint, it's CQRS in miniature — a write-optimized store and a read-optimized store, reconciled asynchronously instead of kept in lockstep. Let's walk it from the outside in.

Identifying a visitor without tracking them

The hardest part of cookieless analytics isn't counting — it's deciding what "a visitor" even means when you've forbidden yourself every persistent identifier. The industry-standard answer, the one Plausible and Fathom both use, is a daily-rotating salted hash [2] [3]. I used the same recipe (Figure 3).

Salted daily SHA-256 session id derivation and the 30-minute sliding dedup window

Figure 3. Session id derivation and dedup. Four inputs in, a 128-bit id out. The UTC date is part of the hash, so the id rotates at midnight even though the salt is static. The 30-minute dedup window decides unique vs. repeat.

The derivation is intentionally tiny — it lives in analytics-session.ts and is the whole of the "who" logic:

export async function deriveSessionId(opts: {
  ip: string;
  userAgent: string;
  salt: string;
  now?: Date;
}): Promise<string> {
  const date = utcDateString(opts.now); // YYYY-MM-DD
  const material = `${opts.ip}|${opts.userAgent}|${date}|${opts.salt}`;
  const digest = await crypto.subtle.digest("SHA-256", enc.encode(material));
  const bytes = new Uint8Array(digest);
  let hex = "";
  for (let i = 0; i < bytes.length; i++) {
    hex += bytes[i].toString(16).padStart(2, "0");
  }
  return hex.slice(0, 32); // first 32 hex chars = 128 bits
}

Four properties fall out of those four lines, and each one is load-bearing:

  • It's irreversible. SHA-256 plus a secret salt (ANALYTICS_SESSION_SALT, set via wrangler secret put) means the stored id can't be walked back to an IP. Lose the salt and every historical id becomes un-reconstructable — which is exactly what you want.
  • It rotates at midnight UTC. Because the date string is in the hash material, the same person on the same network produces a different id tomorrow. There's no salt-deletion cron to run; the date does the rotation for free.
  • It never persists the inputs. The IP and User-Agent exist only as transient hash inputs inside a Worker isolate. Nothing about them is written anywhere.
  • If the salt is unset, tracking simply turns off. The pageview tracker checks for the salt first and returns an empty session id when it's missing, so a misconfigured environment degrades to "count nothing" rather than "leak something."

The honest tradeoff — and it's the same one every cookieless tool makes — is that a visitor who comes back five days running counts as five uniques, not one. Daily rotation is what buys the GDPR-grade anonymization, and inflated weekly uniques is the price [2]. For a portfolio, that's a trade I'll take every time.

One small divergence from Plausible worth naming: they rotate and delete the salt every 24 hours; mine is a static secret and relies on the date-in-hash for rotation. The id still rotates daily, but cross-day irreversibility rests on the salt staying secret rather than on it being destroyed. For my threat model (a personal site, no IPs stored anywhere downstream) that's fine; for anything regulated, prefer salt rotation.

Deciding what's even countable

Before any of that runs, a classifier (analytics-context.ts) throws out everything that isn't a real human reading a real page. It skips HEAD requests, anything under /admin, /ops, or /api, static-asset extensions (a regex over .js, .css, .png, fonts, …), and — critically — bots, via the isbot package matched against the User-Agent. Route classification is a plain switch:

function classifyRoute(pathname: string) {
  if (pathname === "/" || pathname === "") return { type: "home", slug: "" };
  const parts = pathname.split("/").filter(Boolean);
  if (parts[0] === "blogs" && parts[1]) return { type: "blog", slug: parts[1] };
  if (parts[0] === "projects" && parts[1])
    return { type: "project", slug: parts[1] };
  return { type: "page", slug: "" };
}

The type is deliberately a free-form string. When I later wanted to count gallery image opens and résumé downloads, no schema migration was needed — they're just new type values.

The hot path: recordView and the bug that ate a day of data

Here's where constraint #1 — never block the response — gets real. The pageview tracker is a TanStack Start server function called from the root route's loader on every navigation. There's a subtlety: on client-side navigations the server-fn request URL is /_serverFn/…, not the page the user is on, so the pathname is passed explicitly and the real URL is synthesized for classification. The headers (IP, UA) still come from the real request.

Once a view is deemed countable, it calls recordView(), which is void and fire-and-forget. The counting work is deferred so the user's response ships first:

waitUntil(
  (async () => {
    const [counts, dedupExisting] = await Promise.all([
      kv.get<ViewCounts>(ck, "json"), // vc:counts:{type}:{slug}
      kv.get(dk), // vc:dedup:{sessionId}:{type}:{slug}
    ]);
 
    const isUnique = !dedupExisting;
    const next = {
      unique: (counts?.unique ?? 0) + (isUnique ? 1 : 0),
      total: (counts?.total ?? 0) + 1,
    };
 
    await Promise.all([
      kv.put(ck, JSON.stringify(next), { metadata }),
      kv.put(dk, "1", { expirationTtl: 1800 }), // 30-min sliding window
    ]);
  })(),
);

Two keys, two reads, two writes. The vc:dedup:* key is the unique/total discriminator: a view is "unique" only if the dedup key was absent when we read it. Every view re-writes that key with a fresh 1800-second TTL, so the window slides — it tracks activity, not a fixed 30-minute clock from first contact.

Now the bug. Notice that waitUntil. The first version of this code wrapped the KV work in a bare async IIFE and didn't hand it to waitUntil. It looked correct and passed every local test. In production, the dashboard stayed stubbornly empty.

The reason is the entire personality of the Workers runtime: the moment a server function returns its response, the isolate can be torn down. Any promise still in flight — like my detached KV writes — is abandoned. Every recordView() was returning instantly and dropping its writes on the floor. The fix is to explicitly tell the runtime "keep the isolate alive until this settles," which is precisely what waitUntil (imported from cloudflare:workers) does [5]. One import, one wrap, and the counters came alive.

It's a great reminder that at the edge, finishing your async work is opt-in. If you've ever seen background work mysteriously not happen in a Worker, this is the first place to look.

The hot path also refuses to ever throw at a user — the whole block is wrapped in try/catch that logs and moves on. Analytics failing should be invisible, never a 500.

Why two storage layers

You might reasonably ask: if KV already holds the counts, why involve D1 at all? Because KV and a SQL database are good at opposite things, and the dashboard needs the things KV is bad at (Figure 4).

KV versus D1: KV is the fast write model, D1 is the queryable read model, reconciled by a cron

Figure 4. Two storage layers. KV is a fast, write-heavy counter store with no query language. D1 is a durable, indexed, queryable mirror that isn't on the hot path. The cron reconciles one into the other.

KV gives me single-digit-millisecond reads at the edge and dirt-cheap writes, which is exactly what the per-view hot path wants. But KV is eventually consistent (a write can take up to ~60 seconds to propagate globally) [1], its list() returns keys without values, and it has no notion of GROUP BY date. You cannot build a 14-day trend chart out of it without doing aggregation yourself, on every request.

D1 is the inverse. It's real SQLite: indexed, durable, and it speaks SQL. The schema is two tables (migration 011_view_counts.sql):

CREATE TABLE view_counts (
  type TEXT NOT NULL,
  slug TEXT NOT NULL,
  unique_views INTEGER NOT NULL DEFAULT 0,
  total_views  INTEGER NOT NULL DEFAULT 0,
  src  TEXT,                 -- optional asset URL (gallery images)
  data TEXT,                 -- optional JSON blob for extensibility
  last_synced_at TEXT NOT NULL,
  PRIMARY KEY (type, slug)
);
 
CREATE TABLE view_counts_daily (
  type TEXT NOT NULL,
  slug TEXT NOT NULL,
  date TEXT NOT NULL,        -- YYYY-MM-DD (UTC)
  unique_views_snapshot INTEGER NOT NULL DEFAULT 0,
  total_views_snapshot  INTEGER NOT NULL DEFAULT 0,
  PRIMARY KEY (type, slug, date)
);

view_counts is the current cumulative count per asset; view_counts_daily is an end-of-day snapshot that makes time-series possible. Neither is ever written by a user request — they're both downstream of the cron.

Two crons, two jobs

The reconciliation lives entirely in scheduled handlers. The Worker's scheduled() entry point switches on the raw crontab string and dispatches the matching job, each wrapped in ctx.waitUntil so the runtime keeps the isolate alive until the writes finish. Failures are logged but never rethrown — a flaky run shouldn't poison the next one.

The two jobs and their cadence are shown in Figure 5.

Two crons: a 15-minute KV-to-D1 flush and a nightly 00:05 UTC snapshot

Figure 5. Cron lifecycle. Every 15 minutes the flush drains KV into D1. Once a day at 00:05 UTC the snapshot freezes the cumulative totals into a dated row. Both are idempotent.

The 15-minute flush

flushViewCounters() paginates through every vc:counts:* key, reads each value, and upserts it into view_counts in a single batched round trip:

INSERT INTO view_counts (type, slug, unique_views, total_views, src, data, last_synced_at)
VALUES (?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(type, slug) DO UPDATE SET
  unique_views = excluded.unique_views,
  total_views  = excluded.total_views,
  src  = COALESCE(excluded.src,  view_counts.src),
  data = COALESCE(excluded.data, view_counts.data),
  last_synced_at = excluded.last_synced_at;

The key property is idempotency: KV holds the cumulative value, and D1 is overwritten to match it rather than incremented. If the cron fires twice in a row with no new views in between, the second pass writes the identical numbers. There's no addition happening in D1, so there's nothing to double-count. (KV's list() not returning values means one extra read per key — but at a few hundred keys per flush, that's noise.)

The nightly snapshot — and a SQLite gotcha

At 00:05 UTC, snapshotDailyViewCounts() copies the current cumulative totals into view_counts_daily, stamped with yesterday's date:

INSERT INTO view_counts_daily (type, slug, date, unique_views_snapshot, total_views_snapshot)
SELECT type, slug, ?, unique_views, total_views FROM view_counts WHERE true
ON CONFLICT(type, slug, date) DO UPDATE SET
  unique_views_snapshot = excluded.unique_views_snapshot,
  total_views_snapshot  = excluded.total_views_snapshot;

That WHERE true is not decoration. Without it, SQLite's parser reads the ON of ON CONFLICT as the start of a join clause on the preceding SELECT and rejects the whole statement with near "DO": syntax error. The dummy WHERE disambiguates INSERT … SELECT … ON CONFLICT — a documented parsing ambiguity in SQLite's upsert grammar [4]. It cost me a confusing ten minutes the first time; now it's a one-line comment in the codebase pointing at the SQLite docs.

Because the primary key is (type, slug, date), re-running the snapshot for the same day just overwrites with the same numbers — idempotent again, and safe to backfill.

The daily delta — "how many views did this post get yesterday?" — isn't stored at all. It's computed in the read layer as today's snapshot − yesterday's snapshot, which means the snapshot table only ever holds monotonic cumulative values and the deltas can never drift out of sync with the totals.

Client events and anti-forgery

Pageviews ride the server-fn path, but some things only happen in the browser — opening a gallery image, copying or downloading the résumé. Those fire a small beacon to /api/track-event carrying the session id (which the page exposes in a <meta name="x-mz-sid"> tag).

A beacon with a client-supplied session id is, of course, forgeable. The endpoint defends against that by re-deriving the session id server-side from the request's own IP + UA + date + salt, and comparing:

const expected = await deriveSessionId({ ip, userAgent: ua, salt });
if (expected !== body.s) return new Response(null, { status: 204 }); // silently drop

Since an attacker doesn't know the salt, they can't forge an id that matches their own IP and UA. A mismatch is dropped silently with a 204. The same recordView() machinery then records the event under a descriptive type (gallery_open, gallery_impression, resume_copy, resume_download) — those free-form types paying off exactly as designed.

Reading it back

The admin dashboard at /ops/views reads only from D1, never KV. Two queries do most of the work: a listViewCounts() that sorts the current totals (by total, unique, or recency), and a daily aggregate that powers the trend chart:

SELECT date,
       CAST(SUM(total_views_snapshot)  AS INTEGER) AS total_views,
       CAST(SUM(unique_views_snapshot) AS INTEGER) AS unique_views
FROM view_counts_daily
WHERE date >= ?
GROUP BY date
ORDER BY date ASC;

From there it's recharts: a 14-day pulse chart (daily deltas), a donut of views by content type, a per-asset 30-day trend dialog, and the usual KPI tiles. There's also a manual "Flush KV → D1" button for when I deploy and don't want to wait up to 15 minutes for the first cron — handy, and safe precisely because the flush is idempotent.

Tradeoffs, and what I'd change

No design is free. The ones I'd want a reader to walk away knowing:

  • Eventual consistency means counts can briefly disagree across regions. For a portfolio, nobody cares. For anything where the number is contractual (billing, rate limits), KV is the wrong primitive.
  • The flush reads every counter key every 15 minutes. At a few hundred assets it's trivial. At tens of thousands it'd want a dirty-set (only flush keys touched since last run) instead of a full scan. I'd reach for that the day it matters, not before.
  • No raw event log. I store aggregates, not events, so I can never retroactively ask a new question of old traffic ("what referrers did this post get last March?"). That's a deliberate privacy choice — you can't leak what you never recorded — but it is a real analytical ceiling.
  • Static salt vs. rotated salt. As noted, a regulated context should rotate-and-delete the salt daily rather than lean on date-in-hash with a fixed secret.

Takeaways

The thing I like most about this system is how little of it there is. A hash function, two KV keys, two SQL tables, two cron jobs. No queue, no external analytics vendor, no cookie, no PII. It rides entirely on primitives the Worker already had, and the parts compose:

  • Split the write model from the read model when they want opposite things from storage. KV for the fast write, D1 for the rich read, a cron to reconcile — and make the reconciliation idempotent so retries and double-fires are non-events.
  • Defer the unimportant work with waitUntil, and remember that at the edge, finishing your async work is opt-in. The most pernicious bug here was silent precisely because the runtime did exactly what it promised: it tore the isolate down the instant the response was sent.
  • You can identify sessions without surveilling people. A salted, date-rotated hash is enough to separate "two reads by one person" from "two people," and it forgets everyone at midnight.

If you're on Cloudflare and you've been putting off analytics because the privacy story felt like a project, it doesn't have to be. It can be a hash and a cron.

Tags:
cloudflare-workers
Cloudflare KV
Cloudflare D1
analytics
edge
Privacy
Azraf Al Monzim

Written by Azraf Al Monzim

Platform Engineer passionate about building scalable systems and sharing knowledge through writing.