Deep Dives17 min read

How SonicJS Uses Cloudflare D1 at the Edge

A deep dive into the SonicJS D1 database layer โ€” how the Document Model, prepared statements, KV caching, and read replicas combine into an edge-native CMS data layer. (v2 history included.)

SonicJS Team

Isometric visualization of a distributed edge database with floating SQLite shards, glowing query streams, and a central compute layer on a dark slate background

How SonicJS Uses Cloudflare D1 at the Edge

Note: This post was originally written for v2 of SonicJS. In v3, content no longer lives in per-feature tables managed by Drizzle ORM. Instead, all content is stored in the Document Model โ€” 5 tables (documents, document_types, document_references, document_facets, document_permissions) accessed via DocumentRepository and DocumentsService using raw SQL prepared statements. Drizzle ORM is retained for auth tables only (Better Auth / 0001_core.sql). The v2 schema descriptions below are kept for historical reference; v3 specifics are called out inline.

TL;DR โ€” SonicJS runs on Cloudflare D1, a serverless SQLite database deployed across Cloudflare's global network. In v3, content is stored in the Document Model (5 tables, raw SQL via DocumentRepository/DocumentsService). Drizzle ORM is used only for auth tables. Migrations are version-controlled SQL files bundled at build time, and hot reads are layered behind a Cloudflare KV cache for sub-10ms global latency.

Key Stats:

  • D1 = SQLite at the edge โ€” full SQL with indexes, joins, transactions, and PRAGMAs
  • 300+ Cloudflare locations route reads to a regional read replica (Sessions API)
  • 10 GB per database, 50K databases per account on the paid plan
  • v3 has exactly 2 migrations: 0001_core.sql (auth tables) and 0002_documents.sql (document repository). Queryable scalar fields added via MigrationService.ensureDocumentGeneratedColumns() โ€” no new migration files needed.
  • 5โ€“15ms cached reads via KV vs 20โ€“60ms cold D1 reads at the edge

If you've used Postgres or MySQL with a traditional CMS, the SonicJS data layer will feel both familiar and refreshingly different. Familiar because it's still SQL, still relational, still backed by typed query helpers you can reason about. Different because it runs inside the same Cloudflare Worker that serves your requests โ€” no connection pool, no separate database tier, no host: db.us-east-1.amazonaws.com to provision.

This deep dive walks through how SonicJS uses Cloudflare D1, the engineering choices behind the schema, and the trade-offs you should know about before you ship to production.

What Cloudflare D1 Actually Is

Cloudflare D1 is a serverless SQLite database that lives on Cloudflare's edge network. Under the hood, every D1 database is a real SQLite file managed by Cloudflare, with one primary region and (on the paid plan) read replicas spread across other regions.

A few specifics that matter for CMS workloads:

  • SQLite-compatible โ€” supports the full SQLite SQL dialect: indexes, joins, foreign keys, JSON1, full-text search via FTS5, and PRAGMA introspection.
  • No connection pooling โ€” Workers connect to D1 via the binding (env.DB), not over TCP. There are no pools to size or exhaust.
  • Global by default โ€” you bind D1 to a Worker, and the Worker accesses it from any data center.
  • Sessions API โ€” Workers can pin a logical "session" to a particular replica to get read-your-writes consistency without paying the latency of round-tripping to the primary on every read.
  • Limits per database โ€” 10 GB max size and 100 KB row size on current plans. Enough for a real CMS, but not the place for blob storage (use R2 for that โ€” see the database overview for the full architecture).

The mental model is: a SQLite file you can query from any of 300+ Cloudflare locations, with a smart routing layer between you and it.

D1's Read Replicas and the Consistency Model

Read replicas are where edge databases get philosophically interesting. By default, every D1 query is routed to the primary region of the database. That gives you strong consistency โ€” every read sees every write โ€” but at the cost of geographic latency. A user in Singapore hitting a database whose primary is in Virginia pays 200+ ms per query.

D1 solves this with the Sessions API:

// Pseudo-code: D1 Sessions API at the edge
const session = env.DB.withSession('first-unconstrained')
const results = await session
  .prepare('SELECT * FROM content WHERE status = ?')
  .bind('published')
  .all()

// The bookmark identifies the latest write the session has observed.
// Subsequent queries on this session won't read older data.
const bookmark = session.getBookmark()

The session carries a bookmark โ€” a logical clock that says "I have observed at least this much of the write history." When the same user makes a follow-up request, you pass the bookmark back to the next session and D1 routes the read to a replica that has caught up at least that far. This gives you causal / read-your-writes consistency without the round-trip to the primary.

For a CMS like SonicJS, the trade-off shakes out cleanly:

  • Reads of published content (most of your traffic) โ†’ served from the nearest replica, eventually consistent within seconds.
  • Author editing flows โ†’ use a session bookmark so the editor always sees their own latest save.
  • Writes โ†’ always go to the primary region.

Most CMS traffic is read-dominant, which is exactly the workload D1 replicas are designed for.

How SonicJS Structures the Schema (v3)

v2 note: The v2 schema was a Drizzle-managed collection of per-feature tables (users, collections, content, media, settings, and plugin-specific tables) defined in packages/core/src/db/schema.ts. That approach is described later in this section for reference. v3 replaces all content tables with the Document Model.

v3: The Document Model (5 tables, raw SQL)

In v3, packages/core/src/db/schema.ts contains only auth tables (managed by Better Auth). All content lives in 5 tables introduced in 0002_documents.sql and accessed exclusively via DocumentRepository and DocumentsService using raw env.DB.prepare(sql).bind(...).run() / env.DB.batch([...]) calls:

TablePurpose
document_typesRegistered schemas (code/plugin-owned).
documentsEvery content record + every historical version. Queryable scalar fields exposed as indexed JSON VIRTUAL generated columns (q_*) via json_extract(data, '$.path').
document_referencesTyped strong/weak edges between documents; powers "where used" and reference-aware delete.
document_facetsIndexed rows for multi-valued scalar fields (e.g. tags) โ€” the one case generated columns can't cover.
document_permissionsPer-document ACL overrides on top of type-level base grants.

Collections are code-only: registered via registerCollections([...]) in the app entry point and held in the in-memory CollectionRegistry singleton. There is no collections table in v3.

To add a queryable scalar field, add a q_* entry in MigrationService.ensureDocumentGeneratedColumns() โ€” no new migration file needed. New migration files are only for structural changes.

v3: Auth tables (Drizzle scope)

Drizzle ORM is retained for Better Auth tables only โ€” auth_user, auth_session, auth_account, etc. โ€” defined in packages/core/src/db/schema.ts and migrated by 0001_core.sql. Drizzle is not used for any content, media, or plugin data in v3.

v2 reference: per-feature tables (historical)

The v2 schema defined users, collections, content, media, settings, and plugin-specific tables in Drizzle. Here is a representative slice preserved for historical context:

// packages/core/src/db/schema.ts โ€” v2 (historical, auth_user replaces this in v3)
import { sqliteTable, text, integer } from 'drizzle-orm/sqlite-core'
import { createInsertSchema, createSelectSchema } from 'drizzle-zod'

// v2 Users โ€” superseded by Better Auth auth_user in v3
export const users = sqliteTable('users', {
  id: text('id').primaryKey(),
  email: text('email').notNull().unique(),
  username: text('username').notNull().unique(),
  firstName: text('first_name').notNull(),
  lastName: text('last_name').notNull(),
  passwordHash: text('password_hash'),
  role: text('role').notNull().default('viewer'),
  isActive: integer('is_active', { mode: 'boolean' }).notNull().default(true),
  lastLoginAt: integer('last_login_at'),
  createdAt: integer('created_at').notNull(),
  updatedAt: integer('updated_at').notNull(),
})

// v2 Collections โ€” code-only via CollectionRegistry in v3; no DB table
export const collections = sqliteTable('collections', { /* ... */ })

// v2 Content โ€” replaced by documents table in v3
export const content = sqliteTable('content', {
  id: text('id').primaryKey(),
  collectionId: text('collection_id').notNull().references(() => collections.id),
  slug: text('slug').notNull(),
  title: text('title').notNull(),
  data: text('data', { mode: 'json' }).notNull(),
  status: text('status').notNull().default('draft'),
  publishedAt: integer('published_at', { mode: 'timestamp' }),
  authorId: text('author_id').notNull().references(() => users.id),
  createdAt: integer('created_at', { mode: 'timestamp' }).notNull().$defaultFn(() => new Date()),
  updatedAt: integer('updated_at', { mode: 'timestamp' }).notNull().$defaultFn(() => new Date()),
})

export type User = typeof users.$inferSelect
export const insertUserSchema = createInsertSchema(users)
export const selectUserSchema = createSelectSchema(users)

The "fixed core, flexible payload" pattern (preserved in v3)

Both v2 and v3 use the same fundamental idea: a fixed set of columns the CMS reasons about, plus a single data column storing collection-specific fields as JSON. In v3 this is documents.data, with queryable scalar fields surfaced as VIRTUAL generated columns (q_*) that D1 can index.

When you define a BlogPost collection with body, excerpt, and coverImage fields, those become keys inside documents.data. MigrationService.ensureDocumentGeneratedColumns() adds a q_body, q_excerpt etc. generated column backed by json_extract so you can filter and sort without a full-table scan.

The plugin settings pattern

Plugins like OAuth, magic link, OTP, and AI search all need to store configuration. Rather than each plugin owning its own table, SonicJS uses a generic settings table keyed by (category, key) with a JSON value:

-- packages/core/migrations/018_settings_table.sql (v2 reference)
CREATE TABLE IF NOT EXISTS settings (
  id TEXT PRIMARY KEY,
  category TEXT NOT NULL,
  key TEXT NOT NULL,
  value TEXT NOT NULL,
  created_at INTEGER NOT NULL,
  updated_at INTEGER NOT NULL,
  UNIQUE(category, key)
);

CREATE INDEX IF NOT EXISTS idx_settings_category ON settings(category);
CREATE INDEX IF NOT EXISTS idx_settings_category_key ON settings(category, key);

A plugin reads its config with WHERE category = 'oauth' AND key = 'github_client_id'. This avoids a migration for every plugin install and keeps the schema stable.

Sessions, audit, and workflow (v2)

Other v2 tables included user_sessions, workflow_history, content_versions, media, api_tokens, plugins / plugin_hooks / plugin_routes, and plugin-specific tables added by migrations. In v3 these roles are covered by the document repository and Better Auth session management.

The full current schema is documented in the database reference.

Querying D1: DocumentRepository (v3) and Drizzle (auth only)

v3: DocumentRepository and DocumentsService

In v3, DocumentRepository.list() is the read chokepoint for all content queries. DocumentsService handles all writes (create, saveDraft, publish, unpublish, erase). Both use raw env.DB.prepare(sql).bind(...) โ€” never Drizzle query-builder objects for content. This is enforced because Drizzle objects cannot be used inside env.DB.batch([...]), which is required for atomic multi-statement document writes.

// v3: reading published content via DocumentRepository
import { DocumentRepository } from '@sonicjs-cms/core'

app.get('/api/posts', async (c) => {
  const repo = new DocumentRepository(c.env.DB, 'default') // tenant = 'default'
  const { rows } = await repo.listPublished('blog-post', {
    limit: 20,
    orderBy: 'updated_at',
  })
  return c.json({ data: rows })
})
// v3: writing a document via DocumentsService (raw SQL batch internally)
import { DocumentsService } from '@sonicjs-cms/core'

const svc = new DocumentsService(c.env.DB, 'default')
await svc.create({
  type: 'blog-post',
  data: { title: 'Hello', body: '...' },
  authorId: user.userId,
})

v3: Drizzle for auth tables only

Drizzle is still used for Better Auth tables (auth_user, auth_session, auth_account). The connection helper is intentionally narrow:

// packages/core/src/db/index.ts โ€” v3 (auth only)
import { drizzle } from 'drizzle-orm/d1'
import * as schema from './schema' // auth tables only

export function createDb(d1: D1Database) {
  return drizzle(d1, { schema })
}

export * from './schema'

Prepared statements and env.DB.batch()

D1's batch() ships a list of prepared statements as a single round-trip and runs them as an implicit transaction. All document writes in v3 go through batch() internally:

// v3 pattern inside DocumentsService (raw SQL, not Drizzle)
const now = Math.floor(Date.now() / 1000) // documents uses seconds, not ms
const batch = [
  env.DB.prepare(
    'INSERT INTO documents (id, tenant_id, root_id, type_id, version_number, data, is_current_draft, created_at, updated_at) VALUES (?,?,?,?,?,?,1,?,?)'
  ).bind(id, 'default', rootId, typeId, 1, JSON.stringify(data), now, now),
  // ... facet inserts, reference inserts
]
await env.DB.batch(batch)

The batch is atomic and runs in a single network hop to the primary โ€” far faster than await in a loop.

v2 reference: Drizzle as content query path (historical)

In v2, Drizzle was the recommended way to query content in SonicJS. A typical route handler looked like:

// v2 (historical) โ€” Drizzle content query
import { eq, and, desc } from 'drizzle-orm'
import { createDb, content, users, collections } from '@sonicjs-cms/core'

app.get('/api/posts', async (c) => {
  const db = createDb(c.env.DB)
  const posts = await db
    .select({ id: content.id, title: content.title, slug: content.slug })
    .from(content)
    .innerJoin(users, eq(content.authorId, users.id))
    .where(and(eq(collections.name, 'posts'), eq(content.status, 'published')))
    .orderBy(desc(content.publishedAt))
    .limit(20)
  return c.json({ data: posts })
})

In v3, replace this pattern with DocumentRepository.listPublished(typeId, opts).

The Migration Workflow

v3: exactly 2 migrations

In v3, packages/core/migrations/ contains exactly two files:

packages/core/migrations/
โ”œโ”€โ”€ 0001_core.sql        โ€” Better Auth tables (auth_user, auth_session, auth_account, โ€ฆ)
โ””โ”€โ”€ 0002_documents.sql   โ€” Document repository (5 tables + q_* generated cols + partial unique indexes)

That's it. There are no per-feature migrations, no plugin-specific migration files. Adding a queryable scalar field to the document model does not require a new migration file โ€” you add a q_* entry to MigrationService.ensureDocumentGeneratedColumns(), which runs an idempotent ALTER TABLE at bootstrap via table_xinfo introspection.

// Adding a queryable field โ€” no new migration file (v3 pattern)
// In packages/core/src/services/migrations.ts:
await ensureDocumentGeneratedColumns(db, [
  { name: 'q_title',  expr: "json_extract(data, '$.title')" },
  { name: 'q_slug',   expr: "json_extract(data, '$.slug')" },
  { name: 'q_status', expr: "json_extract(data, '$.status')" },
  // add new entries here โ€” idempotent ALTER TABLE runs at startup
])

The next free migration number is 0003_*. Use a new file only for structural schema changes (new tables, index drops, column type changes).

The build-time bundler

Workers don't have filesystem access at runtime, so SonicJS bundles every SQL file into TypeScript at build time. The prebuild npm script runs scripts/generate-migrations.ts, which reads packages/core/migrations/*.sql and emits src/db/migrations-bundle.ts with the SQL embedded as string constants. At runtime, MigrationService reads from the bundle, applies pending migrations, and tracks applied state in a migrations table:

// What the runtime sees (v3)
import { bundledMigrations, getMigrationSQLById } from './db/migrations-bundle'

const status = await migrationService.getMigrationStatus()
// { totalMigrations: 2, appliedMigrations: 2, pendingMigrations: 0, ... }

await migrationService.runPendingMigrations()
// No-op on a fresh install after setup:db

After editing any packages/core/migrations/*.sql file, regenerate the bundle: cd packages/core && npm run generate:migrations, then re-sync the my-sonicjs-app/migrations/ copies (byte-identical).

For local development you can also use wrangler d1 migrations apply DB --local to run the same SQL against your local D1 instance directly.

Why not drizzle-kit generate only?

Drizzle Kit is used only for auth table development (v3). The document repository (0002_documents.sql) contains VIRTUAL generated columns and partial / expression UNIQUE indexes that Drizzle Kit cannot express. The shipped artifact for document model changes is always hand-curated SQL. Two reasons:

  1. Determinism. Generated SQL diffs are sensitive to small schema reorderings; reviewing hand-written SQL keeps every column rename and index intentional.
  2. Data migrations. Many SonicJS migrations seed default rows or backfill columns. That's awkward to express in a schema diff but trivial in a hand-written INSERT OR IGNORE.

The result: Drizzle gives you typed auth queries, SQL files give you reviewable history, and the bundler gives you Worker-compatible packaging.

v2 reference: 36 migrations (historical)

The v2 packages/core/migrations/ directory contained 36 numbered files (001_initial_schema.sql through 036_analytics_events.sql), one per schema change. That approach accumulated per-feature tables for every plugin. v3 replaces this with the 2-file document model above.

Performance: Cold Reads, Warm Reads, and KV-Cached Reads

In practice, the SonicJS data layer has three latency tiers.

TierWhat runsTypical latency
KV cache hitEdge KV GET on a precomputed query key5โ€“15 ms globally
D1 replica readDrizzle/D1 query routed to nearest replica20โ€“60 ms (region-dependent)
D1 primary read/writeRound-trip to the primary region40โ€“200 ms (region-dependent)

The third tier is the only one users notice โ€” and it's where the layered cache matters. SonicJS uses a three-tier read path for hot content:

// Conceptual: memory โ†’ KV โ†’ D1
async function getPublishedPost(slug: string) {
  // 1. In-isolate memory cache (instant for the duration of this isolate)
  const memo = memoryCache.get(slug)
  if (memo) return memo

  // 2. Cloudflare KV โ€” globally replicated, eventually consistent
  const cached = await env.CACHE_KV.get(`post:${slug}`, 'json')
  if (cached) {
    memoryCache.set(slug, cached)
    return cached
  }

  // 3. D1 โ€” source of truth (v3: DocumentRepository; v2 used Drizzle .select())
  const repo = new DocumentRepository(env.DB, 'default')
  const { rows } = await repo.listPublished('blog-post', { filter: { q_slug: slug }, limit: 1 })
  const row = rows[0] ?? null

  if (row) {
    memoryCache.set(slug, row)
    await env.CACHE_KV.put(`post:${slug}`, JSON.stringify(row), { expirationTtl: 3600 })
  }
  return row ?? null
}

The KV layer absorbs the long tail of repeated published-content reads, so D1 only handles writes, cache misses, and authenticated admin queries. For more on the cache design (including invalidation), see the caching strategy guide.

Indexes still matter

D1 is SQLite, so indexes are still the difference between a 5 ms query and a 500 ms one. In v3, 0002_documents.sql ships indexes on documents(tenant_id, type_id, is_current_draft), the q_* generated columns, and a partial unique index on (root_id, version_number) WHERE is_current_draft = 1 for concurrent-write safety. Auth tables in 0001_core.sql carry their own indexes on auth_user.email and session lookup columns.

-- v3: representative indexes from 0002_documents.sql
CREATE INDEX idx_documents_tenant_type ON documents(tenant_id, type_id);
CREATE INDEX idx_documents_q_slug ON documents(q_slug) WHERE is_current_draft = 1;
CREATE UNIQUE INDEX idx_documents_unique_version ON documents(root_id, version_number);

In v2, indexes covered the per-feature tables: content(collection_id, status), content(author_id, status, published_at DESC), users(email) WHERE is_active = 1, etc. Those tables are decommissioned in v3.

Trade-offs vs Postgres and MySQL

D1 is a great fit for the CMS workload SonicJS targets, but it isn't a drop-in replacement for every Postgres deployment. The honest comparison:

Where D1 wins:

  • Zero ops. No instances to size, no failover to script, no connection pool to tune.
  • Edge latency. Replicas in regions where Postgres would require setting up read replicas yourself.
  • Cost. Generous free tier (5 GB, 25M reads/day) and pay-per-request scaling that maps directly to traffic.
  • Tight coupling with Workers. The binding is faster and simpler than any TCP driver.

Where Postgres still wins:

  • Database size. Postgres scales to terabytes; D1 caps at 10 GB per database (use multiple databases or push blob data to R2).
  • Concurrency on the primary. SQLite serializes writes โ€” fine for CMS workloads, problematic for write-heavy SaaS analytics. (See our REST API guide for how SonicJS partitions hot write paths.)
  • Advanced types. Postgres has jsonb, arrays, enums, RANGE, and PostGIS. D1 has JSON1 and that's it.
  • Stored procedures and PL/pgSQL. D1 doesn't run server-side procedural code.

For a content-driven workload where 95% of traffic is reads of a known set of slugs, D1 + KV is hard to beat. For a write-heavy OLTP system, you'd reach for Postgres on a managed platform and use SonicJS with Hyperdrive โ€” also supported, but a different architectural conversation.

Internal Links

  • Database reference โ€” full schema documentation, every table, every index
  • Caching strategy โ€” three-tier memory โ†’ KV โ†’ D1 in detail
  • Collections โ€” how collection schemas turn into document types in v3
  • REST API โ€” endpoint reference for everything described here

Key Takeaways

  • D1 is SQLite at the edge โ€” ACID, indexes, joins, no server to manage. In v3, SonicJS uses it as the system of record via the 5-table Document Model.
  • In v3, Drizzle ORM is for auth tables only (0001_core.sql). All content reads go through DocumentRepository.list() / listPublished(); all writes go through DocumentsService โ€” both use raw prepared SQL and env.DB.batch().
  • Schemas use a "fixed core, flexible payload" pattern โ€” documents has stable structural columns plus a data JSON blob. Queryable scalar fields are surfaced as VIRTUAL generated columns (q_*) added via MigrationService.ensureDocumentGeneratedColumns() โ€” no new migration files.
  • v3 has exactly 2 migrations: 0001_core.sql (auth) and 0002_documents.sql (document repository). The v2 approach of 36 per-feature migrations is replaced by the self-healing generated-column pattern.
  • Hot reads are served from KV, cold reads hit a regional D1 replica, writes go to the primary. Use D1's Sessions API for read-your-writes consistency where needed.
  • D1 trades some Postgres flexibility for zero-ops and edge latency โ€” the right call for content-driven workloads, an honest conversation for write-heavy ones.

The data layer is one of the most opinionated parts of SonicJS, and almost every choice is downstream of one decision: the database has to live where the request handler lives. Once you accept that, D1 + the Document Model + KV is a remarkably small amount of moving parts for what it gives you.

Have questions about extending the schema, adding a custom collection, or tuning queries for your workload? Join us on Discord or open an issue on GitHub.

Happy querying!

#database#cloudflare-d1#drizzle-orm#edge-computing#sqlite#performance

Share this article

Related Articles