Learning Bytes

Post · · 5 min

Technical Deep Dive: How LearningBytes Was Built in 1 day

A complete technical guide to building LearningBytes with Astro, Notion as CMS, local image rewriting, Pagefind search, and Cloudflare Pages deployment.

If you want to build a fast, content-driven engineering blog without operating a traditional backend, this architecture is a strong pattern: Notion as the authoring CMS, Astro as the static site generator, and Cloudflare Pages for globally distributed hosting.

In this post, I will break down exactly how learningbytes.sheraj.org is built end-to-end, including the tricky parts like media URL expiry, schema mapping, and keeping the build pipeline reproducible in CI.

1) Overview

learningbytes.sheraj.org is a static content platform designed for technical write-ups (Bytes) and source material references. The core goals were: fast page loads, easy publishing, markdown-native source control for builds, and low operational overhead.

  • Authoring should feel non-technical (Notion editor)
  • Delivery should be web-performance-first (Astro static output)
  • Search should be local/static, with no hosted search backend
  • Deployments should be automated and cheap (Cloudflare Pages)

2) Tech Stack

The stack is intentionally lean and composable:

  • Astro: static site generation, route/content orchestration
  • React Islands: only hydrate interactive UI components where needed
  • Tailwind CSS v4: utility-first styling with streamlined config
  • MDX: allows prose + component-level enrichment when needed
  • Pagefind: static full-text search generated at build time

This combination gives excellent default performance because most pages ship as static HTML/CSS with minimal JavaScript.

Astro + React islands pattern

---
import Layout from "../layouts/BaseLayout.astro";
import SearchDialog from "../components/SearchDialog.tsx";
const { post } = Astro.props;
---

<Layout title={post.title}>
  <article class="prose prose-invert max-w-3xl mx-auto">
    <h1>{post.title}</h1>
    <Fragment set:html={post.html} />
  </article>

  <!-- Hydrates only on interaction -->
  <SearchDialog client:idle />
</Layout>

3) Content Pipeline (Notion as CMS)

The CMS layer uses two Notion databases:

  1. Bytes database: canonical posts (title, slug, tags, status, format, dates, body)
  2. Sources database: references, links, and source material that can relate to one or more Bytes

During build prep, a fetch script queries Notion, converts page blocks to Markdown, normalizes metadata, and writes local files under src/content (or equivalent). Astro then builds from local artifacts rather than live Notion responses.

This separation keeps builds deterministic and avoids runtime dependence on Notion API availability.

4) Notion Integration (@notionhq/client + notion-to-md)

The integration layer is a small Node script that uses @notionhq/client for API calls and notion-to-md for block conversion.

import { Client } from "@notionhq/client";
import { NotionToMarkdown } from "notion-to-md";

const notion = new Client({ auth: process.env.NOTION_API_KEY });
const n2m = new NotionToMarkdown({ notionClient: notion });

const BYTES_DB_ID = process.env.NOTION_BYTES_DATABASE_ID!;
const SOURCES_DB_ID = process.env.NOTION_SOURCES_DATABASE_ID!;

Recommended environment variables:

NOTION_API_KEY=ntn_xxx
NOTION_BYTES_DATABASE_ID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
NOTION_SOURCES_DATABASE_ID=yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
SITE_URL=https://learningbytes.sheraj.org

Fetch only publishable content to avoid accidental previews:

const bytes = await notion.databases.query({
  database_id: BYTES_DB_ID,
  filter: {
    and: [
      { property: "Status", select: { equals: "Published" } },
      { property: "Format", select: { equals: "Post" } }
    ]
  },
  sorts: [{ property: "Published date", direction: "descending" }]
});

For each result page, convert content blocks into markdown:

const mdBlocks = await n2m.pageToMarkdown(pageId);
const mdString = n2m.toMarkdownString(mdBlocks);

5) Image Handling (Critical)

One of the most important implementation details: Notion-hosted file URLs are signed and expire quickly (often around 1 hour). If your static site references those URLs directly, images will break after expiry.

Solution: during fetch-notion, download every required asset and rewrite references to local paths.

  1. Read cover image + inline image URLs from page properties/blocks
  2. Download each image into public/images/bytes//…
  3. Replace original Notion URLs in markdown/frontmatter with local /images/… paths
  4. Commit or generate these assets in CI so production output is self-contained
import fs from "node:fs/promises";
import path from "node:path";

async function localizeImage(url: string, slug: string, filename: string) {
  const outDir = path.join("public", "images", "bytes", slug);
  await fs.mkdir(outDir, { recursive: true });

  const outPath = path.join(outDir, filename);
  const res = await fetch(url);
  if (!res.ok) throw new Error(`Failed: ${url}`);

  const buf = Buffer.from(await res.arrayBuffer());
  await fs.writeFile(outPath, buf);

  return `/images/bytes/${slug}/${filename}`;
}

This makes builds resilient, cache-friendly, and CDN-optimal.

6) Build Pipeline (3 stages)

The production build is intentionally explicit and staged:

  1. fetch-notion: sync content + assets from Notion into local content files
  2. astro build: generate static HTML/CSS/JS
  3. pagefind: index generated output for client-side static search
{
  "scripts": {
    "fetch-notion": "tsx scripts/fetch-notion.ts",
    "build:site": "astro build",
    "build:search": "pagefind --site dist",
    "build": "pnpm fetch-notion && pnpm build:site && pnpm build:search"
  }
}

This sequence guarantees that search indexes the final built HTML, not raw source.

7) Hosting & Deployment

Hosting is done on Cloudflare Pages with CI/CD from Git. Typical flow:

  • Push to main branch
  • CI installs dependencies and runs pnpm build
  • dist/ output is deployed to Cloudflare Pages
  • Custom domain learningbytes.sheraj.org points to the Pages project

Cloudflare gives global edge caching, TLS, and simple rollback/version history out of the box.

Example Cloudflare build settings

  • Build command: pnpm build
  • Build output directory: dist
  • Environment variables: NOTION_API_KEY, NOTION_BYTES_DATABASE_ID, NOTION_SOURCES_DATABASE_ID

8) Search with Pagefind

Pagefind is ideal for static sites because it creates a compact search index at build time and runs entirely in the browser.

import { useEffect, useState } from "react";

export function useSearch(query: string) {
  const [results, setResults] = useState([]);

  useEffect(() => {
    if (!query) return;
    (async () => {
      // @ts-ignore loaded from /pagefind/pagefind.js in dist
      const { search } = await window.pagefind;
      const res = await search(query);
      setResults(res.results || []);
    })();
  }, [query]);

  return results;
}

Because indexing happens post-build, content changes become searchable immediately after deployment—no external indexing service required.

9) Key Learnings

What was tricky

  • Notion media URL expiry: must localize images during build prep
  • Schema drift: keeping Notion property names/types stable is essential
  • Markdown fidelity: complex Notion blocks may need custom transformers
  • Build determinism: avoid mixing runtime API calls into static rendering

What worked well

  • Notion as authoring UI: fast content iteration without touching code
  • Astro architecture: excellent performance with minimal hydration
  • Pagefind: zero-backend search with great UX for technical content
  • Cloudflare Pages: low-friction global deploys and reliable CI/CD

If you replicate this architecture, start with a strict Notion schema, make image localization non-optional, and keep the build pipeline staged and deterministic. That combination gives you a robust, scalable content system with very low ops overhead.

Conclusion

LearningBytes demonstrates that you can combine a writer-friendly CMS (Notion) with modern static tooling (Astro + Pagefind) to create a fast, maintainable publication stack. With the same pattern, you can ship your own knowledge base, engineering blog, or docs site quickly—while retaining full control of code, content, and deployment.

#technical #astro #notion #cloudflare #tutorial

share · x · linkedin ·