Post · · 5 min
Technical Deep Dive: How LearningBytes Was Built in 1 day
A complete technical guide to building LearningBytes with Astro, Notion as CMS, local image rewriting, Pagefind search, and Cloudflare Pages deployment.
If you want to build a fast, content-driven engineering blog without operating a traditional backend, this architecture is a strong pattern: Notion as the authoring CMS, Astro as the static site generator, and Cloudflare Pages for globally distributed hosting.
In this post, I will break down exactly how learningbytes.sheraj.org is built end-to-end, including the tricky parts like media URL expiry, schema mapping, and keeping the build pipeline reproducible in CI.
1) Overview
learningbytes.sheraj.org is a static content platform designed for technical write-ups (Bytes) and source material references. The core goals were: fast page loads, easy publishing, markdown-native source control for builds, and low operational overhead.
- Authoring should feel non-technical (Notion editor)
- Delivery should be web-performance-first (Astro static output)
- Search should be local/static, with no hosted search backend
- Deployments should be automated and cheap (Cloudflare Pages)
2) Tech Stack
The stack is intentionally lean and composable:
- Astro: static site generation, route/content orchestration
- React Islands: only hydrate interactive UI components where needed
- Tailwind CSS v4: utility-first styling with streamlined config
- MDX: allows prose + component-level enrichment when needed
- Pagefind: static full-text search generated at build time
This combination gives excellent default performance because most pages ship as static HTML/CSS with minimal JavaScript.
Astro + React islands pattern
---
import Layout from "../layouts/BaseLayout.astro";
import SearchDialog from "../components/SearchDialog.tsx";
const { post } = Astro.props;
---
<Layout title={post.title}>
<article class="prose prose-invert max-w-3xl mx-auto">
<h1>{post.title}</h1>
<Fragment set:html={post.html} />
</article>
<!-- Hydrates only on interaction -->
<SearchDialog client:idle />
</Layout>
3) Content Pipeline (Notion as CMS)
The CMS layer uses two Notion databases:
- Bytes database: canonical posts (title, slug, tags, status, format, dates, body)
- Sources database: references, links, and source material that can relate to one or more Bytes
During build prep, a fetch script queries Notion, converts page blocks to Markdown, normalizes metadata, and writes local files under src/content (or equivalent). Astro then builds from local artifacts rather than live Notion responses.
This separation keeps builds deterministic and avoids runtime dependence on Notion API availability.
4) Notion Integration (@notionhq/client + notion-to-md)
The integration layer is a small Node script that uses @notionhq/client for API calls and notion-to-md for block conversion.
import { Client } from "@notionhq/client";
import { NotionToMarkdown } from "notion-to-md";
const notion = new Client({ auth: process.env.NOTION_API_KEY });
const n2m = new NotionToMarkdown({ notionClient: notion });
const BYTES_DB_ID = process.env.NOTION_BYTES_DATABASE_ID!;
const SOURCES_DB_ID = process.env.NOTION_SOURCES_DATABASE_ID!;
Recommended environment variables:
NOTION_API_KEY=ntn_xxx
NOTION_BYTES_DATABASE_ID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
NOTION_SOURCES_DATABASE_ID=yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
SITE_URL=https://learningbytes.sheraj.org
Fetch only publishable content to avoid accidental previews:
const bytes = await notion.databases.query({
database_id: BYTES_DB_ID,
filter: {
and: [
{ property: "Status", select: { equals: "Published" } },
{ property: "Format", select: { equals: "Post" } }
]
},
sorts: [{ property: "Published date", direction: "descending" }]
});
For each result page, convert content blocks into markdown:
const mdBlocks = await n2m.pageToMarkdown(pageId);
const mdString = n2m.toMarkdownString(mdBlocks);
5) Image Handling (Critical)
One of the most important implementation details: Notion-hosted file URLs are signed and expire quickly (often around 1 hour). If your static site references those URLs directly, images will break after expiry.
Solution: during fetch-notion, download every required asset and rewrite references to local paths.
- Read cover image + inline image URLs from page properties/blocks
- Download each image into public/images/bytes/
/… - Replace original Notion URLs in markdown/frontmatter with local /images/… paths
- Commit or generate these assets in CI so production output is self-contained
import fs from "node:fs/promises";
import path from "node:path";
async function localizeImage(url: string, slug: string, filename: string) {
const outDir = path.join("public", "images", "bytes", slug);
await fs.mkdir(outDir, { recursive: true });
const outPath = path.join(outDir, filename);
const res = await fetch(url);
if (!res.ok) throw new Error(`Failed: ${url}`);
const buf = Buffer.from(await res.arrayBuffer());
await fs.writeFile(outPath, buf);
return `/images/bytes/${slug}/${filename}`;
}
This makes builds resilient, cache-friendly, and CDN-optimal.
6) Build Pipeline (3 stages)
The production build is intentionally explicit and staged:
- fetch-notion: sync content + assets from Notion into local content files
- astro build: generate static HTML/CSS/JS
- pagefind: index generated output for client-side static search
{
"scripts": {
"fetch-notion": "tsx scripts/fetch-notion.ts",
"build:site": "astro build",
"build:search": "pagefind --site dist",
"build": "pnpm fetch-notion && pnpm build:site && pnpm build:search"
}
}
This sequence guarantees that search indexes the final built HTML, not raw source.
7) Hosting & Deployment
Hosting is done on Cloudflare Pages with CI/CD from Git. Typical flow:
- Push to main branch
- CI installs dependencies and runs pnpm build
- dist/ output is deployed to Cloudflare Pages
- Custom domain learningbytes.sheraj.org points to the Pages project
Cloudflare gives global edge caching, TLS, and simple rollback/version history out of the box.
Example Cloudflare build settings
- Build command: pnpm build
- Build output directory: dist
- Environment variables: NOTION_API_KEY, NOTION_BYTES_DATABASE_ID, NOTION_SOURCES_DATABASE_ID
8) Search with Pagefind
Pagefind is ideal for static sites because it creates a compact search index at build time and runs entirely in the browser.
import { useEffect, useState } from "react";
export function useSearch(query: string) {
const [results, setResults] = useState([]);
useEffect(() => {
if (!query) return;
(async () => {
// @ts-ignore loaded from /pagefind/pagefind.js in dist
const { search } = await window.pagefind;
const res = await search(query);
setResults(res.results || []);
})();
}, [query]);
return results;
}
Because indexing happens post-build, content changes become searchable immediately after deployment—no external indexing service required.
9) Key Learnings
What was tricky
- Notion media URL expiry: must localize images during build prep
- Schema drift: keeping Notion property names/types stable is essential
- Markdown fidelity: complex Notion blocks may need custom transformers
- Build determinism: avoid mixing runtime API calls into static rendering
What worked well
- Notion as authoring UI: fast content iteration without touching code
- Astro architecture: excellent performance with minimal hydration
- Pagefind: zero-backend search with great UX for technical content
- Cloudflare Pages: low-friction global deploys and reliable CI/CD
If you replicate this architecture, start with a strict Notion schema, make image localization non-optional, and keep the build pipeline staged and deterministic. That combination gives you a robust, scalable content system with very low ops overhead.
Conclusion
LearningBytes demonstrates that you can combine a writer-friendly CMS (Notion) with modern static tooling (Astro + Pagefind) to create a fast, maintainable publication stack. With the same pattern, you can ship your own knowledge base, engineering blog, or docs site quickly—while retaining full control of code, content, and deployment.