Migrate from Paligo + Webpack to Docusaurus 3#3
Open
matenadasdi wants to merge 3 commits intomainfrom
Open
Conversation
Replaces the custom Paligo CMS + Webpack pipeline with a Docusaurus 3 + MDX site
backed by an XML→Markdown migration script. Preserves every existing live URL
(444/444 coverage), the integrations layer (OneTrust, GTM, Intercom, Search
Widget), and the Bitrise visual identity.
Key components
- migration/convert.py — Paligo XML export → Markdown/MDX. Walks the structure
tree, pulls content from canonical resources via UUID origin lookup,
renders DocBook to Markdown (admonitions, tables, procedures, code blocks,
cross-references). Detects topichead nodes (non-clickable categories) and
linktype="import" subtrees (skipped — content is reused via xi:include).
- migration/build_url_map.py — extracts live page titles from the Paligo HTML
output to drive slug assignment.
- migration/apply_slugs_v2.py — section-aware slug matcher; only applies a
slug when the matched URL lives in the same top-level section, preventing
cross-section misroutes for common titles like "Getting started".
Docusaurus features wired up
- Tabs / TabItem from XML <para role="tabs"> + <procedure role="tab-content">
patterns, emitted as MDX components.
- Glossary as an HTML <dl> with anchor IDs; inline glossterms become
<GlossTerm baseform="..."> components that show a hover tooltip and click
through to the glossary anchor.
- Reusable content fragments via MDX partials. Every xi:include target gets
one file at src/partials/<readable-slug>.mdx (e.g. opening-the-workflow-
editor.mdx). Section-context references render the partial as a JSX
component; list-context references use a placeholder line that the
Docusaurus markdown.preprocessor expands inline before MDX parsing — keeps
step numbering continuous while preserving a single source of truth.
- Topichead components in the Paligo structure render as toggle-only sidebar
categories (link: null) so they expand children without navigating, matching
the live site behavior.
- Sidebar labels driven by migration/live_nav_labels.json extracted from the
Paligo HTML toc, so navigation matches the published site (e.g. "Insights
for the Build Cache" instead of just "Insights").
- Sidebar wiring: every category with an index.md gets link.type=doc; topichead
categories get link: null.
- Image references rewritten to /img/_paligo/uuid-<hex>.<ext> using the Paligo
HTML export as the canonical image store (665 files), since the XML's src/
remap attributes often pointed to legacy filenames that don't exist on disk.
- Portal landing page (src/pages/index.tsx) rebuilt as a React component with
six product cards, Bitrise brand styling, and the search input wired for the
Google Search Widget.
Build pipeline
- docusaurus.config.ts handles markdown.preprocessor for: list-context partial
expansion, JSX-tag escaping for placeholder text like <Git provider> or
<connected-app-id>, and the kebab-case {placeholder} pattern that MDX would
otherwise read as a JSX expression.
- format: 'detect' so .mdx files are parsed as MDX and .md files stay plain.
- Site builds clean (zero broken-link errors) at 444/444 URL coverage.
Removes nothing — the old paligo.js / webpack.config.js / middleware.js stay
in tree until cutover is approved.
Step-by-step instructions covering Node.js install, repo clone, npm install, dev server, production build, and troubleshooting — written for someone with no prior Docusaurus or Node.js knowledge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single source of truth for anyone editing the docs — humans or LLMs. Ports the Bitrise Style Guide (voice, capitalization, terminology, lists, titles, punctuation) from Confluence into actionable rules with side-by-side examples. Adds the codebase-specific mechanics that arent on Confluence: frontmatter, .md vs .mdx, partial imports and the build-time list-context expansion, GlossTerm usage, image conventions, redirect rules, and the JSX-escape pitfalls the migration surfaced. Claude Code reads CLAUDE.md automatically every session, so well-written guidance here becomes both a human reference and an LLM context primer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the custom Paligo CMS + Webpack pipeline with a Docusaurus 3 site backed by an XML → Markdown/MDX migration. Preserves every existing live URL (444/444), ports all integrations (OneTrust, GTM, Intercom, Google Search Widget), and matches the live site's visual identity and navigation.
What's in the box
Migration pipeline (
migration/)convert.py— Paligo XML export → Markdown/MDX. Walks the<e:structure>tree, resolves linked components byoriginUUID, converts DocBook bodies (sections, admonitions, tables, procedures, code blocks, lists, cross-references). Detects topichead nodes (non-clickable categories) andlinktype="import"subtrees (skipped — content is reused viaxi:include).build_url_map.py— extracts live page titles fromout/en/*.htmlto drive slug assignment.apply_slugs_v2.py— section-aware slug matcher; only applies a slug when the matched URL lives in the same top-level section, preventing cross-section misroutes for common titles like "Getting started".Docusaurus features wired up
<para role="tabs">+<procedure role="tab-content">patterns become MDX<Tabs>/<TabItem>components.<dl>with anchor IDs. Inlineglosstermreferences become<GlossTerm baseform="...">components with a hover tooltip that click through to the glossary anchor.xi:includetarget gets one file atsrc/partials/<readable-slug>.mdx(e.g.opening-the-workflow-editor.mdx). Section-context references render the partial as a JSX component; list-context references use a placeholder line that the Docusaurusmarkdown.preprocessorexpands inline before MDX parsing — keeps step numbering continuous while preserving a single source of truth._category_.jsonentries withlink: null, so they expand children without navigating.migration/live_nav_labels.json) so navigation reads identically to the published site — e.g. "Insights for the Build Cache" instead of just "Insights"./img/_paligo/uuid-<hex>.<ext>using the Paligo HTML export as the canonical image store (665 files), since the XML'ssrc/remapattrs often pointed to legacy filenames.src/pages/index.tsx) rebuilt as a React component with the six product cards, Bitrise brand styling, and search input wired for the Google Search Widget.Integrations
headTagscustomFields.gtmId(env var)customFields.genSearchWidgetConfigId(env var)customFields.intercomAppIdWhy this approach
src/partials/once. Editing the partial cascades to every consumer on next build — no more hunting through dozens of duplicated copies.<Component />inside a numbered list renders as a block and breaks step numbering. The on-disk consumer file has1. <Partial_X />; the build splices in the partial's actual list items before MDX parses.Reviewer notes
paligo.js,webpack.config.js,middleware.js,build.js,src/html/,src/js/, etc.) is left in tree on purpose. Cutover happens in a separate PR after this one is verified end-to-end.migration/docs/is in.gitignore— it's the Python script's intermediate output, copied intodocs/by the bash pipeline.static/img/_paligo/holds the 665 Paligo-exported images. Largest part of the diff in raw line count.src/partials/has 592 generated MDX partial files. Each has a readable filename derived from the resource title (e.g.opening-the-workflow-editor.mdx).markdown.preprocessorindocusaurus.config.tsdoes several things in one pass: list-context partial expansion, JSX-tag escaping for placeholder text like<Git provider>or<connected-app-id>, and{kebab-case}placeholder escaping. Each step has an inline comment explaining the trigger pattern.Test plan
npm install && npx docusaurus build— should complete with zero broken-link errors/) — six product cards render, search input is present, OneTrust banner shows in production builds/en/bitrise-ci/workflows-and-pipelines/workflows/creating-a-workflow.html) — 7 steps render in continuous numbering, first 2 from "Opening the Workflow Editor"src/partials/opening-the-workflow-editor.mdx, rebuild, confirm every consumer reflects the change🤖 Generated with Claude Code