Archive, search, and browse Discord server messages with a fast web viewer.
- Full archive: channels, threads, attachments, reactions, embeds, users
- Incremental updates: only fetches new messages on re-runs
- Full-text search: FTS5-powered search with
#channeland@userfilters - Keyboard-driven: use-kbd omnibar (Cmd+K), shortcuts, arrow-key navigation
- Deployable: Cloudflare Workers + D1 + Pages, with GitHub Actions CI/CD
- Versioned data: DVX-tracked archive with S3 remote cache
# 1. Set your Discord bot token and guild ID
export DISCORD_TOKEN="your-bot-token"
export DISCORD_GUILD="your-guild-id"
# 2. Archive all messages
./archive.py
# 3. Build the SQLite database
./build_db.py
# 4. Start the local API server
./server.py &
# 5. Start the viewer
cd app && pnpm install && pnpm dev
# Open http://localhost:5272discord-archive/
archive.py # Discord API → JSON (incremental, per-channel files)
build_db.py # JSON → SQLite (normalized, FTS5 search index)
build_index.py # JSON → index.json (for static viewer)
server.py # Local dev API server (Starlette + SQLite)
archive/ # DVX-tracked raw JSON archive + attachments
archive.db # SQLite database (derived from archive/)
app/ # Vite + React viewer
api/ # Cloudflare Worker (D1-backed API)
d1-import.sh # Full SQLite → D1 import
d1-sync.py # Incremental D1 sync (zero downtime)
actions/ # Composite GH Actions for downstream repos
.github/workflows/ # CI (typecheck)
Archives all messages from a Discord guild to per-channel JSON files.
./archive.py # archive all channels + threads
./archive.py --no-threads # skip thread messages
./archive.py --no-attachments # skip downloading attachments
./archive.py --backfill-attachments # re-fetch expired CDN URLs and download
./archive.py -g 123456789 # specify guild ID
./archive.py -o my-archive # custom output directoryRequires DISCORD_TOKEN env var (bot token with Message Content intent).
Builds a normalized SQLite database from the JSON archive.
./build_db.py # default: archive/ → archive.db
./build_db.py -i my-archive -o my.db # custom pathsCreates tables: channels, messages, users, attachments, reactions, embeds, threads, plus a messages_fts FTS5 index.
Local development API server.
./server.py # serves archive.db on :5273Endpoints: /api/channels, /api/channels/:id/messages, /api/messages/:id, /api/search, /api/users. Also serves downloaded attachments from /attachments/.
React + TypeScript + Vite application with:
- Virtual scrolling (TanStack Virtual) for large channels
- Full-text search with
#channeland@userautocomplete - Permalink URLs (
#channelId/messageId) - Message grouping, reactions with tooltips, embed rendering
- Keyboard navigation via use-kbd (Cmd+K omnibar,
/search,?shortcuts) - Responsive layout (collapsible sidebar, mobile support)
- Prefetch on hover (channels, search results, mentions)
cd app
pnpm install
pnpm dev # http://localhost:5272 (proxies /api to :5273)
pnpm build # production buildThe api/ directory contains a Cloudflare Worker that serves the same API backed by D1.
cd api
pnpm install
# Create D1 database
npx wrangler d1 create my-discord-archive
# Update wrangler.toml with the database_id
# Import data
./d1-import.sh ../archive.db # local D1
./d1-import.sh --remote ../archive.db # remote D1
# Deploy worker
npx wrangler deploy
# Deploy viewer
cd ../app
VITE_API_BASE=https://your-worker.workers.dev pnpm build
npx wrangler pages deploy dist --project-name my-discord-archive./archive.py # fetch new messages
./build_db.py # rebuild SQLite
cd api && ./d1-sync.py --remote # sync delta to D1 (zero downtime)This repo provides composite actions that downstream repos use in their workflows:
# Deploy the viewer to Cloudflare Pages
- uses: Open-Athena/discord-agent/actions/deploy-app@v1
with:
pages_project_name: my-discord-archive
vite_api_base: https://my-worker.workers.dev
cloudflare_token: ${{ secrets.CLOUDFLARE_TOKEN }}
cloudflare_account_id: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }}
# Deploy the Worker API
- uses: Open-Athena/discord-agent/actions/deploy-worker@v1
with:
cloudflare_token: ${{ secrets.CLOUDFLARE_TOKEN }}
cloudflare_account_id: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }}
wrangler_toml: wrangler.toml # your project's wrangler.toml
# Update the archive (fetch + rebuild + sync D1)
- uses: Open-Athena/discord-agent/actions/update-archive@v1
with:
discord_token: ${{ secrets.DISCORD_TOKEN }}
cloudflare_token: ${{ secrets.CLOUDFLARE_TOKEN }}
wrangler_toml: wrangler.tomlSee actions/ for full input documentation.
The archive/ directory is tracked with DVX (a DVC fork). Each archive update creates a new snapshot; individual file blobs are deduplicated.
dvx add archive # track archive state
dvx push # push to S3 remote
dvx pull # restore archive from remote- Go to the Discord Developer Portal
- Create a new application, add a bot
- Enable Message Content Intent under Bot settings
- Generate a bot token → set as
DISCORD_TOKEN - Invite the bot to your server with
Read Message History+Read Messagespermissions - Find your guild ID (right-click server name → Copy Server ID) → set as
DISCORD_GUILD