ensk.is is a free, open English-Icelandic dictionary web app (https://ensk.is). Based on Geir T. Zoëga's 1896 dictionary, extended with modern entries.
Python 3.12, FastAPI, Jinja2, SQLite, Uvicorn. Linted with ruff, type-checked with pyright.
Dictionary data lives in plain text files under data/dict/ (one file per letter: a.txt–z.txt). Modern additions go in data/dict/_add.txt. All generated artifacts (SQLite DB, JSON, CSV, PDF) are derived from these files.
WORD CATEGORY. definition in Icelandic
Categories: n. (noun), s. (verb/sögn), l. (adjective/lýsingarorð), ao. (adverb/atviksorð), fs. (preposition/forsetning), gr. (article/greinir), st. (conjunction/samtenging). Multiple meanings separated by ;. Bracketed text [like this] denotes usage examples or clarification.
# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt
# Verify dictionary source files (syntax/integrity checks)
python gen/verify.py
# Generate dictionary data (SQLite DB, JSON, CSV, etc.)
python gen/gen.py
# Run the web app
uvicorn app:app --host localhost --port 8000
# Run tests
python -m pytestNo Makefile — commands are run directly.
Runs on every push/PR. Steps in order:
ruff check *.py routes/*.py gen/*.py tests/*.pycurlylint templates/*.htmlpyright *.py routes/*.py gen/*.py tests/*.pypython gen/verify.pypython gen/gen.pypython -m pytest
- Ruff with Black defaults: 88 char lines, 4-space indent, double quotes
- Target: Python 3.12
- Config in
.ruff.toml
app.py FastAPI app factory, middleware, CSP headers
dict.py Dictionary parsing, definition unpacking
db.py SQLite database singleton and queries
util.py Shared utilities
settings.py Configuration (pydantic-settings, reads .env)
routes/
api.py REST API: /api/search, /api/item, /api/suggest, /api/metadata
web.py HTML pages: /, /item/{word}, /about, /files, /stats, etc.
core.py Shared route logic, caching, DB init
static.py Static file serving
gen/
gen.py Main generator: text files → SQLite + exports
verify.py Dictionary syntax and integrity validation
audio.py Speech synthesis (macOS only, uses `say`)
macos.py Apple Dictionary bundle generation
pdf/ PDF export
data/
dict/ Dictionary source text files (a.txt–z.txt, _add.txt)
wordlists/ English word lists for validation
ipa/ IPA pronunciation data (uk/, us/)
freq/ Word frequency data
syllables/ Syllable breakdowns
templates/ Jinja2 HTML templates
static/ CSS, JS, images, fonts, audio, generated files
tests/ pytest tests (test_dict.py, test_routes.py, test_util.py)
tools/ Analysis scripts (IPA, syllables, frequency, duplicates, etc.)
- The
data/dict/text files are canonical. Edit those, then regenerate withgen/gen.py. _add.txtcontains all modern additions (not in Zoëga's original). New entries go here unless they are corrections to existing entries in a letter file.- Letter files (a.txt–z.txt) are sorted alphabetically, one entry per line.
gen/verify.pyshould pass beforegen/gen.pyis run.- The web app opens
dict.dbin read-only mode. Regenerate the DB to pick up text file changes.
build_deploy.sh runs gen.py + audio.py, then rsyncs to root@ensk.is:/www/ensk.is/html/.