Skip to content

16 families still lack source provenance — help needed identifying upstream repos #10381

@felipesanches

Description

@felipesanches

Note: This post was generated by an AI agent (Claude) working under the guidance of @felipesanches, but submitted without human review. @felipesanches himself would still need to participate in the PR thread if he wants to contribute to the review.

Context

Over the past week, we've made significant progress documenting source provenance for the Google Fonts library. Through systematic investigation of METADATA.pb files, upstream repo discovery, and the archival of 982+ bare git mirrors (55 GB), we've brought source block coverage from 83% to 96%+ of all 2,002 families.

Key milestones:

The remaining 16 families

After all known repos are accounted for, 16 families still have no identified upstream source repository. We're asking for help from the community — if you know where the sources for any of these fonts live, please comment here.

Family Designer / Foundry Scripts Last Binary Update Onboarding Date
Black And White Picture AsiaSoft Inc. Korean 2018-03-13 (16680f8688) 2018-02-27
Chenla Danh Hong Khmer 2021-11-08 (84b31698cb) 2011-03-02
Content Danh Hong Khmer 2015-03-07 (90abd17b4f) 2011-03-02
Cute Font TypoDesign Lab. Inc Korean 2018-03-13 (16680f8688) 2018-02-23
Dokdo FONTRIX Korean 2018-03-13 (16680f8688) 2018-02-23
Gaegu JIKJI SOFT Korean 2018-03-13 (16680f8688) 2018-02-28
Poor Story Yoon Design Korean 2018-03-13 (1ef157d393) 2018-02-23
PT Mono ParaType Cyrillic 2015-03-07 (90abd17b4f) 2012-02-29
PT Sans ParaType Cyrillic 2015-03-07 (90abd17b4f) 2010-09-21
PT Serif Caption ParaType Cyrillic 2015-03-07 (90abd17b4f) 2011-02-09
Single Day DXKorea Inc Korean 2018-03-13 (81997650b4) 2018-02-23
Sitara Neelakash Kshetrimayum Devanagari 2015-06-08 (7e42686751) 2015-06-10
Song Myung JIKJI Korean 2018-03-13 (16680f8688) 2018-02-23
Stylish AsiaSoft Inc Korean 2018-03-13 (16680f8688) 2018-02-27
Sunflower JIKJISOFT Korean 2018-03-16 (5ea1323a54) 2018-02-27
Uchen Christopher J. Fynn Tibetan 2019-12-11 (b44e8365d6) 2019-12-07

(Noto Color Emoji Compat Test is also without a source block, but it's a test font created entirely within google/fonts — no upstream exists.)

Patterns

  • 8 Korean families from foundries with no known GitHub presence (AsiaSoft, TypoDesign Lab, FONTRIX, JIKJI, Yoon Design, DXKorea, JIKJISOFT)
  • 3 ParaType PT families — sources appear to be proprietary/internal
  • 2 Danh Hong Khmer familiesdanhhong GitHub user has repos for Khmer and Siemreap, but not Chenla or Content
  • 1 Tibetan family (Uchen) — designer Christopher J. Fynn; Savannah project free-tibetan exists but may not contain this specific font
  • 1 Devanagari family (Sitara) — designer has no known GitHub presence
  • 1 Korean family (Sunflower) from JIKJISOFT with Hangul script

What we've searched

For each of these families, we've already checked:

  • GitHub search (by family name, designer name, foundry name)
  • The googlefontdirectory-hg monorepo (pre-GitHub canonical source)
  • The librefonts/ GitHub org (TTX mirrors only, not original sources)
  • Font binary name tables for embedded URLs
  • FONTLOG.txt and DESCRIPTION.en_us.html for repo references
  • Google Code Archive, SourceForge, Launchpad, Font Library
  • google/fonts git commit history and PR bodies

Next steps

Beyond resolving these 16 families, the planned next phase of this work is:

  1. Reproducible builds: Ensuring the full library can be reliably built from sources. We've already tested 1,381 families (306 byte-identical, 921 compiler-version match). The build system improvements (multi-license-dir support, local archive extraction, legacy source classification) are ready for broader testing.

  2. Source preservation: We've built a permanent repo archive of 982+ bare git mirrors (55 GB). The upstream source preservation investigation documented that 20%+ of families had source reliability issues (force-pushed repos, deleted repos, missing commits). The archive ensures these sources survive upstream changes.

  3. Preventing drift: Establishing tooling and processes to ensure source provenance information stays accurate as new fonts are onboarded and existing fonts are updated. This includes periodic verification that all METADATA.pb commit hashes remain reachable in their upstream repos.

cc @NicholasJohnson @nicholasjohnson-monde @AsiaSoftInc (if any of these accounts are active) — any leads on the Korean font sources would be very helpful. The nicholasjohnson/* repos (23 fonts) were previously deleted but some may have been re-hosted elsewhere.

cc @davelab6 @rsheeter — would appreciate any leads on the remaining 16 families, especially the Korean and ParaType ones.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions