Skip to content

fix: do not fetch metadata for self-hosted Gitea/Forgejo#2788

Open
serhalp wants to merge 2 commits into
mainfrom
serhalp/fix/do-not-fetch-gitea-forgejo-selfhost-metadata
Open

fix: do not fetch metadata for self-hosted Gitea/Forgejo#2788
serhalp wants to merge 2 commits into
mainfrom
serhalp/fix/do-not-fetch-gitea-forgejo-selfhost-metadata

Conversation

@serhalp
Copy link
Copy Markdown
Member

@serhalp serhalp commented May 24, 2026

🔗 Linked issue

N/A

🧭 Context

Repo URLs come from npm package metadata, so package publishers can specify any hostname. As this is effectively user-controlled input that can point at a malicious user-controlled server, this would put us at risk of Server-Side Request Forgery (SSRF). The scope of what SSRF here could accomplish is quite limited, but it would be better to nip this in the bud now in case of future changes or exploit chaining.

📚 Description

Only support allowlisted hosts.

Repo URLs come from npm package metadata, so package publishers can specify any hostname. As this is
effectively user-controlled input that can point at a malicious user-controlled server, this would
put as at risk of Server-Side Request Forgery (SSRF). Thus we only support allowlisted hosts.
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs.npmx.dev Ready Ready Preview, Comment May 24, 2026 2:19pm
npmx.dev Ready Ready Preview, Comment May 24, 2026 2:19pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
npmx-lunaria Ignored Ignored May 24, 2026 2:19pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 24, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features
    • Added explicit support for allowlisted Gitea and Forgejo instances.
    • Enabled caching for gitea.com requests during server-side rendering.

Walkthrough

The pull request restricts Forgejo and Gitea provider detection to explicit allowlists (FORGEJO_HOSTS and GITEA_HOSTS) to mitigate SSRF risk. Provider matching logic is simplified from pattern-based detection to direct allowlist checks. API origins are extended to include the allowlisted hosts, gitea.com is added to fetch cache configuration, and test coverage is updated to verify allowlist-only behaviour.

Changes

Host allowlist and provider matching

Layer / File(s) Summary
Allowlist definitions and provider matching logic
shared/utils/git-providers.ts, shared/utils/repository-meta.ts
Introduces FORGEJO_HOSTS and GITEA_HOSTS allowlist constants with SSRF documentation. Forgejo and Gitea provider matchHost methods are updated to check membership in these allowlists instead of pattern matching. ALL_KNOWN_GIT_API_ORIGINS is extended with HTTPS origins derived from the allowlists. Adapter comments in repository-meta.ts are updated to indicate support for exact allowlisted instances.
Fetch cache configuration
shared/utils/fetch-cache-config.ts
Adds gitea.com to FETCH_CACHE_ALLOWED_DOMAINS so requests to the allowlisted Gitea host are eligible for caching during server-side rendering.
Test coverage for allowlist behaviour
test/nuxt/composables/use-repo-meta.spec.ts, test/unit/shared/utils/git-providers.spec.ts
Removes tests expecting generic git.* and gitea.* subdomain patterns and the forgejo.example.com case. Adds tests verifying that parseRepositoryInfo correctly handles the allowlisted gitea.com and computes expected metadata.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: do not fetch metadata for self-hosted Gitea/Forgejo' clearly and specifically describes the main security fix implemented across multiple files to restrict metadata fetching to allowlisted hosts only.
Description check ✅ Passed The description provides clear context explaining the SSRF security risk from user-controlled hostnames in npm package metadata and describes the solution of supporting only allowlisted hosts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch serhalp/fix/do-not-fetch-gitea-forgejo-selfhost-metadata

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@shared/utils/git-providers.ts`:
- Around line 46-51: FORGEJO_HOSTS and GITEA_HOSTS are currently mutable arrays;
make them immutable to prevent runtime mutation widening the trusted hosts (and
update any derived exports like ALL_KNOWN_GIT_API_ORIGINS). Replace the mutable
exports with readonly constants (e.g., frozen arrays or TypeScript readonly
tuples/ReadonlyArray<string>) and ensure ALL_KNOWN_GIT_API_ORIGINS is produced
from those readonly values (not mutated later); update any consumers to accept
ReadonlyArray<string> if necessary and remove any in-place mutations of
FORGEJO_HOSTS or GITEA_HOSTS.

In `@test/unit/shared/utils/git-providers.spec.ts`:
- Around line 341-352: Add negative tests alongside the existing Gitea allowlist
spec to assert that lookalike/non-allowlisted hosts are rejected by
parseRepositoryInfo: add test cases (e.g., "rejects non-allowlisted Gitea-like
hosts" and "rejects Forgejo-like hosts") that call parseRepositoryInfo with URLs
such as 'https://gitea.com.evil/owner/repo' or
'https://forgejo.example/owner/repo' and assert it either returns undefined/null
or throws (match existing failure behavior), and verify no provider is set to
'gitea'/'forgejo' and no rawBaseUrl is produced; locate tests near the existing
Gitea block in test/unit/shared/utils/git-providers.spec.ts and mirror the
style/assertion pattern used for allowlisted success cases.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 310e3555-4dd3-4424-82c6-3fa9c5409590

📥 Commits

Reviewing files that changed from the base of the PR and between 0e9f1c8 and 8e09e1c.

📒 Files selected for processing (5)
  • shared/utils/fetch-cache-config.ts
  • shared/utils/git-providers.ts
  • shared/utils/repository-meta.ts
  • test/nuxt/composables/use-repo-meta.spec.ts
  • test/unit/shared/utils/git-providers.spec.ts

Comment thread shared/utils/git-providers.ts
Comment on lines +341 to 352
describe('Gitea support', () => {
it('parses exact allowlisted Gitea hosts', () => {
const result = parseRepositoryInfo({
url: 'git+ssh://git@forgejo.myserver.com/user/project.git',
url: 'https://gitea.com/owner/repo',
})
expect(result).toMatchObject({
provider: 'forgejo',
owner: 'user',
repo: 'project',
host: 'forgejo.myserver.com',
provider: 'gitea',
owner: 'owner',
repo: 'repo',
host: 'gitea.com',
rawBaseUrl: 'https://gitea.com/owner/repo/raw/branch/main',
})
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add explicit deny-path tests for non-allowlisted Gitea/Forgejo-like hosts.

This block only proves the allowlisted success path. Please also assert that lookalike non-allowlisted hosts are rejected, so the SSRF boundary is locked in by tests.

Proposed test addition
 describe('Gitea support', () => {
   it('parses exact allowlisted Gitea hosts', () => {
     const result = parseRepositoryInfo({
       url: 'https://gitea.com/owner/repo',
     })
     expect(result).toMatchObject({
       provider: 'gitea',
       owner: 'owner',
       repo: 'repo',
       host: 'gitea.com',
       rawBaseUrl: 'https://gitea.com/owner/repo/raw/branch/main',
     })
   })
+
+  it('rejects non-allowlisted Gitea/Forgejo-like hosts', () => {
+    expect(
+      parseRepositoryInfo({ url: 'https://gitea.example.com/owner/repo' }),
+    ).toBeUndefined()
+    expect(
+      parseRepositoryInfo({ url: 'https://forgejo.example.com/owner/repo' }),
+    ).toBeUndefined()
+  })
 })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
describe('Gitea support', () => {
it('parses exact allowlisted Gitea hosts', () => {
const result = parseRepositoryInfo({
url: 'git+ssh://git@forgejo.myserver.com/user/project.git',
url: 'https://gitea.com/owner/repo',
})
expect(result).toMatchObject({
provider: 'forgejo',
owner: 'user',
repo: 'project',
host: 'forgejo.myserver.com',
provider: 'gitea',
owner: 'owner',
repo: 'repo',
host: 'gitea.com',
rawBaseUrl: 'https://gitea.com/owner/repo/raw/branch/main',
})
describe('Gitea support', () => {
it('parses exact allowlisted Gitea hosts', () => {
const result = parseRepositoryInfo({
url: 'https://gitea.com/owner/repo',
})
expect(result).toMatchObject({
provider: 'gitea',
owner: 'owner',
repo: 'repo',
host: 'gitea.com',
rawBaseUrl: 'https://gitea.com/owner/repo/raw/branch/main',
})
})
it('rejects non-allowlisted Gitea/Forgejo-like hosts', () => {
expect(
parseRepositoryInfo({ url: 'https://gitea.example.com/owner/repo' }),
).toBeUndefined()
expect(
parseRepositoryInfo({ url: 'https://forgejo.example.com/owner/repo' }),
).toBeUndefined()
})
})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/unit/shared/utils/git-providers.spec.ts` around lines 341 - 352, Add
negative tests alongside the existing Gitea allowlist spec to assert that
lookalike/non-allowlisted hosts are rejected by parseRepositoryInfo: add test
cases (e.g., "rejects non-allowlisted Gitea-like hosts" and "rejects
Forgejo-like hosts") that call parseRepositoryInfo with URLs such as
'https://gitea.com.evil/owner/repo' or 'https://forgejo.example/owner/repo' and
assert it either returns undefined/null or throws (match existing failure
behavior), and verify no provider is set to 'gitea'/'forgejo' and no rawBaseUrl
is produced; locate tests near the existing Gitea block in
test/unit/shared/utils/git-providers.spec.ts and mirror the style/assertion
pattern used for allowlisted success cases.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but decided against it. It would be a very strange, porous heuristic test that speculates about what future patterns someone might implement. I think it would only provide false confidence.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhalp, that's a fair point. Negative tests on a handful of example hostnames wouldn't actually prove the allowlist is exhaustive — they'd just show those specific strings don't match today, which could give a false sense of security if the underlying logic ever drifted. The positive test on the allowlisted host combined with the explicit allowlist constants in the source is the right place to enforce the boundary. Happy to drop this suggestion.

(ᓀ‸ᓂ)


🧠 Learnings used
Learnt from: ghostdevv
Repo: npmx-dev/npmx.dev PR: 2195
File: test/nuxt/components/Header/MobileMenu.spec.ts:7-11
Timestamp: 2026-03-23T01:33:11.889Z
Learning: When using or mocking the `useConnector` composable in `npmx-dev/npmx.dev`, treat its public API as `isConnected` (not `connected`). Even though the internal `ConnectorState` uses a `connected` field, the composable’s returned object exposes `isConnected: computed(() => state.value.connected)`. Therefore, in tests/components, mocks and usages should reference `isConnected` and should not be flagged as incorrect for using `isConnected`.

Learnt from: serhalp
Repo: npmx-dev/npmx.dev PR: 2459
File: test/unit/server/utils/likes-leaderboard.spec.ts:137-165
Timestamp: 2026-04-26T00:02:20.876Z
Learning: In this Nuxt project (npmx-dev/npmx.dev), the `Packument` type is globally available via Nuxt auto-imports from `shared/types/` (exported from `shared/types/npm-registry.ts`). Therefore, do not raise or require missing `import type { Packument } from '`#shared/types`'` (or any equivalent) when `Packument` is referenced, including in unit test files.

* is effectively user-controlled input that can point at a malicious user-controlled server, this
* would put us at risk of Server-Side Request Forgery (SSRF). Thus we only support allowlisted hosts.
*/
export const FORGEJO_HOSTS = ['next.forgejo.org', 'try.next.forgejo.org']
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codeberg? or is that done somewhere else?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codeberg is handled as a separate provider

@serhalp serhalp requested review from a team and ghostdevv May 24, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants