Skip to content

Investigation: E2E test failures vs CI test failures - Corrected analysis#840

Closed
Copilot wants to merge 1 commit intofilestorage-extension-addfrom
copilot/sub-pr-834
Closed

Investigation: E2E test failures vs CI test failures - Corrected analysis#840
Copilot wants to merge 1 commit intofilestorage-extension-addfrom
copilot/sub-pr-834

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 17, 2026

Investigated test failures across PRs in response to reports of e2e test failures. Initial assessment was corrected after feedback.

Corrected Findings

CI (Continuous Integration) Workflows

  • Status: PASSING on multiple branches after Feb 10, 2026
  • Examples:
    • renovate/google.golang.org-grpc-1.x: CI passed Feb 16
    • renovate/github-actions-deps: CI passed Feb 15

E2E-Test Workflows

  • Status: FAILING across all branches since Feb 15-16, 2026
  • Last successful run: Feb 10, 2026
  • Symptom: Collector pods start but timeout waiting for metrics/logs after 5 minutes
  • Affected tests: k8scombined, netflow, and all other e2e test suites

filestorage-extension-add Branch Specific Issue

  • Status: CI failing (not just E2E)
  • Root cause: TestE2E_FileStorage test is running during regular make test despite having //go:build e2e tag
  • Error: "unable to load kubeconfig from /tmp/kube-config-collector-e2e-testing: no such file or directory"

Key Distinction

CI and E2E-Test are separate workflows with different success rates:

  • CI workflows: Generally healthy, passing on most branches
  • E2E-Test workflows: Failing across all branches (separate investigation needed)
  • filestorage branch: Both CI and E2E failing due to test configuration issue

Initial Assessment Error

Initial investigation incorrectly attributed all failures to OpenTelemetry Collector v0.145.0 upgrade (commit 85c69c2, PR #823). While E2E-Test workflows are failing, regular CI builds have continued to pass, indicating the issue is more specific than originally reported.

Recommendations

  1. Fix TestE2E_FileStorage test configuration so it doesn't run during regular CI
  2. Investigate E2E-Test workflow failures separately (may still be related to OTel v0.145.0 or infrastructure changes)
  3. Consider whether E2E test build tags are being properly respected across the test suite

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Investigate reasons for e2e test failures Investigation: E2E test failures are repository-wide, caused by OTel Collector v0.145.0 upgrade Feb 17, 2026
Copilot AI requested a review from odubajDT February 17, 2026 07:36
@odubajDT
Copy link
Copy Markdown
Contributor

@copilot that's not true, there were multiple green builds after that, even 5 days ago, i must be something in the pipeline

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 17, 2026

@copilot that's not true, there were multiple green builds after that, even 5 days ago, i must be something in the pipeline

You're absolutely right, I apologize for the error. I was looking only at E2E-Test workflows. After reviewing more carefully:

CI (Continuous Integration) workflows: PASSING on multiple branches after Feb 10

  • renovate/google.golang.org-grpc-1.x: CI passed on Feb 16
  • renovate/github-actions-deps: CI passed on Feb 15

The actual issues:

  1. filestorage-extension-add branch - CI failing because TestE2E_FileStorage is running during regular make test (should only run with e2e tag). This is a test configuration issue in the filestorage PR, not related to OTel v0.145.0
  2. E2E-Test workflows - Still failing across all branches (separate from CI), but this doesn't affect regular builds

The CI pass rate is actually much better than I initially reported. The filestorage test needs to be fixed so it doesn't run during regular CI.

Copilot AI changed the title Investigation: E2E test failures are repository-wide, caused by OTel Collector v0.145.0 upgrade Investigation: E2E test failures vs CI test failures - Corrected analysis Feb 17, 2026
@odubajDT
Copy link
Copy Markdown
Contributor

@copilot that's not true, there were multiple green builds after that, even 5 days ago, i must be something in the pipeline

You're absolutely right, I apologize for the error. I was looking only at E2E-Test workflows. After reviewing more carefully:

CI (Continuous Integration) workflows: PASSING on multiple branches after Feb 10

* renovate/google.golang.org-grpc-1.x: CI passed on Feb 16

* renovate/github-actions-deps: CI passed on Feb 15

The actual issues:

1. **filestorage-extension-add branch** - CI failing because `TestE2E_FileStorage` is running during regular `make test` (should only run with e2e tag). This is a test configuration issue in the filestorage PR, not related to OTel v0.145.0

2. **E2E-Test workflows** - Still failing across all branches (separate from CI), but this doesn't affect regular builds

The CI pass rate is actually much better than I initially reported. The filestorage test needs to be fixed so it doesn't run during regular CI.

@copilot but the dependency update PRs are failing, why is that? It's still the same errors across all the opened PRs, there needs to be something off in the repo

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 17, 2026

@copilot that's not true, there were multiple green builds after that, even 5 days ago, i must be something in the pipeline

You're absolutely right, I apologize for the error. I was looking...

You're correct - all E2E-Test workflows are failing across all open PRs with the same error: "failed to receive metrics, received 0 metrics in 5 minutes". This includes:

The pattern: Collector pods start successfully but produce no metrics/logs before timeout.

Important distinction:

  • CI workflows (make test, build): PASSING ✅
  • E2E-Test workflows (K8s integration tests): FAILING ❌

This is a repository-wide E2E infrastructure issue, not PR-specific. The root cause could be:

  1. OTel Collector v0.145.0 compatibility issue with test configurations
  2. K8s test infrastructure change (kind cluster, networking)
  3. GitHub Actions runner environment change

All PRs are blocked on E2E tests, though regular builds/CI are healthy. This needs investigation at the repository/infrastructure level, not individual PR level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants