Skip to content

fix(test-e2b): skip bench-command tests on Windows (POSIX shell only)#426

Merged
0xmariowu merged 1 commit intomainfrom
fix/cross-platform-skip-bash-only-tests
Apr 26, 2026
Merged

fix(test-e2b): skip bench-command tests on Windows (POSIX shell only)#426
0xmariowu merged 1 commit intomainfrom
fix/cross-platform-skip-bash-only-tests

Conversation

@0xmariowu
Copy link
Copy Markdown
Owner

Summary

tests/scripts/test_e2b_bench_commands.py builds shell command strings that target the e2b Linux sandbox (bash). On Windows runners, pytest invokes shell=True → cmd.exe, which fails on \$HOME and .venv/bin/python syntax with 'system cannot find the path specified' — completely unrelated to the shell-quoting behavior the test actually verifies.

Add module-level pytestmark = pytest.mark.skipif(sys.platform == 'win32').

Background

Cross-Platform Tests workflow run 24952635233 failed on this test. The previous Cross-Platform fix (#425) addressed test_judge.py Windows path-separator issue; this is the next casualty in the same workflow's Windows backlog.

Test plan

  • pytest tests/scripts/test_e2b_bench_commands.py -x -q → 2 passed (macOS, no skip)
  • Ruff clean
  • Cross-Platform Tests workflow re-run on the merged commit (Windows job will SKIP these tests)

Note

These tests are valuable on Linux/macOS — they catch real shell-injection regressions in the bench-command builders. Skipping on Windows only loses coverage on a platform that doesn't run the production code path anyway (e2b sandbox is always Linux).

The bench-command builders construct strings for the e2b Linux sandbox
(uses bash). On Windows pytest runners the test invokes 'shell=True'
which falls through to cmd.exe, where '\$HOME' and '.venv/bin/python'
syntax fail with 'system cannot find the path specified' — unrelated
to the actual shell-quoting behavior under test.

Add module-level pytestmark = pytest.mark.skipif(sys.platform == 'win32')
so Cross-Platform Tests workflow no longer flags these as red.
@0xmariowu 0xmariowu enabled auto-merge (squash) April 26, 2026 08:55
Copilot AI review requested due to automatic review settings April 26, 2026 08:55
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

Warning

Rate limit exceeded

@0xmariowu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 26 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 16 minutes and 26 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3528ce0c-b1da-4983-94d8-18680ee1be29

📥 Commits

Reviewing files that changed from the base of the PR and between d095b93 and af1531c.

📒 Files selected for processing (1)
  • tests/scripts/test_e2b_bench_commands.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@0xmariowu 0xmariowu merged commit e3bddec into main Apr 26, 2026
17 checks passed
@0xmariowu 0xmariowu deleted the fix/cross-platform-skip-bash-only-tests branch April 26, 2026 08:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to address Windows failures in the Cross-Platform Tests workflow by conditionally skipping tests/scripts/test_e2b_bench_commands.py on win32, since the tests currently execute POSIX-oriented shell command strings that are incompatible with cmd.exe.

Changes:

  • Add pytest import and module-level pytestmark = pytest.mark.skipif(sys.platform == "win32", ...).
  • Document why the tests are skipped on Windows (POSIX-shell assumptions vs cmd.exe behavior).

Comment on lines +10 to +21
import pytest

# These tests assert that bench-command builders correctly escape shell
# metacharacters when the resulting string is fed to a POSIX shell — the same
# environment the e2b Linux sandbox uses in production. On Windows pytest
# runners (Cross-Platform Tests workflow) the command relies on `$HOME` and
# `.venv/bin/python` syntax that cmd.exe does not understand, so the tests
# fail for environment reasons unrelated to the quoting behavior under test.
pytestmark = pytest.mark.skipif(
sys.platform == "win32",
reason="bench commands target POSIX shell (e2b Linux sandbox); cmd.exe cannot run them",
)
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing a module-level pytestmark = pytest.mark.skipif(sys.platform == "win32", ...) is a test suppression and conflicts with the repo’s stated hard rule to not skip tests to resolve failures (see .github/copilot-instructions.md rules #2 and #4). Instead of skipping Windows entirely, consider making the test platform-independent by not executing via shell=True (cmd.exe) and asserting POSIX-shell safety structurally (e.g., parse the generated command as POSIX with shlex.split(..., posix=True) and assert the malicious input remains a single argument), or execute under an explicit POSIX shell invocation (bash -lc ...) with a conditional only if bash is unavailable.

Copilot generated this review using guidance from repository custom instructions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants