Skip to content

Migrate Benchmark#731

Open
faridyagubbayli wants to merge 4 commits into
masterfrom
migrate-benchmark
Open

Migrate Benchmark#731
faridyagubbayli wants to merge 4 commits into
masterfrom
migrate-benchmark

Conversation

@faridyagubbayli
Copy link
Copy Markdown
Collaborator

@faridyagubbayli faridyagubbayli commented May 14, 2026

Migrates benchmark.m script with some additions.

benchmarks/README.md shows sample usage.

Example output json file
{
  "comp_size": [
    32768,
    65536,
    131072,
    262144,
    524288
  ],
  "comp_time": [
    2.813780007263025,
    2.5228265166903534,
    2.5327161628132067,
    4.796307735455533,
    10.499971745846173
  ],
  "options": {
    "data_cast": "off",
    "heterogeneous_media": true,
    "absorbing_media": true,
    "nonlinear_media": false,
    "binary_sensor_mask": true,
    "number_sensor_points": 100,
    "number_time_points": 1000,
    "num_averages": 3,
    "start_size": 32,
    "x_scale_array": [
      1,
      2,
      2,
      2,
      4,
      4,
      4,
      8,
      8,
      8,
      16,
      16
    ],
    "y_scale_array": [
      1,
      1,
      2,
      2,
      2,
      4,
      4,
      4,
      8,
      8,
      8,
      16
    ],
    "z_scale_array": [
      1,
      1,
      1,
      2,
      2,
      2,
      4,
      4,
      4,
      8,
      8,
      8
    ],
    "domain_size": 0.022,
    "sensor_radius": 0.01,
    "pml_size": 10,
    "pml_inside": true,
    "report_mem_usage": false,
    "backend": "python",
    "device": "gpu",
    "computer_name": "k-instance",
    "python_version": "3.11.14",
    "platform": "Linux-5.10.0-39-cloud-amd64-x86_64-with-glibc2.31",
    "kwave_python_version": "0.6.1"
  },
  "output_path": "<filename>",
  "error_reached": false,
  "error_message": ""
}

Greptile Summary

This PR ports the MATLAB benchmark.m script to Python, implementing a 3D solver scaling benchmark that runs kspaceFirstOrder across a sequence of increasing grid sizes and records average runtimes. It also adds the PeakMemorySampler polling approach to replace the previously flagged resource.getrusage peak-RSS approach, and fixes the earlier case-identity and nan-in-JSON issues.

  • benchmarks/helpers.py: Introduces BenchmarkOptions (frozen dataclass), grid construction helpers, a PeakMemorySampler context manager that polls current RSS in a background thread, and cross-platform memory readers for Linux (/proc/self/statm), macOS/Linux-fallback (ps), and Windows (PSAPI via ctypes).
  • benchmarks/benchmark.py: Implements the run() loop, per-average timing and memory averaging, partial-result persistence after every solver call, and a main() entry point with CLI argument parsing.
  • tests/test_benchmark.py: Adds unit tests covering timing aggregation, case-index correctness, report_mem_usage path, unsupported platform early failure, and partial-result preservation on solver error.

Confidence Score: 5/5

The benchmark is new infrastructure with no impact on existing library code paths; it is safe to merge.

All changed files are new additions under benchmarks/ and tests/. The run loop, rolling averages, partial-result persistence, and cross-platform memory readers are all logically correct and well-tested. Previously identified issues (case-identity collision, nan-in-JSON, unsupported-platform early failure, start_size validation, and ru_maxrss misuse) have been fully addressed. No regressions to existing library code are possible since no existing files are modified.

No files require special attention.

Important Files Changed

Filename Overview
benchmarks/helpers.py Core helpers, dataclass, and PeakMemorySampler — well-structured, previously flagged nan/platform issues resolved; logic is sound.
benchmarks/benchmark.py Main run loop and CLI entry point — case-index-based result tracking is correct, early validation for mem_usage added, elapsed_time scoping is safe.
tests/test_benchmark.py Test suite covers happy path, mem_usage path, unsupported platform, and partial-result on error — all previously uncovered paths now exercised.
benchmarks/init.py Trivial package marker with docstring.
benchmarks/README.md Usage documentation and output schema description — accurate and complete.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[run] --> B{report_mem_usage?}
    B -- yes --> C[validate_memory_bytes early check]
    B -- no --> D[for each case_index / nx,ny,nz,scale]
    C --> D
    D --> E[build_case: kgrid, medium, source, sensor]
    E --> F[for loop_num in 1..num_averages]
    F --> G{report_mem_usage?}
    G -- yes --> H[PeakMemorySampler polls RSS in background thread]
    H --> I[solver]
    I --> H2[exit: final sample + thread join]
    H2 --> J[rolling_average mem_usage]
    G -- no --> I2[solver]
    I2 --> K[record elapsed_time]
    J --> K
    K --> L[rolling_average loop_time]
    L --> M[store_case_result by case_index]
    M --> N[save_results to JSON partial persistence]
    N --> F
    F -- all averages done --> D
    D -- all cases done OR exception --> O{error?}
    O -- yes --> P[set error_reached + save_results + break]
    O -- no --> Q[return result dict]
    P --> Q
Loading

Reviews (4): Last reviewed commit: "Merge branch 'master' into migrate-bench..." | Re-trigger Greptile

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.68%. Comparing base (66c256d) to head (f3109c7).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #731      +/-   ##
==========================================
+ Coverage   75.57%   75.68%   +0.10%     
==========================================
  Files          57       57              
  Lines        8195     8195              
  Branches     1600     1600              
==========================================
+ Hits         6193     6202       +9     
+ Misses       1381     1370      -11     
- Partials      621      623       +2     
Flag Coverage Δ
3.10 75.64% <ø> (+0.10%) ⬆️
3.11 75.64% <ø> (+0.10%) ⬆️
3.12 75.64% <ø> (+0.10%) ⬆️
3.13 75.64% <ø> (+0.10%) ⬆️
macos-latest 75.58% <ø> (+0.10%) ⬆️
ubuntu-latest 75.58% <ø> (+0.10%) ⬆️
windows-latest 75.42% <ø> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread benchmarks/benchmark.py Outdated
Comment thread benchmarks/benchmark.py Outdated
Comment thread benchmarks/benchmark.py Outdated
Comment thread tests/test_benchmark.py
Comment thread benchmarks/helpers.py Outdated
faridyagubbayli and others added 2 commits June 4, 2026 14:24
Validate benchmark inputs and memory reporting so generated results stay complete and JSON-compatible.
@faridyagubbayli faridyagubbayli requested a review from waltsims June 4, 2026 14:31
@faridyagubbayli
Copy link
Copy Markdown
Collaborator Author

@greptile review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 4, 2026

Want your agent to iterate on Greptile's feedback? Try greploops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant