Skip to content

incorrect bytes() constructor usage in buf_filled_with#3077

Merged
mr-tz merged 3 commits into
masterfrom
fix/buf_filled_with
May 16, 2026
Merged

incorrect bytes() constructor usage in buf_filled_with#3077
mr-tz merged 3 commits into
masterfrom
fix/buf_filled_with

Conversation

@mike-hunhoff
Copy link
Copy Markdown
Collaborator

@mike-hunhoff mike-hunhoff commented May 15, 2026

Bug Fix: Incorrect bytes() constructor usage in buf_filled_with

Description

In capa/features/extractors/strings.py:buf_filled_with, the code used bytes(character) * SLICE_SIZE to create a repeating chunk of a specific character. However, in Python 3, bytes(int) creates a null-filled buffer of that length, rather than a single byte with that value.

This caused the chunked comparison for large buffers (>= 4096 bytes) to fail for characters in the REPEATS set (like 'A', 0xFE, 0xFF), because the constructed dupe_chunk had an incorrect length and content. This effectively disabled the optimization intended to skip large blocks of filler data (like section padding), causing unnecessary regex scanning overhead.

Fix

Changed bytes(character) to bytes([character]) to correctly create a 1-byte buffer containing the character value before multiplying by SLICE_SIZE.

Tests

Added unit tests in tests/test_strings.py with buffers larger than 4096 bytes to ensure coverage of the chunked processing logic in buf_filled_with.

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed
  • This submission includes AI-generated code and I have provided details in the description.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

@github-actions github-actions Bot dismissed their stale review May 15, 2026 21:20

CHANGELOG updated or no update needed, thanks! 😄

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in the buf_filled_with function where the comparison chunk was incorrectly initialized, and adds new test cases to verify behavior with large buffers. The reviewer suggested further improving test coverage by adding specific cases that target potential boundary issues in the chunked processing logic.

Comment thread tests/test_strings.py
@mike-hunhoff mike-hunhoff requested review from a team and williballenthin May 15, 2026 21:22
Copy link
Copy Markdown
Collaborator

@williballenthin williballenthin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

@mr-tz mr-tz merged commit db0e153 into master May 16, 2026
39 checks passed
@mr-tz mr-tz deleted the fix/buf_filled_with branch May 16, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants