Skip to content

Keep source/destination arrays alive across array.copy's per-element loop#13450

Open
vouillon wants to merge 2 commits into
bytecodealliance:mainfrom
vouillon:array-copy-fix
Open

Keep source/destination arrays alive across array.copy's per-element loop#13450
vouillon wants to merge 2 commits into
bytecodealliance:mainfrom
vouillon:array-copy-fix

Conversation

@vouillon
Copy link
Copy Markdown
Contributor

translate_per_element_copy derives raw src_elem_addr / dst_elem_addr pointers once and uses only those raw pointers inside the per-element forward/backward loop. As a result, the original source and destination array gc-refs are dead in CLIF after the address computation and are not in the stack maps at safe points inside the loop.

This was harmless until the DRC collector started firing force_gc from inside the read barrier when the over-approximated-stack-roots list grew past 1024 entries (#13422). At that point, a GC could run mid-copy with neither the source nor the destination array marked from any frame's stack map. Sweep could then free the arrays out from under the copy.

Fix this by extracting the array gc-refs from the source and destination entities (when they are arrays) and threading them through the loop's forward and backward iteration blocks as block parameters. After the fix, the stack map at the in-barrier force_gc inside array.copy's loop lists the source and destination arrays in addition to the freshly-read element.

@vouillon vouillon requested review from a team as code owners May 22, 2026 15:39
@vouillon vouillon requested review from fitzgen and removed request for a team May 22, 2026 15:39
@vouillon vouillon force-pushed the array-copy-fix branch 2 times, most recently from 72d9dab to 9c28001 Compare May 22, 2026 16:32
@fitzgen fitzgen mentioned this pull request May 22, 2026
10 tasks
Copy link
Copy Markdown
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@fitzgen
Copy link
Copy Markdown
Member

fitzgen commented May 22, 2026

Looks like this has some conflicts that need to be resolved before it can merge.

In `translate_per_element_copy` (used by `array.copy` on arrays whose
elements are GC references) the per-element loop derives raw
`src_elem_addr`/`dst_elem_addr` pointers once and uses only those raw
pointers inside the loop body. The source and destination *array*
gc-refs themselves are dead in CLIF inside the loop and were absent
from the stack maps at safe points there.

Once the DRC collector started firing `force_gc` from inside the read
barrier (when the over-approximated-stack-roots list grew past 1024
entries) a GC could run mid-copy with neither array marked from any
frame's stack map. Sweep would drop the arrays from OASR, freeing them
if they had no other ref-count contributions and cascading dec-refs
through their elements, after which the rest of the copy read from the
freed memory and produced dangling references in the destination.

Surfaces in practice as a `wasm trap: cast failure` deep in OCaml's
sort routines when running jsoo-compiled wasm.

The test uses a small `gc_heap_reservation` and a churn-allocation
pass after the copy to force slot reuse so the corruption (stale
references in the destination array) is observable as wrong values in
the verify pass.
…py loop

`translate_per_element_copy` derives raw `src_elem_addr` /
`dst_elem_addr` pointers once and uses only those raw pointers inside
the per-element forward/backward loop. As a result, the original source
and destination *array* gc-refs (the `CheckedEntity::Array { array,
.. }` values that go through `read_field_at_addr` /
`write_field_at_addr`) are dead in CLIF after the address computation
and are not in the stack maps at safe points inside the loop.

The latent stack-map gap was harmless until the DRC collector started
firing `force_gc` from inside the read barrier when the
over-approximated-stack-roots list grew past 1024 entries (bytecodealliance#13422). At
that point, a GC could run mid-copy with neither the source nor the
destination array marked from any frame's stack map; sweep would
remove them from OASR, and if their only ref-count contribution was
the OASR membership they would be freed, cascading dec-refs through
their elements. The rest of the copy would then read freed memory and
write dangling references into the destination.

Fix this by extracting the array gc-refs from the source and
destination entities (when they are arrays) and threading them through
the loop's forward and backward iteration blocks as block parameters,
calling `declare_value_needs_stack_map` on the per-iteration copies so
they are tracked at safe points inside the loop. After the fix, the
stack map at the in-barrier `force_gc` inside `array.copy`'s loop
lists the source and destination arrays in addition to the
freshly-read element.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants