Add wasm-bindgen support#23493
Conversation
Does rustc then read the wasm to find those function names, and pass those names to In general if we need to read metadata-type info from the wasm, then we have a minimal parser in tools/webassembly.py. If we need something more complex, a binaryen pass is an option. |
|
wasm-bindgen itself is two pieces: a library that allows you to annotate your rust code marking things to be exported, and a tool that consumes a .wasm file and reads those annotations to produce a companion js file. rustc knows about those function names because wasm-bindgen as a library provided the annotations. If rustc invokes the linker itself, it's able to pass that information along. However, because we need to also build C++, we're only using rustc to compile and not drive the whole process, so we need to have it output that information elsewhere. One (very naive) possibility is to have rustc invoke a fake linker that just writes the |
Export all C symbols, except perhaps those from system libraries.
Automatically infer what symbols to export for wasm-bindgen
|
I believe this is ready for review! :D |
|
Sorry for the delay reviewing this. Just getting to it now. Is the PR description up-to-date with current state of things? |
| 'CACHE', | ||
| 'PORTS', | ||
| 'COMPILER_WRAPPER', | ||
| 'WASM_BINDGEN', |
There was a problem hiding this comment.
Could we avoid the new config setting completely and just rely on wasm-bindgen being in the PATH when a user added -sWASM_BINDGEN.
This is what we do for tsc I believe. I'm loath to add more config setting if we can possibly avoid it.
| for file_path in bindgen_tsd: | ||
| with open(file_path, encoding='utf-8') as file: | ||
| for line in file: | ||
| out += f'{line}' |
There was a problem hiding this comment.
Can these 3 lines be replaced with just out += read_file(file_path)?
| """ | ||
|
|
||
| value: str | ||
| is_file: int |
There was a problem hiding this comment.
Perhaps putting this in cmdline.py would make more sense?
I'm still not clear why we need to move this, but trying to understand now.
There was a problem hiding this comment.
Previously it was just defined in emcc.py. Moving it here lets us also use it in link.py. That let's us ensure that all entries to linker_args are LinkFlags and not a combination of LinkFlags and strings.
Happy to move it if you'd like.
No worries at all!
I just updated it now. I will address your other comments later this week. Thanks again for taking a look! |
Now that the unpushed work in the two forks has been pushed to public
branches on nobodywho-ooo (and walkingeyerobot's emscripten has been
forked into nobodywho-ooo/emscripten for transparency), the build no
longer needs to point at local checkout paths.
Workspace Cargo.toml:
- `wasm-bindgen = { path = "/Users/user/git/wasm-bindgen" }`
→ `wasm-bindgen = { git = "https://github.com/nobodywho-ooo/wasm-bindgen",
branch = "emscripten-descriptor-fixes" }`
- Drop the `[patch."https://github.com/nobodywho-ooo/llama-cpp-rs"]`
block entirely — `core/Cargo.toml` already declares
`llama-cpp-2 / llama-cpp-sys-2 = { git = ..., branch = "wasm-emscripten" }`,
and the branch now contains the three previously-unpushed commits
(CMAKE_SYSTEM_PROCESSOR=wasm32, MA_NO_* + -fexceptions for mtmd).
README "Outstanding" section: replaced the two prose-only fork notes
with three explicit list items linking to the public branch URLs:
- nobodywho-ooo/llama-cpp-rs branch wasm-emscripten
- nobodywho-ooo/wasm-bindgen branch emscripten-descriptor-fixes
- nobodywho-ooo/emscripten branch wbg-walkingeyerobot
(fork of walkingeyerobot/emscripten, which itself carries the
-sWASM_BINDGEN flag — emscripten-core/emscripten#23493)
Cargo.lock updated to reflect the new wasm-bindgen git source pin
(commit f4fc33dc7).
Verified:
- `cargo check --workspace --exclude nobodywho-js` (after `cargo clean
-p nobodywho -p nobodywho-python` to flush the stale incremental
cache from the patch change): clean
- `cargo check -p nobodywho-js --target wasm32-unknown-emscripten`: clean
| } | ||
| ''' % locals(), | ||
| 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n', cflags=extra_args) | ||
| ''', 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n', cflags=extra_args) |
There was a problem hiding this comment.
This looks like an unrelated cleanup?
| create_file('post.js', ''' | ||
| Module.onRuntimeInitialized = () => { | ||
| out(Module.rs_add(17, 25)); | ||
| }; |
There was a problem hiding this comment.
Use a single line function here? Module.onRuntimeInitialized = () => out(Module.rs_add(17, 25));
| '--post-js', | ||
| 'post.js', | ||
| '-lexports.js', | ||
| ] |
There was a problem hiding this comment.
Does this all fit on a single line? Note: you can make --post-js=post.js into a single arg.
| '--print-file-name', '--quiet'] | ||
| nm_args += input_files | ||
|
|
||
| result = run_process(nm_args, stdout=subprocess.PIPE) |
There was a problem hiding this comment.
Maybe use check_call here? Then it will show up when you run with emcc -v.
|
|
||
|
|
||
| def create_tsd(metadata, embind_tsd): | ||
| def create_tsd(metadata, embind_tsd, bindgen_tsd=None): |
There was a problem hiding this comment.
Does this need a default? (i.e. do all callsites pass this arg?)
|
|
||
|
|
||
| def link_lld(args, target, external_symbols=None): | ||
| def link_lld(args, target, external_symbols=None, linker_inputs=None): |
There was a problem hiding this comment.
Does this need a default? (i.e. do all callsites pass this arg?)
| #if SUPPORT_BIG_ENDIAN | ||
| #if SUPPORT_BIG_ENDIAN || WASM_BINDGEN | ||
| /** @type {!DataView} */ | ||
| var HEAP_DATA_VIEW; |
There was a problem hiding this comment.
How/why does WASM_BINDGEN use HEAP_DATA_VIEW?
This is an early draft PR for the purposes of gathering feedback early. There are also pending changes to wasm-bindgen.This is ready for review.How this works:
wasm32-unknown-emscripteninto a.afile (staticlib). This.afile includes some annotations needed by wasm-bindgen later..afile.wasm-ldto link the C++ and Rust together into a single.wasmfile..wasmfile, removing the annotations needed by wasm-bindgen and producing a new.wasmfile, alibrary.jsfile, and apre.jsfile..js, integrating the wasm-bindgen.jsfiles.You can see a demo more easily at https://github.com/walkingeyerobot/cxx-rust-demo.
Some TODOs:
Figure out how to pass the exported symbols from the rust compiler to Emscripten. These are symbols that need to be passed towasm-ldso they're not removed in the final.wasmbut that may not necessarily be present after wasm-bindgen processes the.wasm. wasm-bindgen at compile time puts the information it needs to generate JS inside the.wasmfile itself in the form of_describefunctions. These functions are then removed after JS generation.Merge the.jsfiles produced by wasm-bindgen. This shouldn't be that hard; I just haven't gotten around to it yet. This would simplify the code for both Emscripten and wasm-bindgen.Get wasm-bindgen tests to pass. Early efforts here have revealed some very odd compiler differences between-unknownand-emscriptenthat I'll have to fix.Have this work end-to-end via wasm-pack. I'll have a draft PR for this soon (tm).My work here didn't pan out, but there's a new PR for this here: feat: support wasm32-unknown-emscripten target wasm-bindgen/wasm-pack#1583I'm mostly looking for feedback on the first point about exported symbols and about the general addition of-sWASM_BINDGENto Emscripten. Again, this is very early, but it's a pretty big feature, so I thought it best to start discussions now.cc @daxpedda @guybedford @RReverser, who I've been working with on the wasm-bindgen side.
(updated May 18 2026 to be more accurate as to the current state of things)