dlopen many cgo libs#3845
Conversation
d1b0295 to
e5f07a0
Compare
d758e03 to
584281e
Compare
| go version | ||
| echo "Building Go shared server library..." | ||
| go build -buildmode=c-shared -o "$SERVER_SO_FILE" "$SERVER_GO_FILE" | ||
|
|
||
| echo "Building Go shared file ops library..." | ||
| go build -buildmode=c-shared -o "$FILEOPS_SO_FILE" "$FILEOPS_GO_FILE" |
There was a problem hiding this comment.
Shouldn't we test shared libs from different Go versions? Was that the user's case?
There was a problem hiding this comment.
The user uses same version v1.24.11. I also found that (gdb) info functions Syscall6 only returns one internal/runtime.syscall.Syscall6 despite that this test app load two libs. And it always returns the one from the first lib the app dlopen'd (I flipped the order of my dlopen calls to verify). So, I am not sure what will happen if different go versions are used 💀
| void* handle = dlopen(so_path.c_str(), RTLD_LAZY); | ||
| if (!handle) { | ||
| // These dlopen() flags are provided by the user. | ||
| void* server_so_handle = dlopen(server_so_path.c_str(), RTLD_LAZY | RTLD_NODELETE | RTLD_DEEPBIND); |
There was a problem hiding this comment.
Huh. RTLD_NODELETE makes me thing that maybe they're loading and unloading the same library multiple times 👀
We should test it probably
There was a problem hiding this comment.
Huh. Also RTLD_LAZY looks fishy as hell 🤔
Not sure why it even works with RTLD_LAZY right now. Not sure how exactly layer "hooks" the symbol either (does it overwrite pointer in the symbol table? does it overwrite instructions at the start of the hooked function itself?). But RTLD_LAZY sounds like there are other trampolines and function replacements in the game.
There was a problem hiding this comment.
RTLD_NODELETE
Yes, worth asking the user about this in my call with them next week. dlclose() will unload the go runtime code, maybe they don't want that to happen.
RTLD_LAZY
The one we hook is from static library, thus the symbols are bind already when the go lib is compiled? This flag affects all functions in dynamic library I think.
| go version | ||
| echo "Building Go shared server library..." | ||
| go build -buildmode=c-shared -o "$SERVER_SO_FILE" "$SERVER_GO_FILE" | ||
|
|
||
| echo "Building Go shared file ops library..." | ||
| go build -buildmode=c-shared -o "$FILEOPS_SO_FILE" "$FILEOPS_GO_FILE" |
There was a problem hiding this comment.
The user uses same version v1.24.11. I also found that (gdb) info functions Syscall6 only returns one internal/runtime.syscall.Syscall6 despite that this test app load two libs. And it always returns the one from the first lib the app dlopen'd (I flipped the order of my dlopen calls to verify). So, I am not sure what will happen if different go versions are used 💀
| void* handle = dlopen(so_path.c_str(), RTLD_LAZY); | ||
| if (!handle) { | ||
| // These dlopen() flags are provided by the user. | ||
| void* server_so_handle = dlopen(server_so_path.c_str(), RTLD_LAZY | RTLD_NODELETE | RTLD_DEEPBIND); |
There was a problem hiding this comment.
RTLD_NODELETE
Yes, worth asking the user about this in my call with them next week. dlclose() will unload the go runtime code, maybe they don't want that to happen.
RTLD_LAZY
The one we hook is from static library, thus the symbols are bind already when the go lib is compiled? This flag affects all functions in dynamic library I think.
fd1c8a0 to
3240409
Compare
Added a new cgo lib that does basic file ops. The cpp app now dlopen both cgo libs.
The address needs to seat in the middle of the function so stack frame unwind works normally.
1. Correct warning message about missing main.go files. 2. Use go 1.24 to build test app in CI. 3. Fix wrong comments on assembly code.
CPP main app dlopen CPP shared object that wraps Go c-archive library.
📝 WalkthroughWalkthroughThis pull request adds experimental support for dlopen-based loading of multiple C-shared Go 1.24 libraries on Linux x86_64 and aarch64 architectures. It introduces low-level syscall interception infrastructure, refactors the test application with modular server and fileops components supporting both c-shared and c-archive build modes, and updates test integration points accordingly. Changes
Sequence Diagram(s)sequenceDiagram
actor App as Go App
participant Detour as Syscall6<br/>Detour
participant StackMgr as Stack<br/>Manager
participant ABI as ABI<br/>Wrapper
participant Handler as Syscall<br/>Handler
participant GoRuntime as Go Runtime<br/>(g, m, TLS)
App->>Detour: internal_runtime_syscall_syscall6()<br/>(args on stack)
alt Exit Syscall (SYS_EXIT/SYS_EXIT_GROUP)
Detour->>GoRuntime: bypass detour,<br/>invoke directly
else Non-Exit Syscall
Detour->>StackMgr: check if on system stack
StackMgr->>GoRuntime: save/swap TLS & g state
StackMgr->>ABI: switch to system stack
ABI->>ABI: copy args from stack<br/>to registers (SystemV AMD64)
ABI->>Handler: c_abi_syscall6_handler
Handler->>Handler: invoke actual syscall
Handler-->>ABI: return result
ABI->>ABI: restore stack frame
ABI-->>StackMgr: return with result
StackMgr->>GoRuntime: restore g/m/TLS state
StackMgr-->>Detour: ready to return
end
Detour->>Detour: normalize errno & return value
Detour-->>App: return to Go caller
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@mirrord/layer/src/go/c_shared/mod.rs`:
- Around line 211-218: The inline comment is inverted for the assembly sequence
that loads g->sched->ctxt into r9 and then does "test r9, r9" / "jz 1f" / "call
{runtime_abort}" / "1: ret"; update the comment near that block (the lines
around "mov r9, QWORD PTR [r14+0x50]", "test r9, r9", "jz 1f", "call
{runtime_abort}", "1:", "ret") so it correctly states that when ctxt == 0 the
code returns and when ctxt != 0 the runtime_abort is called (e.g., "If
g->sched->ctxt == 0 return, otherwise call runtime_abort.").
- Around line 188-251: The g->sched->pc must point to a real
runtime.systemstack_switch-style marker, not an arbitrary NOP sled; update
gosave_systemstack_switch so the "lea r9, [rip + {dummy} + 0x8]" uses the actual
systemstack_switch marker or a synthetic symbol whose frame layout exactly
matches Go's systemstack_switch frame: replace the dummy symbol with a symbol
that either references the real runtime.systemstack_switch (if available from
the Go runtime) or implement a new naked function whose prologue/epilogue, saved
registers and frame size match the Go runtime's systemstack_switch marker (match
its return address offset and frame layout used by unwinder), and keep
runtime_abort and gosave_systemstack_switch logic but point g->sched->pc to that
correct symbol instead of the current dummy.
In `@mirrord/layer/tests/apps/dlopen_cgo/build_test_app.sh`:
- Around line 37-40: The error message for the shared C++ source is misleading
and there's no pre-check for the c-archive source; update the check that uses
C_SHARED_CPP_FILE to print the correct filename (main_c_shared.cpp) or a generic
variable-based message and add an analogous existence check for
C_ARCHIVE_CPP_FILE (main_c_archive.cpp), exiting with a clear error if missing
so the build step that compiles the archive (invoking the compiler for
C_ARCHIVE_CPP_FILE) fails fast with a helpful message.
In `@mirrord/layer/tests/apps/dlopen_cgo/main_c_archive.cpp`:
- Around line 22-27: get_exe_dir currently calls readlink without checking its
return, which can produce an out-of-bounds write; change get_exe_dir to validate
ssize_t len = readlink("/proc/self/exe", exe_path, sizeof(exe_path) - 1) by
checking for len == -1 (or len >= sizeof(exe_path)-1) and handle the error path
(e.g., log/return a safe default or throw) before writing exe_path[len] = '\0';
apply the same guard to the identical implementation in main_c_shared.cpp to
avoid undefined behavior.
In `@mirrord/layer/tests/apps/dlopen_cgo/main_c_shared.cpp`:
- Around line 25-30: The get_exe_dir function calls readlink and unconditionally
uses its return value as an index, which will crash if readlink returns -1;
update get_exe_dir to check the ssize_t len result from readlink (handle len ==
-1 as an error path—return an empty string or propagate/log an error), ensure
you clamp len to the buffer size (use sizeof(exe_path)-1) before writing the
terminating '\0', and only call dirname on a valid, null-terminated exe_path;
reference get_exe_dir to locate where to add the len check and error handling.
- Around line 32-38: Update the stale build comment in the header block (the
third step that currently references `out.cpp_dlopen_cgo`) to the actual
produced binary name `out.dlopen_cgo_c_shared` so the documentation matches the
build output; also scan the surrounding comment for any other references to the
old name and correct them (this affects the comment in main_c_shared.cpp that
describes the g++ build invocation).
🧹 Nitpick comments (5)
mirrord/layer/tests/apps/dlopen_cgo/build_test_app.sh (1)
64-64: Inconsistent variable quoting.Lines 64, 70, and 73 leave
$SERVER_CPP_FILE,$SERVER_C_ARCHIVE_LIB, etc. unquoted, while the rest of the script consistently double-quotes path variables. This is unlikely to cause issues in a CI-controlled path, but for consistency (and to avoid surprises if paths ever contain spaces), quote them.Example for line 64
-g++ -fPIC -shared $SERVER_CPP_FILE $SERVER_C_ARCHIVE_LIB -o $SERVER_CPP_WRAPPER_LIB -lpthread -ldl +g++ -fPIC -shared "$SERVER_CPP_FILE" "$SERVER_C_ARCHIVE_LIB" -o "$SERVER_CPP_WRAPPER_LIB" -lpthread -ldlmirrord/layer/src/go/mod.rs (1)
13-17: Redundantcfgattribute — the file-level#![cfg]already applies the same gate.Lines 1–4 gate the entire file with
#![cfg(all(any(target_arch = "x86_64", target_arch = "aarch64"), target_os = "linux"))], which is identical to the attribute on thec_sharedmodule (lines 13–16). The module-level#[cfg]is a no-op and adds unnecessary noise.Suggested simplification
-#[cfg(all( - any(target_arch = "x86_64", target_arch = "aarch64"), - target_os = "linux" -))] pub(crate) mod c_shared;mirrord/layer/tests/apps/dlopen_cgo/fileops/main.go (1)
11-21:ReadFileToStringnever returnsnilon error, but callers check forNULL.On error, this returns a
C.CString(msg)(non-null), so theif (!file_content)null-check inmain_c_shared.cpp(line 96) andmain_c_archive.cpp(line 101) will never trigger — the error message will be printed as if it were file content. Since this is a test app where the error message showing up in output is diagnostic enough, this is fine for now.mirrord/layer/tests/apps/dlopen_cgo/main_c_archive.cpp (1)
96-103: Null check onReadFileToStringwon't trigger — Go side never returnsNULL.As noted in the
fileops/main.goreview,ReadFileToStringreturns an error message string on failure, neverNULL. Theif (!file_content)guard on line 101 is dead code. The same applies tomain_c_shared.cppline 96. This is fine for a test app — errors will just show up as printed content.mirrord/layer/src/go/c_shared/mod.rs (1)
157-186: Acknowledged duplication with go_1_25'sc_abi_wrapper— consider extracting if more versions need it.The doc comment correctly notes this is an exact copy. This is fine for two versions but if Go 1.26+ also needs this same wrapper, consider extracting it to a shared location. No action needed now.
| /// Implemented based on [`gosave_systemstack_switch`](https://github.com/golang/go/blob/go1.24.11/src/runtime/asm_amd64.s#L823) | ||
| /// | ||
| /// Smashes r9. | ||
| #[unsafe(naked)] | ||
| unsafe extern "C" fn gosave_systemstack_switch() { | ||
| naked_asm!( | ||
| // TODO: In Go's implementation, it loads the address of `runtime.systemstack_switch` + 8 bytes. | ||
| // [`runtime.systemstack_switch`](https://github.com/golang/go/blob/go1.24.11/src/runtime/asm_amd64.s#L475) | ||
| // is a dummy marker function. If it is directly called, it will hit a `ud2` trap. | ||
| // Go only cares that g->sched->pc is set to an address between the prologue and epilogue. | ||
| // | ||
| // I don't know if it matters for the address to be between the actual `systemstack_switch` function. | ||
| // Or a simple dummy function will work? | ||
| "lea r9, [rip + {dummy} + 0x8]", | ||
| // Store r9 in g->sched->pc. | ||
| "mov QWORD PTR [r14+0x40],r9", | ||
| "lea r9, [rsp+0x8]", | ||
| // Store r9 in g->sched->sp. | ||
| "mov QWORD PTR [r14+0x38],r9", | ||
| // Store 0 in g->sched->ret. | ||
| "mov QWORD PTR [r14+0x58],0x0", | ||
| // Store rbp in g->sched->bp. | ||
| "mov QWORD PTR [r14+0x68],rbp", | ||
| // Store g->sched->ctxt in r9. | ||
| "mov r9, QWORD PTR [r14+0x50]", | ||
| // If g->sched->ctxt == 0 abort runtime, otherwise return. | ||
| "test r9, r9", | ||
| "jz 1f", | ||
| "call {runtime_abort}", | ||
| "1:", | ||
| "ret", | ||
|
|
||
| runtime_abort = sym runtime_abort, | ||
| dummy = sym dummy, | ||
| ); | ||
| } | ||
|
|
||
| /// Implemented based on [`runtime.abort`](https://github.com/golang/go/blob/fed3b0a298464457c58d1150bdb3942f22bd6220/src/runtime/asm_amd64.s#L1237) | ||
| /// | ||
| /// This function crashes the runtime. `int3` is recognized by debuggers. | ||
| /// The rest of the function is an infinite trap loop. | ||
| #[unsafe(naked)] | ||
| unsafe extern "C" fn runtime_abort() { | ||
| naked_asm!("int3", "1:", "jmp 1b",); | ||
| } | ||
|
|
||
| /// A dummy function that wraps some `nop` between prologue and epilogue. | ||
| #[unsafe(naked)] | ||
| unsafe extern "C" fn dummy() { | ||
| naked_asm!( | ||
| "push rbp", | ||
| "mov rbp,rsp", | ||
| "nop", | ||
| "nop", | ||
| "nop", | ||
| "nop", | ||
| "nop", // this is the address we save in g->sched->pc | ||
| "nop", | ||
| "nop", | ||
| "pop rbp", | ||
| "ret", | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Go runtime g->sched->pc stack scanning requirements systemstack_switch
💡 Result:
In the Go runtime, g.sched is the “saved register context” the runtime/GC uses to start unwinding (and thus scanning) a goroutine’s user stack when that goroutine isn’t actively running on its own stack.
What stack scanning needs from g->sched
- When a goroutine is not running, stack unwinding initializes its starting
(pc, sp[, lr])from:g.syscallpc/g.syscallspif the goroutine is in a syscall, elseg.sched.pc/g.sched.sp(andg.sched.lron LR architectures).
This is exactly what the unwinder does. (go.dev)
- Therefore,
g.sched.pcmust be a valid Go PC suitable for unwinding/stack maps, andg.sched.spmust match the stack pointer for that PC’s frame (same forbp/lrwhere used). In syscall transitions, the runtime explicitly captures callerpc/sp/bpand saves them viasave(pc, sp, bp), then copiesg.sched.*intog.syscall*for later unwinding/scanning. (go.dev)
Why systemstack_switch exists (and why g.sched.pc is set to it)
systemstack(fn)switches execution from a user goroutine stack to the per-thread system stack (g0) for runtime work that must not grow the user stack / must not be preempted. (go.dev)- While doing that switch, the assembly saves the current goroutine state into
g->schedand “pretends” the savedpcissystemstack_switch:
“save our state in g->sched. Pretend to be systemstack_switch if the G stack is scanned.” (go.dev) systemstack_switchis a dummy frame marker left at the bottom of the goroutine stack so the runtime can distinguish:- the marker at the bottom of the user goroutine stack vs
- the real
systemstackframe at the top of the system (g0) stack, which is treated as terminating the stack walk (“see topofstack()”). (go.dev)
- On amd64, the runtime further notes: the
systemstack_switchframe layout must matchsystemstackso unwinding/scanning logic can safely treat the saved context as that marker frame. (tip.golang.org)
Key takeaway
When the runtime switches stacks (e.g., systemstack), it must leave behind a self-consistent saved context in g.sched. Setting g.sched.pc to systemstack_switch is a deliberate requirement so that GC/unwind code can safely interpret the boundary between user-stack frames and system-stack frames during stack walking/scanning. (go.dev)
Citations:
- 1: https://go.dev/src/runtime/traceback.go
- 2: https://go.dev/src/runtime/proc.go?utm_source=openai
- 3: https://go.dev/src/runtime/HACKING.md?utm_source=openai
- 4: https://go.dev/src/runtime/asm_386.s?utm_source=openai
- 5: https://go.dev/src/runtime/asm_386.s?utm_source=openai
- 6: https://tip.golang.org/src/runtime/asm_amd64.s?utm_source=openai
- 7: https://go.dev/src/runtime/asm_386.s?utm_source=openai
The dummy function approach for g->sched->pc has correctness issues — the TODO identifies a critical problem that needs resolution.
Go's runtime requires g->sched->pc to be a valid, properly-formed Go PC suitable for unwinding and stack map interpretation. More importantly, systemstack_switch is not just any address between a prologue/epilogue — it is a dummy frame marker that the GC and stack unwinding code must be able to distinguish during stack walking to correctly interpret the boundary between user-stack and system-stack frames.
The current approach of using an arbitrary NOP-sled dummy function does not guarantee the correct frame layout that the unwinder expects. This mismatch could cause GC/unwinding failures during stack scanning, potentially leading to crashes or memory corruption. The implementation should either:
- Use the actual
runtime.systemstack_switchaddress (if possible from the running Go process), or - Ensure the
dummyfunction's frame layout precisely matches what the unwinder expects for asystemstack_switchframe boundary.
This is not a pragmatic workaround — it is a correctness issue that should be resolved before merging.
🤖 Prompt for AI Agents
In `@mirrord/layer/src/go/c_shared/mod.rs` around lines 188 - 251, The
g->sched->pc must point to a real runtime.systemstack_switch-style marker, not
an arbitrary NOP sled; update gosave_systemstack_switch so the "lea r9, [rip +
{dummy} + 0x8]" uses the actual systemstack_switch marker or a synthetic symbol
whose frame layout exactly matches Go's systemstack_switch frame: replace the
dummy symbol with a symbol that either references the real
runtime.systemstack_switch (if available from the Go runtime) or implement a new
naked function whose prologue/epilogue, saved registers and frame size match the
Go runtime's systemstack_switch marker (match its return address offset and
frame layout used by unwinder), and keep runtime_abort and
gosave_systemstack_switch logic but point g->sched->pc to that correct symbol
instead of the current dummy.
| // Store g->sched->ctxt in r9. | ||
| "mov r9, QWORD PTR [r14+0x50]", | ||
| // If g->sched->ctxt == 0 abort runtime, otherwise return. | ||
| "test r9, r9", | ||
| "jz 1f", | ||
| "call {runtime_abort}", | ||
| "1:", | ||
| "ret", |
There was a problem hiding this comment.
Comment on line 213 is inverted — code aborts when ctxt != 0, not when == 0.
test r9, r9 / jz 1f jumps to ret when r9 == 0. The call runtime_abort is reached when r9 != 0. The comment says the opposite.
📝 Proposed fix
- // If g->sched->ctxt == 0 abort runtime, otherwise return.
+ // If g->sched->ctxt != 0 abort runtime, otherwise return.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Store g->sched->ctxt in r9. | |
| "mov r9, QWORD PTR [r14+0x50]", | |
| // If g->sched->ctxt == 0 abort runtime, otherwise return. | |
| "test r9, r9", | |
| "jz 1f", | |
| "call {runtime_abort}", | |
| "1:", | |
| "ret", | |
| // Store g->sched->ctxt in r9. | |
| "mov r9, QWORD PTR [r14+0x50]", | |
| // If g->sched->ctxt != 0 abort runtime, otherwise return. | |
| "test r9, r9", | |
| "jz 1f", | |
| "call {runtime_abort}", | |
| "1:", | |
| "ret", |
🤖 Prompt for AI Agents
In `@mirrord/layer/src/go/c_shared/mod.rs` around lines 211 - 218, The inline
comment is inverted for the assembly sequence that loads g->sched->ctxt into r9
and then does "test r9, r9" / "jz 1f" / "call {runtime_abort}" / "1: ret";
update the comment near that block (the lines around "mov r9, QWORD PTR
[r14+0x50]", "test r9, r9", "jz 1f", "call {runtime_abort}", "1:",
"ret") so it correctly states that when ctxt == 0 the code returns and when ctxt
!= 0 the runtime_abort is called (e.g., "If g->sched->ctxt == 0 return,
otherwise call runtime_abort.").
| if [[ ! -f "$C_SHARED_CPP_FILE" ]]; then | ||
| echo "ERROR: main.cpp not found in $SCRIPT_DIR" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Misleading error message and missing check for c-archive source.
Line 38 prints "main.cpp not found" but C_SHARED_CPP_FILE points to main_c_shared.cpp. Also, there is no existence check for C_ARCHIVE_CPP_FILE (main_c_archive.cpp) — if it's missing, the build on line 73 will fail with a less informative compiler error.
Proposed fix
if [[ ! -f "$C_SHARED_CPP_FILE" ]]; then
- echo "ERROR: main.cpp not found in $SCRIPT_DIR"
+ echo "ERROR: main_c_shared.cpp not found in $SCRIPT_DIR"
+ exit 1
+fi
+
+if [[ ! -f "$C_ARCHIVE_CPP_FILE" ]]; then
+ echo "ERROR: main_c_archive.cpp not found in $SCRIPT_DIR"
exit 1
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [[ ! -f "$C_SHARED_CPP_FILE" ]]; then | |
| echo "ERROR: main.cpp not found in $SCRIPT_DIR" | |
| exit 1 | |
| fi | |
| if [[ ! -f "$C_SHARED_CPP_FILE" ]]; then | |
| echo "ERROR: main_c_shared.cpp not found in $SCRIPT_DIR" | |
| exit 1 | |
| fi | |
| if [[ ! -f "$C_ARCHIVE_CPP_FILE" ]]; then | |
| echo "ERROR: main_c_archive.cpp not found in $SCRIPT_DIR" | |
| exit 1 | |
| fi |
🤖 Prompt for AI Agents
In `@mirrord/layer/tests/apps/dlopen_cgo/build_test_app.sh` around lines 37 - 40,
The error message for the shared C++ source is misleading and there's no
pre-check for the c-archive source; update the check that uses C_SHARED_CPP_FILE
to print the correct filename (main_c_shared.cpp) or a generic variable-based
message and add an analogous existence check for C_ARCHIVE_CPP_FILE
(main_c_archive.cpp), exiting with a clear error if missing so the build step
that compiles the archive (invoking the compiler for C_ARCHIVE_CPP_FILE) fails
fast with a helpful message.
| string get_exe_dir() { | ||
| char exe_path[PATH_MAX]; | ||
| ssize_t len = readlink("/proc/self/exe", exe_path, sizeof(exe_path) - 1); | ||
| exe_path[len] = '\0'; | ||
| return string(dirname(exe_path)); | ||
| } |
There was a problem hiding this comment.
readlink return value not checked for error — potential out-of-bounds write.
If readlink fails (returns -1), line 25 writes to exe_path[-1], which is undefined behavior. Same issue exists in main_c_shared.cpp line 28. Low risk since /proc/self/exe should always be resolvable in the test environment, but worth a quick guard.
🛡️ Proposed fix
string get_exe_dir() {
char exe_path[PATH_MAX];
ssize_t len = readlink("/proc/self/exe", exe_path, sizeof(exe_path) - 1);
+ if (len < 0) {
+ cerr << "readlink(/proc/self/exe) failed\n";
+ exit(1);
+ }
exe_path[len] = '\0';
return string(dirname(exe_path));
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| string get_exe_dir() { | |
| char exe_path[PATH_MAX]; | |
| ssize_t len = readlink("/proc/self/exe", exe_path, sizeof(exe_path) - 1); | |
| exe_path[len] = '\0'; | |
| return string(dirname(exe_path)); | |
| } | |
| string get_exe_dir() { | |
| char exe_path[PATH_MAX]; | |
| ssize_t len = readlink("/proc/self/exe", exe_path, sizeof(exe_path) - 1); | |
| if (len < 0) { | |
| cerr << "readlink(/proc/self/exe) failed\n"; | |
| exit(1); | |
| } | |
| exe_path[len] = '\0'; | |
| return string(dirname(exe_path)); | |
| } |
🤖 Prompt for AI Agents
In `@mirrord/layer/tests/apps/dlopen_cgo/main_c_archive.cpp` around lines 22 - 27,
get_exe_dir currently calls readlink without checking its return, which can
produce an out-of-bounds write; change get_exe_dir to validate ssize_t len =
readlink("/proc/self/exe", exe_path, sizeof(exe_path) - 1) by checking for len
== -1 (or len >= sizeof(exe_path)-1) and handle the error path (e.g., log/return
a safe default or throw) before writing exe_path[len] = '\0'; apply the same
guard to the identical implementation in main_c_shared.cpp to avoid undefined
behavior.
| string get_exe_dir() { | ||
| char exe_path[PATH_MAX]; | ||
| ssize_t len = readlink("/proc/self/exe", exe_path, sizeof(exe_path)-1); | ||
| exe_path[len] = '\0'; | ||
| return string(dirname(exe_path)); | ||
| } |
There was a problem hiding this comment.
Same readlink error-handling gap as main_c_archive.cpp.
If readlink returns -1, line 28 indexes exe_path[-1]. See the fix proposed for main_c_archive.cpp.
🤖 Prompt for AI Agents
In `@mirrord/layer/tests/apps/dlopen_cgo/main_c_shared.cpp` around lines 25 - 30,
The get_exe_dir function calls readlink and unconditionally uses its return
value as an index, which will crash if readlink returns -1; update get_exe_dir
to check the ssize_t len result from readlink (handle len == -1 as an error
path—return an empty string or propagate/log an error), ensure you clamp len to
the buffer size (use sizeof(exe_path)-1) before writing the terminating '\0',
and only call dirname on a valid, null-terminated exe_path; reference
get_exe_dir to locate where to add the len check and error handling.
| /** | ||
| * This test app calls `dlopen()` to load two different c-shared cgo libraries. | ||
| * To compile the app, run the script `./build_test_app.sh`, which does the following: | ||
| * 1. build the go c-shared server library: `go build -buildmode=c-shared -o server/libgo_server.so server/main.go`. | ||
| * 2. build the go c-shared file ops library: `go build -buildmode=c-shared -o fileops/libgo_fileops.so fileops/main.go`. | ||
| * 3. build the cpp app: `g++ main.cpp -o out.cpp_dlopen_cgo -ldl`. | ||
| **/ |
There was a problem hiding this comment.
Stale build comment — output name doesn't match the renamed binary.
The comment on line 37 says out.cpp_dlopen_cgo but the actual output name (per common/mod.rs line 1194) is out.dlopen_cgo_c_shared.
📝 Proposed fix
- * 3. build the cpp app: `g++ main.cpp -o out.cpp_dlopen_cgo -ldl`.
+ * 3. build the cpp app: `g++ main_c_shared.cpp -o out.dlopen_cgo_c_shared -ldl`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * This test app calls `dlopen()` to load two different c-shared cgo libraries. | |
| * To compile the app, run the script `./build_test_app.sh`, which does the following: | |
| * 1. build the go c-shared server library: `go build -buildmode=c-shared -o server/libgo_server.so server/main.go`. | |
| * 2. build the go c-shared file ops library: `go build -buildmode=c-shared -o fileops/libgo_fileops.so fileops/main.go`. | |
| * 3. build the cpp app: `g++ main.cpp -o out.cpp_dlopen_cgo -ldl`. | |
| **/ | |
| /** | |
| * This test app calls `dlopen()` to load two different c-shared cgo libraries. | |
| * To compile the app, run the script `./build_test_app.sh`, which does the following: | |
| * 1. build the go c-shared server library: `go build -buildmode=c-shared -o server/libgo_server.so server/main.go`. | |
| * 2. build the go c-shared file ops library: `go build -buildmode=c-shared -o fileops/libgo_fileops.so fileops/main.go`. | |
| * 3. build the cpp app: `g++ main_c_shared.cpp -o out.dlopen_cgo_c_shared -ldl`. | |
| **/ |
🤖 Prompt for AI Agents
In `@mirrord/layer/tests/apps/dlopen_cgo/main_c_shared.cpp` around lines 32 - 38,
Update the stale build comment in the header block (the third step that
currently references `out.cpp_dlopen_cgo`) to the actual produced binary name
`out.dlopen_cgo_c_shared` so the documentation matches the build output; also
scan the surrounding comment for any other references to the old name and
correct them (this affects the comment in main_c_shared.cpp that describes the
g++ build invocation).
| // If g->sched->ctxt == 0 abort runtime, otherwise return. | ||
| "test r9, r9", | ||
| "jz 1f", | ||
| "call {runtime_abort}", | ||
| "1:", |
There was a problem hiding this comment.
Critical Logic Error: Inverted condition comment
The comment states "If g->sched->ctxt == 0 abort runtime, otherwise return" but the code does the opposite:
jz 1fjumps to return (label 1) when r9 is zero- Falls through to
call {runtime_abort}when r9 is non-zero
This means the code aborts when ctxt != 0, not when ctxt == 0. The comment is backwards and misleading.
// Store g->sched->ctxt in r9.
"mov r9, QWORD PTR [r14+0x50]",
// If g->sched->ctxt != 0 abort runtime, otherwise return.
"test r9, r9",
"jz 1f",
"call {runtime_abort}",This documentation error could cause future maintainers to misunderstand the code's behavior and introduce bugs when modifying it.
| // If g->sched->ctxt == 0 abort runtime, otherwise return. | |
| "test r9, r9", | |
| "jz 1f", | |
| "call {runtime_abort}", | |
| "1:", | |
| // If g->sched->ctxt != 0 abort runtime, otherwise return. | |
| "test r9, r9", | |
| "jz 1f", | |
| "call {runtime_abort}", | |
| "1:", | |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
This PR implements go detours for go v1.24.*. These detours are used only for Linux amd64 dlopen hook.
The go detours use the same approach in our 1.25 detours, but do not use the
gosave_systemstack_switchfunction resolved from the go binary. Instead, the whole call stack is implemented with Rust naked functions.This approach shall handle edge cases when multiple c-shared go libs are opened.