Skip to content

Lock hang in SdfLayer::_OpenLayerAndUnlockRegistry when compiling with C++ 20 #4017

@jfpanisset

Description

@jfpanisset

I am trying to build USD 26.03 in a VFX Platform 2026 compliant environment, in my case I am using:

  • Rocky Linux 8.10 / glibc 2.28
  • gcc 14.2.1 from gcc-toolset-14
  • OpenUSD 26.03
  • oneTBB 2022.1
  • C++ 20 standard
  • Dual Xeon E5-2687W v3 system, total 40 HT cores (but issue also happens when running single core)

OpenUSD 26.03 forces C++ 17 in cmake/defaults/CXXDefaults.cmake:

# Require C++17
set(CMAKE_CXX_STANDARD 17)

If I force compiling with C++ 20 by commenting out that line and specifying CMAKE_CXX_STANDARD=20 when configuring CMake, I can build USD, but when I try to run any of the utilities such as usdcat, the application hangs, spinning somewhere in SdfLayer::_OpenLayerAndUnlockRegistry() in pxr/usd/sdf/layer.cpp. This happens whether I force the application to run single threaded with taskset -c 0 or I allow it to start worker threads on each CPU core (default behavior).

In gdb if I add a breakpoint in that function and look at the assembly code, I the line where the hang happens is the jmp instruction at offset 608 from the start of the function:

   0x000072b02db8b34f <+591>:   mov    %rax,%rdx
   0x000072b02db8b352 <+594>:   mov    %r14,%rcx
   0x000072b02db8b355 <+597>:   mov    %rbx,%rsi
   0x000072b02db8b358 <+600>:   mov    %rbp,%rdi
   0x000072b02db8b35b <+603>:   call   0x72b02d78d0f0 <_ZNK34pxrInternal_v0_26_3__pxrReserved__17Sdf_LayerRegistry4FindERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_@plt>
   0x000072b02db8b360 <+608>:   jmp    0x72b02db8b360 <_ZN34pxrInternal_v0_26_3__pxrReserved__8SdfLayer27_OpenLayerAndUnlockRegistryIN3tbb6detail2d116queuing_rw_mutex11scoped_lockEEENS_8TfRefPtrIS0_EERT_RKNS0_20_FindOrOpenLayerInfoEb+608>
   0x000072b02db8b362 <+610>:   nopw   0x0(%rax,%rax,1)
   0x000072b02db8b368 <+616>:   lea    (%rax,%rsi,1),%rdi
   0x000072b02db8b36c <+620>:   cmp    $0x1,%r15
   0x000072b02db8b370 <+624>:   je     0x72b02db8b561 <_ZN34pxrInternal_v0_26_3__pxrReserved__8SdfLayer27_OpenLayerAndUnlockRegistryIN3tbb6detail2d116queuing_rw_mutex11scoped_lockEEENS_8TfRefPtrIS0_EERT_RKNS0_20_FindOrOpenLayerInfoEb+1121>
   0x000072b02db8b376 <+630>:   mov    (%rsp),%rsi
   0x000072b02db8b37a <+634>:   mov    %r15,%rdx

This code appears to make no sense, as it's a jmp to the same address, and indeed once execution reaches that instruction, the app will just spin over and over on the same instruction.

Although I have not yet been able to exactly identify what line this corresponds to in SdfLayer::_OpenLayerAndUnlockRegistry() the hang happens in the template instantiation for Lock type tbb::detail::d1::queuing_rw_mutex::scoped_lock which would tend to point the finger to one of the calls to lock.release() at either line 3445 or 3463.

When I leave CXXDefaults.cmake alone and allow compilation in C++ 17 mode, then no such "infinite jmp" core is issued and the standalone utilities like usdcat work as expected. It is unclear whether this is a compiler bug or an issue with oneTBB 2022.1, OpenUSD itself may just be the victim here.

Here are the specific compilation lines:

  • C++ 20 (exhibits hang)
/opt/rh/gcc-toolset-14/root/usr/bin/c++ -DBOOST_NO_CXX98_FUNCTION_BASE -DGLX_GLXEXT_PROTOTYPES -DGL_GLEXT_PROTOTYPES -DMFB_ALT_PACKAGE_NAME=sdf -DMFB_PACKAGE_MODULE=Sdf -DMFB_PACKAGE_NAME=sdf -DPXR_BUILD_LOCATION=usd -DPXR_GL_SUPPORT_ENABLED -DPXR_PLUGIN_BUILD_LOCATION=../plugin/usd -DPXR_X11_SUPPORT_ENABLED -DSDF_EXPORTS=1 -Dsdf_EXPORTS -I/tmp/OpenUSD-26.03/build/pxr/usd/sdf -I/tmp/OpenUSD-26.03/pxr/usd/sdf -I/tmp/OpenUSD-26.03/build/include -Wall -Wformat-security -Wmismatched-tags -pthread -Wno-deprecated -Wno-deprecated-declarations -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-class-memaccess  -O3 -DNDEBUG -std=c++20 -fPIC -MD -MT pxr/usd/sdf/CMakeFiles/sdf.dir/layer.cpp.o -MF CMakeFiles/sdf.dir/layer.cpp.o.d -o CMakeFiles/sdf.dir/layer.cpp.o -c /tmp/OpenUSD-26.03/pxr/usd/sdf/layer.cpp
  • C++ 17 (works as expected)
/opt/rh/gcc-toolset-14/root/usr/bin/c++ -DBOOST_NO_CXX98_FUNCTION_BASE -DGLX_GLXEXT_PROTOTYPES -DGL_GLEXT_PROTOTYPES -DMFB_ALT_PACKAGE_NAME=sdf -DMFB_PACKAGE_MODULE=Sdf -DMFB_PACKAGE_NAME=sdf -DPXR_BUILD_LOCATION=usd -DPXR_GL_SUPPORT_ENABLED -DPXR_PLUGIN_BUILD_LOCATION=../plugin/usd -DPXR_X11_SUPPORT_ENABLED -DSDF_EXPORTS=1 -Dsdf_EXPORTS -I/tmp/OpenUSD-26.03/build/pxr/usd/sdf -I/tmp/OpenUSD-26.03/pxr/usd/sdf -I/tmp/OpenUSD-26.03/build/include -Wall -Wformat-security -Wmismatched-tags -pthread -Wno-deprecated -Wno-deprecated-declarations -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-class-memaccess  -O3 -DNDEBUG -std=c++17 -fPIC -MD -MT pxr/usd/sdf/CMakeFiles/sdf.dir/layer.cpp.o -MF CMakeFiles/sdf.dir/layer.cpp.o.d -o CMakeFiles/sdf.dir/layer.cpp.o -c /tmp/OpenUSD-26.03/pxr/usd/sdf/layer.cpp

The only difference between these two is -std=c++20 vs -std=c++17, and no exotic compilation options are specified, only -O3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions