[TENT][Sunrise] Add sunrise_link transport, platform support, and UT …#1915
[TENT][Sunrise] Add sunrise_link transport, platform support, and UT …#1915HomeDish wants to merge 4 commits intokvcache-ai:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the Sunrise Link Transport, a GPU backend for the Mooncake TENT framework leveraging the Tang Runtime. The changes encompass the transport and platform implementations, build system updates, documentation, and tests. Feedback identifies several issues, including data races in memory registration, device context risks with thread-local streams, and the need for safer memory management using std::vector instead of malloc. Suggestions also include optimizing remote segment invalidation and ensuring asynchronous transfer logic respects configuration settings.
…coverage Integrate Sunrise platform/transport wiring across TENT runtime and examples, add SunriseLink end-to-end unit tests, and fix RDMA error logging pointer formatting to avoid crash during registration failure paths. Made-with: Cursor
Bump pre-commit hook revisions to current releases so local checks and CI use newer lint/format toolchains consistently. Made-with: Cursor
Address review feedback in SunriseLink transport/platform paths (stream/device context, registration map synchronization, safer probe/allocator handling, and cache-refresh strategy), and remove the obsolete transfer_engine_sunrise_bench CMake target now that its source no longer exists.
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@HomeDish Could you give more informations about run_rise transport? |
@stmatengss Thanks for the question, here is a quick summary:
|
Description
This PR integrates SunriseLink as a new transport backend in TENT Transfer Engine and wires Sunrise platform support end-to-end, including transport loading, platform probing/allocation, benchmark integration, and build system updates.
Key changes:
tent/src/transport/sunrise_link/*tent/include/tent/transport/sunrise_link/*tent/src/platform/sunrise/*tent/include/tent/platform/sunrise.htent/src/runtime/transport_loader.cpptent/src/runtime/platform.cpptent/src/runtime/transfer_engine_impl.cpptent/include/tent/common/types.hCMakeLists.txt) for Sunrise components.sunrise_link:transfer_engine_bench.cppdocs/source/zh_archive/[sunrise_link_transport.md](http://sunrise_link_transport.md/)Module
mooncake-transfer-engine)mooncake-store)mooncake-ep)mooncake-integration)mooncake-p2p-store)mooncake-wheel)mooncake-pg)mooncake-rl)Type of Change
How Has This Been Tested?
transfer_engine_benchsuccessfully with Sunrise enabled.transfer_engine_benchonsunrise_linkprotocol.read+write, all non-self pairs) intransfer_engine_bench.cpptent/tests/sunrise_link_transport_test.cppand test successChecklist
./scripts/code_format.shbefore submitting.