Add documents for rosidl::Buffer features by nvcyc · Pull Request #6440 · ros2/ros2_documentation

nvcyc · 2026-04-21T08:25:51Z

Description

Introduces documentation for the new rosidl::Buffer feature and its pluggable backend architecture.

Pages created in this pull request to cover the features are::

Concepts/Intermediate/About-Buffer-Backends.rst — conceptual overview:
the Buffer<T> / BufferImplBase<T> / BufferBackend split, the
descriptor round-trip, discovery hooks, and the "base backend vs
composed backend" pattern (CUDA as a base; PyTorch as a composed
backend that can layer on top of multiple bases). Also clarifies that
the plugin interface is RMW-agnostic — get_descriptor_type_support()
returns a generic aggregate handle, and the RMW (currently
rmw_fastrtps_cpp) resolves it internally.
How-To-Guides/Using-Buffer-Backends.rst — user guide for enabling a
backend on a subscription via
SubscriptionOptions::acceptable_buffer_backends
(""/"cpu" / "any" / comma-separated list), C++ and Python
examples, a per-RMW support matrix, and the three practical rules for a
compatible pub/sub pair (same backend installed on both sides, same RMW,
aligned package versions). Also emphasises that intra-process /
inter-process / inter-host transport scope is a property of each
backend, not of rosidl::Buffer.
Tutorials/Advanced/Writing-a-Buffer-Backend.rst — vendor-facing
step-by-step guide for implementing and packaging a new BufferBackend
plugin: interface walkthrough, descriptor design (including the 4096-byte
kMaxBufferDescriptorSize limit), BufferImplBase<T> and
BufferBackend implementation, pluginlib registration, CMake/package.xml
scaffolding, user-facing API patterns (allocate_msg, from_buffer,
to_buffer), the composed-backend pattern, and a ship checklist. Uses a
generic mydev backend in prose and cross-links the real CUDA, Torch,
and demo backends for concrete reference.
Tutorials/Demos/GPU-Buffer-Transport.rst — runnable end-to-end demo
based on robot_arm_demo from ros2/rosidl_buffer_backends_tutorials,
comparing CUDA zero-copy vs CPU-serialised transport at several
resolutions, with the benchmark table reproduced from the demo README.

Other registration / discoverability page updates:

Concepts/Intermediate.rst, How-To-Guides.rst,
Tutorials/Advanced.rst, Tutorials/Demos.rst — new pages added to the
respective toctrees.
Glossary.rst — new entries for "Buffer", "Buffer backend", "Base
backend", "Composed backend", "Buffer descriptor", and "Acceptable
backend list".
The-ROS2-Project/Features.rst — added a "Pluggable buffer backends"
row (marked experimental, currently supported in rmw_fastrtps_cpp and, C++ user-facing APIs only).
Related-Projects.rst — added a "Community rosidl::Buffer backends"
section linking to ros2/rosidl_buffer_backends and
ros2/rosidl_buffer_backends_tutorials.

Did you use Generative AI?

Yes. Cursor with Claude Opus 4.7 was used to assist with the draft version of the docs included in this pull request.

Signed-off-by: CY Chen <[email protected]>

github-actions · 2026-04-21T15:57:02Z

HTML artifacts: https://github.com/ros2/ros2_documentation/actions/runs/24916423129/artifacts/6635453352.

To view the resulting site:

Click on the above link to download the artifacts archive
Extract it
Open html-artifacts-6440/index.html in your favorite browser

yuanknv · 2026-04-21T18:26:51Z

+that describes how to locate or reconstruct the payload on the receiving
+side.
+For a CPU-only backend the descriptor would just carry the bytes;
+for a GPU backend it typically carries an IPC handle plus metadata.


not sure if this is accurate, the descriptor generally don't carry any IPC handle, as we use FD to import the GPU ptr, and the FD is transmitted through socket.

i would suggest that "a small reference that the receiving side uses to re-attach to the payload, the exact mechanism is backend-specific.

yuanknv · 2026-04-21T22:09:30Z

+          [this](sensor_msgs::msg::Image::SharedPtr msg) {
+            if (msg->data.get_backend_type() == "cuda") {
+              // Zero-copy GPU path: read the device pointer directly.
+              auto rh = cuda_buffer_backend::from_buffer(msg->data, stream_);


the input msg need to be const in order to get a readhandle. for non-const, it will return a write handle

yuanknv · 2026-04-21T22:10:19Z

+   * - RMW implementation
+     - Support status
+     - Notes
+   * - ``rmw_fastrtps_cpp``


i think we should also mention which QoS is not supported

yuanknv · 2026-04-21T22:16:12Z

+* **CPU** -- the frame is rendered on the GPU, copied back to host memory
+  with ``cudaMemcpy``, and then serialised through the RMW as a regular
+  ``uint8[]``.
+  No buffer backend is involved.


techinqually, i think it's still using the cpu buffer backend

fujitatomoya

No need to keep the lines in certain length, as long as single sentence per line.

fujitatomoya · 2026-04-28T23:14:52Z

+The feature was introduced to let vendors transport large binary payloads
+(camera images, point clouds, tensors, ...) through the existing ROS 2
+pub/sub API with as few copies as the underlying memory technology allows,
+while keeping every piece of existing code that treats a ``uint8[]`` field as
+a ``std::vector<uint8_t>`` working unchanged.


developers and users would think that this feature is available on services and actions, because services and actions in ROS 2 are built on top of the message types and endpoints plus generated services and topics under the hood.
but i do not think this is not supported yet. (not sure that services and actions are supported in the future, or only topic types are supported. saying limitation for now or design spec.)

i would add the description that services and actions are not supported with this feature explictly.
i think this is not limited by design, but the implementation of the registration is not yet implemented like rmw_subscription_options_t, the descriptor path with zero-copy GPU transport is, in the current merge, only wired through the topic transport.

fujitatomoya · 2026-04-28T23:18:59Z

+that describes how to locate or reconstruct the payload on the receiving
+side.
+For a CPU-only backend the descriptor would just carry the bytes;
+for a GPU backend it typically carries an IPC handle plus metadata.


i would suggest that "a small reference that the receiving side uses to re-attach to the payload, the exact mechanism is backend-specific.

fujitatomoya · 2026-04-28T23:25:39Z

+   :widths: 25 25 50
+   :header-rows: 1
+
+   * - RMW implementation


probably we need to add rmw_zenoh_cpp here as tier 1 implementation.

fujitatomoya · 2026-04-28T23:37:13Z

+
+* intra-process (same Python/C++ process);
+* inter-process on the same host, same GPU, same user (via CUDA VMM IPC);
+* inter-host is not supported; the RMW falls back to CPU serialization.


i actually think this is against design? The RMW doesn't decide what works across hosts but the backend does? architecture's whole point is that the RMW is backend-agnostic at the transport-mechanism level.

according to the current major scope, i do understand that all the buffers are probably and likely managed under inter-process communication. but i believe that this is not the limitation. for example, GPUDirect RDMA is designed for cross-host GPU-to-GPU transfer over RoCE/InfiniBand, CXL coherent memory?

fujitatomoya · 2026-04-28T23:42:20Z

+
+* A CUDA-capable GPU and the CUDA Toolkit (>= 11.8).
+* SDL2, GLEW, OpenGL, X11 development packages.
+* A ROS 2 Rolling source workspace.


Probably for the future, we could say Lyrical Luth or later, so that user can see the least distro version.

fujitatomoya · 2026-04-28T23:45:53Z

@@ -0,0 +1,199 @@
+About ``rosidl::Buffer`` backends


how about adding explicitly the NITROS / Isaac ROS relationship with this? this could be one of the most common reader questions, and the docs currently appear to leave it unaddressed. explaining that native buffers operate at the rosidl layer rather than via REP-2007/2011 type adaptation, that they work cross-process out of the box, and that they coexist with type adaptation rather than competing with it. this would head off a lot of confusion i guess?

fujitatomoya · 2026-04-28T23:48:19Z

+* ``cuda_buffer_backend``: a realistic **base backend** built on CUDA VMM
+  and CUDA IPC, with intra-process, inter-process same-host, and CPU-fallback
+  paths.
+  See `ros2/rosidl_buffer_backends <https://github.com/ros2/rosidl_buffer_backends>`__.


obviously we need to merge ros2/rosidl_buffer_backends#1 before this doc is published.

fujitatomoya · 2026-04-28T23:49:01Z

+* ``demo_buffer_backend``: a minimal CPU-to-CPU backend used by the
+  ``rosidl::Buffer`` system tests.
+  Useful as a pedagogical example with no device dependencies.
+  See ``rcl_buffer/demo_buffer_backend`` in the workspace used by this


where is this package?

nvcyc added 2 commits April 21, 2026 08:25

Add documents for rosidl::Buffer features

1b488c2

Signed-off-by: CY Chen <[email protected]>

Fix lint error

d1fff03

Signed-off-by: CY Chen <[email protected]>

yuanknv reviewed Apr 21, 2026

View reviewed changes

Merge branch 'rolling' into nvcyc/rosidl_buffer

3ba0dab

fujitatomoya requested changes Apr 28, 2026

View reviewed changes

Conversation

nvcyc commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Did you use Generative AI?

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuanknv Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fujitatomoya left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nvcyc commented Apr 21, 2026 •

edited

Loading

github-actions Bot commented Apr 21, 2026 •

edited

Loading

yuanknv Apr 21, 2026 •

edited

Loading