The .proto definition of a PBNode is this (spec):
message PBNode {
// refs to other objects
repeated PBLink Links = 2;
// opaque user data
optional bytes Data = 1;
}
Implementations are expected to write the repeated Links messages to the output buffer first, then the Data field, even though the field IDs are ordered the opposite way (not disallowed by the protobuf spec, though some off the shelf encoders will write in ID order).
When the PBNode represents a directory, the Links objects could be either flat directory entries, or HAMT shard entries - the information needed to ascertain this is contained in the Data field.
This means when processing an incoming PBNode message, we typically read all of the Links, and then use the Data to decide how to process them.
If we are performing a graph traversal, a streaming parser would let us process the PBNode message as it arrives, and to select the Link message we wish to traverse through or resolve to, however this is not currently possible since we need to process the Data field before we can return a Link. This can add significant overhead when there are many thousands of Links.
We should allow ordering the Data field first, this would enable the streaming use-case and help with IPFS code running in resource-constrained environments such as web browsers.
Since IPIP-499 we now have CID profiles which could be an upgrade path for the network. This proposal could be part of a unixfs-v1-2026 profile.
The
.protodefinition of aPBNodeis this (spec):Implementations are expected to write the repeated
Linksmessages to the output buffer first, then theDatafield, even though the field IDs are ordered the opposite way (not disallowed by the protobuf spec, though some off the shelf encoders will write in ID order).When the
PBNoderepresents a directory, theLinksobjects could be either flat directory entries, or HAMT shard entries - the information needed to ascertain this is contained in theDatafield.This means when processing an incoming
PBNodemessage, we typically read all of theLinks, and then use theDatato decide how to process them.If we are performing a graph traversal, a streaming parser would let us process the
PBNodemessage as it arrives, and to select theLinkmessage we wish to traverse through or resolve to, however this is not currently possible since we need to process theDatafield before we can return aLink. This can add significant overhead when there are many thousands ofLinks.We should allow ordering the
Datafield first, this would enable the streaming use-case and help with IPFS code running in resource-constrained environments such as web browsers.Since IPIP-499 we now have CID profiles which could be an upgrade path for the network. This proposal could be part of a
unixfs-v1-2026profile.