Summary
When upgrading from a version before v2.7 to v2.7+, tables whose primary key / distribution key contains JSONB can compute a different vnode for the same logical JSONB value.
This is because vnode assignment hashes the distribution-key datum, and JSONB hashing relies on Rust's Hash trait behavior for the underlying JSONB representation. That hash behavior changed across the upgrade even when the JSONB value is logically equal and has the same key-serialized representation.
This can leave old rows persisted under the old vnode while new operations compute a new vnode for the same JSONB key.
#23497 -> risingwavelabs/jsonbb#7
Simple Repro
On a pre-v2.7 cluster:
CREATE TABLE t (a JSONB PRIMARY KEY);
INSERT INTO t VALUES ('"foo"'::jsonb);
Upgrade the cluster to v2.7+.
Then insert or otherwise operate on the same logical key:
INSERT INTO t VALUES ('"foo"'::jsonb);
The same logical JSONB key may now be assigned to a different vnode.
A symptom can be:
SELECT DISTINCT a FROM t;
returning duplicate-looking rows, while equality still says they are equal:
SELECT t1.a, t2.a, t1.a = t2.a
FROM t AS t1, t AS t2;
What Leads To This
The persisted Hummock table key is effectively:
vnode bytes + memcomparable primary-key bytes
The vnode prefix is computed separately from the serialized key payload.
For a table like:
CREATE TABLE t (a JSONB PRIMARY KEY);
the JSONB column a is both the primary key and the distribution key.
The write path computes the vnode from the primary/distribution key:
StateTable::serialize_pk
-> compute_vnode_by_pk
-> compute_vnode
-> VirtualNode::compute_row
-> hash distribution-key row with Crc32FastBuilder
-> hash % vnode_count
Relevant code paths:
src/stream/src/common/table/state_table.rs
src/common/src/hash/table_distribution.rs
src/common/src/hash/consistent_hash/vnode.rs
src/common/src/row/mod.rs
src/common/src/types/jsonb.rs
The JSONB datum hash ultimately relies on Rust's Hash trait implementation for the underlying JSONB value. That behavior changed across the upgrade.
As a result, the derived vnode can change even if:
- the JSONB value is logically equal
- SQL equality still returns true
- display/string output is the same
- memcomparable primary-key serialization is the same
The key payload can remain stable while the vnode prefix changes.
Why This Is Unsafe
Rust's Hash trait is not intended to provide stable persisted/cross-version hash output. The official docs discuss this in the Hash trait portability notes:
https://doc.rust-lang.org/std/hash/trait.Hash.html
Therefore, using dependency-derived Hash output for persisted vnode/distribution-key assignment can introduce upgrade incompatibility.
Impact
Possible symptoms include:
- the same logical JSONB primary key being stored under different vnode prefixes before and after upgrade
- point lookup/write/delete computing only the new vnode and missing rows under the old vnode
- storage scans seeing logically equal keys separated by vnode
DISTINCT, GROUP BY, or aggregation paths relying on sorted input producing duplicate groups
Suggested Fix Direction
Avoid using dependency-derived Hash for persisted vnode/distribution-key computation of JSONB.
Possible approaches:
- Hash a stable canonical JSONB representation.
- Hash the same bytes used by the memcomparable primary-key serialization.
- Preserve the pre-
v2.7 JSONB hash behavior for existing tables.
- Add upgrade tests covering JSONB primary/distribution keys across
v2.6.x -> v2.7+.
- Add compatibility/migration handling for tables whose distribution key includes JSONB.
Summary
When upgrading from a version before
v2.7tov2.7+, tables whose primary key / distribution key containsJSONBcan compute a different vnode for the same logical JSONB value.This is because vnode assignment hashes the distribution-key datum, and JSONB hashing relies on Rust's
Hashtrait behavior for the underlying JSONB representation. That hash behavior changed across the upgrade even when the JSONB value is logically equal and has the same key-serialized representation.This can leave old rows persisted under the old vnode while new operations compute a new vnode for the same JSONB key.
#23497 -> risingwavelabs/jsonbb#7
Simple Repro
On a pre-
v2.7cluster:Upgrade the cluster to
v2.7+.Then insert or otherwise operate on the same logical key:
The same logical JSONB key may now be assigned to a different vnode.
A symptom can be:
returning duplicate-looking rows, while equality still says they are equal:
What Leads To This
The persisted Hummock table key is effectively:
The vnode prefix is computed separately from the serialized key payload.
For a table like:
the
JSONBcolumnais both the primary key and the distribution key.The write path computes the vnode from the primary/distribution key:
Relevant code paths:
The JSONB datum hash ultimately relies on Rust's
Hashtrait implementation for the underlying JSONB value. That behavior changed across the upgrade.As a result, the derived vnode can change even if:
The key payload can remain stable while the vnode prefix changes.
Why This Is Unsafe
Rust's
Hashtrait is not intended to provide stable persisted/cross-version hash output. The official docs discuss this in theHashtrait portability notes:https://doc.rust-lang.org/std/hash/trait.Hash.html
Therefore, using dependency-derived
Hashoutput for persisted vnode/distribution-key assignment can introduce upgrade incompatibility.Impact
Possible symptoms include:
DISTINCT,GROUP BY, or aggregation paths relying on sorted input producing duplicate groupsSuggested Fix Direction
Avoid using dependency-derived
Hashfor persisted vnode/distribution-key computation ofJSONB.Possible approaches:
v2.7JSONB hash behavior for existing tables.v2.6.x -> v2.7+.