Skip to content

Breaking change >=2.7: JSONB distribution-key hash change can remap the same logical key to a different vnode #25336

@maybe-vibe

Description

@maybe-vibe

Summary

When upgrading from a version before v2.7 to v2.7+, tables whose primary key / distribution key contains JSONB can compute a different vnode for the same logical JSONB value.

This is because vnode assignment hashes the distribution-key datum, and JSONB hashing relies on Rust's Hash trait behavior for the underlying JSONB representation. That hash behavior changed across the upgrade even when the JSONB value is logically equal and has the same key-serialized representation.

This can leave old rows persisted under the old vnode while new operations compute a new vnode for the same JSONB key.

#23497 -> risingwavelabs/jsonbb#7

Simple Repro

On a pre-v2.7 cluster:

CREATE TABLE t (a JSONB PRIMARY KEY);
INSERT INTO t VALUES ('"foo"'::jsonb);

Upgrade the cluster to v2.7+.

Then insert or otherwise operate on the same logical key:

INSERT INTO t VALUES ('"foo"'::jsonb);

The same logical JSONB key may now be assigned to a different vnode.

A symptom can be:

SELECT DISTINCT a FROM t;

returning duplicate-looking rows, while equality still says they are equal:

SELECT t1.a, t2.a, t1.a = t2.a
FROM t AS t1, t AS t2;

What Leads To This

The persisted Hummock table key is effectively:

vnode bytes + memcomparable primary-key bytes

The vnode prefix is computed separately from the serialized key payload.

For a table like:

CREATE TABLE t (a JSONB PRIMARY KEY);

the JSONB column a is both the primary key and the distribution key.

The write path computes the vnode from the primary/distribution key:

StateTable::serialize_pk
  -> compute_vnode_by_pk
  -> compute_vnode
  -> VirtualNode::compute_row
  -> hash distribution-key row with Crc32FastBuilder
  -> hash % vnode_count

Relevant code paths:

src/stream/src/common/table/state_table.rs
src/common/src/hash/table_distribution.rs
src/common/src/hash/consistent_hash/vnode.rs
src/common/src/row/mod.rs
src/common/src/types/jsonb.rs

The JSONB datum hash ultimately relies on Rust's Hash trait implementation for the underlying JSONB value. That behavior changed across the upgrade.

As a result, the derived vnode can change even if:

  • the JSONB value is logically equal
  • SQL equality still returns true
  • display/string output is the same
  • memcomparable primary-key serialization is the same

The key payload can remain stable while the vnode prefix changes.

Why This Is Unsafe

Rust's Hash trait is not intended to provide stable persisted/cross-version hash output. The official docs discuss this in the Hash trait portability notes:

https://doc.rust-lang.org/std/hash/trait.Hash.html

Therefore, using dependency-derived Hash output for persisted vnode/distribution-key assignment can introduce upgrade incompatibility.

Impact

Possible symptoms include:

  • the same logical JSONB primary key being stored under different vnode prefixes before and after upgrade
  • point lookup/write/delete computing only the new vnode and missing rows under the old vnode
  • storage scans seeing logically equal keys separated by vnode
  • DISTINCT, GROUP BY, or aggregation paths relying on sorted input producing duplicate groups

Suggested Fix Direction

Avoid using dependency-derived Hash for persisted vnode/distribution-key computation of JSONB.

Possible approaches:

  1. Hash a stable canonical JSONB representation.
  2. Hash the same bytes used by the memcomparable primary-key serialization.
  3. Preserve the pre-v2.7 JSONB hash behavior for existing tables.
  4. Add upgrade tests covering JSONB primary/distribution keys across v2.6.x -> v2.7+.
  5. Add compatibility/migration handling for tables whose distribution key includes JSONB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions