Add Milvus 2.6 API support by richzw · Pull Request #112 · milvus-io/milvus-sdk-rust

richzw · 2026-04-01T10:04:12Z

Summary

Update milvus-proto submodule from v2.6.1 to v2.6.13 and regenerate all proto Rust files
Add full support for Milvus 2.6 new data types, RPCs, search features, and schema evolution

New Data Types

Type	DataType	Wire Format
Geometry	24	WKB (`GeometryArray`) / WKT (`GeometryWktArray`)
Text	25	String (used with BM25 function)
Timestamptz	26	i64 microseconds (`TimestamptzArray`)
Int8Vector	105	bytes (1 byte/dim)
Float16Vector	103	bytes (2 bytes/dim)
BFloat16Vector	104	bytes (2 bytes/dim)
SparseFloatVector	--	`SparseFloatArray`

Value enum: added Int8Vector, Float16Vector, BFloat16Vector, Geometry, GeometryWkt, Timestamptz, SparseFloat variants
ValueVec: added Geometry, GeometryWkt, Timestamptz, SparseFloat, StructArray, VectorArray variants
Zero unimplemented!() panics remaining in value.rs and data.rs

New RPCs

truncate_collection() -- clear data without dropping collection
batch_describe_collections() -- describe multiple collections in one call
add_collection_field() -- schema evolution (add nullable field)
add/alter/drop_collection_function() -- server-side function management
run_analyzer() -- test text analyzers, returns tokenized results

Search & Query Enhancements

COSINE metric type + HNSW index support
BM25 full-text search via schema functions (add_function() on CollectionSchemaBuilder)
Highlighter support: SearchOptions::highlighter(), SearchResult::highlight_results
Namespace for multi-tenancy across insert/upsert/search/query/hybrid_search/iterators
Per-query highlight result slicing (correct behavior with nq > 1)

Schema Improvements

FieldSchema::set_nullable() -- required for schema evolution
FieldSchema::add_type_param() -- extra params like enable_analyzer, enable_match
CollectionSchemaBuilder::add_function() -- attach BM25/TextEmbedding/Rerank functions at creation
extra_type_params round-trip through describe/create (not dropped on describe)
Function output fields auto-marked with is_function_output

Index Types

Added: INVERTED, SPARSE_INVERTED_INDEX, SPARSE_WAND, RTREE, AutoIndex, DiskANN, GpuIvfFlat, GpuIvfPQ
Added metrics: COSINE, BM25

Test Plan

cargo build -- zero errors, deprecation warnings only
cargo test --lib -- 2/2 unit tests pass
cargo test --no-run -- all 21 test binaries compile
cargo run --example milvus26_features -- all 7 demos pass against live Milvus 2.6.13:
1. COSINE search with HNSW + INVERTED scalar index
2. Truncate collection
3. Batch describe collections
4. Partial upsert
5. Int8 vector insert
6. Timestamptz field insert + query
7. BM25 full-text search (analyzer + schema function + text_match query)

Update milvus-proto submodule to v2.6.13 and add comprehensive support for Milvus 2.6 features including new data types, RPCs, and search capabilities while maintaining backward compatibility with Milvus 2.5. Key changes: - Proto: update submodule to v2.6.13, handle oneof search_input breaking change - Data types: Geometry (WKB/WKT), Text, Timestamptz, Int8Vector, Float16Vector, BFloat16Vector, SparseFloatVector, ArrayOfVector, ArrayOfStruct - New RPCs: truncate_collection, batch_describe_collections, add_collection_field, add/alter/drop_collection_function, run_analyzer - Search: namespace, highlighter, COSINE metric, BM25 full-text search - Schema: add_function() builder, add_type_param(), nullable field, schema evolution - Index: INVERTED, SPARSE_INVERTED_INDEX, SPARSE_WAND, RTREE, DiskANN, AutoIndex - Mutate: UpsertOptions with partial_update, namespace support across all operations - Testing: docker-compose updated to v2.6.13, integration tests and examples Signed-off-by: Wei Zang <[email protected]>

sre-ci-robot · 2026-04-01T10:04:17Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: richzw
To complete the pull request process, please assign yah01 after the PR has been reviewed.
You can assign the PR to them by writing /assign @yah01 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mergify · 2026-04-01T10:04:52Z

@richzw Please associate the related issue to the body of your Pull Request. (eg. “issue: #187”)

Copilot

Pull request overview

Adds Milvus 2.6 API support to the Rust SDK by updating proto bindings and extending client/schema/value/query layers for new 2.6 types and RPCs.

Changes:

Regenerates/extends proto types and wires new Milvus 2.6 fields (namespace, highlighter, schema version, new datatypes).
Adds SDK support for new datatypes (e.g., Int8/Float16/BFloat16 vectors, Geometry, Timestamptz, SparseFloat, struct/vector arrays) and related serialization/deserialization.
Introduces new client APIs and options for Milvus 2.6 features (truncate, batch describe, schema evolution/function management, namespace/highlighter) plus new tests/examples and an updated docker-compose for Milvus 2.6.13.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/milvus26.rs	Adds Milvus 2.6-focused integration tests (new types/RPCs/search/index types).
src/value.rs	Extends `Value`/`ValueVec` to represent new 2.6 scalar/vector/aggregate types and conversions.
src/schema.rs	Adds schema support for nullable fields, extra type params, schema functions, and new field constructors.
src/query.rs	Adds namespace/highlighter options and updates search request encoding (search_input oneof) + highlight slicing.
src/proto/milvus.proto.schema.rs	Updates generated schema protos (schema version, highlight results, new DataType variants).
src/proto/milvus.proto.msg.rs	Updates generated message protos (namespace on insert, create-collection schema field, etc.).
src/proto/milvus.proto.common.rs	Updates generated common protos (highlighter types, WAL enums, misc formatting).
src/mutate.rs	Adds namespace to insert, introduces `UpsertOptions` + partial update + new upsert options plumbing.
src/iterator.rs	Adds namespace plumbing to iterators and adapts search request encoding (search_input).
src/index/mod.rs	Adds new index/metric enums (e.g., INVERTED, COSINE, BM25) and makes index param parsing more tolerant.
src/data.rs	Updates column encoding/decoding for new datatypes and fixes row length computation for new vector formats.
src/collection.rs	Adds new Milvus 2.6 RPC helpers (truncate, batch describe, schema evolution/function mgmt, analyzer) and carries highlight results in `SearchResult`.
examples/milvus26_features.rs	Adds an end-to-end Milvus 2.6 feature demo example.
docker-compose.yml	Updates local dev stack to Milvus 2.6.13 and adds required dependencies (etcd).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T10:12:42Z

src/schema.rs

            chunk_size: (dim
                * match dtype {
                    DataType::BinaryVector => dim / 8,
                    _ => dim,
                }) as _,


chunk_size computed in From<schema::FieldSchema> is incorrect for vector fields: for FloatVector it becomes dim*dim and for BinaryVector it becomes dim*(dim/8). This will report the wrong per-row byte/element width to callers inspecting FieldSchema. Consider setting chunk_size to the actual row width (e.g., FloatVector = dim, BinaryVector = dim/8, Float16/BFloat16 = dim*2, Int8Vector = dim, scalars = 1).

Copilot · 2026-04-01T10:12:42Z

src/mutate.rs

+    pub async fn upsert<S, O>(
        &self,
        collection_name: S,
        fields_data: Vec<FieldColumn>,
-        options: Option<InsertOptions>,
+        options: O,
    ) -> Result<crate::proto::milvus::MutationResult>
    where
        S: Into<String>,
+        O: IntoUpsertOptions,
    {
-        let options = options.unwrap_or_default();
+        let options = options.into_upsert_options();


Changing Client::upsert to take a generic O: IntoUpsertOptions breaks existing callers that pass Option<InsertOptions> (e.g., Some(InsertOptions::new()...)) and can also make None/option variables harder to type-infer. To preserve backwards compatibility, consider adding an IntoUpsertOptions impl for Option<InsertOptions> (or a generic impl<T: IntoUpsertOptions> IntoUpsertOptions for Option<T>) or keeping an overload that accepts Option<InsertOptions>.

Copilot · 2026-04-01T10:12:42Z

tests/milvus26.rs

+    let _new_field = FieldSchema::new_varchar("description", "added later", 256);
+    // The field must be nullable for schema evolution
+    // We set nullable via the proto conversion path
+    // Currently the FieldSchema struct does not expose nullable directly
+    // but the AddCollectionField RPC requires it.
+    // This test verifies the RPC call works; full nullable support is a follow-up.
+


This test currently does not exercise schema evolution: it creates _new_field but never calls client.add_collection_field(...), and the comments claim FieldSchema can’t set nullable even though FieldSchema::set_nullable() now exists. Update the test to set the new field nullable and actually invoke add_collection_field, then assert it appears in describe_collection/batch_describe_collections results.

Suggested change

let _new_field = FieldSchema::new_varchar("description", "added later", 256);

// The field must be nullable for schema evolution

// We set nullable via the proto conversion path

// Currently the FieldSchema struct does not expose nullable directly

// but the AddCollectionField RPC requires it.

// This test verifies the RPC call works; full nullable support is a follow-up.

let new_field = FieldSchema::new_varchar("description", "added later", 256)

.set_nullable(true);

// Perform schema evolution by adding the new field

client

.add_collection_field(&collection_name, new_field.clone())

.await?;

// Verify the new field appears in describe_collection results

let described_schema = client.describe_collection(&collection_name).await?;

assert!(described_schema.get_field("description").is_some());

// Verify the new field appears in batch_describe_collections results

let described_schemas = client

.batch_describe_collections(vec![collection_name.clone()])

.await?;

assert_eq!(1, described_schemas.len());

assert!(described_schemas[0].get_field("description").is_some());

Copilot · 2026-04-01T10:12:43Z

examples/milvus26_features.rs

+/// Demonstrates new capabilities introduced in Milvus 2.6:
+/// - COSINE metric with HNSW index
+/// - Truncate collection
+/// - Batch describe collections
+/// - Schema evolution (add_collection_field)
+/// - Partial upsert
+/// - BM25 full-text search function
+/// - Int8 vector field
+/// - Timestamptz field


The module-level docs list “Schema evolution (add_collection_field)”, but this example doesn’t demonstrate calling add_collection_field/schema evolution anywhere. Either add a short schema-evolution demo or remove/update the bullet to avoid misleading readers.

Copilot · 2026-04-01T10:12:43Z

src/schema.rs

            name: this.name,
            description: this.description,
            enable_dynamic_field: self.enable_dynamic_field,
+            functions: this.functions,
        })


In CollectionSchemaBuilder::build(), std::mem::replace(self, CollectionSchemaBuilder::new("".into(), "")) uses a String temporary where new() expects &str, which will not compile. Use string literals (e.g., CollectionSchemaBuilder::new("", "")) or otherwise pass &str references when resetting the builder.

Copilot · 2026-04-01T10:12:43Z

src/schema.rs

@@ -604,6 +828,7 @@ impl CollectionSchemaBuilder {
            name: this.name,
            description: this.description,
            enable_dynamic_field: self.enable_dynamic_field,


CollectionSchemaBuilder::build() uses enable_dynamic_field: self.enable_dynamic_field after self has been reset via std::mem::replace, so the built CollectionSchema will always have enable_dynamic_field = false even if enable_dynamic_field() was called. Use the moved-out builder state (this.enable_dynamic_field) instead.

Suggested change

enable_dynamic_field: self.enable_dynamic_field,

enable_dynamic_field: this.enable_dynamic_field,

richzw · 2026-04-01T10:16:01Z

Issue #113

…e upsert compatibility Signed-off-by: Wei Zang <[email protected]>

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T12:29:39Z

src/mutate.rs

+    pub async fn upsert<S, O>(
        &self,
        collection_name: S,
        fields_data: Vec<FieldColumn>,
-        options: Option<InsertOptions>,
+        options: O,
    ) -> Result<crate::proto::milvus::MutationResult>
    where
        S: Into<String>,
+        O: IntoUpsertOptions,
    {
-        let options = options.unwrap_or_default();
+        let options = options.into_upsert_options();


Client::upsert now takes a generic options: O where O: IntoUpsertOptions. This makes common call sites like upsert(..., None) fail to compile due to type inference ambiguity (there are impls for both Option<InsertOptions> and Option<UpsertOptions>). Consider reverting the parameter to a concrete Option<UpsertOptions> (and keep From<InsertOptions> for UpsertOptions), or provide an additional overload/helper that preserves the previous Option<InsertOptions> signature while keeping None unambiguous.

Copilot · 2026-04-01T12:29:39Z

src/collection.rs

+    pub async fn alter_collection_function<S>(
+        &self,
+        collection_name: S,
+        function_name: S,
+        function: proto::schema::FunctionSchema,
+    ) -> Result<()>
+    where
+        S: Into<String>,


alter_collection_function uses a single generic type parameter S for both collection_name and function_name, which forces callers to pass both arguments as the same concrete type (e.g., both &str), reducing ergonomics and causing avoidable type mismatches. Use separate generics (e.g., C: Into<String>, F: Into<String>) for these two parameters.

Suggested change

pub async fn alter_collection_function<S>(

&self,

collection_name: S,

function_name: S,

function: proto::schema::FunctionSchema,

) -> Result<()>

where

S: Into<String>,

pub async fn alter_collection_function<C, F>(

&self,

collection_name: C,

function_name: F,

function: proto::schema::FunctionSchema,

) -> Result<()>

where

C: Into<String>,

F: Into<String>,

Copilot · 2026-04-01T12:29:40Z

src/collection.rs

+    pub async fn drop_collection_function<S>(
+        &self,
+        collection_name: S,
+        function_name: S,
+    ) -> Result<()>
+    where
+        S: Into<String>,
+    {


drop_collection_function uses a single generic type parameter S for both collection_name and function_name, forcing both arguments to have the same concrete type. This is unnecessarily restrictive for callers (e.g., mixing String and &str). Use separate generics for the two parameters.

Copilot · 2026-04-01T12:29:40Z

src/schema.rs

@@ -168,24 +172,55 @@ impl FieldSchema {
            chunk_size: 0,
            dim: 0,
            max_length: 0,
+            nullable: false,
+            extra_type_params: HashMap::new(),
        }
    }
+
+    #[deprecated(note = "use FieldSchema::empty() instead")]
+    pub fn const_default() -> Self {
+        Self::empty()
+    }


FieldSchema::const_default used to be a const fn but is now a regular function (and deprecated). This is a breaking change for downstream users that relied on calling it in const contexts. If API compatibility is a goal, consider keeping a const fn constructor (even if it can only initialize a minimal/default state) or introducing a new const-safe constructor and keeping the old one as const fn for a deprecation cycle.

Copilot · 2026-04-01T12:29:40Z

src/data.rs

+            // Complex aggregate types: these represent the entire field data and are
+            // copied as a whole rather than pushed element by element.
+            (ValueVec::SparseFloat(dst), Value::SparseFloat(src)) => *dst = src.into_owned(),
+            (ValueVec::StructArray(dst), Value::StructArray(src)) => *dst = src.into_owned(),
+            (ValueVec::VectorArray(dst), Value::VectorArray(src)) => *dst = src.into_owned(),


For complex column types (SparseFloat, StructArray, VectorArray), FieldColumn::push overwrites the destination value (*dst = src.into_owned()) rather than appending a single row. This breaks search/query result slicing, because result construction pushes one value per hit; overwriting will leave the full original column (or the last assignment) instead of a per-hit subset and can make column lengths inconsistent with topk. Consider implementing per-row extraction/append semantics for these types (or updating result assembly to handle these types without per-row push).

Suggested change

// Complex aggregate types: these represent the entire field data and are

// copied as a whole rather than pushed element by element.

(ValueVec::SparseFloat(dst), Value::SparseFloat(src)) => *dst = src.into_owned(),

(ValueVec::StructArray(dst), Value::StructArray(src)) => *dst = src.into_owned(),

(ValueVec::VectorArray(dst), Value::VectorArray(src)) => *dst = src.into_owned(),

// Complex aggregate types: these represent the entire field data rather than a

// single row. Pushing them element-by-element is not supported, because it would

// overwrite the stored aggregate and break result slicing.

(ValueVec::SparseFloat(_), Value::SparseFloat(_))

| (ValueVec::StructArray(_), Value::StructArray(_))

| (ValueVec::VectorArray(_), Value::VectorArray(_)) => {

panic!(

"per-row push is not supported for SparseFloat/StructArray/VectorArray; \

construct these fields as full aggregates instead"

)

}

xiaofan-luan · 2026-04-01T21:16:23Z

Thanks for the contribution.

I'll start to review the commits once I got time

mergify · 2026-04-02T07:42:35Z

@richzw Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco.

Signed-off-by: Wei Zang <[email protected]>

Copilot AI review requested due to automatic review settings April 1, 2026 10:04

sre-ci-robot requested review from congqixia and yah01 April 1, 2026 10:04

sre-ci-robot added the size/XXL label Apr 1, 2026

Copilot started reviewing on behalf of richzw April 1, 2026 10:04 View session

mergify bot added the dco-passed label Apr 1, 2026

mergify bot added the do-not-merge/missing-related-issue label Apr 1, 2026

Copilot AI reviewed Apr 1, 2026

View reviewed changes

fix: correct v2.6 schema parsing, preserve dynamic fields, and restor…

bd4d6bb

…e upsert compatibility Signed-off-by: Wei Zang <[email protected]>

richzw requested a review from Copilot April 1, 2026 12:22

Copilot started reviewing on behalf of richzw April 1, 2026 12:23 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

mergify bot added needs-dco and removed dco-passed labels Apr 2, 2026

fix: separate generics for the two parameters

5f05593

Signed-off-by: Wei Zang <[email protected]>

richzw force-pushed the main branch from 337fbaa to 5f05593 Compare April 2, 2026 09:25

mergify bot added dco-passed and removed needs-dco labels Apr 2, 2026

fix: stabilize complex field slicing and collection integration tests

54fbc37

Signed-off-by: Wei Zang <[email protected]>

-    let _new_field = FieldSchema::new_varchar("description", "added later", 256);
-    // The field must be nullable for schema evolution
-    // We set nullable via the proto conversion path
-    // Currently the FieldSchema struct does not expose nullable directly
-    // but the AddCollectionField RPC requires it.
-    // This test verifies the RPC call works; full nullable support is a follow-up.
+    let new_field = FieldSchema::new_varchar("description", "added later", 256)
+        .set_nullable(true);
+    // Perform schema evolution by adding the new field
+    client
+        .add_collection_field(&collection_name, new_field.clone())
+        .await?;
+    // Verify the new field appears in describe_collection results
+    let described_schema = client.describe_collection(&collection_name).await?;
+    assert!(described_schema.get_field("description").is_some());
+    // Verify the new field appears in batch_describe_collections results
+    let described_schemas = client
+        .batch_describe_collections(vec![collection_name.clone()])
+        .await?;
+    assert_eq!(1, described_schemas.len());
+    assert!(described_schemas[0].get_field("description").is_some());

	enable_dynamic_field: self.enable_dynamic_field,
	enable_dynamic_field: this.enable_dynamic_field,

-            // Complex aggregate types: these represent the entire field data and are
-            // copied as a whole rather than pushed element by element.
-            (ValueVec::SparseFloat(dst), Value::SparseFloat(src)) => *dst = src.into_owned(),
-            (ValueVec::StructArray(dst), Value::StructArray(src)) => *dst = src.into_owned(),
-            (ValueVec::VectorArray(dst), Value::VectorArray(src)) => *dst = src.into_owned(),
+            // Complex aggregate types: these represent the entire field data rather than a
+            // single row. Pushing them element-by-element is not supported, because it would
+            // overwrite the stored aggregate and break result slicing.
+            (ValueVec::SparseFloat(_), Value::SparseFloat(_))
+            | (ValueVec::StructArray(_), Value::StructArray(_))
+            | (ValueVec::VectorArray(_), Value::VectorArray(_)) => {
+                panic!(
+                    "per-row push is not supported for SparseFloat/StructArray/VectorArray; \
+                     construct these fields as full aggregates instead"
+                )
+            }

Conversation

richzw commented Apr 1, 2026

Summary

New Data Types

New RPCs

Search & Query Enhancements

Schema Improvements

Index Types

Test Plan

Uh oh!

sre-ci-robot commented Apr 1, 2026

Uh oh!

mergify bot commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

richzw commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

xiaofan-luan commented Apr 1, 2026

Uh oh!

mergify bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants