Conversation
Update milvus-proto submodule to v2.6.13 and add comprehensive support
for Milvus 2.6 features including new data types, RPCs, and search
capabilities while maintaining backward compatibility with Milvus 2.5.
Key changes:
- Proto: update submodule to v2.6.13, handle oneof search_input breaking change
- Data types: Geometry (WKB/WKT), Text, Timestamptz, Int8Vector, Float16Vector,
BFloat16Vector, SparseFloatVector, ArrayOfVector, ArrayOfStruct
- New RPCs: truncate_collection, batch_describe_collections, add_collection_field,
add/alter/drop_collection_function, run_analyzer
- Search: namespace, highlighter, COSINE metric, BM25 full-text search
- Schema: add_function() builder, add_type_param(), nullable field, schema evolution
- Index: INVERTED, SPARSE_INVERTED_INDEX, SPARSE_WAND, RTREE, DiskANN, AutoIndex
- Mutate: UpsertOptions with partial_update, namespace support across all operations
- Testing: docker-compose updated to v2.6.13, integration tests and examples
Signed-off-by: Wei Zang <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: richzw The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@richzw Please associate the related issue to the body of your Pull Request. (eg. “issue: #187”) |
There was a problem hiding this comment.
Pull request overview
Adds Milvus 2.6 API support to the Rust SDK by updating proto bindings and extending client/schema/value/query layers for new 2.6 types and RPCs.
Changes:
- Regenerates/extends proto types and wires new Milvus 2.6 fields (namespace, highlighter, schema version, new datatypes).
- Adds SDK support for new datatypes (e.g., Int8/Float16/BFloat16 vectors, Geometry, Timestamptz, SparseFloat, struct/vector arrays) and related serialization/deserialization.
- Introduces new client APIs and options for Milvus 2.6 features (truncate, batch describe, schema evolution/function management, namespace/highlighter) plus new tests/examples and an updated docker-compose for Milvus 2.6.13.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/milvus26.rs | Adds Milvus 2.6-focused integration tests (new types/RPCs/search/index types). |
| src/value.rs | Extends Value/ValueVec to represent new 2.6 scalar/vector/aggregate types and conversions. |
| src/schema.rs | Adds schema support for nullable fields, extra type params, schema functions, and new field constructors. |
| src/query.rs | Adds namespace/highlighter options and updates search request encoding (search_input oneof) + highlight slicing. |
| src/proto/milvus.proto.schema.rs | Updates generated schema protos (schema version, highlight results, new DataType variants). |
| src/proto/milvus.proto.msg.rs | Updates generated message protos (namespace on insert, create-collection schema field, etc.). |
| src/proto/milvus.proto.common.rs | Updates generated common protos (highlighter types, WAL enums, misc formatting). |
| src/mutate.rs | Adds namespace to insert, introduces UpsertOptions + partial update + new upsert options plumbing. |
| src/iterator.rs | Adds namespace plumbing to iterators and adapts search request encoding (search_input). |
| src/index/mod.rs | Adds new index/metric enums (e.g., INVERTED, COSINE, BM25) and makes index param parsing more tolerant. |
| src/data.rs | Updates column encoding/decoding for new datatypes and fixes row length computation for new vector formats. |
| src/collection.rs | Adds new Milvus 2.6 RPC helpers (truncate, batch describe, schema evolution/function mgmt, analyzer) and carries highlight results in SearchResult. |
| examples/milvus26_features.rs | Adds an end-to-end Milvus 2.6 feature demo example. |
| docker-compose.yml | Updates local dev stack to Milvus 2.6.13 and adds required dependencies (etcd). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/schema.rs
Outdated
| chunk_size: (dim | ||
| * match dtype { | ||
| DataType::BinaryVector => dim / 8, | ||
| _ => dim, | ||
| }) as _, |
There was a problem hiding this comment.
chunk_size computed in From<schema::FieldSchema> is incorrect for vector fields: for FloatVector it becomes dim*dim and for BinaryVector it becomes dim*(dim/8). This will report the wrong per-row byte/element width to callers inspecting FieldSchema. Consider setting chunk_size to the actual row width (e.g., FloatVector = dim, BinaryVector = dim/8, Float16/BFloat16 = dim*2, Int8Vector = dim, scalars = 1).
| pub async fn upsert<S, O>( | ||
| &self, | ||
| collection_name: S, | ||
| fields_data: Vec<FieldColumn>, | ||
| options: Option<InsertOptions>, | ||
| options: O, | ||
| ) -> Result<crate::proto::milvus::MutationResult> | ||
| where | ||
| S: Into<String>, | ||
| O: IntoUpsertOptions, | ||
| { | ||
| let options = options.unwrap_or_default(); | ||
| let options = options.into_upsert_options(); |
There was a problem hiding this comment.
Changing Client::upsert to take a generic O: IntoUpsertOptions breaks existing callers that pass Option<InsertOptions> (e.g., Some(InsertOptions::new()...)) and can also make None/option variables harder to type-infer. To preserve backwards compatibility, consider adding an IntoUpsertOptions impl for Option<InsertOptions> (or a generic impl<T: IntoUpsertOptions> IntoUpsertOptions for Option<T>) or keeping an overload that accepts Option<InsertOptions>.
tests/milvus26.rs
Outdated
| let _new_field = FieldSchema::new_varchar("description", "added later", 256); | ||
| // The field must be nullable for schema evolution | ||
| // We set nullable via the proto conversion path | ||
| // Currently the FieldSchema struct does not expose nullable directly | ||
| // but the AddCollectionField RPC requires it. | ||
| // This test verifies the RPC call works; full nullable support is a follow-up. | ||
|
|
There was a problem hiding this comment.
This test currently does not exercise schema evolution: it creates _new_field but never calls client.add_collection_field(...), and the comments claim FieldSchema can’t set nullable even though FieldSchema::set_nullable() now exists. Update the test to set the new field nullable and actually invoke add_collection_field, then assert it appears in describe_collection/batch_describe_collections results.
| let _new_field = FieldSchema::new_varchar("description", "added later", 256); | |
| // The field must be nullable for schema evolution | |
| // We set nullable via the proto conversion path | |
| // Currently the FieldSchema struct does not expose nullable directly | |
| // but the AddCollectionField RPC requires it. | |
| // This test verifies the RPC call works; full nullable support is a follow-up. | |
| let new_field = FieldSchema::new_varchar("description", "added later", 256) | |
| .set_nullable(true); | |
| // Perform schema evolution by adding the new field | |
| client | |
| .add_collection_field(&collection_name, new_field.clone()) | |
| .await?; | |
| // Verify the new field appears in describe_collection results | |
| let described_schema = client.describe_collection(&collection_name).await?; | |
| assert!(described_schema.get_field("description").is_some()); | |
| // Verify the new field appears in batch_describe_collections results | |
| let described_schemas = client | |
| .batch_describe_collections(vec![collection_name.clone()]) | |
| .await?; | |
| assert_eq!(1, described_schemas.len()); | |
| assert!(described_schemas[0].get_field("description").is_some()); |
| /// Demonstrates new capabilities introduced in Milvus 2.6: | ||
| /// - COSINE metric with HNSW index | ||
| /// - Truncate collection | ||
| /// - Batch describe collections | ||
| /// - Schema evolution (add_collection_field) | ||
| /// - Partial upsert | ||
| /// - BM25 full-text search function | ||
| /// - Int8 vector field | ||
| /// - Timestamptz field |
There was a problem hiding this comment.
The module-level docs list “Schema evolution (add_collection_field)”, but this example doesn’t demonstrate calling add_collection_field/schema evolution anywhere. Either add a short schema-evolution demo or remove/update the bullet to avoid misleading readers.
| name: this.name, | ||
| description: this.description, | ||
| enable_dynamic_field: self.enable_dynamic_field, | ||
| functions: this.functions, | ||
| }) |
There was a problem hiding this comment.
In CollectionSchemaBuilder::build(), std::mem::replace(self, CollectionSchemaBuilder::new("".into(), "")) uses a String temporary where new() expects &str, which will not compile. Use string literals (e.g., CollectionSchemaBuilder::new("", "")) or otherwise pass &str references when resetting the builder.
src/schema.rs
Outdated
| @@ -604,6 +828,7 @@ impl CollectionSchemaBuilder { | |||
| name: this.name, | |||
| description: this.description, | |||
| enable_dynamic_field: self.enable_dynamic_field, | |||
There was a problem hiding this comment.
CollectionSchemaBuilder::build() uses enable_dynamic_field: self.enable_dynamic_field after self has been reset via std::mem::replace, so the built CollectionSchema will always have enable_dynamic_field = false even if enable_dynamic_field() was called. Use the moved-out builder state (this.enable_dynamic_field) instead.
| enable_dynamic_field: self.enable_dynamic_field, | |
| enable_dynamic_field: this.enable_dynamic_field, |
|
Issue #113 |
…e upsert compatibility Signed-off-by: Wei Zang <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pub async fn upsert<S, O>( | ||
| &self, | ||
| collection_name: S, | ||
| fields_data: Vec<FieldColumn>, | ||
| options: Option<InsertOptions>, | ||
| options: O, | ||
| ) -> Result<crate::proto::milvus::MutationResult> | ||
| where | ||
| S: Into<String>, | ||
| O: IntoUpsertOptions, | ||
| { | ||
| let options = options.unwrap_or_default(); | ||
| let options = options.into_upsert_options(); |
There was a problem hiding this comment.
Client::upsert now takes a generic options: O where O: IntoUpsertOptions. This makes common call sites like upsert(..., None) fail to compile due to type inference ambiguity (there are impls for both Option<InsertOptions> and Option<UpsertOptions>). Consider reverting the parameter to a concrete Option<UpsertOptions> (and keep From<InsertOptions> for UpsertOptions), or provide an additional overload/helper that preserves the previous Option<InsertOptions> signature while keeping None unambiguous.
src/collection.rs
Outdated
| pub async fn alter_collection_function<S>( | ||
| &self, | ||
| collection_name: S, | ||
| function_name: S, | ||
| function: proto::schema::FunctionSchema, | ||
| ) -> Result<()> | ||
| where | ||
| S: Into<String>, |
There was a problem hiding this comment.
alter_collection_function uses a single generic type parameter S for both collection_name and function_name, which forces callers to pass both arguments as the same concrete type (e.g., both &str), reducing ergonomics and causing avoidable type mismatches. Use separate generics (e.g., C: Into<String>, F: Into<String>) for these two parameters.
| pub async fn alter_collection_function<S>( | |
| &self, | |
| collection_name: S, | |
| function_name: S, | |
| function: proto::schema::FunctionSchema, | |
| ) -> Result<()> | |
| where | |
| S: Into<String>, | |
| pub async fn alter_collection_function<C, F>( | |
| &self, | |
| collection_name: C, | |
| function_name: F, | |
| function: proto::schema::FunctionSchema, | |
| ) -> Result<()> | |
| where | |
| C: Into<String>, | |
| F: Into<String>, |
src/collection.rs
Outdated
| pub async fn drop_collection_function<S>( | ||
| &self, | ||
| collection_name: S, | ||
| function_name: S, | ||
| ) -> Result<()> | ||
| where | ||
| S: Into<String>, | ||
| { |
There was a problem hiding this comment.
drop_collection_function uses a single generic type parameter S for both collection_name and function_name, forcing both arguments to have the same concrete type. This is unnecessarily restrictive for callers (e.g., mixing String and &str). Use separate generics for the two parameters.
| @@ -168,24 +172,55 @@ impl FieldSchema { | |||
| chunk_size: 0, | |||
| dim: 0, | |||
| max_length: 0, | |||
| nullable: false, | |||
| extra_type_params: HashMap::new(), | |||
| } | |||
| } | |||
|
|
|||
| #[deprecated(note = "use FieldSchema::empty() instead")] | |||
| pub fn const_default() -> Self { | |||
| Self::empty() | |||
| } | |||
There was a problem hiding this comment.
FieldSchema::const_default used to be a const fn but is now a regular function (and deprecated). This is a breaking change for downstream users that relied on calling it in const contexts. If API compatibility is a goal, consider keeping a const fn constructor (even if it can only initialize a minimal/default state) or introducing a new const-safe constructor and keeping the old one as const fn for a deprecation cycle.
src/data.rs
Outdated
| // Complex aggregate types: these represent the entire field data and are | ||
| // copied as a whole rather than pushed element by element. | ||
| (ValueVec::SparseFloat(dst), Value::SparseFloat(src)) => *dst = src.into_owned(), | ||
| (ValueVec::StructArray(dst), Value::StructArray(src)) => *dst = src.into_owned(), | ||
| (ValueVec::VectorArray(dst), Value::VectorArray(src)) => *dst = src.into_owned(), |
There was a problem hiding this comment.
For complex column types (SparseFloat, StructArray, VectorArray), FieldColumn::push overwrites the destination value (*dst = src.into_owned()) rather than appending a single row. This breaks search/query result slicing, because result construction pushes one value per hit; overwriting will leave the full original column (or the last assignment) instead of a per-hit subset and can make column lengths inconsistent with topk. Consider implementing per-row extraction/append semantics for these types (or updating result assembly to handle these types without per-row push).
| // Complex aggregate types: these represent the entire field data and are | |
| // copied as a whole rather than pushed element by element. | |
| (ValueVec::SparseFloat(dst), Value::SparseFloat(src)) => *dst = src.into_owned(), | |
| (ValueVec::StructArray(dst), Value::StructArray(src)) => *dst = src.into_owned(), | |
| (ValueVec::VectorArray(dst), Value::VectorArray(src)) => *dst = src.into_owned(), | |
| // Complex aggregate types: these represent the entire field data rather than a | |
| // single row. Pushing them element-by-element is not supported, because it would | |
| // overwrite the stored aggregate and break result slicing. | |
| (ValueVec::SparseFloat(_), Value::SparseFloat(_)) | |
| | (ValueVec::StructArray(_), Value::StructArray(_)) | |
| | (ValueVec::VectorArray(_), Value::VectorArray(_)) => { | |
| panic!( | |
| "per-row push is not supported for SparseFloat/StructArray/VectorArray; \ | |
| construct these fields as full aggregates instead" | |
| ) | |
| } |
|
Thanks for the contribution. I'll start to review the commits once I got time |
|
@richzw Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco. |
Signed-off-by: Wei Zang <[email protected]>
Signed-off-by: Wei Zang <[email protected]>
Summary
milvus-protosubmodule from v2.6.1 to v2.6.13 and regenerate all proto Rust filesNew Data Types
GeometryArray) / WKT (GeometryWktArray)TimestamptzArray)SparseFloatArrayValueenum: addedInt8Vector,Float16Vector,BFloat16Vector,Geometry,GeometryWkt,Timestamptz,SparseFloatvariantsValueVec: addedGeometry,GeometryWkt,Timestamptz,SparseFloat,StructArray,VectorArrayvariantsunimplemented!()panics remaining invalue.rsanddata.rsNew RPCs
truncate_collection()-- clear data without dropping collectionbatch_describe_collections()-- describe multiple collections in one calladd_collection_field()-- schema evolution (add nullable field)add/alter/drop_collection_function()-- server-side function managementrun_analyzer()-- test text analyzers, returns tokenized resultsSearch & Query Enhancements
add_function()onCollectionSchemaBuilder)SearchOptions::highlighter(),SearchResult::highlight_resultsnq > 1)Schema Improvements
FieldSchema::set_nullable()-- required for schema evolutionFieldSchema::add_type_param()-- extra params likeenable_analyzer,enable_matchCollectionSchemaBuilder::add_function()-- attach BM25/TextEmbedding/Rerank functions at creationextra_type_paramsround-trip through describe/create (not dropped on describe)is_function_outputIndex Types
Added:
INVERTED,SPARSE_INVERTED_INDEX,SPARSE_WAND,RTREE,AutoIndex,DiskANN,GpuIvfFlat,GpuIvfPQAdded metrics:
COSINE,BM25Test Plan
cargo build-- zero errors, deprecation warnings onlycargo test --lib-- 2/2 unit tests passcargo test --no-run-- all 21 test binaries compilecargo run --example milvus26_features-- all 7 demos pass against live Milvus 2.6.13: