Skip to content

Are we undercounting RAM used by vectors buffered in an in-memory segment? #15901

@mikemccand

Description

@mikemccand

I'm looking at a fun segment trace of a nearly pure flat (no HNSW graph) vectors Lucene index (production index for Amazon product search) and see weirdly that the flushed segments show ~112% RAM efficiency.

RAM efficiency measures size-on-disk of the newly flushed segment against size-in-ram before it was flushed. For postings heavy indices it's usually poor, maybe 20-30%, because in-RAM postings storage has overhead because they need to grow/realloc as new documents are indexed, appending integers to tons of tiny int[] virtual slices (one or two or three per unique term in the document being inverted). On disk those become well packed.

But for flat vectors, I think we just buffer the vectors in RAM until flush, so the RAM accounting ought to be simple?

This is with 7-bit quantization (aside: why do we now have signed 7 bit, and unsigned 8 bit, KnnVectorsFormat options? is there really a difference? oh hmm the javadocs says its for backwards compatibility ... shouldn't it (7 bit) be marked deprecated (new Lucene indices shouldn't choose it)?), and @msokolov pointed out we increase index size to store the one-byte-per-dimension quantized vectors in addition to the float32 vectors. But then I would have expected 125% RAM efficiency? So I'm confused why I see halfway (~112%)...

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions