These benchmarks show TopK's end-to-end query performance for hybrid vector search across different collection sizes and filter selectivity levels.
The metrics include median (p50), 95th percentile (p95), and 99th percentile (p99) latencies in milliseconds, as well as overall throughput in queries per second (QPS).
Selectivity refers to what fraction of the collection is scanned - from a full scan (100%) down to scanning just 1% of vectors. Lower selectivity generally yields better performance without impacting the quality of results.
dim=768, k=10
1m3s
Ingest + index
doc_non_zero=512, query_non_zero=32, k=10
6m
Ingest + index
dim=768, k=10
10m30s
Ingest + index
doc_non_zero=512, query_non_zero=32, k=10
15m50s
Ingest + index
dim=768, k=10
1h44m
Ingest + index
doc_non_zero=512, query_non_zero=32, k=10
2h52m
Ingest + index
dim=768, k=10
17h12m
Ingest + index
doc_non_zero=512, query_non_zero=32, k=10
29h50m
Ingest + index