These benchmarks show TopK's end-to-end query performance for nearest-neighbor search across different collection sizes and selectivity levels.
The metrics include median (p50), 95th percentile (p95), and 99th percentile (p99) latencies in milliseconds, as well as overall throughput in queries per second (QPS).
Selectivity refers to what fraction of the collection is scanned - from a full scan (100%) down to scanning just 1% of vectors.
Lower selectivity generally yields better performance but requires effective filtering or indexing strategies.
Selectivity | p50 (ms) | p95 (ms) | p99 (ms) | QPS |
---|---|---|---|---|
100% (Full Scan) | 6.8 | 9.1 | 13 | 54 |
10% (100K docs) | 5.25 | 7.3 | 10 | 61 |
1% (10K docs) | 4.9 | 6.7 | 9 | 63 |
Selectivity | p50 (ms) | p95 (ms) | p99 (ms) | QPS |
---|---|---|---|---|
100% (Full Scan) | 16.5 | 20 | 25 | 36.5 |
10% (1M docs) | 11 | 15 | 19 | 46 |
1% (100K docs) | 9.5 | 13 | 16 | 49 |
Selectivity | p50 (ms) | p95 (ms) | p99 (ms) | QPS |
---|---|---|---|---|
100% (Full Scan) | 24 | 34 | 41 | 27 |
10% (10M docs) | 18.5 | 29 | 34 | 34 |
1% (1M docs) | 17 | 27 | 33 | 36 |
Collection Size | p50 (ms) | p95 (ms) | p99 (ms) | QPS |
---|---|---|---|---|
1M | 6.8 | 9.1 | 13 | 54 |
10M | 16.5 | 20 | 25 | 36.5 |
100M | 24 | 34 | 41 | 27 |
Queries per second (QPS) achieved with different number of concurrent clients and single query path replica.
Achieving higher QPS can be easily achieved by provisioning more query path replicas.