Benchmarks

These benchmarks show TopK's end-to-end query performance for hybrid vector search across different collection sizes and filter selectivity levels.

The metrics include median (p50), 95th percentile (p95), and 99th percentile (p99) latencies in milliseconds, as well as overall throughput in queries per second (QPS).

Selectivity refers to what fraction of the collection is scanned - from a full scan (100%) down to scanning just 1% of vectors. Lower selectivity generally yields better performance without impacting the quality of results.

1M Document Collection

Dense Vector Search

dim=768, k=10

0.00.51.01.52.02.53.03.54.04.5Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
0100200300400500600700800QPS 124816Concurrent Clients
QPS

1m3s

Ingest + index

Sparse Vector Search

doc_non_zero=512, query_non_zero=32, k=10

0.00.51.01.52.02.53.03.5Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
02004006008001,0001,2001,4001,6001,8002,0002,2002,400QPS 124816Concurrent Clients
QPS

6m

Ingest + index

10M Document Collection

Dense Vector Search

dim=768, k=10

024681012141618Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
020406080100120140160180200220240QPS 124816Concurrent Clients
QPS

10m30s

Ingest + index

Sparse Vector Search

doc_non_zero=512, query_non_zero=32, k=10

0246810121416Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
050100150200250300350400450500QPS 124816Concurrent Clients
QPS

15m50s

Ingest + index

100M Document Collection

Dense Vector Search

dim=768, k=10

05101520253035Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
0102030405060708090100110QPS 124816Concurrent Clients
QPS

1h44m

Ingest + index

Sparse Vector Search

doc_non_zero=512, query_non_zero=32, k=10

0246810121416Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
020406080100120140160180200220QPS 124816Concurrent Clients
QPS

2h52m

Ingest + index

1B Document Collection

Dense Vector Search

dim=768, k=10

05101520253035404550556065Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
05101520253035QPS 124Concurrent Clients
QPS

17h12m

Ingest + index

Sparse Vector Search

doc_non_zero=512, query_non_zero=32, k=10

05101520253035404550Latency (ms)100%10%1%Selectivity (%)
p50 (ms)
p95 (ms)
p99 (ms)
0102030405060708090QPS 124816Concurrent Clients
QPS

29h50m

Ingest + index