Use Cases

Retrieval Augmented Generation

Ask questions over PDFs, scans, manuals, policies, and filings, and get answers tied back to the exact source. TopK handles parsing, OCR, embeddings, retrieval, and answer synthesis behind two primitives: upload and ask.

Producing an answer is easy. Making it reliable is the hard part.

Ask most tools a question and you'll always get an answer — confident, polished, and sometimes wrong. When the system misses the one place that holds the real answer, it fills the gap with something plausible, and nothing tells you it did. For the documents your business runs on — contracts, filings, policies, manuals — an answer no one can check is worse than no answer at all. The answer only matters if someone can verify it.

With TopK, a question goes in and a grounded, cited answer comes out:

Question

Understanding query
1Sub-query
2Sub-query
Generating answer
12
Answer

NVIDIA grew 265% YoY to $22.1B in Q4 FY2024, far outpacing AMD's 24% growth to $7.7B:

  • NVIDIA Q4 FY2024: $22.1B revenue, +265% YoY [1]
  • AMD Q4 2024: $7.7B revenue, +24% YoY [2]
  • NVIDIA Data Center: $18.4B, +409% YoY — driven by AI chip demand [1]
  • AMD Data Center: $3.9B, +69% YoY [2]

TopK handles the parts that usually make document Q&A hard:

Build it with upload and ask

Whether you are building a search box, chat UI, workflow assistant, or internal agent, the flow is the same: upload your documents, then ask questions over them.

First, bring your documents into TopK. It parses, OCRs, and embeds every file on ingest — PDFs, scanned documents, and images, down to the tables and figures inside them — and they're ready to query shortly after.

topk upload --dataset service-manuals \
"manuals/**/*.pdf"

Then ask, in plain language. Every answer comes back grounded in your documents, with its citations attached.

topk ask --dataset service-manuals \
"What are the PPE requirements for liquid oxygen and nitrogen servicing?"

Finally, get an answer you can verify. It comes back as structured facts, the exact source each one cites, and a confidence score — so every claim traces back to the document it came from:

{
"facts": [
{
"fact": "Wear a face shield or hard-hat shield combination for eye protection.",
"ref_ids": ["1"]
},
{
"fact": "Use a leather welder's gauntlet, or cloth gloves with inserts.",
"ref_ids": ["2"]
}
],
"refs": {
"1": { "doc_name": "LOX_LN2_Servicing_Manual.pdf", "doc_pages": [31] },
"2": { "doc_name": "Cryogenic_Safety_Procedures.pdf", "doc_pages": [12] }
},
"confidence": 96.0
}

These examples use the CLI, but the same upload and ask flow is available in the Python SDK, TypeScript SDK, and MCP server. Pick the surface that fits your app; the document Q&A workflow stays the same.

For questions that span many documents, --mode research turns it into an agent that plans its approach, works through your documents, and even runs the numbers itself — building up cited facts step by step before it answers.

Measurably more accurate

No tuning, no custom pipeline. On the Vidore V3 benchmark — judged by GPT-5 across four industry verticals — TopK File Search returns more accurate answers than Gemini File Search and Amazon Bedrock Knowledge Base in every category.

File Search87%
Gemini File Search77.74%
Bedrock Knowledge Base71.38%

Answer accuracy judged by GPT-5 on Vidore V3 Industrial

See full benchmarks

Reliable document Q&A usually means owning the retrieval stack. With TopK, the core product loop is just upload and ask: ingest the source material, ask in plain language, and return answers with citations, confidence, and the controls needed to run it in production.

Build AI your users can trust.