Postgres is the lingua franca of databases. Decades of tools, drivers, and workflows are built around its wire protocol: psql, psycopg2, node-postgres, tokio-postgres, ORMs, dashboards, and more. They all expect a Postgres-shaped server on the other end. Until now, TopK was available only through our Python, JavaScript, and Rust SDKs.
Now it speaks Postgres too. Any client that talks to Postgres can connect to TopK and get state-of-the-art search quality as ordinary SQL:
SELECTtitle,semantic_similarity(bio, 'an epic fantasy quest') AS scoreFROM booksWHERE match_any(bio, 'dragon wizard')ORDER BY boost(score, published_year > 2010, 1.5) DESCLIMIT 10
Semantic search, keyword filtering, and metadata-aware ranking in one query, over a standard Postgres connection.
TopK SQL Specification
TopK SQL is a search-oriented dialect: Postgres-shaped where that helps, extended where search needs more. It supports schemaless tables with vector, sparse, and multi-vector types; SELECT with semantic, keyword, vector, and hybrid scoring; standard WHERE predicates plus search filters; INSERT / UPDATE / DELETE; and a Postgres-compatible wire protocol. For the full language reference, see the TopK SQL overview.
1.1 Schema(less)
Schemaless by default. A table has no fixed schema: rows can contain undeclared fields, a column can hold values of different types, and undeclared fields remain queryable and filterable. Declare a field when you want to index it or constrain its type.
CREATE TABLE defines the table and declared columns — indexes are declared inline on each column. DROP TABLE removes the table.
CREATE TABLE books (title TEXT,published_year INTEGER,bio TEXT INDEX semantic_index(),embedding f32_vector(768) INDEX vector_index(metric = 'cosine'));
Declared columns are not the full document. Rows can include fields that never appeared in CREATE TABLE:
INSERT INTO books (_id, title, rating, tags)VALUES ('earthsea', 'A Wizard of Earthsea', 4.8, ARRAY['fantasy', 'magic']);SELECT title, ratingFROM booksWHERE contains(tags, 'magic');
rating and tags were never declared; they are still stored, returned, and filterable.
1.2 Types
Standard Postgres scalar types (BOOLEAN, INT, FLOAT, TEXT, BYTEA, and JSONB) are supported, along with typed arrays such as BOOLEAN[], INT[], FLOAT[], and TEXT[]. We extend the type system with native support for the following vector and matrix shapes:
| Shape | Type | Precisions |
|---|---|---|
| Dense | *_vector(dim) | f32 f16 f8 u8 i8 |
| Sparse | *_sparse_vector | f32 f16 f8 u8 i8 |
| Multi-vector | *_matrix(dim) | f32 f16 f8 u8 i8 |
| Binary | binary_vector(dim) | 1-bit |
Vector values can be constructed with a JSON-string cast ('[...]'::f32_vector) or a constructor (f32_vector(ARRAY[...])).
1.3 SELECT queries
Search queries in TopK SQL are SELECT statements built around scores. Semantic search, vector search, BM25, and multi-vector retrieval each produce scores that can be selected, aliased, combined, boosted, and sorted.
SELECT_id, title,semantic_similarity(bio, 'tales of magic and adventure') AS scoreFROM booksORDER BY score DESCLIMIT 10;
The basic shape is: compute a relevance score, sort by it, and return the top results. From there, search and ranking can be tuned through composition: filter in WHERE, expose multiple scores in SELECT, and combine them in ORDER BY:
SELECT _id, title,bm25_score() AS keyword_score,semantic_similarity(bio, 'an epic fantasy quest') AS semantic_score,vector_distance(embedding, '[...]'::f32_vector) AS vector_scoreFROM booksWHERE match_any(bio, 'dragon wizard')AND published_year > 1950ORDER BY0.2 * keyword_score +0.5 * semantic_score +0.3 * vector_score DESCLIMIT 10;
This is hybrid search without multiple queries, client-side fusion, or reciprocal-rank fusion. Keyword matching contributes lexical relevance, semantic similarity contributes meaning, the vector score contributes similarity against your own embeddings, and the final ranking is a single SQL expression.
1.3.1 Search functions
TopK SQL exposes retrieval modes as scoring functions. Each function targets an index and returns a score. See the TopK SQL overview for the complete list of search functions and index types.
| Function | Index | Use it for |
|---|---|---|
semantic_similarity(field, query) | semantic_index | query embedding, candidate generation, and reranking with Iso-ModernColBERT |
vector_distance(field, vector) | vector_index | dense or sparse ANN against client-supplied vectors |
multi_vector_distance(field, matrix) | multi_vector_index | late-interaction MaxSim retrieval |
bm25_score() | keyword_index | keyword relevance from match_any(...) / match_all(...) predicates |
1.3.2 Filtering
Filters narrow the candidate set before ranking. TopK SQL supports standard predicates — comparisons, membership, text predicates, and regex — plus search-specific predicates.
WHERE published_year > 2000AND in_print = trueAND genre IN ('fantasy', 'fiction')AND match_any(bio, 'dragon wizard')
Those predicates can be ordinary metadata filters, text search predicates, regexes, list checks, or keyword predicates (such as match_any()).
1.3.3 Scoring
Scores are ordinary values. Alias them in SELECT, then combine them with arithmetic or ranking functions in ORDER BY.
The composition example above adds keyword, semantic, and vector scores directly. You can also fold metadata into the ranking expression:
ORDER BY boost(semantic_score, published_year > 2010, 1.5) DESC
Ranking stays inside the query: retrieval scores and metadata signals combine in one ordered expression instead of being merged in application code.
1.4 INSERT / UPDATE / DELETE
TopK SQL supports the same write operations you expect from Postgres. INSERT writes a full document, including undeclared fields, and has upsert semantics: inserting an existing _id replaces the document.
INSERT INTO books (_id, title, published_year, tags)VALUES ('hobbit', 'The Hobbit', 1937, ARRAY['fantasy', 'adventure']);
UPDATE changes fields on documents identified by _id — either _id = '...' or _id IN (...):
UPDATE books SET in_print = true WHERE _id = 'hobbit';
DELETE removes the documents matched by a filter:
DELETE FROM books WHERE published_year < 1900;
Unlike UPDATE, DELETE accepts the same filter expressions as SELECT, so you can delete by ID or by any predicate.
1.5 Protocol
The SQL layer speaks the Postgres wire protocol, so standard clients can connect without custom adapters. psql, application drivers, prepared statements, and dashboard tools can all use the same endpoint.
TopK implements both simple and extended query modes. Connecting to TopK SQL requires only an API key in the connection password field — see the TopK SQL overview for connection setup.
psql "host=elastica.sql.topk.io password=<api-key>"
The host must be set to your desired region, in the format <region>.sql.topk.io. See the full list of supported regions at docs.topk.io/regions.
1.5.1 Type resolution
One part of the Postgres protocol does matter for a schemaless database: column types are sent before rows. In Postgres this is natural because every selected column has a known type. In TopK, undeclared fields may not.
topk-sql tries to infer the type of each SELECT column in order:
SELECT column├─ explicit cast? ──► Postgres OID (::int4, ::float8, ::text)├─ declared column? ──► type inferred from schema└─ unknown/mixed type? ──► JSON
Standard Postgres drivers deserialize JSON values into native maps and lists. Use an explicit :: cast when you want a concrete wire type.
1.5.2 Table catalog
Existing tables and their schemas can be inspected through information_schema.tables and information_schema.columns virtual tables. EXPLAIN returns the TopK query produced by the SQL parser, so you can see what will run before executing it.
1.6 Implementation
The parser, topk-sql, is open source in github.com/topk-io/topk, alongside topk-py, topk-js, and topk-rs. Like the SDKs, it is a thin mapping over the engine: it parses Postgres-flavored SQL into a TopK query rather than implementing a separate query planner, so it provides the same semantics and benefits from all optimizations we make to our planning and execution pipeline.
Pick your interface
SQL is a thin wrapper around TopK, not a second implementation of it. It's one more way in, next to the SDKs, mapping onto the same engine. The query you'd write in Python and the same query in SQL resolve to the same plan.
Reach for whichever fits where you're working: a notebook, a service, a dashboard, a psql prompt, or any JDBC-compatible tool. That means TopK can plug into the existing SQL ecosystem — from BI dashboards to federated query engines and warehouses.
Start today at console.topk.io, read the TopK SQL overview, or browse the parser source at github.com/topk-io/topk.