Full Text Search
Full text search allows you to find Convex documents that approximately match a search query.
Unlike normal document queries, search queries look within a string field to find the keywords. This can be used to build search features within your app like searching for messages that contain certain words.
Search queries are automatically reactive, consistent, transactional, and work seamlessly with pagination. They even include new documents created with a mutation!
Example: Search App
To use full text search you need to:
- Define a search index.
- Run a search query.
Search is currently a beta feature. If you have feedback or feature requests, let us know on Discord!
Defining search indexes
Like database indexes, search indexes are a data structure that is built in advance to enable efficient querying. Search indexes are defined as part of your Convex schema.
Every search index definition consists of:
- A name.
- Must be unique per table.
- A
searchField
- This is the field which will be indexed for full text search.
- It must be of type
string
.
- [Optional] A list of
filterField
s- These are additional fields that are indexed for fast equality filtering within your search index.
To add a search index onto a table, use the
searchIndex
method on your
table's schema. For example, if you want an index which can search for messages
matching a keyword in a channel, your schema could look like:
import { defineSchema, defineTable } from "convex/server";
import { v } from "convex/values";
export default defineSchema({
messages: defineTable({
body: v.string(),
channel: v.string(),
}).searchIndex("search_body", {
searchField: "body",
filterFields: ["channel"],
}),
});
You can specify search and filter fields on nested documents by using a
dot-separated path like properties.name
.
Running search queries
A query for "10 messages in channel '#general' that best match the query 'hello hi' in their body" would look like:
const messages = await ctx.db
.query("messages")
.withSearchIndex("search_body", (q) =>
q.search("body", "hello hi").eq("channel", "#general"),
)
.take(10);
This is just a normal database read that begins by querying the search index!
The
.withSearchIndex
method defines which search index to query and how Convex will use that search
index to select documents. The first argument is the name of the index and the
second is a search filter expression. A search filter expression is a
description of which documents Convex should consider when running the query.
A search filter expression is always a chained list of:
- 1 search expression against the index's search field defined with
.search
. - 0 or more equality expressions against the index's filter fields defined with
.eq
.
Search expressions
Search expressions are issued against a search index, filtering and ranking documents by their relevance to the search expression's query. Internally, Convex will break up the query into separate words (called terms) and approximately rank documents matching these terms.
In the example above, the expression search("body", "hello hi")
would
internally be split into "hi"
and "hello"
and matched against words in your
document (ignoring case and punctuation).
The behavior of search incorporates fuzzy and prefix matching rules.
Equality expressions
Unlike search expressions, equality expressions will filter to only documents
that have an exact match in the given field. In the example above,
eq("channel", "#general")
will only match documents that have exactly
"#general"
in their channel
field.
Equality expressions support fields of any type (not just text).
To filter to documents that are missing a field, use
q.eq("fieldName", undefined)
.
Other filtering
Because search queries are normal database queries, you can also
filter results using the
.filter
method!
Here's a query for "messages containing 'hi' sent in the last 10 minutes":
const messages = await ctx.db
.query("messages")
.withSearchIndex("search_body", (q) => q.search("body", "hi"))
.filter((q) => q.gt(q.field("_creationTime", Date.now() - 10 * 60000)))
.take(10);
For performance, always put as many of your filters as possible into
.withSearchIndex
.
Every search query is executed by:
- First, querying the search index using the search filter expression in
withSearchIndex
. - Then, filtering the results one-by-one using any additional
filter
expressions.
Having a very specific search filter expression will make your query faster and less likely to hit Convex's limits because Convex will use the search index to efficiently cut down on the number of results to consider.
Retrieving results and paginating
Just like ordinary database queries, you can
retrieve the results using
.collect()
,
.take(n)
,
.first()
, and
.unique()
.
Additionally, search results can be paginated
using
.paginate(paginationOpts)
.
Note that collect()
will throw an exception if it attempts to collect more
than the limit of 1024 documents. It is often better to pick a smaller limit and
use take(n)
or paginate the results.
Ordering
Search queries always return results in relevance order based on how well the document matches the search query. Different ordering of results are not supported.
Search Behavior
Fuzzy and Prefix Search
Convex full-text search is designed to power as-you-type search experiences. To achieve this, full-text search automatically applies fuzzy and prefix matching rules to search terms! This means that documents matched by a search query do not necessarily need to exactly match any of the query's terms.
Depending on the length of search terms, a fixed number of typos is permitted between matches. Typos are defined in terms of Levenshtein distance. The specific typo-tolerance rules are:
- Terms with length
<=
4 allow no typos - Terms with 5
<
length<=
8 allow 1 typo - Terms with length
>
8 allow 2 typos
For example, the expression search("body", "hello something")
would match the
following documents:
"hillo"
"somethan"
"hallo somethan"
"I left something in my car"
In addition to typo-tolerance, the final search term has prefix search
enabled, matching any term that is a prefix of the original term. For example,
the expression search("body", "r")
would match the documents:
"rabbit"
"Rakeeb searches"
"send request"
- ...
Relevance order
Relevance order is subject to change. The relevance of search results and the exact typo-tolerance rules Convex applies is subject to change to improve the quality of search results.
Search queries return results in relevance order. Internally, Convex ranks the relevance of a document based on a combination of its BM25 score and several other criteria such as the number of typos of matched terms in the document, the proximity of matches, the number of exact matches, and more. The BM25 score takes into account:
- How many words in the search query appear in the field?
- How many times do they appear?
- How long is the text field?
If multiple documents have the same score, the newest documents are returned first.
Limits
Search indexes must have:
- Exactly 1 search field.
- Up to 16 filter fields.
Search indexes count against the limit of 32 indexes per table.
Search queries can have:
- Up to 16 terms (words) in the search expression.
- Up to 8 filter expressions.
Additionally, search queries can scan up to 1024 results from the search index.
For information on other limits, see here.
Coming soon
We plan to expand Convex full text search to include:
- Snippeting (returning the matched positions in the search text)
- Faceted search (counting the number of results per filter field value)
- Additional languages
- Searching across multiple fields
If any of these features is important for your app, let us know on Discord!