Skip to main content

Using Convex with Airbyte

Airbyte is a data-integration platform that allows you to sync your Convex data into other databases. This can be useful for handling queries and workloads that aren't supported by Convex directly. Some example use cases include:

  1. Analytics
    • Convex isn't optimized for queries that load huge amounts of data. A data platform like Databricks or Snowflake is more appropriate.
  2. Search
    • Convex doesn't currently efficiently support queries that look for substrings in text fields. Databases like ElasticSearch are designed for these queries.
  3. Machine Learning training
    • Convex isn't optimized for queries running computationally intensive machine learning algorithms.

Using Airbyte allows you to replicate individual Convex tables into any of their supported destinations.

Connecting to Airbyte

Contact Convex via email or Discord to request Airbyte support for your account.

If you haven't done so, get started with Airbyte. You can use Airbyte Open Source free, or pay Airbyte to host the pipeline on their cloud.

Within Airbyte, create a new Source. Select "Convex" from the dropdown.

  1. Go to the Convex Dashboard.
  2. Find your project and click "Settings".
  3. Copy the "Deploy key".
  4. Paste the deploy key as access_key in the Airbyte form.
  5. On the dashboard Settings page, copy the "Deployment URL".
  6. Paste the deployment URL into the Airbyte form.
  7. Click "Set up source".

Now you have connected Convex to Airbyte! Follow prompts to connect Airbyte to a data destination, such as Databricks or Elasticsearch.

Choose a replication frequency, which is how often data will be synced out of Convex and into the destination.

Choose the Convex tables to sync over, and how it should be synced. Airbyte has several sync modes. Convex supports all of them and recommends "Incremental + Deduped History" to reduce sync times and make the destination reflect document edits and deletes.

Click "Set up connection" to start syncing. Once the data has synced, you can query your Convex data within the destination.