Skip to main content

Connect to CrateDB

Connection Modes: DirectSSH TunnelAgents

Connection

Cluvio supports connecting to self-hosted as well as managed CrateDB databases such as in CrateDB Cloud. To connect to a CrateDB database, select Add Datasource on the datasources overview page to open the Create Datasource dialog and select CrateDB as the database type.

image-600 image-600

The Name is a required human-friendly identifier or description of the datasource in Cluvio. Datasource names need not be unique but we recommend to give each datasource a unique and meaningful name for ease of identification, especially if your organization uses multiple datasources.

The Host is required and must contain a DNS name or IP address. The Port defaults to the CrateDB standard port 5432 but can be changed if your CrateDB server is listening on a custom port.

The Database Name must contain the name of the database to connect to. Cluvio will fetch schema information for this database in order to populate the almanach in the report editor.

The Username and Password are used by Cluvio to authenticate the connections to your database. The database user must have access to the database specified as the Database Name. The password is required if the database user is configured to require password authentication. When using the Direct connection mode, password authentication together with a verified SSL/TLS connection is strongly recommended in order to secure your database connection.

The Connection Mode specifies how Cluvio connects to your database. A direct connection should generally require SSL/TLS as well as certificate verification. For certificate verification to succeed, the database server certificate must be issued by a well-known certificate provider whose root certificates are known to Cluvio. The list of trusted root certificates is regularly updated.

If you use an SSH tunnel, configure the tunnel user, host and port accordingly. If you use a Cluvio Agent select one or more agents through which Cluvio will connect to your database.

When you have entered all the required information, select Test Connection to check that Cluvio can connect to your database. The connection test will report errors if the connection fails. See Troubleshooting for common problems.

Cloud Databases

To establish a direct connection to a CrateDB instance running in managed cloud infrastructure, the database must have a public IP address and allow inbound connections from Cluvio's static IP addresses.

To establish a connection through an SSH tunnel, the database server must allow inbound connections from the SSH server, which should itself be in a secure private network and allow inbound connections from Cluvio's static IP addresses.

To establish a connection through a Cluvio Agent, all that is required is that the Cluvio agent can establish outbound TCP connections and that the host running the Cluvio Agent is in a secure private network with access to the database.

Local Databases

Cluvio is generally used with databases running on servers, whether on cloud providers (AWS, Azure, Google Cloud, ...) or within your own server infrastructure. However, you can also connect Cluvio to a database running on your PC or laptop by installing and running a Cluvio Agent locally. A Cluvio Agent is open-source software that can run on your local computer and connects to Cluvio over HTTPS. Note that the SQL queries run in Cluvio can only succeed if your computer is running and the Cluvio Agent is running and connected to Cluvio.

Configuration

The Configuration tab of the datasource dialog shows settings that affect the datasource's behavior.

image-600 image-600

The Data Time Zone defaults to UTC and is the time zone that Cluvio assumes for any timestamps returned from queries that do not contain time zone information. See Data Time Zone for details.

The Maximum number of concurrent query executions control the maximum concurrency that Cluvio allows for the datasource. This setting can be used to control the maximum load on your database. The default is 20.

The toggle Update schema nightly controls whether Cluvio queries your database schema nightly to ensure that the almanach in the report editor has up-to-date information on your database schema. Together with fetching schema information, Cluvio also tries to retrieve approximate row counts in each table. If you disable nightly schema updates, the almanach is only updated when you manually trigger a schema refresh on the datasource from the Cluvio datasources overview. Nightly schema updates are enabled by default.

The toggle Update exact row counts is only available when Update schema nightly is enabled. This setting controls whether Cluvio will determine exact row counts for every table in your database schema. Row counts are shown in the report editor almanach. Determining exact row counts usually involves issuing a COUNT(*) query on each table, which may cause undesirable load on your database. You can disable this setting to avoid these nightly queries. When disabled, the tables in the report editor almanach may not show row count information if the database does not provide approximate row counts.

Troubleshooting

Firewalls

The most common connectivity problems with CrateDB databases result from firewall configurations when using a direct connection or an SSH tunnel. Please make sure that your firewall (or "security group" in AWS) permits inbound connections from Cluvio's static IP addresses.