Connect to Presto
Connection Modes: Direct — SSH Tunnel — Agents
Connection
To connect to a Presto database server, select Add Datasource
on the datasources
overview page to open the Create Datasource
dialog and select
Presto
as the database type.
The Name
is a required human-friendly identifier or description of the
datasource in Cluvio. Datasource names need not be unique but we recommend
to give each datasource a unique and meaningful name for ease of identification,
especially if your organization uses multiple datasources.
The Host
is required and must contain a DNS name or IP address. The Port
defaults to the Presto standard port 8080
but can be changed if your
Presto server is listening on a custom port.
The Database Name
must contain the name of the database to connect to. Cluvio
will fetch schema information for this database in order to populate the
almanach in the report editor.
The Username
and Password
are used by Cluvio to authenticate the connections
to your database. The database user must have access to the database specified
as the Database Name
. The password is required if the database user is
configured to require password authentication. When using the Direct
connection mode, password authentication together with a verified SSL/TLS
connection is strongly recommended in order to secure your database connection.
The Connection Mode
specifies how Cluvio connects to your database. A direct
connection should generally require SSL/TLS as well as
certificate verification. For certificate verification to succeed, the database
server certificate must be issued by a well-known certificate provider whose
root certificates are known to Cluvio. The list of trusted root certificates is
regularly updated.
If you use an SSH tunnel, configure the tunnel user, host and port accordingly. If you use a Cluvio Agent select one or more agents through which Cluvio will connect to your database.
When you have entered all the required information, select Test Connection
to
check that Cluvio can connect to your database. The connection test will report
errors if the connection fails. See Troubleshooting for
common problems.
Cloud Databases
To establish a direct connection to a Presto instance running in managed cloud infrastructure, like Amazon EC2, the server must have a public IP address and allow inbound connections from Cluvio's static IP addresses.
To establish a connection through an SSH tunnel, the database server must allow inbound connections from the SSH server, which should itself be in a secure private network and allow inbound connections from Cluvio's static IP addresses.
To establish a connection through a Cluvio Agent, all that is required is that the Cluvio agent can establish outbound TCP connections and that the host running the Cluvio Agent is in a secure private network with access to the database.
Local Databases
Cluvio is generally used with databases running on servers, whether on cloud providers (AWS, Azure, Google Cloud, ...) or within your own server infrastructure. However, you can also connect Cluvio to a database running on your PC or laptop by installing and running a Cluvio Agent locally. A Cluvio Agent is open-source software that can run on your local computer and connects to Cluvio over HTTPS. Note that the SQL queries run in Cluvio can only succeed if your computer is running and the Cluvio Agent is running and connected to Cluvio.
Configuration
The Configuration
tab of the datasource dialog shows settings that
affect the datasource's behavior.
The Data Time Zone
defaults to UTC
and is the time zone that Cluvio assumes
for any timestamps returned from queries that do not contain time zone
information. See Data Time Zone for details.
The Maximum number of concurrent query executions
control the maximum
concurrency that Cluvio allows for the datasource. This setting can be used to
control the maximum load on your database. The default is 20
.
The toggle Update schema nightly
controls whether Cluvio queries your database
schema nightly to ensure that the almanach in the
report editor has up-to-date information on your database schema. Together with
fetching schema information, Cluvio also tries to retrieve approximate row
counts in each table. If you disable nightly schema updates, the almanach is
only updated when you manually trigger a schema refresh on the datasource from
the Cluvio datasources overview. Nightly schema updates are enabled by default.
The toggle Update exact row counts
is only available when Update schema nightly
is enabled. This setting controls whether Cluvio will determine exact
row counts for every table in your database schema. Row counts are shown in the
report editor almanach. Determining exact row counts
usually involves issuing a COUNT(*)
query on each table, which may cause
undesirable load on your database. You can disable this setting to avoid these
nightly queries. When disabled, the tables in the report editor almanach may not
show row count information if the database does not provide approximate row
counts.
Troubleshooting
Firewalls
The most common connectivity problems with Presto databases result from firewall configurations when using a direct connection or an SSH tunnel. Please make sure that your firewall (or "security group" in AWS) permits inbound connections from Cluvio's static IP addresses.