Connect to Athena
Connection Modes: Direct
Connection
To connect to Amazon Athena, select Add Datasource
on the datasources
overview page to open the Create Datasource
dialog and select
Amazon Athena
as the database type.
The Name
is a required human-friendly identifier or description of the
datasource in Cluvio. Datasource names need not be unique but we recommend
to give each datasource a unique and meaningful name for ease of identification,
especially if your organization uses multiple datasources.
The AWS Region
must contain the code of the AWS region
that contains the Workgroup
that you want to use with Cluvio.
The S3 Bucket
must be a S3 URI of the form s3://<bucket-name>
and identifies
the S3 bucket that is used to store query results.
The AWS Access Key
and AWS Secret Key
must refer to valid a valid access key
and secret of an IAM user. The access key must
have permission to use Amazon Athena in the configured AWS Region
, as well as
permission to read from the S3 buckets that contain the data you want to query
through Cluvio. It must also have permission to store results in the configured S3 Bucket
.
The IAM permissions granted to the IAM user whose access key is configured in Cluvio should have only the minimal permissions needed for Cluvio to query and store data. See also Identity and Access Management in Athena.
When you have entered all the required information, select Test Connection
to
check that Cluvio can connect to Athena. The connection test will report
errors if the connection fails. See Troubleshooting for
common problems.
Configuration
The Configuration
tab of the datasource dialog shows settings that
affect the datasource's behavior.
The Data Time Zone
defaults to UTC
and is the time zone that Cluvio assumes
for any timestamps returned from queries that do not contain time zone
information. See Data Time Zone for details.
The Maximum number of concurrent query executions
control the maximum
concurrency that Cluvio allows for the datasource. This setting can be used to
control the maximum load on your database. The default is 20
.
The toggle Update schema nightly
controls whether Cluvio queries your database
schema nightly to ensure that the almanach in the
report editor has up-to-date information on your database schema. Together with
fetching schema information, Cluvio also tries to retrieve approximate row
counts in each table. If you disable nightly schema updates, the almanach is
only updated when you manually trigger a schema refresh on the datasource from
the Cluvio datasources overview. Nightly schema updates are enabled by default.
The toggle Update exact row counts
is only available when Update schema nightly
is enabled. This setting controls whether Cluvio will determine exact
row counts for every table in your database schema. Row counts are shown in the
report editor almanach. Determining exact row counts
usually involves issuing a COUNT(*)
query on each table, which may cause
undesirable load on your database. You can disable this setting to avoid these
nightly queries. When disabled, the tables in the report editor almanach may not
show row count information if the database does not provide approximate row
counts.
Troubleshooting
If you need help connecting to your Amazon Athena, please contact support@cluvio.com
.