Skip to main content

Connect to Athena

Connection Modes: Direct

Connection

To connect to Amazon Athena, select Add Datasource on the datasources overview page to open the Create Datasource dialog and select Amazon Athena as the database type.

image-600 image-600

The Name is a required human-friendly identifier or description of the datasource in Cluvio. Datasource names need not be unique but we recommend to give each datasource a unique and meaningful name for ease of identification, especially if your organization uses multiple datasources.

The AWS Region must contain the code of the AWS region that contains the Workgroup that you want to use with Cluvio.

The S3 Bucket must be a S3 URI of the form s3://<bucket-name> and identifies the S3 bucket that is used to store query results.

The AWS Access Key and AWS Secret Key must refer to valid a valid access key and secret of an IAM user. The access key must have permission to use Amazon Athena in the configured AWS Region, as well as permission to read from the S3 buckets that contain the data you want to query through Cluvio. It must also have permission to store results in the configured S3 Bucket.

IAM Permissions

The IAM permissions granted to the IAM user whose access key is configured in Cluvio should have only the minimal permissions needed for Cluvio to query and store data. See also Identity and Access Management in Athena.

When you have entered all the required information, select Test Connection to check that Cluvio can connect to Athena. The connection test will report errors if the connection fails. See Troubleshooting for common problems.

Configuration

The Configuration tab of the datasource dialog shows settings that affect the datasource's behavior.

image-600 image-600

The Data Time Zone defaults to UTC and is the time zone that Cluvio assumes for any timestamps returned from queries that do not contain time zone information. See Data Time Zone for details.

The Maximum number of concurrent query executions control the maximum concurrency that Cluvio allows for the datasource. This setting can be used to control the maximum load on your database. The default is 20.

The toggle Update schema nightly controls whether Cluvio queries your database schema nightly to ensure that the almanach in the report editor has up-to-date information on your database schema. Together with fetching schema information, Cluvio also tries to retrieve approximate row counts in each table. If you disable nightly schema updates, the almanach is only updated when you manually trigger a schema refresh on the datasource from the Cluvio datasources overview. Nightly schema updates are enabled by default.

The toggle Update exact row counts is only available when Update schema nightly is enabled. This setting controls whether Cluvio will determine exact row counts for every table in your database schema. Row counts are shown in the report editor almanach. Determining exact row counts usually involves issuing a COUNT(*) query on each table, which may cause undesirable load on your database. You can disable this setting to avoid these nightly queries. When disabled, the tables in the report editor almanach may not show row count information if the database does not provide approximate row counts.

Troubleshooting

If you need help connecting to your Amazon Athena, please contact support@cluvio.com.