Connect to MongoDB Atlas
Connection Modes: Direct
Connection
Cluvio supports connecting to MongoDB Atlas SQL federated database
instances.
Select Add Datasource
on the datasources overview page to open the
Create Datasource
dialog and select MongoDB Atlas
as the database type.
The Name
is a required human-friendly identifier or description of the
datasource in Cluvio. Datasource names need not be unique but we recommend
to give each datasource a unique and meaningful name for ease of identification,
especially if your organization uses multiple datasources.
The Host
is required and must contain the DNS name of your MongoDB Atlas
federated database instance.
The Database
must contain the name of the database to connect to. Cluvio
will fetch schema information for this database in order to populate the
almanach in the report editor.
The Username
and Password
are used by Cluvio to authenticate the connections
to your database. The database user must have access to the database specified
as the Database
. A password is required to secure the connection.
Cluvio always uses direct connections with verified SSL/TLS
to connect to MongoDB Atlas.
When you have entered all the required information, select Test Connection
to
check that Cluvio can connect to your database. The connection test will report
errors if the connection fails. See Troubleshooting for
common problems.
Configuration
The Configuration
tab of the datasource dialog shows settings that
affect the datasource's behavior.
The Data Time Zone
defaults to UTC
and is the time zone that Cluvio assumes
for any timestamps returned from queries that do not contain time zone
information. See Data Time Zone for details.
The Maximum number of concurrent query executions
control the maximum
concurrency that Cluvio allows for the datasource. This setting can be used to
control the maximum load on your database. The default is 20
.
The toggle Update schema nightly
controls whether Cluvio queries your database
schema nightly to ensure that the almanach in the
report editor has up-to-date information on your database schema. Together with
fetching schema information, Cluvio also tries to retrieve approximate row
counts in each table. If you disable nightly schema updates, the almanach is
only updated when you manually trigger a schema refresh on the datasource from
the Cluvio datasources overview. Nightly schema updates are enabled by default.
The toggle Update exact row counts
is only available when Update schema nightly
is enabled. This setting controls whether Cluvio will determine exact
row counts for every table in your database schema. Row counts are shown in the
report editor almanach. Determining exact row counts
usually involves issuing a COUNT(*)
query on each table, which may cause
undesirable load on your database. You can disable this setting to avoid these
nightly queries. When disabled, the tables in the report editor almanach may not
show row count information if the database does not provide approximate row
counts.
Troubleshooting
Firewalls
The most common connectivity problems with MongoDB Atlas federated database instances result from firewall configurations. Please make sure that the network access configuration in your MongoDB Atlas account permits inbound connections from Cluvio's static IP addresses.
GROUP BY Syntax
MongoDB Atlas SQL requires the use of aliases when referencing aggregation or
other expressions used in a GROUP BY
clause also in the SELECT
clause which
are not simple column names. For example, the following Cluvio query
-- THIS IS INCORRECT!
SELECT
{released:aggregation}, -- Alias required
count(*)
FROM
movies
WHERE
{released=timerange}
GROUP BY {released:aggregation} -- Alias required
ORDER BY 1 DESC
will produce the error
Field `released` in the `SELECT` clause at the 0 scope level not found.
The correct query looks as follows:
SELECT
{released:aggregation} AS released,
count(*)
FROM
movies
WHERE
{released=timerange}
GROUP BY released
ORDER BY 1 DESC
Alternatively, in MongoDB Atlas SQL the alias can also be defined in the
GROUP BY
clause:
-- This works as well.
SELECT
released,
count(*)
FROM
movies
WHERE
{released=timerange}
GROUP BY {released:aggregation} AS released
ORDER BY 1 DESC
Note that the alias in the GROUP BY
clause is not required when the
group-by expression is not also referenced in the SELECT
clause:
-- This works as well.
SELECT
count(*)
FROM
movies
WHERE
{released=timerange}
GROUP BY {released:aggregation}
ORDER BY 1 DESC