Cluvio allows not just running SQL and visualizing the results. You can also use the full power of R and run a custom R script to further process the SQL results. The following example shows how easy it is to add a running total column to the SQL results:
SQL:
SELECT {saletime:week}, SUM(pricepaid) as "revenue" FROM sales GROUP BY 1 ORDER BY 1
R:
sql.data %>% mutate(running_total=cumsum(revenue))
The SQL results are available to the R script in a 'sql.data' variable (as a data.frame).
At the moment, the following R libraries are available within the R script. Please contact hi@cluvio.com in case you would like to use a package that is not currently supported.
- R (3.4.4) - most standard R functions
- reshape2 (1.4.2)
- plyr (1.8.4)
- dplyr (0.8.3)
- tidyr (1.0.0)
- purrr (0.2.4)
- tibble (2.1.3)
- broom (0.5.2)
- stringr (1.4.0)
- lubridate (1.6.0)
- hms (0.5.3)
- magrittr (1.5)
- glue (1.4.2)
- zoo (1.8-3)
- DataCombine (0.2.21)
- googlesheets4 (0.2.0)
- bayesAB (1.1.1)
- prophet (0.6.1)
- tvm (0.4.0)
- pwr (1.3-0)
- sigr (1.1.3)
- jsonlite (1.8.0)
System and I/O functions are not available, only referentially transparent transparent transformations, i.e. from input to output. The results of the R script are given by the last command and are passed to the visualization as a data table. Supported R types are:
- primitive types, date, timestamp
- vectors of these (will appear as multiple rows of a single column)
- data.frame
You can name columns in the output as usual in R via the 'names' attribute, e.g. the following will produce a column named "Metric" with 2 rows of data:
res <- c(42, 84) names(res) <- "Metric" res
Important: Restrictions
The R script runs within a constrained environment:
1. The script has to finish within 1 minute (please contact support if you need this limit raised).
2. The script can use at most 100MB of computer memory.
3. Most system functions and functions for accessing the local filesystem are not available, i.e. the script is meant to process the SQL output in a referentially transparent manner.
4. The output of the R script has to be data (as stated above); there is not much use for generating visuals using R (e.g. using ggplot or shiny) in Cluvio, as the visualization step we apply is the same as with plain SQL results.