Data Source Details
Refer to the following information when creating a data source for your Vertica database in GoodData Cloud Native (GoodData.CN):
The following database versions are supported:
GoodData.CN uses client driver version 10.0.1-0.
The JDBC URL must be in the following format:
jdbc:vertica://<host>:<port>/<databaseName>For security reasons, the query in the JDBC URL must not contain the
Basic authentication is supported. Specify
If you use native authentication inside your cloud platform (for example, Google Cloud Platform, Amazon Web Services, or Microsoft Azure), you do not have to provide the username and password.
Workspaces that use Vertica as a data source have access to the following additional features:
If your database holds a large amount of data, consider the following practices:
Denormalize the relational model of your data base.
You can use flatten tables to assist with denormalization. Because Vertica is a columnar database, queries read only the required columns, and each column is compressed separately.
- RESEGMENT by the columns that are most frequently used for JOIN and aggregation operations. You can also RESEGMENT by a column with high cardinality so that loaded data is evenly distributed in your cluster.
- SORT by the columns that are most frequently used for JOIN and aggregation operations. Those columns are typically mapped to the attributes that are most frequently used for aggregations in insights.
- Use RLE encoding for low-cardinality columns (columns with few distinct values).
- If you have to build analytics for multiple, mutually exclusive use cases, define multiple projections on top of a table.
Utilize live aggregate projections. Live aggregate projections can store pre-aggregated data when using additive functions like COUNT or SUM. Vertica automatically selects the most optimal projection to use. This may help you simplify your logical data model (LDM). Instead of declaring both the full and pre-aggregated datasets, you can create standard and pre-aggregated projections on top of single table, declare only a single dataset in your LDM, and map that dataset to the table.
Use hierarchical partitioning to avoid too many partitions (ROS containers) in a single projection.
Use Eon Mode to spin up sub-clusters based on user needs.
- Users with similar needs populate data into EON depots that are likely to be reused.
- Isolate data transformation operations running in your database from the analytics generated by GoodData.CN.
Scale up based on users needs. Automate adding and removing secondary sub-clusters.
Query timeout is configurable per application instance. It is a parameter of the sql-executor service, default value is 160 seconds.
Query timeout is closely related to the ACK timeout. Proper configuration of the system requires that ACK timeout is longer than query timeout. Default ACK timeout value is 170 seconds.
NoteWhen a query fails on query timeout, the REST API call returns error code 500. Please note that this is subject to change in a future release.