Dremio

Disclaimer

Support for the Dremio Data Source Manager (DSM) is in beta. Beta features are available for users to test and provide feedback. They do not have their implementation finalized. The behavior or interface for these features may change in the future.

Deployment

You can run Dremio (OSS Version) in a docker container. The image for Dremio is available on Dockerhub .

The following example demonstrates how to start GoodData.CN with Dremio using Minio to serve as S3 storage:

version: '3.7'

services:
  gooddata-cn-ce:
    image: gooddata/gooddata-cn-ce:1.4.0
    ports:
      - "3000:3000"
      - "5432:5432"
    volumes:
      - gooddata-cn-ce-data:/data
    environment:
      LICENSE_AND_PRIVACY_POLICY_ACCEPTED: "YES"

  dremio:
    image: dremio/dremio-oss:17.0.0
    ports:
      - '9047:9047'
      - '31011:31010'
      - '45678:45678'
    volumes:
      # DB drivers
      - ./db-drivers/VERTICA/vertica-jdbc-10.0.1-2.jar:/opt/dremio/jars/3rdparty/vertica-jdbc-10.0.1-2.jar
      - ./db-drivers/SNOWFLAKE/snowflake-jdbc-3.12.9.jar:/opt/dremio/jars/3rdparty/snowflake-jdbc-3.12.9.jar
      # DB plugins
      - ./db-drivers/DREMIO/dremio-verticaarp-plugin.jar:/opt/dremio/jars/dremio-verticaarp-plugin.jar
      - ./db-drivers/DREMIO/dremio-snowflake-plugin.jar:/opt/dremio/jars/dremio-snowflake-plugin.jar
      # DATA volume
      - dremio-data:/opt/dremio/data

  minio:
    image: minio/minio:RELEASE.2021-08-25T00-41-18Z
    volumes:
      - minio-data:/data
    ports:
      - '19000:9000'
      - '19001:19001'
    environment:
      MINIO_ACCESS_KEY: tiger_abcde_k1234567
      MINIO_SECRET_KEY: tiger_abcde_k1234567_secret1234567890123
    command: server --console-address ":19001" /data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3

volumes:
  gooddata-cn-ce-data:
  dremio-data:
  minio-data:

Prepare Dremio for GoodData.CN

To learn how to register Data Sources to Dremio, refer to the official Dremio documentation for connecting a Data Source . To access the Dremio web console, load localhost:9047 in your web browser. Register the user and password for later use when you create the Data Source definition.

Depending on the Data Source you use, additional preparation may be necessary to integrate your Data Source Manager with GoodData.CN. For general considerations, refer to Preparing Data Source Managers for GoodData.CN .

Data Sources Providing Metadata

If you use a Data Source that accommodates metadata (for example, Postgres), consider the following to enure your scan of the Data Sources returns data:

  • Database tables and views can be scanned only if they have been queried in Dremio.
  • Alternatively, you can create Dremio datasets on top of the tables or views to have them available as views without needing to query Dremio.

Data Sources that do not Provide Metadata

If you use a Data Source that does not accommodate metadata, you must always create the datasets.

Data Source Details

Use the following information when creating a data source to use with your Dremio DSM:

  • The following considerations apply when you are configuring the JDBC URL:
  • Basic authentication is supported. Specify the user and password accordingly.
  • You can set enableCaching to true and cachePath to ["$scratch"]

Performance Tips

If you want to query large datasets or even join large datasets from different data sources, we recommend that you use the Dremio reflections feature.