Build a Logical Data Model

A logical data model (LDM) is used to develop visual understandings of data entities, attributes, keys, and relationships. It represents the definitions and characteristics of data elements that stay the same throughout technological changes. This makes it independent from any specific database, therefore it is the single source of truth for analytics.

Using a LDM has the following major benefits:

  • Easy changes of data structure. For example, changing the name of a column in a database can cause issues with analytics. If you are using a LDM as a single source of truth, you define the data elements only once and build all visualizations and metrics on top of it. If there are any changes in the connected databases, you only have to update the relevant definitions in the LDM to keep your analytics working.

  • Curated analytics experience. Using a LDM ensures that the end users of your analytics can see only relevant and valid combinations of attributes and facts.

Process of LDM Creation

  • Connect a data source, i.e. a database or data source manager.

    GoodData then scans the database schema and reads its metadata.

  • Build the LDM by following the steps below.

Logical Data Model Overview

To make mapping your data as easy as possible, learn what rules are applied when the LDM is created and make sure that your database follows them.

UI
Python
API (Bash)
API (Powershell)
  1. Open the demo workspace and go to the Data tab.

  2. Click Create model.

New LDM
  1. Click Connect on the left panel to connect your data source to your LDM.
New LDM
  1. Select your tables and click Add selected to add them as data sets to the LDM.
New LDM

The LDM is automatically created.

  1. Click Save.
New LDM

Once you have created the LDM, you can start building dashboards and visualizations.

Once you create a LDM, you can use Python code to work with it. The following examples show some simple use cases.

from gooddata_sdk import GoodDataSdk, CatalogWorkspace

# GoodData host in the form of uri, e.g. "https://*.gooddata.com" (GoodData Cloud), 
# or "http://localhost:3000" (GoodData Cloud Native)
host = "<GOODDATA_URI>"
# GoodData API token
token = "<API_TOKEN>"
workspace_id = "<WORKSPACE_ID>"
another_workspace_id = "<DIFFERENT_WORKSPACE_ID>"
sdk = GoodDataSdk.create(host, token)

# Get LDM from GoodData
declarative_ldm = sdk.catalog_workspace_content.get_declarative_ldm(workspace_id)

# You can put LDM in a different workspace
sdk.catalog_workspace_content.put_declarative_ldm(another_workspace_id, declarative_ldm)

# Save LDM to a file and version in git
sdk.catalog_workspace_content.store_declarative_ldm(workspace_id)

Check the Python SDK documentation for further details.

Once you create a LDM, you can use the GoodData API to work with it:

curl $HOST_URL/api/v1/layout/workspaces/$WORKSPACE_ID/logicalModel \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer <API_TOKEN>" \
  -X GET \
  | jq .

You will get a similar response (with your LDM):

{
  "ldm": {
    "datasets": [
      {
        "attributes": [
          {
            "defaultView": {
              "id": "label_id",
              "type": "label"
            },
            "description": "Customer name including first and last name.",
            "id": "attr.customers.customer_name",
            "labels": [
              {
                "description": "Customer name",
                "id": "label.customer_name",
                "sourceColumn": "customer_name",
                "sourceColumnDataType": "STRING",
                "tags": [
                  "Customers"
                ],
                "title": "Customer name",
                "valueType": "\"TEXT\" | \"HYPERLINK\" | \"GEO\""
              }
            ],
            "sortColumn": "customer_name",
            "sortDirection": "\"ASC\" | \"DESC\"",
            "sourceColumn": "customer_name",
            "sourceColumnDataType": "STRING",
            "tags": [
              "Customers"
            ],
            "title": "Customer Name"
          }
        ],
        "dataSourceTableId": {
          "dataSourceId": "my-postgres",
          "path": ["postgres_demo_3920fbd8f33ec386", "campaign_channels"],
          "id": "customers",
          "type": "DATA_SOURCE"
        },
        "description": "Our customers.",
        "facts": [
          {
            "description": "A number of orders created by the customer - including all orders, even the non-delivered ones.",
            "id": "fact.customer_order_count",
            "sourceColumn": "customer_order_count",
            "sourceColumnDataType": "STRING",
            "tags": [
              "Customers"
            ],
            "title": "Customer order count"
          }
        ],
        "grain": [
          {
            "id": "attr.customers.customer_name",
            "type": "ATTRIBUTE"
          }
        ],
        "id": "customers",
        "references": [
          {
            "identifier": {
              "id": "customers",
              "type": "DATASET"
            },
            "multivalue": false,
            "sourceColumns": [
              "customer_id"
            ]
          }
        ],
        "tags": [
          "Customers"
        ],
        "title": "Customers"
      }
    ],
    "dateInstances": [
      {
        "description": "A customer's order date",
        "granularities": [
          "MINUTE"
        ],
        "granularitiesFormatting": {
          "titleBase": "string",
          "titlePattern": "%titleBase - %granularityTitle"
        },
        "id": "date",
        "tags": [
          "Customer dates"
        ],
        "title": "Date"
      }
    ]
  }
}

Once you create a LDM, you can use GoodData API to work with it:

Invoke-RestMethod -Method Post -Uri '$HOST_URL/api/v1/layout/workspaces/$WORKSPACE_ID/logicalModel' `
   -ContentType 'application/json' `
   -H @{ 
     'Accept' = 'application/json'
     'Authorization' = 'Bearer <API_TOKEN>' 
   } `
   | ConvertTo-Json

You will get a similar response:

{
  "ldm": {
    "datasets": [
      {
        "attributes": [
          {
            "defaultView": {
              "id": "label_id",
              "type": "label"
            },
            "description": "Customer name including first and last name.",
            "id": "attr.customers.customer_name",
            "labels": [
              {
                "description": "Customer name",
                "id": "label.customer_name",
                "sourceColumn": "customer_name",
                "sourceColumnDataType": "STRING",
                "tags": [
                  "Customers"
                ],
                "title": "Customer name",
                "valueType": "\"TEXT\" | \"HYPERLINK\" | \"GEO\""
              }
            ],
            "sortColumn": "customer_name",
            "sortDirection": "\"ASC\" | \"DESC\"",
            "sourceColumn": "customer_name",
            "sourceColumnDataType": "STRING",
            "tags": [
              "Customers"
            ],
            "title": "Customer Name"
          }
        ],
        "dataSourceTableId": {
          "dataSourceId": "my-postgres",
          "path": ["postgres_demo_3920fbd8f33ec386", "campaign_channels"],
          "id": "customers",
          "type": "DATA_SOURCE"
        },
        "description": "The customers of ours.",
        "facts": [
          {
            "description": "A number of orders created by the customer - including all orders, even the non-delivered ones.",
            "id": "fact.customer_order_count",
            "sourceColumn": "customer_order_count",
            "sourceColumnDataType": "STRING",
            "tags": [
              "Customers"
            ],
            "title": "Customer order count"
          }
        ],
        "grain": [
          {
            "id": "attr.customers.customer_name",
            "type": "ATTRIBUTE"
          }
        ],
        "id": "customers",
        "references": [
          {
            "identifier": {
              "id": "customers",
              "type": "DATASET"
            },
            "multivalue": false,
            "sourceColumns": [
              "customer_id"
            ]
          }
        ],
        "tags": [
          "Customers"
        ],
        "title": "Customers"
      }
    ],
    "dateInstances": [
      {
        "description": "A customer order date",
        "granularities": [
          "MINUTE"
        ],
        "granularitiesFormatting": {
          "titleBase": "string",
          "titlePattern": "%titleBase - %granularityTitle"
        },
        "id": "date",
        "tags": [
          "Customer dates"
        ],
        "title": "Date"
      }
    ]
  }
}