Primary Key

A primary key is an attribute contained in a dataset in your logical data model (LDM) that does the following:

  • It is a primary key of the dataset, which enables the system to distinguish individual records.
  • It is an attribute that enables connecting this dataset to another one using its values to make a relationship.

Primary keys are important identifiers of uniqueness within a dataset.

To connect two datasets together, define a primary key in the first dataset and a reference (foreign key) in the second dataset. Together, they form a relationship - a one-directional mapping between a parent object and a child object that will reference the data in the parent object.

You can have only one primary key per dataset, but you can have multiple other datasets in the LDM with references to that primary key. Any single reference in a dataset can point back to only one primary key.

Note that the primary key can also be a composite key, that is itself created by concatenating multiple attributes together. A composite key is still treated as a single primary key (grain).

The direction of the arrow determines which dataset’s data can be analyzed (sliced) by the data from the other dataset. For example, in the following LDM, the relationship between the Customer and Order Lines datasets allows you to slice Quantity by Customer Name.

Diagram with six datasets: Order Lines, Campaigns, Campaigns / Channels, Customers, Products, and Date. The Order Lines dataset acts as a fact dataset with the primary key Order Line ID. It is connected via directional arrows to four attribute datasets: Customers, Products, Campaigns, and Date, and one additional fact dataset: Campaigns / Channels. Each arrow represents a one-directional relationship, originating from a dataset with a primary key and pointing to Order Lines, where the corresponding foreign key is located. These foreign keys allow data in Order Lines to be sliced and filtered by attributes in the connected datasets.
  • Make sure that each value of a primary key is unique. If the data contains multiple rows with identical primary key values, you will experience wrong results in visualizations.

    The values for the reference attribute do not have to be unique.

  • Maintain “referential integrity”: make sure that each value of the foreign key column in the referencing dataset has a corresponding value in the primary key column of the referenced dataset.

    This is important because the query engine uses dataset relations in the LDM to connect the underlying database tables (technically, inner joins are used on SQL).