Attribute

An attribute is a logical object that represents a non-measurable descriptor that breaks metrics apart and provides context to the data. For example, if you have data about daily sales amounts, you may want to break it by the location of the sales departments and find out how much money each sales department brings. In this case, the department location is an attribute by which you break your measure, which is the daily sales amount.

You can also use attributes as measures to count the number of distinct values of the attribute. For example, if you have an attribute describing the location of a sales department, you may want to know how many distinct locations you have in total.

Attributes can be text (for example, the location - US, EU, or the sales channel - Facebook, Twitter) or numerical (for example, the table size - 1, 2, 3).

Attributes in datasets are identified by the following icon:

Attribute Icon

Attributes and Attribute Labels

An attribute can also be viewed as a collection of attribute labels logically bound together. For example, the Department attribute can be represented by different labels, such as:

  • Full name (Human Resources, Research and Development, Quality Assurance)
  • Shortened name (HR, RD, QA)
  • Number (1, 2, 3)

When you create an attribute in your logical data model (LDM), it is added with a single label, which has the same name as the attribute itself. This label becomes the primary label for the attribute. Every attribute has at least one label, and you can add more labels to it.

When the attribute with multiple labels is used in an insight , the attribute values belonging to the primary label will be populated into the insight. If an attribute has only one label, this label becomes primary by default.

Within an LDM, an attribute belongs to a dataset , and each label of this attribute must define a source column that corresponds to a column from the physical data model (PDM).

Dimension Datasets

A set of related attributes is called a dimension. For example, Address, City, State, and ZIP Code may be related in a dimension called Location. Each attribute in a dimension is a discrete entity, yet they are all related to each other.

Dimension datasets are stored in dimension tables. Dimension tables are typically wide and shallow (they do not have many rows).

Your dimensions should always have consistent definitions and contents. Dimensions that share identical structures are called conformed dimensions. Conformed dimensions are easier to create insightful analytics because of data consistency.

For example, the State attribute should not use two-letter abbreviations (CA) along with full state names (California). Queries using this malformed attribute will not be able to match the two versions of the state name.

Whenever possible, share a dimension dataset with other datasets to ensure consistency. Shared dimension datasets are always conformed. The sharing here means that relations between multiple datasets and the shared dimension dataset exist.

  • Avoid placing attributes in fact tables . Fact tables should contain facts and foreign keys to attributes stored in other dimensions.

    The only exception may be performance of analytics. You can denormalize attributes into fact tables and slide by them. However, the resulted insight can be different because attribute values without any matching fact records are not returned, which is not acceptable for the most insights.

  • Create common dimensions that can be reused (shared) when you create additional fact tables. For example, you should have only one dimension table for customers, one for products, one for employees, and so on. These conformed dimensions ensure uniformity of data in the workspace and enable re-use of the associated contextual information.

  • Name attributes consistently so that business users understand it because they will interact with them in the workspace that uses your LDM.