Fact

A fact is a logical object that represents a numerical piece of data, which in business environments are used to measure business processes. For example, you may have data about the number of sold items, the price of the purchased goods, or the salary of individual employees.

Within a logical data model, a fact belongs to a dataset and must define a source column that corresponds to a column from the database.

Depending on your business goals, you may analyze raw facts (for example, present the salary of each individual employee) or aggregate the facts and analyze the resulting metrics (for example, sum salaries of the employees working for a specific department to see the total salary amount paid in this department).

Facts in datasets are identified by the following icon:

Fact Icon

In your database, facts can be stored as any numerical data type, with or without floating point.

Facts can be of the following types:

  • Additive facts can be used in computations, such as summing them together. Additive facts are sources for aggregation that is accomplished by a metric function.
  • Non-additive facts cannot be added. For example, Unit Prices cannot be added together to produce meaningful information. Non-additive facts can often be derived from other additive facts. For example, Unit Price can be derived from dividing Extended Price by Quantity.
  • Semi-additive facts can be added but only within certain context, therefore they must be managed carefully. For example, Inventory for a month is additive only within a month.

Data modelers are mostly interested in additive facts.

Fact Datasets

A collection of facts that measure the same business process are stored in a single data unit called a fact dataset (also known as a fact table). A fact dataset contains individual fact values and pointers to associated attributes used as context for the fact data. Fact datasets have relatively few columns and many rows.

All facts in a fact dataset are recorded with a certain level of detail. This level of detail is called the grain of a fact. The grain of a fact dataset is largely determined by the attribute references in it. For example, if an inventory fact dataset contains the Quantity fact and the Location and Product attributes, the grain of the dataset would be Location and Product.

GoodData supports only one-attribute grains.

Designing Fact Datasets

Depending on the type of the underlying business process, design fact datasets in one of the following ways:

  • Transactional fact dataset for discrete events

    One row corresponds to a single transaction, usually connected to one date dataset. Rows are added, not updated.

  • Accumulating fact dataset for evolving events

    This method models a process passing through several predefined stages, usually connected to multiple date datasets. One row corresponds to an execution of the process, with individual stages of the process tracked in separate columns. The rows must be updated throughout the stages.

Recommended Practices

  • Include the lowest level of source data in your facts. These facts can always be rolled up.

    Example: If you measure a business process by individual city data, you can always roll up that information to the state level. However, if data is stored at the state level, you cannot decompose that data to the individual city level.

  • Where possible, be consistent in your units of measure before the data is inserted into the system.

    Example: Do not store Fact 1 in inches and Fact 2 in centimeters. Even if you never intend to use these facts together, another user may decide to use both of them in a visualization.

  • Make sure that facts placed in the same fact table are at the same level of detail (grain) and from the same business process. If two events occur at different times, they belong in different fact tables.

  • Verify that you understand the level of detail (grain) in each fact dataset. In general, the complete set of attributes in a dataset indicates the level of detail.

  • Where possible, avoid including non-additive facts, which cannot be used in metric computations.

    • Instead of storing a ratio or percentage, store the facts that can be used to calculate the percentage or ratio as a metric.
    • Instead of storing the price of units (units * unit price), store units and unit price separately. Put the calculation of the price of the units or the average price into metrics.