How to Leverage Metrics in Large-Scale Analytics

November 16, 2018
Miroslav Sova's picture
Sr. Technical Product Marketing Manager
Mirek is Senior Technical Product Marketing Manager at GoodData. He has 12+ years experience in the tech industry with B2B cloud-based products. He started as a .NET software developer and over time entered into product management and more recently into product marketing. Throughout his career, he has contributed to launching an array of products - cloud security systems, software development tools and an enterprise document digitization platform. Mirek holds a BSc in Computer Science from University of Wollongong, Australia and a Master of IT from the University of Sydney, Australia.

Part Three of Four

If you’ve been following my blog series, then you already know that most product owners are looking to integrate embedded analytics into their data products to help make that product usable by everyone in the organization, and that semantic layers are a great way to accomplish that for large-scale analytics applications. Now let’s dive further into this discussion of semantic layers. We’ve already covered logical data models, so let’s talk about measures and metrics.

What are measures?

Measures are numeric aggregations of business transactions or events that have occurred over a period of time and for which a record is stored. This may sound complicated, so let’s look at an example. A measure can be something as simple as the sum of all sales for last month. Other kinds of measures might be: number of leads generated, outside temperature, miles driven, quantity sold, or total weight. Measures are typically segmented (sliced) or filtered by different dimensions. In the case of sales, we might want to view sales totals by product type, region, color, or brand.

A key aspect of a measure is the grain of the underlying fact table in the logical data model. Did the data engineer store every sales transaction? Or only the sum of transactions each day? Or perhaps we have stored the sum of sales each month broken out by product number and geographic region. The underlying grain will dictate how detailed the answers to our queries can be.   

Another key aspect of measures is that they don’t necessarily tell us whether we are doing well, or not so well because they lack context. We might have generated 100 qualified leads from the last marketing campaign—but is that good? Or should we expect a phone call from the boss? To know the answer to that question, we need: a metric!

What are metrics?

A metric comprises one or more measures that are typically evaluated against a goal or a standard. For example, we might have a metric called “Qualified Leads per Month.” With this metric, we can see if the number of leads is increasing—or decreasing—and we can compare it to a standard, say, 50 qualified leads per month. (Hey, maybe the boss is calling to give us a raise!)

In addition to simple sums and averages, metrics can also be calculated. Going back to a sales example, let’s say that we want to track the percentage of online sales to retail sales. Our metric definition might look like this:

Metrics can also include other metrics. For example, in our business we might want to compute total costs by combining the cost of sales (one metric) with our inventory and carrying costs (two more metrics).

The real value of metrics is not just that they can break down complexity into simpler components, but that they represent, in one place and for all to see, how our business measures itself. For example, a metric representing “profit” could be as simple as total sales less all our expenses, or something very complex involving tax rates and international currency conversions.

How does all this relate to the semantic layer?

So how do measures and metrics relate back to semantic layers? The semantic layer not only contains the definitions for measures and metrics, but it is responsible for abstracting all the lower-level details from the dashboards, reports, and analytical applications that use them.

For example, a dashboard that the executive team uses to make decisions might contain a metric called “Year-to-Date Profit on Retail Sales.” The semantic layer descends through the layers of metrics and measures that comprise such things as inventory expenses, invoices, and overhead formulas, and will issue the detailed queries needed against the fact tables to render the final number.

We can begin to see, even in this simple example, the power of the semantic layer—and specifically metrics—in taming the complexity inherent in most any business. Metrics are a vast improvement over pages and pages of SQL statements. But who benefits?

Benefits of using metrics

Both data providers and data consumers benefit from using metrics. On the provider side, business or domain analysts can define a metric once and then use it many times over to provide a range of insights. Users of analytic applications can reuse metrics to look at information from different angles, or even deconstruct a metric and recombine the components in a different way to create a customized insight.

Metrics are maintained centrally, such that multiple applications can access them, and they can be replicated and tailored to the needs of individual departments, business units, clients, or customer groups. And, when changes need to be made, they can be rolled out quickly, thanks to clear dependencies and distinct layers of abstraction.

For the organization, metrics create a ‘single source of truth’ for how the business computes its performance measures. In this way, everyone from a data analyst to a mobile application developer to a product manager can leverage the same metrics and obtain the same results. Business managers from multiple departments can finally be on the same page!

Metrics are truly a rare example of something that makes a big difference for the organization and its users, but that doesn’t take a lot of time and effort to manage and deploy after the initial investment.

Best practices for creating metrics in large-scale analytic applications

Here are a few best practices that we have developed over hundreds of analytic implementations:

  • Data engineers and application designers should work together with those who have an in-depth understanding of the core business.
  • Establish clear standards for metric definitions, composition, and layering.
  • Insist on a clear title and description for every metric, so that everyone understands what the metric is—and isn’t.
  • Extend the usefulness of metrics by incorporating variables that automatically filter the data so that it is appropriate to the user viewing the results.
  • Ensure that you create sufficient documentation to account for future development or customization of your product by other teams.
  • If metrics are widely shared across development teams, consider versioning the metrics so that testing and deployment can be done in a controlled and predictable manner.  
  • Achieve consistency by recording the definition in a semantic layer that is shared across the organization, and ensure that all developers are aware of the structure.
  • Be sure to encourage reuse of existing metrics and discourage the building of ad-hoc metrics as new demands crop up. A well-designed semantic layer will facilitate reuse.

Metrics and measures, just like semantic layers and logical data models, are vital parts of creating a user experience that actually delivers value to the end user, but there’s one last component to consider.

I’ll be diving into the concept of insights—What are they? Why are they valuable?—in my next blog post.