How to Build Logical Data Models in At-Scale Analytical Applications

October 23, 2019
GoodData Staff's picture
Writer

Follow on:

Part Two of Four

In my last blog post, I talked about the value of using semantic layers, both for users and providers, in a large-scale embedded analytics application. Specifically, that the value of a semantic layer comes from its ability to simplify the complexity of data, make it more understandable to end-users, and ensure consistency throughout the organization. Now, to build on that discussion, I’m diving into the design and creation of a logical data model, the first component of a semantic layer.

What is a logical data model?

Logical data model

Put simply, logical data models provide an abstract, either visual (diagrammatic) or programmatic (code), representation of a domain of information as it is understood at a point in time. They also serve as a foundation and support structure for other semantic layer components such as measures and insights, so its design must be solid.

To get a bit more technical, logical data models describe facts, or numerical information around transactions, and dimensions, or aspects of the transaction. For example, a fact could be the price paid for an item, and a dimension might be the product category of the item sold. They’re also optimized for answering questions in a multitude of ways, and are independent of underlying data structures and are a useful abstraction from database and data warehouse technologies.

Let’s look at three different best practices for creating and managing a logical data model for large-scale analytic applications.

Start with the Business Case

Before you can start building a logical data model, you need to first understand your end-users—including how many users you have and how many of them will interact with analytics—as well as their analytics needs and business processes. Map out their needs and business processes, interview them about the decisions they need to make as they use the application. By having a better understanding of their workflow, you can design a logical data model that can answer their questions and support their decisions with accuracy and good performance.

Of course, you can’t ignore the business case either. So take the time to interview key stakeholders and understand their needs and the kinds of decisions they’ve made in the past. How do they inform decisions today? And how can the analytic application help them make better decisions in the future?

Build a logical data model one step at a time

Don’t try to build a perfectly comprehensive logical data model. We see this all the time: Customers ask you to give them access to all of their data, which is a big task that requires a lot of resources to successfully complete. Not only that, but it doesn't really answer concrete questions or solve any particular problem—just creates new problems for users who are overwhelmed by the amount of information they have access to.

Starting with a smaller initial model is also beneficial for you, not just the user. With a more manageable selection of data, you can get a better idea of what steps you’ll need to take before launch. Will you need to prepare to teach every user, or is the change small enough that they’ll adjust to it fairly easily? Even if you did wind up needing to teach your users, that time and resource commitment is less taxing if your users can start with a manageable set of data elements.

A less ambitious data model also enables user access to the data more quickly and gives you feedback about their experience fairly early on, which makes it easier for you to start working on the next iteration. So start with what you know your users need today, and then build on that knowledge as needs and decisions evolve over time. If you don’t, there’s less chance of your model scaling and performing as your customer expects.

Deal Effectively with Change

Even though the model might not change every week, that doesn’t mean that it never changes. As your model evolves, you’ll need to ensure that it doesn’t require you to put your data product on hold while you make changes—especially critical for a SaaS business where things change quickly.

To accommodate those changes, you should be following agile methodologies and avoiding downtime. Establishing a closed-loop feedback process with your users as you build your data product is critical to help inform what future changes should be.

You’ll not only need tools for the creation and management of models, but you’ll also need automated deployment and data loading tools. After all, you’re building a large-scale analytics application for lots of external users, so you can’t be sending emails to notify your users that there will be downtime caused by deployment or data loads—or worse, failures of these processes!

In my next blog post, I’ll delve more deeply into additional parts of a semantic layer: measures and insights. Just like logical data models, they’re critical parts of creating a great user experience and ensuring that the application consistently delivers value to the end-user and to the company.

Want to ask about something specific?

Contact us