Go back to Blog's hub Blog   |   tags:  

What Is Analytics as Code?

Written by Lauri Hänninen  | 

Share
What Is Analytics as Code?

Analytics as code is a paradigm shift in the field of data analytics, where code takes center stage in creating and managing analytics workflows and solutions. By treating analytics as code, businesses can leverage the power of software engineering principles to drive agility, efficiency, and scalability in their analytics processes.

At its core, analytics as code refers to an approach that enables users to develop and express analytics objects and functionalities through code. Traditionally, analytics has relied on a mix of coding — SQL for data preparation —  and pointing and clicking objects in graphical interfaces to create and manage dashboards. However, with analytics as code, human and machine-readable code becomes the primary means of defining, manipulating, and automating analytics processes. This shift allows organizations to manage analytics as they would any other software development project, applying well-established software engineering practices, including version control, automated testing, CI/CD, and collaboration, to their data analytics workflows.

Components of Analytics as Code

To grasp the concept of analytics as code, it is important to understand its key components that facilitate the creation and management of analytics workflows. Organizations can utilize widely-used languages such as Python (if you’re not already using it, learn why you should here)  or human-readable data-serialization languages like YAML and JSON to build and manage their data analytics, allowing analytics engineers to write the necessary logic and instructions to achieve desired analytics outcomes.

In addition, analytics as code treats all elements of analytics, such as data connectors, ETL/ELT, logical data models, metrics, visualization, dashboards, user management, and more, as objects that can be defined, manipulated, and customized through code. These analytics objects are serialized into a human-readable and editable textual format. This ‘as code’ approach enables analytics practitioners to apply version control, collaborate effectively, and track changes over time, similar to managing code in software projects. All this helps minimize the risk of breaking analytics when making updates. You’ll always know who did what and when, and you can roll back to the previous stable version if something goes wrong.

A crucial aspect of analytics as code is providing a mature developer experience. This involves offering robust development tools and frameworks specifically designed for analytics workflows. These tools may include integrated development environments (IDEs) with features like syntax highlighting and auto-completion for coding languages, as well as seamless integrations with version control systems. A strong developer experience ensures that analytics engineers can work efficiently, collaborate seamlessly with team members, and test and deploy analytics solutions with confidence.

Examples of Analytics as Code

Analytics as code covers the whole analytics lifecycle, from data pipeline integration to defining the analytical objects as code. Let’s look at some examples of these two items.

Data Pipeline

In the context of analytics as code, tools such as Meltano and dbt can play crucial roles in implementing the analytics as code approach from the very beginning. Meltano assists in building data pipelines, enabling data extraction, and loading (EL) processes. On the other hand, dbt focuses on transforming and modeling data, allowing for efficient data transformations and generating analytics-ready datasets – as shown in the following example:

with users as (
select to_json("item") as item_json from cicd_input_stage.users
),


final as (
select
CAST(json_extract_path_text(item_json, 'id') as INT) as user_id,
CAST(json_extract_path_text(item_json, 'html_url') as TEXT) as url,
CAST(json_extract_path_text(item_json, 'login') as TEXT) as login
from users
)


select * from final

Adopting these modern data tools like Meltano or dbt empowers organizations to establish scalable and reproducible data pipelines, leverage version control for analytics artifacts, and provide analytics engineers with the necessary tools to develop, test, version, and deploy analytics solutions effectively.

Defining Analytical Objects

In analytics as code, analytics components such as logical data models, metrics, visualizations, and dashboards are treated as objects that can be defined, customized, and manipulated through code – as you can see in the following example with the metric definition of total spend:

# A metric is a computational expression of numerical data (facts or other metrics).
# For example, you can have metrics representing the average invoice sum or the number of sold items per country.


type: metric
id: total-spend

title: Total Spend

maql: SELECT SUM({fact/spend})

This approach allows you to treat analytics as any other software. It means utilizing version control, or applying best testing practices before deploying analytics to production. As mentioned above, developer experience plays a crucial role here. Analytics engineers do not want to spend their lives writing long YAML files with analytics objects. Therefore, there is a need for the best developer experience that gives analytics engineers all the benefits of the analytics as code approach and helps them to maintain analytics in the best possible manner.

Benefits of Analytics as Code

Analytics as code brings several significant benefits to organizations, especially for technical professionals.

Flexibility and Customization

By using code, organizations can create highly tailored analytics workflows that address specific business needs and extract the most value from their data. When we compare drag-and-drop tools and code, code is the best abstraction for software. With the ability to define and manipulate analytics objects through code, teams can adapt their analytics solutions as requirements evolve and business goals change.

Greater Control and Reusability

Representing analytics through code transforms analytical objects into reusable code snippets. This allows for easy reusability and iteration of analytics because the snippets can be shared among teams and reused in different contexts. Utilizing reusable code-based workflows ensures that analytics processes are consistent. This helps to prevent mistakes and differences that can occur when using different tools and interfaces.

Support for Collaboration and Version Control

Coding languages and serialized text representations facilitate collaboration among analytics teams. Multiple team members can work simultaneously on different aspects of an analytics project, leveraging version control systems to manage changes, merge contributions, and track the evolution of analytics solutions over time. This level of collaboration and version control ensures transparency, accountability, and traceability in analytics projects.

Automation and Scaling

With analytics as code, you can automate workflows easily (eliminating the need for manual work) and scale your analytics. Automation enables you to deploy high-quality code more frequently through continuous integration and continuous deployment (CI/CD). Additionally, you can manage multi-tenant analytics architecture, including users, permissions, and the creation of new environments using code. By leveraging code instead of relying on point-and-click interfaces, building and scaling analytics becomes straightforward, as code offers better manageability and scalability.

Accuracy and Reliability of Analytics Outcomes

Automated testing frameworks play a crucial role in analytics as code. Developers can create test suites and scripts to validate the functionality of analytics code, ensuring consistent and trustworthy results. By treating analytics as code, organizations can embrace continuous integration and deployment practices, enabling them to automate integration, testing, and the deployment of analytics workflows. This streamlined development lifecycle enhances efficiency and reliability.

Why not try our 30-day free trial?

Fully managed, API-first analytics platform. Get instant access — no installation or credit card required.

Get started

Six Key Aspects of Developer Experience With Analytics as Code

A seamless developer experience is pivotal in enabling analytics engineers to effectively create, maintain, and collaborate on analytics workflows. This section explores six key aspects that define the developer experience with analytics as code. From productivity and testing to change management and collaboration, understanding these aspects is crucial for harnessing the full potential of analytics as code.

Ability To Create and Maintain Analytics as Code

A fundamental aspect of the developer experience with analytics as code is the ability to create and maintain all code related to ETL/ELT (defined by, e.g., Meltano, dbt) through code. This empowers developers to continuously improve and adapt analytics solutions in response to changing business requirements. By leveraging coding languages, developers can define logical data models, metrics, insights, and dashboards, enabling greater agility and customization.

Productivity of Work

Analytics as code enhances developer productivity by reducing context switching and duplication. Developers can focus on a unified coding environment, eliminating the need to switch between multiple tools or interfaces. Additionally, code-based workflows streamline the development process, allowing developers to leverage existing libraries, frameworks, and best practices to expedite analytics solution creation.

Validation of Results and Automatic Testing

In analytics as code, robust testing capabilities are essential to ensure the accuracy and reliability of analytics outcomes. Developers should have access to comprehensive testing frameworks that enable them to validate results and maintain data integrity. Automatic testing processes and tools play a crucial role in detecting errors, enabling developers to confidently iterate on their code.

Ability To Annotate, Approve, Manage, and Revert Changes

Change management is a critical consideration in analytics as code. Developers need to be able to annotate, approve, manage, and revert changes effectively. Incorporating version control systems allows for better governance and control over analytics workflows. By maintaining a detailed history of changes, developers can track the evolution of analytics solutions and easily roll back to previous versions if necessary.

Ability To Collaborate on Analytics

Collaboration is key to the success of analytics projects. Analytics as code should provide effective tools and practices that enable analytics teams to collaborate seamlessly. These include features such as shared repositories, code reviews, and documentation. By fostering collaboration, teams can leverage collective knowledge and expertise, driving innovation and improving the quality of analytics solutions.

Ability To Work Iteratively

The iterative approach is fundamental to analytics as code. Developers should have the ability to work iteratively, facilitating continuous improvement and rapid iteration cycles. By embracing an iterative mindset, developers can incorporate feedback, iterate on their code, and deliver valuable insights more efficiently. This approach fosters innovation and enables organizations to stay agile in a rapidly evolving data landscape.

Considerations and Challenges of Analytics as Code

While analytics as code offers significant advantages, organizations must also consider the following challenges:

Skills and Expertise

Successfully implementing analytics as code requires skilled analytics engineers or data practitioners who possess both coding proficiency and domain expertise. The transition to analytics as code may necessitate investment in training and skill development for existing team members.

Complexity and Learning Curve

Transitioning from traditional tools and methods to analytics as code may involve a learning curve. Teams must adapt to new workflows, coding languages, and development practices. Adequate time and resources should be allocated to support this transition and ensure a smooth learning curve.

Maintenance and Scalability

As analytics workflows evolve and expand, maintaining and scaling the codebase can become challenging. Establishing robust practices for documentation, testing, and code management is crucial for long-term success. Organizations should prioritize scalability and ensure that analytics solutions can handle growing data volumes and evolving business requirements.

Ready To Get Started?

Are you ready to dive into the world of analytics as code? Take the first step by signing up for a 30-day free GoodData trial. Our analytics platform empowers you to treat all analytical objects as code. It uses software engineering best practices for analytics, providing the modern developer experience.

Want To Know More About Analytics as Code?

To learn more about analytics as code and software engineering best practices in analytics, check out the links below.

Why not try our 30-day free trial?

Fully managed, API-first analytics platform. Get instant access — no installation or credit card required.

Get started

Written by Lauri Hänninen  | 

Share
Go back to Blog's hub Blog   |   tags: