Go back to Blog's hub Blog   |   tags:  

Introducing Last Mile ETL: One Tool for Better Data Transformation

Written by Patrik Braborec  | 

Introducing Last Mile ETL: One Tool for Better Data Transformation

Imagine you connect a database to analytics and the data is not in the shape you would like to have it. For example, coordinates should be separated by latitude and longitude, some values are in a different format or type, and some tables can even have a bad design. Normally, you would need to spend a lot of time context-switching between the ELT/ETL pipeline and analytics, where, in the transformation (T) phase you would need to edit code that transforms data from the application shape to the analytics shape. This experience is sort of normal in our industry but does it have to be? For this reason, we are introducing Last Mile ETL!  With it, you can do everything in one tool which significantly improves the ability to iterate on and the speed of development, customization, and security. You may now ask how? Well, enough words, it’s time for an example. Let’s deep dive into it!

Exploratory Analytics With Last Mile ETL

What do I mean by "exploratory analytics"? Let’s say, we have just three tables with data, and we would like to explore its value or learn some facts based on this data. It means that instead of one concrete goal to achieve, we will primarily try to explore some value in this data! With defined exploratory analytics, here are three tables in a database (Airports, Country list, and GDP — Gross Domestic Product):

Tables in a database
Tables in a database

You can see that the coordinates are in one column called coordinates instead of latitude and longitude, or value in GDP the table is text rather than numerics. We can address these issues with the help of Last Mile ETL inside the analytics.

Let’s connect the database to the analytics (if you are not familiar with GoodData, I encourage you to check the documentation). The result is the following:

The result of connected database to analytics (GoodData)

What you can see in the image above are datasets. We can convert a dataset to a so-called SQL dataset:

Converting dataset to SQL dataset

The SQL dataset gives us the possibility to write SQL queries that are executed directly in the database. Let’s just check what types of airports we have in the database:

SQL query from analytics

It seems that the Airports table contains several types of airports such as heliports, or even closed airports. Let’s say that I am interested just in the medium and large airports — it’s not a problem at all. I don’t need to go to the ELT/ETL pipeline, I simply create a new SQL dataset called Airports transformed directly in the analytics, and I can do that with the following SQL code:

New dataset Airports transformed

You can see that I can do it with quite a simple SQL query and the result is the following: