Prepare like a Maestro

I spent the last week going through the experience of data prep using the recently released beta Project Maestro tool from Tableau. Here are some few lessons I’ve learned while conducting a set of datasets into a Maestro workflow:

1 | Set my final goal at the beginning:

For this Maestro practice I had one main table of daily transactions recorded in a Baby tracking app. This main table was then complemented by others within the relational SQL database of the app. Here is a snippet of the main source table and its headers:

Screen Shot 2018-01-12 at 1.43.08 PM

From this table my main goals were:

  • Replace all Unix dates with regular datetime dates (YYYY-MM-DD HH:MM:SS)
  • Replace all data IDs with their respective human-readable labels from the auxiliary tables
  • Some extra IDs were also present on the Value column (e.g boolean 1 or 0), so clean up those as well
  • Export a tde and use it on Tableau

 

2 | Clean, Extract and Transform the dates

After literally breaking Maestro for the third time I realized that it is way better to spend some time cleaning up the tables before joining them. Once I put that on my mind Maestro worked like a charm. The first field I had to remove was the Blob column which only contained binary text for pictures that were added to the app.

Next, in order to transform the dates I’ve added a “New Step” to my Datasource which opened a “Clean” node in the workflow. In this single step I was able to create a couple of Calculated Fields to Extract and Transform the Unix dates from the Value column:

[Time Extract]
IF CONTAINS([AttributeName],’time’) THEN [Value]
END

Translation for humans: Maestro, whenever you see “time” in the AttributeName column, please bring the data from Value column.

Next I had to use another famous Tableau Calculation to convert the times into the format I wanter:

[Transaction Date]
DATETIME(DATEADD(‘second’,INT([Time extract]),#1970-01-01#))

Translation for humans: Maestro we are going to create a field with a Date and Time format. To build  this field we are going to make the field “Time extract” and convert it to a number, then we will add one second for each digit of the “Time extract”, this all starting on January 1st 1970

This calculation allowed me then to remove the [Time extract] field from the Maestro workflow. And this is really helpful tip in order to keep you sane using the tool. As soon as you don’t need a field, remove it!

At last I built another Calculation to replace the original Unix dates from the Value column, filling it with the new formatted dates:

[Value fixed]
IF CONTAINS([AttributeName],’time’) THEN STR([Transaction Date])
ELSE [Value]
END

Translation for humans: Maestro, whenever you see “time” in the Attribute Name field, pick the Transaction Data value (as a TEXT); for all other fields just pick the data from [Value] column.

I repeated these steps to also transform the “ModificationDateTime” and built the [Recorded Date] field:

[Recorded Date]
DATETIME(DATEADD(‘second’,INT([ModificationDateTime]),#1970-01-01#))

 

3 | Extract IDs from Value column

While exploring the [Value] column I found a type of attribute named “listvalue” that contained values like this:

3D238104-61C7-4AE3-A900-E7659B6E2B7E.37

This ended up being lists that we could build as we were using the app (e.g. list of foods our baby was eating). I then realized that prefix (the string the .) was always the same. I also found a table that was suspiciously named “FieldListItem”, in which I was able to see the labels that we added for each list entry. So, I’ve decided to trim this prefix out so I could deal only with the known code. For this I’ve used the good old Tableau’s Custom Split option:

[CategoryFieldListItemId]
TRIM( SPLIT( [ListValue extract], “.”, -1 ) )

This created the column that I needed with only the IDs that I would later replace with their labels. As a final manipulation on this stage I just needed to replace the original “listname”codes with the extracted ones:

[Value end]
IF [AttributeName] = “listvalue” THEN [CategoryFieldListItemId]
ELSE [Value fixed]
END

Translation to humans: Maestro you will pick the data from [CategoryFieldListItemId] column and place it on a new [Value end] field whenever you see the Attribute “listvalue”. For all the users just pick the [Value fixed]

Note: before I forget again, I’ve deleted the [Value fixed] field at the end of this Clean section.

 

4 | Joining the files

The first join that I made to my master file was the simplest one that I had. I’ve added the CategoryField table so I could bring the Names of the categories into the workflow. The table has these columns:

Screen Shot 2018-01-12 at 10.57.56 PM

As soon as I loaded this to Maestro I’ve added the Clean step and removed the [DataType], [OrderValue] and [ViewInSummary] fields. This kept only the fields I needed for my join to be perfect match:

Screen Shot 2018-01-12 at 10.59.50 PM

Note how the list of options from the two tables are identical. This started to look promising.

After this Join I’ve added a new Clean step where I was able to remove both fields that I used for the Join (seems weird right?) and kept only the [Name] column (which is what I just wanted to have on my final file).

My second Join was for the [Category] file and it followed a pretty similar logic. After adding datasource to workflow, cleaned all unnecessary fields, joined it with my last Clean and kept the column that I needed only (Category Name).

I followed the same rule with all other files that I joined. As you can see on my workflow below, after new files there is always a Clean followed by a Join step:

Screen Shot 2018-01-12 at 11.10.37 PM

And here is my final piece of Maestro art:

Screen Shot 2018-01-12 at 11.13.07 PM

As someone who started my professional career doing data migrations, this is simply an amazing powerful tool. And I can’t wait for my next Maestro trial project which is work related.

I hope you enjoy the reading and go ahead, give Maestro a try today!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s