Where Do SQL Transform Changes Take Place?

Reply
Yellow Belt

Where Do SQL Transform Changes Take Place?

I am currently building a dataset that requires me to insert a dummy row if a specific date does not exist in a table. I am attempting to do this in a MySQL dataflow. From what I have read I cannot use the insert function when the transform type is a table. Instead, the transform type needs to be SQL. My question is how/where do I access the output data once that SQL transform is run? Is it recursive and only makes changes to the original input dataset? 

 

I cannot seem to find any content that has been produced by Domo that could answer this question. The best I could find is "SQL transform does not create an output dataset" If that is the case where do the changes take place?

 

 


Accepted Solutions
Black Belt

Your insert sql statement should be inserting into a table that you created in a previous step. Then, after you can reference that original table in a later step and it will have your inserted data. Here is a sample screenshot.

mysql.PNG

The select statement that I perform in transform_data_1 will have the newly inserted row that was created by the sql statement to the prep_data table.

Hope this helps.

 




**Make sure to like any users posts that helped you and accept the ones who solved your issue.**

View solution in original post

Tags (2)
Black Belt

You can use the SQL transform if you choose... but I recommend against it.

 

the "Add Table" transform does a lot (and you can add index via the UI) but unless you're doing large JOINs adding Indexes can cost more (in time) than just SELECT into table.

 

Ideally you want to build dataflows that you can easily hand over to a coworker.  the more SQL you write, the more you depend on their ability to read your code when you want to go on holiday.

 

To be honest, having done consulting for over a hundred Domo clients, I can say with confidence that for most ETLs, 

1) Magic ETL will be faster

2) Magic ETL will deliver 80% of transformation requirements.  

 

SQL is fun b/c it's familiar, but it has consistently been my experience that organizations that can leverage Magic can iterate faster.

 

Just my ten cents!  Good luck!!


Jae Wilson
Check out my Domo Training YouTube Channel

**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"

View solution in original post


All Replies
Black Belt

Your insert sql statement should be inserting into a table that you created in a previous step. Then, after you can reference that original table in a later step and it will have your inserted data. Here is a sample screenshot.

mysql.PNG

The select statement that I perform in transform_data_1 will have the newly inserted row that was created by the sql statement to the prep_data table.

Hope this helps.

 




**Make sure to like any users posts that helped you and accept the ones who solved your issue.**

View solution in original post

Tags (2)
Yellow Belt

Thanks Mark, 

 

I just tested it out and now everything is up to par. I appreciate the timely and accurate response!

Black Belt

Sorry i'm late to the party, but I wanted to clear up a few points.

 

MySQL dataflows execute in a standard MySQL 5.6+ database environment. The UI is hiding some automated scripting activities.  When you create an Add Table transform, you are creating the equivalent of

 

SELECT ...

INTO tableName

FROM ...

 

@MarkSnodgrass is 100% correct in his solution.

 

Some nit-pick points.

With a SQL transform (not Add Table), in addition to an INSERT, ALTER,UPDATE or DELETE statement, you could write CREATE TABLE ... and then access that table in a downstream Add Table transform.

More advanced uses of the SQL transform would be to create stored procedures or functions to do advanced data transformation.

 

The 'recursive' documentation you stumbled across is a red herring.  Does not apply to your scenario.

 

When you run a MySQL dataflow the output of a MySQL dataflow is an entirely new dataset (unless you run a recursive dataflow that uses the output dataset as an input).  In Domo, data does not get stored (long term) in a database.  It resides in a datalake "Vault" before moving into the database layer "Adrenaline" for querying / visualization.

 

The input and output datasets (from the Data Lake's perspective) are completely separate and unrelated files.  

 

"The best I could find is "SQL transform does not create an output dataset"  -- this is not necessarily true.  As I stated earlier, you could write CREATE TABLE.

 


Jae Wilson
Check out my Domo Training YouTube Channel

**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"
Yellow Belt

Thanks for smoothing things over, Jae!

 

I feel much more comfortable using the SQL Transform. Also, I appreciate the clarification on producing output datasets from an SQL Transform. The CREATE TABLE functionality will be widely used on my end. 

Black Belt

You can use the SQL transform if you choose... but I recommend against it.

 

the "Add Table" transform does a lot (and you can add index via the UI) but unless you're doing large JOINs adding Indexes can cost more (in time) than just SELECT into table.

 

Ideally you want to build dataflows that you can easily hand over to a coworker.  the more SQL you write, the more you depend on their ability to read your code when you want to go on holiday.

 

To be honest, having done consulting for over a hundred Domo clients, I can say with confidence that for most ETLs, 

1) Magic ETL will be faster

2) Magic ETL will deliver 80% of transformation requirements.  

 

SQL is fun b/c it's familiar, but it has consistently been my experience that organizations that can leverage Magic can iterate faster.

 

Just my ten cents!  Good luck!!


Jae Wilson
Check out my Domo Training YouTube Channel

**Say "Thanks" by clicking the heart in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"

View solution in original post

Yellow Belt

Thanks for making the recommendation to primarily stick with the Magic ETL!

 

I had a couple of dataflows that involved 3 datasets well over 500k rows and two joins. Using the SQL Transform it was taking about 35 - 40 minutes to run. I moved those dataflows over to Magic ETLs and now the average runtime is no longer than a minute. 

 

When it comes to lengthy logic I might bring a dataset into the SQL transform but for large scale joins I am going to use Magic ETL. 

Announcements
New & Improved Dojo: Get excited for a new feel and additional features in the online community.

Site will be down February 4-5th for maintenance.

Launch coming February 8th!