Where Do SQL Transform Changes Take Place?

I am currently building a dataset that requires me to insert a dummy row if a specific date does not exist in a table. I am attempting to do this in a MySQL dataflow. From what I have read I cannot use the insert function when the transform type is a table. Instead, the transform type needs to be SQL. My question is how/where do I access the output data once that SQL transform is run? Is it recursive and only makes changes to the original input dataset? 

 

I cannot seem to find any content that has been produced by Domo that could answer this question. The best I could find is "SQL transform does not create an output dataset" If that is the case where do the changes take place?

 

 

Best Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤
    Accepted Answer

    Your insert sql statement should be inserting into a table that you created in a previous step. Then, after you can reference that original table in a later step and it will have your inserted data. Here is a sample screenshot.

    mysql.PNG

    The select statement that I perform in transform_data_1 will have the newly inserted row that was created by the sql statement to the prep_data table.

    Hope this helps.

     

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤
    Accepted Answer

    You can use the SQL transform if you choose... but I recommend against it.

     

    the "Add Table" transform does a lot (and you can add index via the UI) but unless you're doing large JOINs adding Indexes can cost more (in time) than just SELECT into table.

     

    Ideally you want to build dataflows that you can easily hand over to a coworker.  the more SQL you write, the more you depend on their ability to read your code when you want to go on holiday.

     

    To be honest, having done consulting for over a hundred Domo clients, I can say with confidence that for most ETLs, 

    1) Magic ETL will be faster

    2) Magic ETL will deliver 80% of transformation requirements.  

     

    SQL is fun b/c it's familiar, but it has consistently been my experience that organizations that can leverage Magic can iterate faster.

     

    Just my ten cents!  Good luck!!

Answers

  • Thanks Mark, 

     

    I just tested it out and now everything is up to par. I appreciate the timely and accurate response!

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    Sorry i'm late to the party, but I wanted to clear up a few points.

     

    MySQL dataflows execute in a standard MySQL 5.6+ database environment. The UI is hiding some automated scripting activities.  When you create an Add Table transform, you are creating the equivalent of

     

    SELECT ...

    INTO tableName

    FROM ...

     

    @MarkSnodgrass is 100% correct in his solution.

     

    Some nit-pick points.

    With a SQL transform (not Add Table), in addition to an INSERT, ALTER,UPDATE or DELETE statement, you could write CREATE TABLE ... and then access that table in a downstream Add Table transform.

    More advanced uses of the SQL transform would be to create stored procedures or functions to do advanced data transformation.

     

    The 'recursive' documentation you stumbled across is a red herring.  Does not apply to your scenario.

     

    When you run a MySQL dataflow the output of a MySQL dataflow is an entirely new dataset (unless you run a recursive dataflow that uses the output dataset as an input).  In Domo, data does not get stored (long term) in a database.  It resides in a datalake "Vault" before moving into the database layer "Adrenaline" for querying / visualization.

     

    The input and output datasets (from the Data Lake's perspective) are completely separate and unrelated files.  

     

    "The best I could find is "SQL transform does not create an output dataset"  -- this is not necessarily true.  As I stated earlier, you could write CREATE TABLE.

     

  • Thanks for smoothing things over, Jae!

     

    I feel much more comfortable using the SQL Transform. Also, I appreciate the clarification on producing output datasets from an SQL Transform. The CREATE TABLE functionality will be widely used on my end. 

  • Thanks for making the recommendation to primarily stick with the Magic ETL!

     

    I had a couple of dataflows that involved 3 datasets well over 500k rows and two joins. Using the SQL Transform it was taking about 35 - 40 minutes to run. I moved those dataflows over to Magic ETLs and now the average runtime is no longer than a minute. 

     

    When it comes to lengthy logic I might bring a dataset into the SQL transform but for large scale joins I am going to use Magic ETL.