How to append data to the output dataset?

rishi_patel301
edited November 3 in Dataflows

I have a workbench job that will run daily. It will get the data with current date. I then needs to filter this data and create two data sets using the DataFlow ETL tool.

How do I append data to the DataFlow ETL output table?

DataFlow ETL creates a new table every time it runs.


I realize that One way to achieve this is to select append in the workbench job itself, however this takes up unnecessary computing resources since it it processes the entire table every time data is appended.


Tagged:

Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    It sounds like you are wanting to leverage recursive dataflows. I would recommend looking at this KB article which will walk you through how to set it up.

    https://domohelp.domo.com/hc/en-us/articles/360057087393-Creating-a-Recursive-Snapshot-DataFlow-in-the-New-Magic-ETL

  • GrantSmith
    GrantSmith Indiana 🔴

    Hi @rishi_patel301

    Have you looked into the DataSet Copy connector to copy your datasets and set the update method to Append? https://domohelp.domo.com/hc/en-us/articles/360043436533-DataSet-Copy-DataSet-Connector

    You can copy the dataset to the same instance and set the update method to append so you'd remove the append rows and the two old input datasets and just output those two datasets then have two dataset copy jobs to copy your two datasets with the append update method.

  • @MarkSnodgrass

    I am trying to do recursion, however it creates new datasets every time. I end up with two datasets having the same name.

    I wonder why DOMO did not give an option to select append to an existing output dataset in the ETL tool.

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    @rishi_patel301 the steps are a little tricky, but you will end up with just one dataset once your done. I would review the steps again as you shouldn't end up with two datasets.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    if it were me, I would implement the DS pipeline and the recursive dataflow as two separate dataflows.

    separate interests. if you alter your DS pipeline for whatever reason, or need to test something, you don't necessarily want to commit that the your recursive dataflow. and if you need to add new columns or calculations to your recursive dataflow, you don't want to have to reproccess your DS pipeline.