Append data when columns missing from input data

Hello. 

 

I have a master dataset in domo which was originally extracted from a legacy system, and has 'timestamp' and 'id' columns which were specific to the legacy data set. New data is identical but with the exception of the fact that it doesn't include those columns. It seems if I try to append the data to the master data set, it just removes those columns entirely from the main data, which I do not want. I can see that I could create a new dataset from the new data, and then merge them in domo, but it looks like that will duplicate the number of rows that I have  in the system - my master dataset is already 8m rows so I don't want to duplicate this for allowance reasons. 

Is there anything I can do short of having to manually add null 'timestamp' and 'id' columns in new data before it is appended?

Comments

  • Hello,

     

    What does your source query look like in workbench? You may have to actually name the fields in the source query.

    For example ... 

     

    select null as id
    ,null as timestamp
    ,make
    ,model
    ,year
    from fake_table

    Hope this helps,

     

    Brian

     

    **Please mark "Accept as Solution" if this post solves your problem
    **Say "Thanks" by clicking the "heart" in the post that helped you.

  • I'm uploading CSV not selecting source data like that

  • I don't know of a way to solve for this using CSV. If you do expect your schema to change, I might upload the csv as a replacement file in Domo and do the append using a MySQL Domo dataFlow where you can control your columns.

     

    Brian

     

    **Please mark "Accept as Solution" if this post solves your problem
    **Say "Thanks" by clicking the "heart" in the post that helped you.

  • I thought of that, but I can't seem to see a way to do it where I am not creating a whole new dataset, i.e. massively increasing my row count because I am duplicating the whole original dataset in order just to append some data - do you know if it is possible just to append tom the actual source table?

  • So what I was thinking was you only bring the new records into Domo.

     

    Basically the first step would be to create a dataflow with the current dataset of 8 million records. Now you have two datasets, the original with 8 million and the new history dataset with 8 million. Tomorrow when the data reloads from workbench it will replace the original 8 million with only new records (now that old dataset may have 1,000 records as opposed to 8 million). You now take that dataset with 1,00 records into your dataflow and append it to the new history dataset with 8 million records and then your dataflow just overwrites that existing new history dataset. This is called a recursive history dataflow where you overwrite the input dataset with the output dataset. 

     

    Please see the attached image for an example. I am bringing in PROD_CRITICAL_pipeline_build_history, appending new data to the flow then overwriting the PROD_CRITICAL_pipeline_build_history. Everytime this runs it will complete the same process and I will only have one dataset with complete history. 

     

    Hope this helps,

     

    Brian

     

    **Please mark "Accept as Solution" if this post solves your problem
    **Say "Thanks" by clicking the "heart" in the post that helped you.

  • Thanks I'll give that a try

This discussion has been closed.