How to delete rows automatically based on a date in a dataset ?

Reply
White Belt

How to delete rows automatically based on a date in a dataset ?

I am inserting new rows in a dataset on daily basis, and now the dataset size has became huge since the rows are added daily. I want to delete the rows from this dataset based on date criteria (Dataset has a column type of "date" in it). So the criteria is like I want to preserve rows in dataset for last 30 days. eg assume I updated the dataset today (Nov 25) then dataset should only have rows (Oct 25 - Nov 25) and for next day ie Nov 26 the dataset should have rows in range from (Oct 26 to Nov 26) and all the rows before Oct 26 should get deleted, if exist.

1. Is there a way I can do this?
2. Is there any API provided by DOMO that can be used to do programmatically?

Black Belt

Hi @user017486 

 

You could utilize a DataFlow to do the following:

 

1) Run a simple DataFlow to filter your records so only the last 30 days are included in the output dataset.

2) Change your dataset to only return the prior day and make it Replace for the update method.

3) Change the DataFlow from step 1 to append your output dataset with the prior day's dataset. You'll configure a recursive DataFlow. Since you're trimming your dataset to be only 30 days long there shouldn't be a big performance hit because of the size of the data. Have this dataset run whenever your single data data set from step 2 is updated.

 

Here's a link to the knowledge base articles to configure this. I'd recommend Magic V2 for it's speed improvements.

Magic ETL 2.0: https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/Creating_a_Recursive%2F%2FSnapshot_DataF...

Magic ETL 1.0:

https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/Creating_a_Recursive%2F%2FSnapshot_ETL_D...

MySQL:

https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/Creating_a_Recursive%2F%2FSnapshot_SQL_D...

 

You'd then need to update any Domo objects which utilize your old dataset with this new dataset.

 

Short version:

Configure a recursive DataFlow with a filter at the end of the dataset to remove any data more than 30 days old.

 

 

Another option would be to change your data pull and utilize the replace update method and only pull in the last 30 days which would be the simplest option if you're able to tweak how many days prior you're able to pull in.



**Was this post helpful? Click the heart icon**

**Did this solve your problem? Accept it as a solution!**
Announcements
New & Improved Dojo: Get excited for a new feel and additional features in the online community.

Site will be down February 4-5th for maintenance.

Launch coming February 8th!