Deduplicating a table based on the content of certain columns
Stan_Smith
⚪️
in Dataflows
I have a dataset that has about 20 columns in it. The first column contains ID numbers and a lot of the ID numbers are duplicated multiple times. All the other data in the columns are also duplicated multiple times except one column named last_updated. The last_updated column lists a date.
I would like to de-dupe this dataset and keep the rows with the most recent dates in the last_updated columns. Is there a way to do this?
0
Comments
-
The easiest way to do this is to use the Group By tile in Magic ETL. Add all your columns in the select except for the last_updated column. Add that one to the aggregated column list and choose Max. This will give you the most recent date for each.
**Make sure toany users posts that helped you.
**Please mark as accepted the ones who solved your issue.2
Categories
- 10.6K All Categories
- 1 APAC User Group
- 12 Welcome
- 36 Domo News
- 9.6K Using Domo
- 1.9K Dataflows
- 2.4K Card Building
- 2.2K Ideas Exchange
- 1.2K Connectors
- 339 Workbench
- 252 Domo Best Practices
- 11 Domo Certification
- 461 Domo Developer
- 47 Domo Everywhere
- 100 Apps
- 703 New to Domo
- 84 Dojo
- Domopalooza
- 1.1K 日本支部
- 4 道場-日本支部へようこそ
- 22 お知らせ
- 63 Kowaza
- 296 仲間に相談
- 649 ひらめき共有