Can you remove duplicated rows of data in a data set?

We have a data set that is updated from a Google Sheet. I see that back in June we have duplicate entries for June 22 and June 23 in the data set, yet the data in the Google Sheet is not duplicated.

Is there a way to edit the data that is in the data set to remove these duplicates?


Tagged:

Answers

  • GrantSmith
    GrantSmith Indiana 🥷
    edited August 2021

    Hi @AlexF

    Short answer: not easily

    if possible you could completely reimport your dataset with the replace update method if you have all of your data


    You could use a magic Etl dataflow to remove the duplicates with the remove duplicate tile or use a group by and group all your columns together and take the min or max value for the metric field you need to pull. The grouping method could also be done in a dataset view.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Thanks Grant. I am really puzzled why this occurred in the first place.

  • @AlexF - We have gotten in the habit with a lot of datasets of just expecting that at some point something can go wacky and we can get duplicate data.

    So we think about data coming in and wonder if there is a way for duplicate data to come in (a script runs twice ... whatever) and build that into our dataflows.

    Maybe we just have bad luck but it is that common for us that we just account for it almost every time.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🔴

    @AlexF Domo has two ingest methods when receiving data, either a full REPLACE (all rows in the dataset are truncated and then the new data is brought in) or APPEND (rows are tacked on to the bottom). That's it.

    There is also UPSERT or PARTITIONing but both of those are based on the APPEND concept with a little data processing, and usually you know if you're using those features.

    Check your connectors. Is it append or replace? if it's APPEND an easy way to get duplicate rows is to run the same connector twice in the same day.

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Thanks