Comparing two dataset outputs

I am trying to find a solution to compare output of two datasets. Both have same columns and same timeline but I want to create a flag if there is any difference in any of the col when we compare both outputs.

Comments

  • amehdad
    amehdad 🟒

    Hi there, are there reasons why you would not consolidated those two outputs into one dataset for your needs?

  • User2021
    User2021 βšͺ️

    The data is refreshed weekly and the purpose is to find the differences. I merge both datasets in one but then I am struggling to proceed the way to figure out difference in Col A Vs Col A of second table.

  • GrantSmith
    GrantSmith Indiana πŸ”΄

    You can use an ETL with both datasets as an input and then left join them together based on your columns and see which ones are are missing values, that will identify where your records don't exactly line up.

  • User2021
    User2021 βšͺ️

    Left join then create a formula to get the difference?

  • GrantSmith
    GrantSmith Indiana πŸ”΄

    You can use a group by with the COUNT aggregation to count how many records matches compared to the number of records in your original dataset then use a formula to do total - count to get the difference / records which don't match.

  • Depending on the purpose of your comparison, you may want to consider a Full Outer Join instead of a Left Join. That way you can check if records are missing from either dataset instead of just one.