It's easy to remove Duplicates.....How do I KEEP only the Duplicates in a Dataset?
I have a large dataset that I am constantly adding to, and the unique identifier for each item is a 17 or 22 character Text String. It is critical for me to quickly identify if I have duplicated a previous item when I add to the dataset.
ETL makes it simple to remove duplicates from a dataset.....but is there a way to eliminate everything BUT the duplicates??? Ideally, I'd like to either:
1. Create an alert anytime a new duplicate value is added to the dataset, or
2. Create an Output Dataset that consists ONLY of the rows that have a duplicate value in a specific column.
Thanks in advance for any help.