I have two datasets, 1 and 2, each containing a number of columns with values. My end goal is to find all of the rows in dataset 1 that are different from and can't be found in dataset 2.
Dataset 1 (example):
Name Species Age
Donald Dog 3
Petronella Dog 5
Dataset 2 (example):
Name Species Age
Donald Dog 3
Anna Dog 5
In the example above, I would like to find out that the combination of cell values regarding Petronella are unique to the first dataset and can't be found in the second. Donald and Anna are of less interest in this case.
Perhaps an easy option would be to add a fourth column with a value of 1 or 0 depending on wether the range of data is present in the second dataset.
I know how to compare one range directly to another range, but how do I expand this comparison to include all of the rows in dataset 1? The order of the rows should not be a factor when determing if a range of values in dataset 1 can be found in dataset 2.
Answer
You can use MATCH and CONCATENATE in an Array Formula to know the list of Unique Values from Dataset1. Since MATCH is used the comparison is not Case Sensitive though.
Sample Dataset1 is in cells B4:D7 & dataset2 is say in G4:I7. Now in E4 put the following formula and then press CTRL+SHIFT+ENTER from within the formula bar to create an Array Formula. The formula shall get enclosed in Curly Braces to indicate that it's any Array Formula.
=IF(ISERROR(MATCH((B4&C4&D4),CONCATENATE($G$4:$G$7,$H$4:$H$7,$I$4:$I$7),0)),"Unique","Duplicate")
See the below screenshot. It's the basic use of MATCH but the argument is Concatenated List of Rows in an Array.
No comments:
Post a Comment