Monday 18 March 2019

bash - Diff and ignore lines missing in one file


I want to diff two files and ignore lines that are present in one file but missing in the other.


For example


File1:


foo
bar
baz
bat

File2:


foo
ball
bat

I'm currently running the following diff command


diff File1 File2 --changed-group-format='%>' --unchanged-group-format=''

Which in this case would produce


bar
baz

as the output, i.e. only missing or conflicting lines. I would like to only print conflicting lines, i.e. ignore cases where one line is missing from File2 and is present in File1 (not the other way around). Is there any way to do something like this using diff or do I have to resort to other tools? If so, what would you recommend?



Answer



You might also take a look at comm, if you have it available:


comm [-1] [-2] [-3 ] file1 file2
-1 Suppress the output column of lines unique to file1.
-2 Suppress the output column of lines unique to file2.
-3 Suppress the output column of lines duplicated in file1 and file2.

The input files should be sorted. However, you can modify the default behavior with --nocheck-order option, if available.


In your case you would want comm --nocheck-order -23 file filter_file


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...