Consider I have two text files.
First File name - "Emails.txt" with the following data:
00iiiiiiii_l@hotmail.com
00rrrrrrrr@hotmail.com
00zzzzz@gmail.com
00eeeeee@gotmail.com
00gggggg@uor.edu
00uuuuuuuu@yahoo.com
00e21_ss@cmail.com
00gggggggg@cmail.com
00zzzzzzzz48@hotmail.com
00aaaaaaa_2020@gotmail.com
jjjjjjjj@gmail.com
Second text file - "Banned.txt" with the following strings:
@gotmail.com
@cmail.com
@uor.edu
How to delete all the lines in the 1st text file "Emails.txt" if it matches the stings of any line present in the second text file "Banned.txt"?
The desired output of the new file should be:
00iiiiiiii_l@hotmail.com
00rrrrrrrr@hotmail.com
00zzzzz@gmail.com
00uuuuuuuu@yahoo.com
00zzzzzzzz48@hotmail.com
jjjjjjjj@gmail.com
Can this be done using SED or awk in Linux? Can you please suggest how to do this?
Answer
grep -v
is enough. The flag -f
allows you to do exactly what you want:
grep -vf Banned.txt Emails.txt
If you want to do something more complicated out of the list of banned addresses, e.g. impose that they match the whole of the domain, you'll need to generate a regex from your Banned
file:
cat Banned.txt | tr "\n" "|" | sed -e 's,|,$\\|,g' | sed -e 's,\\|$,,'
gives the desired
@gotmail.com$\|@cmail.com$\|@uor.edu$
Then:
cat Banned.txt | tr "\n" "|" | sed -e 's,|,$\\\\|,g' | sed -e 's,\\|$,,' | xargs -i grep -v '{}' Emails.txt
(doubling the number of escapes \
as they're being evaluated when going through xargs
). This will match and remove me@uor.edu
but not e.g. me@uor.education.gov
.
No comments:
Post a Comment