We have this file:
1 2
1 3
1 2
3 3
52 1
52 300
and 1000 more.
I want to count the number of times each value occurs in the first column.
1 3
3 1
52 2
This means we saw 1
three times.
How can I do that, in Perl, AWK or Bash?
Answer
If the input is sorted, you can use uniq:
If not, sort it first:
Output:
3 1
1 3
2 52
The output is swapped compared to your requirement, you can use awk '{ print $2, $1 }'
to change that.
1 3
3 1
52 2
There's also the awk idiom, which does not require sorted input:
awk '{h[$1]++}; END { for(k in h) print k, h[k] }'
Output:
1 3
52 2
3 1
As the output here comes from a hash it will not be ordered, pass to sort -n
if that is needed:
awk '{h[$1]++} END { for(k in h) print k, h[k] }' | sort -n
If you're using GNU awk, you can do the sorting from within awk:
awk '{h[$1]++} END { n = asorti(h, d, "@ind_num_asc"); for(i=1; i<=n; i++) print d[i], h[d[i]] }'
In the last two cases the output is:
1 3
3 1
52 2
No comments:
Post a Comment