Wednesday, 6 March 2019

linux - bash - processing output one line at a time


I read that xargs was good for processing the output of a command one line at a time (and it is). I have the following line in my script.


./gen-data | awk '{printf $2 " "; printf $1=$2=$3=""; gsub (" ", "", $0);if(length($0) == 0){ print "0000"} else{print $0}}' | xargs -t -n2 -P1 bash -c 'datatojson "$@"' _

It produces the right output, there is no question of that. However, gen-data produces something like 1000 lines, and what I really would like is for this command to execute after each line, not after 1000 lines (It's clearly stopping regularly to get more input).


Here is what gen-data looks like:


candump $interface &
while true; do
while read p; do
cansend $interface $(echo $p | awk 'NF>1{print $NF}');
done < <(shuf $indoc)
done

(cansend sends data to an interface and candump reads from that interface and outputs it onto the screen, but I wager that's not too relevant). In any case candump seems to be continuously streaming output, but when I pipe that into awk and xargs, it becomes chunked. Is it just because I used shuf? I would think that since it's going through the interface, and being read on the other side, it would be less chunked than shuf provides.



Answer



You can try the same command, this time using multiple hacks to avoid buffering:


./gen-data | gawk '{printf $2 " "; printf $1=$2=$3=""; gsub (" ", "", $0);if(length($0) == 0){ print "0000"} else{print $0}; fflush(stdout)}' | stdbuf -o0 xargs -t -n2 -P1 bash -c 'datatojson "$@"' _

Mind the change from awk to gawk and the use of fflush. You can also try mawk -Winteractive. Also mind that I added stdbuf -o0 before xargs. You can also try the latest at the beginning with ./gen-data


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...