Tuesday, 8 October 2019

command line - What does the linux pipe symbol "|" do?



Here is a command that sorts files in a folder in reverse order


ls | sort -r

What does the | symbol in that command do?


What I'm really looking for here is high level (easy to understand) explanation of pipes for Linux beginners. I see other questions about pipes here on Superuser, but nothing that elicits an answer that explains in simple terms what they do and how they differ from redirection (the > or < symbol).



Answer



The following is simplified a bit to help new users.


Well, first, it's necessary to understand the concept of standard input and standard output.


In Linux and other UNIX-like operating systems, each process has a standard input (stdin) and a standard output (stdout). The usual situation is that stdin is your keyboard and stdout is your screen or terminal window.


So when you run ls, it will throw it's output to stdout. If you do nothing else, it will go to your screen or terminal window, and you will view it.


Now, some Linux commands interact with the user, and use stdin to do that, your text editor being one of those. It reads from stdin to accept your keystrokes, do things, and then writes stuff to stdout.


However, there are also non-interactive or "filter" commands that do NOT work interactively, but want a bunch of data. These commands will take everything stdin has, do something to it, and then throw it to stdout


Let's look at another command called du - stands for disk usage. du /usr, for example, will print out (to stdout like any other Linux command) a list of every file in that directory and it's size:


# du /usr
2312 /usr/games
124 /usr/lib/tc
692 /usr/lib/rygel-1.0
400 /usr/lib/apt/methods
40 /usr/lib/apt/solvers
444 /usr/lib/apt
6772 /usr/lib/gnash

As you can tell right off the bat, it isn't sorted, and you probably want it sorted in order of size.


sort is one of those "filter" commands that will take a bunch of stuff from stdin and sort it.


So, if we do this:


# du /usr | sort -nr


we get this, which is a bit better:


4213348 /usr
2070308 /usr/lib
1747764 /usr/share
583668 /usr/lib/vmware
501700 /usr/share/locale
366476 /usr/lib/x86_64-linux-gnu
318660 /usr/lib/libreoffice
295388 /usr/lib/vmware/modules
290376 /usr/lib/vmware/modules/binary
279056 /usr/lib/libreoffice/program
216980 /usr/share/icons

And you can now see that the "pipe" connects the stdout of one command to the stdin of another. Typically you will use it in situations like this where you want to filter, sort or otherwise manipulate the output of a command. They can be cascaded if you want to process output through multiple filter-type commands.


If you type sort by itself, it will still try to read from stdin. Since stdin is connected to your keyboard, it will be waiting for you to type, and process things until you press Control-D. It won't prompt you since it's not really meant to be used interactively.


It's possible for a program to tell whether stdin is interactive or not, so some programs may act differently if you issue them by themselves or at the end of a pipe.


Also, piping a program that only works interactively, like vi, will result in you having a bad time.


Pipes are different from redirection in that the data shuffled from one command to the next without being stored anywhere. So, In the above example, du's output is not stored anywhere. The majority of the time you don't want this with pipes because the reason to use pipes is to process the output of a command in some way - but, there is a command tee that lets you have your cake and eat it too, it will copy what it receives from stdin to both stdout and a file of your choosing. You can also likely do this from bash with some arcane syntax involving ampersands and brackets that I don't know about.


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...