Output Redirection • kitras.io

Output Redirection

Post a comment

Introduction

*-nix terminal programs are designed to be very simple. This affords the system a type of modularity not possible if the programs were more complex. The idea of a program is to do one job well with few options for alteration and provide consistently formatted output. Knowing when to use which programs forms the basic building blocks of any Linux user. Knowing how to link these building blocks together can help you accomplish more advanced tasks and create intelligent workflows. The way we link these blocks (or program functionalities) together is by output redirection, or using the output from one program as the input of another.

Devices, Streams, and Buffers (oh my)

Before we go further, it is important to nail down some terminology. In output redirection, you will hear much talk of streams, buffers, and devices. To grossly oversimplify, a device is simply a physical object which helps either read output from a program (i.e. a screen or printer) or handle input to that program (i.e. a keyboard, mouse, or touchscreen). Devices are often represented by special files in Linux systems. You can take a look at those by taking a look inside the /dev directory on your system:

ls /dev                                                                                                          ─╯
acpi_thermal_rel  fuse       loop17        null       shm       tty22  tty42  tty62   ttyS24   vboxdrv     vcsu2
ashmem            hidraw0    loop18        nvidia0    snapshot  tty23  tty43  tty63   ttyS25   vboxdrvu    vcsu3
autofs            hidraw1    loop19        nvidiactl  snd       tty24  tty44  tty7    ttyS26   vboxnetctl  vcsu4
block             hidraw2    loop2         nvme0      **stderr**    tty25  tty45  tty8    ttyS27   vboxusb     vcsu5
btrfs-control     hidraw3    loop20        nvme0n1    **stdin**     tty26  tty46  tty9    ttyS28   vcs         vcsu6
bus               hidraw4    loop21        nvme0n1p1  **stdout**    tty27  tty47  ttyS0   ttyS29   vcs1        vcsu63
cec0              hidraw5    loop22        nvme0n1p2  tty       tty28  tty48  ttyS1   ttyS3    vcs2        vfio
char              hpet       loop23        nvme0n1p3  tty0      tty29  tty49  ttyS10  ttyS30   vcs3        vga_arbiter
console           hugepages  loop3         nvme0n1p4  tty1      tty3   tty5   ttyS11  ttyS31   vcs4        vhci
core              input      loop4         nvme0n1p5  tty10     tty30  tty50  ttyS12  ttyS4    vcs5        vhost-net
cpu               kmsg       loop5         nvme0n1p6  tty11     tty31  tty51  ttyS13  ttyS5    vcs6        vhost-vsock
cpu_dma_latency   kvm        loop6         nvme0n1p8  tty12     tty32  tty52  ttyS14  ttyS6    vcs63       video0
cuse              log        loop7         nvram      tty13     tty33  tty53  ttyS15  ttyS7    vcsa        video1
disk              loop0      loop8         port       tty14     tty34  tty54  ttyS16  ttyS8    vcsa1       watchdog
dma_heap          loop1      loop9         ppp        tty15     tty35  tty55  ttyS17  ttyS9    vcsa2       watchdog0
dri               loop10     loop-control  psaux      tty16     tty36  tty56  ttyS18  udmabuf  vcsa3       wmi
drm_dp_aux0       loop11     mapper        ptmx       tty17     tty37  tty57  ttyS19  uhid     vcsa4       zero
drm_dp_aux1       loop12     media0        pts        tty18     tty38  tty58  ttyS2   uinput   vcsa5
drm_dp_aux2       loop13     mei0          random     tty19     tty39  tty59  ttyS20  urandom  vcsa6
fb0               loop14     mem           rfkill     tty2      tty4   tty6   ttyS21  usb      vcsa63
fd                loop15     mqueue        rtc        tty20     tty40  tty60  ttyS22  userio   vcsu
full              loop16     net           rtc0       tty21     tty41  tty61  ttyS23  v4l      vcsu1

You’ll notice many devices have names like cpu or disk which correspond to actual physical objects inside or connected to the computer. Conversely, there are also devices like random and tty don’t have a strong physical correlation but rather refer to specific concepts. Those devices represent virtual objects (i.e. all the ttys are just virtual console screens and nvme0n*s are disk partitions). Regardless of these discrepancies, the devices that are of most interest for this article are the stdin, stdout, and stderr buffers.

Streams is a concept used to describe the flow of data through a device. For example, when I run several commands in the terminal:

cat example.txt
banana
❯ echo pumpernickle
pumpernickle

all of the output is displayed on the same terminal screen. That means that the device, regardless of which program wrote the data, still had the output at one point in its stream. The same idea applies when considering input. If I looked at the text that my keyboard typed constantly as one flowing stream, I would be able to see different character strokes regardless of which program I was interacting with.

Buffers are like streams due to the fact that they get their information to or from devices. However, instead of the data flowing freely and ephemerally to and from the device, buffers save that info in memory so that chunks of data can be sent or read at the appropriate times. When we deal with the standard buffers, we are normally accessing or transmitting data from cached sources to make input or output from programs more manageable.

The Holy Trinity of Buffers

Before we can understand how to pass output between programs we need to know how the Linux operating system classifies input and output. Anyone who has programmed in C before should be familiar with the holy trinity of buffers: standard out, standard in, and standard error.

stdout

stdout is where all the normal, non-error output of a program should go. Whether you are writing your first hello world program or formatting fancy tables which print to the console, you are most likely printing those to stdout. For those familiar with C, you’ll recall that to write something to a file, you need to provide a file descriptor which normally returns an int associated with that file. By default, 0-2 are reserved on *-nix systems because they are assigned to the three default I/O buffers, 1 being the file descriptor number for stdout. To see this explicit association on your Linux box, enter:

ls -la /dev/stdout
lrwxrwxrwx 1 root root 15 Jun 21 09:36 /dev/**stdout** -> /proc/self/fd/**1**

Knowing the file descriptor number will provide some interesting shorthand for some of the advanced redirection commands we’ll do later.

stdin

stdin is the buffer where data is sent to a program. You can think of your keyboard being the ultimate stdin device because it is how you ultimately enter in information into the computer. Apart from your physical keyboard providing input to programs, you will see shortly that there are other ways to pass input into a program. The file descriptor for stdin is 0 as seen by executing:

ls -la /dev/stdin
lrwxrwxrwx 1 root root 15 Jun 21 09:36 /dev/**stdin** -> /proc/self/fd/**0**

stderr

stderr is a special buffer where all error messages of a program should go. It is worth mentioning that there are lots of programmers who do not go out of their way to use this buffer when they are printing errors. stderr does not show up by default on a terminal screen unless you specify it to do so when executing the program. You have to use different functions/parameters when you are specifying that you are writing to stderr instead of stdout (i.e. in C using perror instead of printf or in Python including the file=sys.stderr option). But what is the utility of stderr if everything that I want to communicate to the user can go to stdout? Having distinct buffers for distinct types of output can help suppress unnecessary output in special cases. For example, if I decided to put all non-critical errors in stderr , I could do some stream manipulation (see section below) and have a cleaner output at runtime. The file descriptor for stderr is 2 as seen by:

ls -la /dev/stderr
lrwxrwxrwx 1 root root 15 Jun 21 09:36 /dev/**stderr** -> /proc/self/fd/**2**

Redirection Operators

Up to this point we have assumed the input for a program comes from your keyboard (i.e. you providing the values for arguments) and that the output is printed to the terminal screen. By using redirection operators, we can plug programs into each other (i.e. feeding ones output to another’s input).

Read/Write

The first helpful set of operators to know are < and >. Think of them as arrows that indicate where the data will flow. When we use these operators, we always do it between a command and a file. Failing to use a file (i.e. something that returns a string, like echo) will cause the operator to either write a new file or complain that the file to be read in doesn’t exist.

Write (>)

For example, if I wanted to create a file that contained the list of contents of my current directory I could do:

ls > folder_contents.txt

But wait a second, I didn’t have a file called folder_contents.txt before waiting for that input! Why doesn’t this crash and tell me that there is no such file? With the output redirector, if the target file does not exist, then the shell will create one! To verify that the folder_contents.txt actually has the output of ls:

cat folder_contents.txt
bkp_sss
blog
christopolise.github.io
clgui
....

But be careful! If you run another write > operation on this file, it will overwrite the contents:

echo "Ooops :P" > folder_contents.txt
❯ cat folder_contents.txt
Ooops :P # Notice that there are no file directory contents anymore :(

If you want to add more to the existing contents of the file, you will need to use the append operator (see more below).

Read (<)

As you would expect, the read operator functions syntactically in the same way as write does. Reading in also requires a command + < + a file(s):

cat < .bashrc
#
# ~/.bashrc
#

[[ $- != *i* ]] && return

colors() {
	local fgc bgc vals seq0

	printf "Color escapes are %s\n" '\e[${value};...;${value}m'
....

Oftentimes if you are trying pass a file into a program that takes a file as an input by default, the < operator is not necessary:

cat .bashrc
#
# ~/.bashrc
#

[[ $- != *i* ]] && return

colors() {
	local fgc bgc vals seq0

	printf "Color escapes are %s\n" '\e[${value};...;${value}m'

Append (»)

Sometimes you will want to have the option to append data to a file instead of overwriting everytime. An example of this could be to write a script with a for loop that needs to update a log file. Instead of using the write operator, you would use the append >> operator instead:

echo "hello" > example.txt
❯ echo "world" > example.txt
❯ cat example.txt
hello
world

heredocs/strings

All of this creation of files with commands can leave your directories a little messy. Creating files for each command you execute can be wasteful and a nuisance. How can we take advantage of this interprocess communication if nothing is formatted right? Especially if the output of the previous command is returned as a string instead of a file!

heredocs («)

A here-document, or heredoc for short, is a special way to format a string output to act as a document so that it can be compatible with commands that only accept documents. It is a powerful tool, even if the formatting is a little wonky. To perform a heredoc operation for a command we use the << operator. This will be followed by some sort of text delimiter or a special word or value that will mark that we have reached the end of the intended input. In the example below I use the word DELIMITER to be our delimiter (for redundancy’s sake):

cat "I can't cat a string" # This fails because cat only accepts files 
cat: "I can't cat a string": No such file or directory

❯ cat < echo "The input operator won't work either"
zsh: no such file or directory: echocat << DELIMITER
heredoc> This should work!
heredoc> It will cat any text that I type  
heredoc> Until I type the delimiter word above
heredoc> DELIMITER
This should work!
It will cat any text that I type
Until I type the delimiter word above

here strings («<)

You’ll notice that heredocs support multiline input like a normal file would. However this becomes inefficient and annoying if all you are trying to do is input a single word or string. Here strings are exactly like heredocs, but with an abbreviated syntax that has no delimiters:

cat <<< "Hello world"
Hello world

For more information about how these operators are used day-to-day, here is a useful article that will help you understand the full impact and utility of heredocs/here strings:

Manipulating Streams

By default the information that is returned by a program to the terminal is the information put into the stdout and stderr streams by the program. Why do I need two streams if both print by default? This is done to keep separate event types separate when spitting everything out to the terminal (i.e. separating normal output from debugging statements).

Let’s take the following python script that prints a string to both stdout and stderr:

import sys

print("I'm printing on stdout!") # This string goes to stdout
print("I'm printing on stderr!", file=sys.stderr) # This string goes to stderr

When we run the example with no stream merging modifiers, we get the following output:

❯ python3 buffers.py
I'm printing on stdout!
I'm printing on stderr!

What happens if I only want to see the errors of the program instead of the normal output so I can quickly diagnose errors? This is a perfect opportunity to tell all the output to go to the garbage! In Linux, whenever you want to send something to the garbage (whether it be a file or just output from a program) you can send it to /dev/null. If we only want the info on stderr we can do the following:

❯ python3 buffers.py 1> /dev/null
I'm printing on stderr!

By putting a 1> in front of the garbage location, we forced all the data that would print on file descriptor 1, or stdout, to the garbage location. If we wanted just the normal program output and no errors (wouldn’t that be nice), we can send the info on file descriptor 2 to /dev/null instead:

❯ python3 buffers.py 2> /dev/null
I'm printing on stdout!

But wait a minute, I’ve seen the > operator before. Isn’t that just writing something to a file? What’s with the numbers? Why didn’t I have to put one there before? It turns out that when you don’t include a number, the inferred buffer that the terminal uses is stdout or 1. This means that if you used 1> or > it would be the exact same thing!

❯ python3 buffers.py 1> /dev/null
I'm printing on stderr!
❯ python3 buffers.py > /dev/null
I'm printing on stderr!

Likewise, using the I/O buffer file descriptors will work on any normal write that you do, making logging only errors a very easy thing to do:

❯ python3 buffers.py 2> buffer.log
I'm printing on stdout!
❯ cat buffer.log
I'm printing on stderr!

Notice how only the stderr output went to the log file but the info on stdout still printed in the terminal. That is because we did not give 1 any special instructions. What if we wanted both streams to go to the same destination? We can combine them by sending the output to a file like we normally do and then pushing the contents of stderr into stdout like so:

❯ python3 buffers.py > buffer.log 2>&1
❯ cat buffer.log 
I'm printing on stderr!
I'm printing on stdout!

2>&1 writes the contents of stderr into the location of stdout (In this case, think of the & symbol like a pointer reference in C. If you hate C, than think of it as a goofy looking A that stands for “at” i.e. “at stdout”). By pushing stderr into stdout we have combined them into one write buffer that will end up in our buffer.log file, providing an example of possibly the only time it is ever okay to cross streams.

Interprogram Operations

The last, but probably most important redirection that we’ll talk about in this article is passing information from one program directly to another.

Pipes (|)

A pipe is an operator which takes the stdout of one program and put it into the stdin of another program. This type of flow is very handy because instead of needing interim files which store output to be used as input for another file, the input/output lines are connected directly:

# Way 1:ls > dir_contents.txt # Store contents of current directory in filecat dir_contents.txt
3DPrinting
Android
AndroidStudioProjects
ans.txt
Arduino
....
❯ rm dir_contents.txt # Remove to keep the folder clean
❯
❯ # Way 2 (with pipe):ls | cat
3DPrinting
Android
AndroidStudioProjects
ans.txt
Arduino
....

NOTE: Not all commands that take in some sort of text as an input will work with pipes. An example of this would be echo which prints its arguments and not input from stdin:

ls | echo # Returns nothing because echo has no arguments added to itls | echo banana # Just prints banana because that was its only argument, ignores stdout from ls
banana

mkfifo

mkfifo is an interesting addition to the interprocess tool chain. Much like a pipe |, a file created from mkfifo acts like the tangible connection between the input and output of these two processes. However, it differs from a pipe in the sense that (1) it is named and (2) it can pipe information across terminal sessions.

To make a mkfifo file (or named pipelines) as they’re officially known, you need to do the following:

mkfifo my_pipe_file

You can verify that this is a pipeline instead of a normal file by checking it with:

ls -l my_pipe_file
prw-r--r-- 1 christopolise christopolise 0 Jun 23 10:43 my_pipe_file

The p at the beginning of the the permissions block indicates that this file is a pipe.

To use your brand new named pipeline across terminal sessions, you can write to it in one window:

ls > my_pipe_file

Notice that after hitting enter that the file doesn’t finish its execution and it seems like it is hanging. This is because something is in the pipe and it is waiting to come out the other side. In another terminal window/tab you can retrieve the info from the pipe with a command that will read from it:

cat < my_pipe_file
3DPrinting
Android
AndroidStudioProjects
ans.txt
Arduino
....

Once the information in the pipe is read into the latter program, the program will finish in both terminal windows. This functionality also works in the reverse direction (i.e. reading from the pipe in the first window and writing in the second).

Conclusion

A solid knowledge of output redirection is the difference between a novice Linux user and a more competent, advanced user. With this you can now make more complex pieces instead of relying on the utility of just a single program.

As a quick example, using ls allows you to see the contents of your directory, and grep searches for patterns in a file. Combining those two provides you the ability to find files that match a specific pattern in a certain directory:

ls /dev | grep stdin
stdin

With the variety of different programs that can be mixed and matched on a Linux machine, the possibilities of what you can do become virtually limitless.