Scripts

Introduction

The first two points Unix Philosophy as documented by Doug McIlroy:

Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new “features”.
Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don’t insist on interactive input.

Downside of the Unix Philosophy

After you have used the terminal for a while, you will start to see that certain tasks are pretty repetitive. You’ve probably noticed—as evident in the terminal article—there are myriad of programs that live on your Linux box. However, you’ve also probably noticed that all of these programs tend to be pretty simple (i.e. cd is pretty good at changing directories, but not much else). This follows the Unix philosophy of having many programs that do few things so they can link together like building blocks to do great things. Unfortunately, all of this modularity leads to a lot of typing.

Scripting to the Rescue

Scripts are programs written in the shell’s language to automate tasks. Think of it like a recipe; apart from your list of ingredients, there is also the procedure, or step by step instructions to make the meal correctly. When writing a script, you are writing the procedure the shell will execute, step by step until the list is completed.

The following is a simple script that prints out Hello world! , then the date, then your user name. You’ll notice that the main components of a script tend to be programs run in sequence.

# Example Script

#!/bin/bash

echo "Hello world!"; # Step one
date; # Step two
whoami; # Step three

Basics

Everything in Linux is a file, including shell scripts. Per convention, these files end in .sh, but can also end in .zsh or .bash, etc. for specific shells. In this tutorial, we will focus on bash scripting. While sh, bash, zsh, or dash are similar to each other, there are idiosyncrasies that prevent them from being interchangeable. This won’t be addressed in this tutorial.

Skeleton

When creating a script, the terminal needs to know what kind of language it is interpreting. This is done on the first line of the script by putting a shebang (#!) followed by the path of the program language. For bash scripts you would put:

#!/bin/bash

Hint: This works for any scripting language, including Python!

#!/usr/bin/env python3

Comments

Scripts can get hairy fairly easily. Especially if you are developing one over the course of days or weeks; a certain line that was clear to you before can become nothing short of an alien language. This is where comments come into play. By peppering your script with comments, you will save yourself from countless hours of trying to relearn what you were doing before. Comments are pretty straightforward in bash. There are three types: inline, full-line, and block comments.

Inline

Good for a quick follow up clarifier, inline comments are put after a statement:

echo "This will be printed!"; # This comment will not be printed

Full-line

Full-line comments are good for when you want a nice header explanation for a block of code that follows or even as a popping visual separation:

# The following code will print out the contents of the directory and filter for a keyword
ls -al | grep 'keyword';

####################################################################

# This is the second section of the code
echo "This text won't print" > /dev/null;

Block

Sometimes specific blocks of information are necessary for someone who will view your script contents at a later time. Block comments are useful for function descriptions or even author information:

: << 'COMMENT'
Author: Christopher Kitras
Date modified: Apr 27, 2022
email: kchristm@byu.edu
COMMENT

echo "This text prints! But the multiline comment won't"

Variable Names

Assigning values to a variable is as simple as putting the variable name, the equals sign, and the value. This will work for any data type you wish (i.e. ints, strings, etc). NOTE: There are NO spaces between the variable name, the equals sign, and the value. This is because if you have a variable name by itself, the script will try to evaluate it as though it were a program.

foo="bar"
bar=2
baz=-1.75

Whenever we want to reference one of these variables, we will not use just the variable name, but rather $ and the variable name. This $ helps the script distinguish between a program to be executed and the variable:

variable="hello world!"
echo variable # Will output the word "variable"
echo $variable # Will output hello world!

One caveat to note is that you CANNOT assign the output of a program by having the variable name, the equals sign, and then the command. Instead, you must execute the command in-line and assign that value to the variable. This can be done in a few ways:

foo=$(echo "bar") # foo will equal bar
bar=`cat text.txt` # bar will equal the contents of text.txt

Conditional Logic

Following a procedure in a recipe tends to be a linear experience. There is normally one path (steps 1 - n) that will allow you accomplish a singular outcome. However, scripts and tasks for a computer rarely tend to be that direct. Most times there are forks in the road and the path we take will be determined by the answer to a single question or condition. Take a look at the following script which sees if there are more than 100 files in the current folder:

#!/bin/bash

# Checks to see the amount of files and folders in a directory and comments accordingly

if [`ls | wc -l` -ge 100]; then
		echo "Woah there bucko, time to clean up"; # Runs when there are 100+ files/folders
else
		echo "I guess you're okay"; # Runs if there are less than 100 files/folders
fi

To let the shell know we are starting a conditional statement, we start with an if keyword. This is followed by a conditional statement (covered in a following section in more detail below) contained in square brackets [](in our example, we are executing inline a command that returns the number of files and folders are in the current folder where the script is being run). Whenever the condition in the brackets is true, the part following the then keyword will be executed, else the opposite statement will be run. To close the scope of the conditional the fi (which is just if backwards) keyword must be used (as opposed to a set of braces or specific indentation as found in other languages).

If you want to have a statement that checks multiple conditions sequentially, you will have to use the else if keyword or elif. If the first condition is true, the statement following it’s then keyword will be executed. If not, the script will sequentially check all of the comparisons until one is true or the else keyword is reached:

#!/bin/bash

# Checks to see the amount of files and folders in a directory and comments accordingly

if [`ls | wc -l` -le 10]; then
		echo "You have less than or equal to 10 files/folders"; # Runs when there are 0-10 files/folders
elif [`ls | wc -l` -le 20 ]; then
		echo "You have less than or equal to 20 files/folders"; # Runs if there are less than 11-20 files/folders
elif [`ls | wc -l` -le 30 ]; then
		echo "You have less than or equal to 30 files/folders"; # Runs if there are less than 21-30 files/folders
elif [`ls | wc -l` -le 40 ]; then
		echo "You have less than or equal to 40 files/folders"; # Runs if there are less than 31-40 files/folders
else
		echo "You have more than 40 files/folders"; # Runs if there are less than 41+ files/folders
fi

For the more programmatically savvy of you who which to branch out (pun intended), syntax also exists for using a switch statement using the case and esac keywords. More information about that can be found here.

Repeating Logic

Inevitably, you will want a script to repeat the same action several times within the same process. Whether repeating the same command, looping through numeric values, or iterating through the output of a program, there are several ways that looping can maximize the efficiency of your script.

For loops

When you want to loop for a distinct amount of times, you will want to use a for loop. In bash you can iterate (or pass through) both numbers or the output of a program. Say we want to print from one to ten in the console. Instead of using echo 10 times, we could simply do the following:

for i in {1..10}
do
	echo $i
done

The for keyword is used to indicate the type of loop, the i is our iterator (it can be any variable name), the in word lets us know that we are counting through the values that follow in the braces {}. True to bash form, you’ll notice that the scope of a call is done with words instead of special whitespace or punctuation. In a for loop, the do and done keywords are what encapsulate the code to be executed in one iteration.

But let’s say we don’t need to iterate through a range of numbers, but rather we want to repeat a command for every file/folder in a directory. For loops can take care of that too! Let’s print out the name of every file in the directory using a loop (yes this can just be done with the ls command, but I’m trying to make a point here 😉):

for file in `echo *`
do
	echo $file
done

echo * prints all the files and folders in a directory in one block while our loop will list every single one of those files on a new line by printing out each value in the block following in individually.

for loops are a powerful tool to help minimize the amount of repetitive code you write. From iterating through a range of numbers, to specific lists, or even the output of a command, large repetitive tasks can be condensed to a few lines of code. For a deeper dive into the different types of for loops, this article goes into fairly decent detail.

While loops

But what if the amount of times we want to repeat something isn’t certain? Much like conditional statements above, a while loop will look at a certain condition and act upon it. while the condition doesn’t evaluate to true, the code between the scope keywords will keep executing:

while [`date +%H` -ne 22]
do
	echo "It's not 10pm!"
done

As mentioned earlier, conditions are placed within square brackets, and this condition happens to check to see if the hour is 22 (as in 22:00). While the time is not between 10pm - 11pm, this script will print out It's not 10pm!. Pretty annoying, no? Similar to the for loop, the scope encapsulating keywords for a while loop are also do and done.

NOTE: For the more programmatically savvy among you, it is also worth noting that basic loop keywords such as break and continue and their subsequent logic are still valid in bash.

Comparison Operators

As you’ve probably noticed in the last few sections, conditional statements are crucial for any logic that makes a script smart. Whether we are checking to see if the output of a program contains a desired string or making sure that a program exits with the right code, knowing the right comparison operator for the right occasion makes all the difference. Admittedly, the syntax for these in bash can get a little confusing, which is why it is important to know whether we are comparing string values or number values.

String comparison

A string is any value assigned to a variable which contains a list of (or string of) characters (letters, symbols, and/or numbers) lumped together as a single object encapsulated in quotes t (i.e. “hello” or “b@nAn!123”). The following table shows all the available string comparison operators:

Operator	Description	Example
= or ==	Is Equal To	if [ “$foo” == “$bar” ]
!=	Is Not Equal To	if [ “$foo” != “$bar” ]
>	Is Greater Than (ASCII comparison)	if [ “$foo” > “$bar” ]
>=	Is Greater Than Or Equal To	if [ “$foo” >= “$bar” ]
<	Is Less Than	if [ “$foo” < “$bar” ]
<=	Is Less Than Or Equal To	if [ “$foo” <= “$bar” ]
-n	Is Not Null	if [ -n “$foo” ]
-z`	Is Null (Zero Length String)	if [ -z “$foo”]

Number comparison

For the math heads out there, you can also compare numbers (i.e. integers and floats) to one another. This tends to be useful when you are trying to compare outputs of programs or even with you are checking the exit status of a program and want to act accordingly. The following table shows the operators that can be used with numbers:

Operator	Description	Example
-eq	Is Equal To	if [ $foo -eq 200 ]
-ne	Is Not Equal To	if [ $foo -ne 1 ]
-gt	Is Greater Than	if [ $foo -gt 15 ]
-ge	Is Greater Than Or Equal To	if [ $foo -ge 10 ]
-lt	Is Less Than	if [ $foo -lt 5 ]
-le	Is Less Than Or Equal To	if [ $foo -le 0 ]
==	Is Equal To	if (( $foo == $bar )) NOTE: Used within double parentheses
!=	Is Not Equal To	if (( $foo != $bar ))
<	Is Less Than	if (( $foo < $bar ))
<=	Is Less Than Or Equal To	if (( $foo <= $bar ))
>	Is Greater Than	if (( $foo > $bar ))
>=	Is Greater Than Or Equal To	if (( $foo >= $bar ))

Functions

Now that you understand how to carry out basic logical actions in a bash script, you might find yourself greedy with power. As you frantically write down conditional after conditional and loop after loop, you realize that your scripting has become monotonous again. Surveying your prized code, you realize despite all the tricks you’ve learned thus far, your script still looks repetitive. Functions to the rescue!

Assuming that you are familiar with functional programming in general, bash functions are interesting in the sense of their syntax (just like with everything else in this tutorial). There are two ways to declare a function in bash:

function banana {
	echo "banana";
}

### OR ####

apple() {
	echo "apple";
}

banana # prints out 'banana'
apple # prints out 'apple'

Both are equally valid and neither has an advantage over the other. It is simply a matter of preference.

If you are looking for something a little more intelligent out of your functions (i.e. accepting parameters to calculate), I’m afraid the novel bash syntax strikes again. Although the apple format has parenthesis () following the function name, they are purely ornamental. Nothing can be put in them. Does this mean that bash functions don’t support parameters? Of course not! It’s just a little more obtuse than that. Say I have a function that adds two numbers together:

add(){
	echo `expr $1 + $2`
}

$1 and $2 are the first and second parameters that get passed into that function, respectively. So when I run

add 1 2 # That's right, no parenthesis, just arguments like any other unix program

I should get an output of 3. The convention of $ and then a number is specific to bash arguments, which means if they are used in the scope of a function, they are the arguments passed into the function. If they are used globally, it means they are the arguments passed into the script!

Misc. Tips and Tricks

As you may have noticed, many of the topics covered in this scripting tutorial show the basics of the bash programming language. For the student who is looking for more information beyond the scope of the basics, a good direction would be to Google how to accomplish x fundamental programming paradigm in bash specifically. If it exists, it can be thrown in a script. Below are a list of tutorials that are helpful in common situations.

Output Redirection

A lot of what you need to know about output redirection can be found in the following article. Using redirection can be a wonderful means to suppress output from programs in a script or saving files from the script itself.

Input Output Redirection in Linux/Unix Examples

Script Arguments

Much like the arguments that go inside of a function, the arguments that go inside of a script are determined by $and then a number. When one of these variables are found in a general scope (i.e. not inside a function), they are determined to be script variables. The following article goes more in depth on script arguments and the different types that exist.

Bash Beginner Series #3: Passing Arguments to Bash Scripts

Special Variables

By now you have noticed that bash programming is full of its own idiosyncratic components. As you get deeper into scripting, some of these special shorthand variables will help you give the robustness that your script needs:

Special Variable	Description
$0	The name of the bash script.
$1 , $2… $n	The bash script arguments.
$$	The process id of the current shell.
$#	The total number of arguments passed to the script.
$@	The value of all the arguments passed to the script.
$?	The exit status of the last executed command.
$!	The process id of the last executed command.

Conclusion

As a researcher, writing scripts can be one of the most invaluable tools on your belt. Automating passive tasks will help you save the time you need for problem solving. Keep in mind that learning to write scripts is almost the same as learning to program in a new language, so it will likely require the same amount of effort. There is so much more to writing scripts than what could be covered in this tutorial, but with these basics and some practice, you will be an expert sysadmin in no time!

Helpful Resources

Online Bash Shell - online editor

Linux - Bash - Comparison Operators