Bash Guide for Beginners

Machtelt Garrels

Version 1.11 Last updated 20081227 Edition


Table of Contents
Introduction
1. Why this guide?
2. Who should read this book?
3. New versions, translations and availability
4. Revision History
5. Contributions
6. Feedback
7. Copyright information
8. What do you need?
9. Conventions used in this document
10. Organization of this document
1. Bash and Bash scripts
1.1. Common shell programs
1.2. Advantages of the Bourne Again SHell
1.3. Executing commands
1.4. Building blocks
1.5. Developing good scripts
1.6. Summary
1.7. Exercises
2. Writing and debugging scripts
2.1. Creating and running a script
2.2. Script basics
2.3. Debugging Bash scripts
2.4. Summary
2.5. Exercises
3. The Bash environment
3.1. Shell initialization files
3.2. Variables
3.3. Quoting characters
3.4. Shell expansion
3.5. Aliases
3.6. More Bash options
3.7. Summary
3.8. Exercises
4. Regular expressions
4.1. Regular expressions
4.2. Examples using grep
4.3. Pattern matching using Bash features
4.4. Summary
4.5. Exercises
5. The GNU sed stream editor
5.1. Introduction
5.2. Interactive editing
5.3. Non-interactive editing
5.4. Summary
5.5. Exercises
6. The GNU awk programming language
6.1. Getting started with gawk
6.2. The print program
6.3. Gawk variables
6.4. Summary
6.5. Exercises
7. Conditional statements
7.1. Introduction to if
7.2. More advanced if usage
7.3. Using case statements
7.4. Summary
7.5. Exercises
8. Writing interactive scripts
8.1. Displaying user messages
8.2. Catching user input
8.3. Summary
8.4. Exercises
9. Repetitive tasks
9.1. The for loop
9.2. The while loop
9.3. The until loop
9.4. I/O redirection and loops
9.5. Break and continue
9.6. Making menus with the select built-in
9.7. The shift built-in
9.8. Summary
9.9. Exercises
10. More on variables
10.1. Types of variables
10.2. Array variables
10.3. Operations on variables
10.4. Summary
10.5. Exercises
11. Functions
11.1. Introduction
11.2. Examples of functions in scripts
11.3. Summary
11.4. Exercises
12. Catching signals
12.1. Signals
12.2. Traps
12.3. Summary
12.4. Exercises
A. Shell Features
A.1. Common features
A.2. Differing features
Glossary
Index
List of Tables
1. Typographic and usage conventions
1-1. Overview of programming terms
2-1. Overview of set debugging options
3-1. Reserved Bourne shell variables
3-2. Reserved Bash variables
3-3. Special bash variables
3-4. Arithmetic operators
4-1. Regular expression operators
5-1. Sed editing commands
5-2. Sed options
6-1. Formatting characters for gawk
7-1. Primary expressions
7-2. Combining expressions
8-1. Escape sequences used by the echo command
8-2. Options to the read built-in
10-1. Options to the declare built-in
12-1. Control signals in Bash
12-2. Common kill signals
A-1. Common Shell Features
A-2. Differing Shell Features
List of Figures
1. Bash Guide for Beginners front cover
2-1. script1.sh
3-1. Different prompts for different users
6-1. Fields in awk
7-1. Testing of a command line argument with if
7-2. Example using Boolean operators

Introduction

1. Why this guide?

The primary reason for writing this document is that a lot of readers feel the existing HOWTO to be too short and incomplete, while the Bash Scripting guide is too much of a reference work. There is nothing in between these two extremes. I also wrote this guide on the general principal that not enough free basic courses are available, though they should be.

This is a practical guide which, while not always being too serious, tries to give real-life instead of theoretical examples. I partly wrote it because I don't get excited with stripped down and over-simplified examples written by people who know what they are talking about, showing some really cool Bash feature so much out of its context that you cannot ever use it in practical circumstances. You can read that sort of stuff after finishing this book, which contains exercises and examples that will help you survive in the real world.

From my experience as UNIX/Linux user, system administrator and trainer, I know that people can have years of daily interaction with their systems, without having the slightest knowledge of task automation. Thus they often think that UNIX is not userfriendly, and even worse, they get the impression that it is slow and old-fashioned. This problem is another one that can be remedied by this guide.


2. Who should read this book?

Everybody working on a UNIX or UNIX-like system who wants to make life easier on themselves, power users and sysadmins alike, can benefit from reading this book. Readers who already have a grasp of working the system using the command line will learn the ins and outs of shell scripting that ease execution of daily tasks. System administration relies a great deal on shell scripting; common tasks are often automated using simple scripts. This document is full of examples that will encourage you to write your own and that will inspire you to improve on existing scripts.

Prerequisites/not in this course:

  • You should be an experienced UNIX or Linux user, familiar with basic commands, man pages and documentation

  • Being able to use a text editor

  • Understand system boot and shutdown processes, init and initscripts

  • Create users and groups, set passwords

  • Permissions, special modes

  • Understand naming conventions for devices, partitioning, mounting/unmounting file systems

  • Adding/removing software on your system

See Introduction to Linux (or your local TLDP mirror) if you haven't mastered one or more of these topics. Additional information can be found in your system documentation (man and info pages), or at the Linux Documentation Project.


3. New versions, translations and availability

The most recent edition can be found at http://tille.garrels.be/training/bash/. You should find the same version at http://tldp.org/LDP/Bash-Beginners-Guide/html/index.html.

This guide is available in print from Fultus.com.

Figure 1. Bash Guide for Beginners front cover

This guide has been translated:

A french translation is in the making and will be linked to as soon as it is finished.


4. Revision History

Revision History
Revision 1.112008-12-27Revised by: MG
Processed input from readers.
Revision 1.102008-06-06Revised by: MG
address change
Revision 1.92006-10-10Revised by: MG
Incorporated reader remarks, added index using DocBook tags.
Revision 1.82006-03-15Revised by: MG
clarified example in Chap4, corrected here doc in chap9, general checks and correction of typos, added link to Chinese and Ukrainian translation, note and stuff to know about awk in chap6.
Revision 1.72005-09-05Revised by: MG
Corrected typos in chapter 3, 6 and 7, incorporated user remarks, added a note in chap7.
Revision 1.62005-03-01Revised by: MG
Minor debugging, added more keywords, info about new Bash 3.0, took out blank image.
Revision 1.02004-04-27Revised by: TM
Initial release for LDP; more exercises, more markup, less errors and abuse; added glossary.
Revision 1.0-beta2003-04-20Revised by: MG
Pre-release


5. Contributions

Thanks to all the friends who helped (or tried to) and to my husband; your encouraging words made this work possible. Thanks to all the people who submitted bug reports, examples and remarks - among many, many others:

  • Hans Bol, one of the groupies

  • Mike Sim, remarks on style

  • Dan Richter, for array examples

  • Gerg Ferguson, for ideas on the title

  • Mendel Leo Cooper, for making room

  • #linux.be, for keeping my feet on the ground

  • Frank Wang, for his detailed remarks on all the things I did wrong ;-)

Special thanks to Tabatha Marshall, who volunteered to do a complete review and spell and grammar check. We make a great team: she works when I sleep. And vice versa ;-)


6. Feedback

Missing information, missing links, missing characters, remarks? Mail it to

the maintainer of this document.


7. Copyright information


* Copyright (c) 2002-2007, Machtelt Garrels
* All rights reserved.
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
*     * Redistributions of source code must retain the above copyright
*       notice, this list of conditions and the following disclaimer.
*     * Redistributions in binary form must reproduce the above copyright
*       notice, this list of conditions and the following disclaimer in the
*       documentation and/or other materials provided with the distribution.
*     * Neither the name of the author, Machtelt Garrels, nor the
*       names of its contributors may be used to endorse or promote products
*       derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL THE AUTHOR AND CONTRIBUTORS BE LIABLE FOR ANY
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

The author and publisher have made every effort in the preparation of this book to ensure the accuracy of the information. However, the information contained in this book is offered without warranty, either express or implied. Neither the author nor the publisher nor any dealer or distributor will be held liable for any damages caused or alleged to be caused either directly or indirectly by this book.

The logos, trademarks and symbols used in this book are the properties of their respective owners.


8. What do you need?

bash, available from http://www.gnu.org/directory/GNU/. The Bash shell is available on nearly every Linux system, and can these days be found on a wide variety of UNIX systems.

Compiles easily if you need to make your own, tested on a wide variety of UNIX, Linux, MS Windows and other systems.


9. Conventions used in this document

The following typographic and usage conventions occur in this text:

Table 1. Typographic and usage conventions

Text typeMeaning
"Quoted text"Quotes from people, quoted computer output.
terminal view
Literal computer input and output captured from the terminal, usually rendered with a light grey background.
commandName of a command that can be entered on the command line.
VARIABLEName of a variable or pointer to content of a variable, as in $VARNAME.
optionOption to a command, as in "the -a option to the ls command".
argumentArgument to a command, as in "read man ls".

command options arguments

Command synopsis or general usage, on a separated line.
filenameName of a file or directory, for example "Change to the /usr/bin directory."
KeyKeys to hit on the keyboard, such as "type Q to quit".
ButtonGraphical button to click, like the OK button.
Menu->ChoiceChoice to select from a graphical menu, for instance: "Select Help->About Mozilla in your browser."
TerminologyImportant term or concept: "The Linux kernel is the heart of the system."
\
The backslash in a terminal view or command synopsis indicates an unfinished line. In other words, if you see a long command that is cut into multiple lines, \ means "Don't press Enter yet!"
See Chapter 1link to related subject within this guide.
The authorClickable link to an external web resource.

10. Organization of this document

This guide discusses concepts useful in the daily life of the serious Bash user. While a basic knowledge of the usage of the shell is required, we start with a discussion of the basic shell components and practices in the first three chapters.

Chapters four to six are discussions of basic tools that are commonly used in shell scripts.

Chapters eight to twelve discuss the most common constructs in shell scripts.

All chapters come with exercises that will test your preparedness for the next chapter.

  • Chapter 1: Bash basics: why Bash is so good, building blocks, first guidelines on developing good scripts.

  • Chapter 2: Script basics: writing and debugging.

  • Chapter 3: The Bash Environment: initialization files, variables, quoting characters, shell expansion order, aliases, options.

  • Chapter 4: Regular expressions: an introduction.

  • Chapter 5: Sed: an introduction to the sed line editor.

  • Chapter 6:Awk: introduction to the awk programming language.

  • Chapter 7: Conditional statements: constructs used in Bash to test conditions.

  • Chapter 8: Interactive scripts: making scripts user-friendly, catching user input.

  • Chapter 9: Executing commands repetitively: constructs used in Bash to automate command execution.

  • Chapter 10: Advanced variables: specifying variable types, introduction to arrays of variables, operations on variables.

  • Chapter 11: Functions: an introduction.

  • Chapter 12: Catching signals: introduction to process signalling, trapping user-sent signals.


Chapter 1. Bash and Bash scripts

In this introduction module we

  • Describe some common shells

  • Point out GNU Bash advantages and features

  • Describe the shell's building blocks

  • Discuss Bash initialization files

  • See how the shell executes commands

  • Look into some simple script examples


1.1. Common shell programs

1.1.1. General shell functions

The UNIX shell program interprets user commands, which are either directly entered by the user, or which can be read from a file called the shell script or shell program. Shell scripts are interpreted, not compiled. The shell reads commands from the script line per line and searches for those commands on the system (see Section 1.2), while a compiler converts a program into machine readable form, an executable file - which may then be used in a shell script.

Apart from passing commands to the kernel, the main task of a shell is providing a user environment, which can be configured individually using shell resource configuration files.


1.1.2. Shell types

Just like people know different languages and dialects, your UNIX system will usually offer a variety of shell types:

  • sh or Bourne Shell: the original shell still used on UNIX systems and in UNIX-related environments. This is the basic shell, a small program with few features. While this is not the standard shell, it is still available on every Linux system for compatibility with UNIX programs.

  • bash or Bourne Again shell: the standard GNU shell, intuitive and flexible. Probably most advisable for beginning users while being at the same time a powerful tool for the advanced and professional user. On Linux, bash is the standard shell for common users. This shell is a so-called superset of the Bourne shell, a set of add-ons and plug-ins. This means that the Bourne Again shell is compatible with the Bourne shell: commands that work in sh, also work in bash. However, the reverse is not always the case. All examples and exercises in this book use bash.

  • csh or C shell: the syntax of this shell resembles that of the C programming language. Sometimes asked for by programmers.

  • tcsh or TENEX C shell: a superset of the common C shell, enhancing user-friendliness and speed. That is why some also call it the Turbo C shell.

  • ksh or the Korn shell: sometimes appreciated by people with a UNIX background. A superset of the Bourne shell; with standard configuration a nightmare for beginning users.

The file /etc/shells gives an overview of known shells on a Linux system:


mia:~> cat /etc/shells
/bin/bash
/bin/sh
/bin/tcsh
/bin/csh

Your default shell is set in the /etc/passwd file, like this line for user mia:


mia:L2NOfqdlPrHwE:504:504:Mia Maya:/home/mia:/bin/bash

To switch from one shell to another, just enter the name of the new shell in the active terminal. The system finds the directory where the name occurs using the PATH settings, and since a shell is an executable file (program), the current shell activates it and it gets executed. A new prompt is usually shown, because each shell has its typical appearance:


mia:~> tcsh
[mia@post21 ~]$

1.2. Advantages of the Bourne Again SHell

1.2.1. Bash is the GNU shell

The GNU project (GNU's Not UNIX) provides tools for UNIX-like system administration which are free software and comply to UNIX standards.

Bash is an sh-compatible shell that incorporates useful features from the Korn shell (ksh) and C shell (csh). It is intended to conform to the IEEE POSIX P1003.2/ISO 9945.2 Shell and Tools standard. It offers functional improvements over sh for both programming and interactive use; these include command line editing, unlimited size command history, job control, shell functions and aliases, indexed arrays of unlimited size, and integer arithmetic in any base from two to sixty-four. Bash can run most sh scripts without modification.

Like the other GNU projects, the bash initiative was started to preserve, protect and promote the freedom to use, study, copy, modify and redistribute software. It is generally known that such conditions stimulate creativity. This was also the case with the bash program, which has a lot of extra features that other shells can't offer.


1.2.2. Features only found in bash

1.2.2.1. Invocation

In addition to the single-character shell command line options which can generally be configured using the set shell built-in command, there are several multi-character options that you can use. We will come across a couple of the more popular options in this and the following chapters; the complete list can be found in the Bash info pages, Bash features->Invoking Bash.


1.2.2.2. Bash startup files

Startup files are scripts that are read and executed by Bash when it starts. The following subsections describe different ways to start the shell, and the startup files that are read consequently.


1.2.2.2.1. Invoked as an interactive login shell, or with `--login'

Interactive means you can enter commands. The shell is not running because a script has been activated. A login shell means that you got the shell after authenticating to the system, usually by giving your user name and password.

Files read:

  • /etc/profile

  • ~/.bash_profile, ~/.bash_login or ~/.profile: first existing readable file is read

  • ~/.bash_logout upon logout.

Error messages are printed if configuration files exist but are not readable. If a file does not exist, bash searches for the next.


1.2.2.2.2. Invoked as an interactive non-login shell

A non-login shell means that you did not have to authenticate to the system. For instance, when you open a terminal using an icon, or a menu item, that is a non-login shell.

Files read:

  • ~/.bashrc

This file is usually referred to in ~/.bash_profile:

if [ -f ~/.bashrc ]; then . ~/.bashrc; fi

See Chapter 7 for more information on the if construct.


1.2.2.2.3. Invoked non-interactively

All scripts use non-interactive shells. They are programmed to do certain tasks and cannot be instructed to do other jobs than those for which they are programmed.

Files read:

  • defined by BASH_ENV

PATH is not used to search for this file, so if you want to use it, best refer to it by giving the full path and file name.


1.2.2.2.4. Invoked with the sh command

Bash tries to behave as the historical Bourne sh program while conforming to the POSIX standard as well.

Files read:

  • /etc/profile

  • ~/.profile

When invoked interactively, the ENV variable can point to extra startup information.


1.2.2.2.5. POSIX mode

This option is enabled either using the set built-in:

set -o posix

or by calling the bash program with the --posix option. Bash will then try to behave as compliant as possible to the POSIX standard for shells. Setting the POSIXLY_CORRECT variable does the same.

Files read:

  • defined by ENV variable.


1.2.2.2.6. Invoked remotely

Files read when invoked by rshd:

  • ~/.bashrc

WarningAvoid use of r-tools
 

Be aware of the dangers when using tools such as rlogin, telnet, rsh and rcp. They are intrinsically insecure because confidential data is sent over the network unencrypted. If you need tools for remote execution, file transfer and so on, use an implementation of Secure SHell, generally known as SSH, freely available from http://www.openssh.org. Different client programs are available for non-UNIX systems as well, see your local software mirror.


1.2.2.2.7. Invoked when UID is not equal to EUID

No startup files are read in this case.


1.2.2.3. Interactive shells

1.2.2.3.1. What is an interactive shell?

An interactive shell generally reads from, and writes to, a user's terminal: input and output are connected to a terminal. Bash interactive behavior is started when the bash command is called upon without non-option arguments, except when the option is a string to read from or when the shell is invoked to read from standard input, which allows for positional parameters to be set (see Chapter 3 ).


1.2.2.3.2. Is this shell interactive?

Test by looking at the content of the special parameter -, it contains an 'i' when the shell is interactive:


eddy:~> echo $-
himBH

In non-interactive shells, the prompt, PS1, is unset.


1.2.2.3.3. Interactive shell behavior

Differences in interactive mode:

  • Bash reads startup files.

  • Job control enabled by default.

  • Prompts are set, PS2 is enabled for multi-line commands, it is usually set to ">". This is also the prompt you get when the shell thinks you entered an unfinished command, for instance when you forget quotes, command structures that cannot be left out, etc.

  • Commands are by default read from the command line using readline.

  • Bash interprets the shell option ignoreeof instead of exiting immediately upon receiving EOF (End Of File).

  • Command history and history expansion are enabled by default. History is saved in the file pointed to by HISTFILE when the shell exits. By default, HISTFILE points to ~/.bash_history.

  • Alias expansion is enabled.

  • In the absence of traps, the SIGTERM signal is ignored.

  • In the absence of traps, SIGINT is caught and handled. Thus, typing Ctrl+C, for example, will not quit your interactive shell.

  • Sending SIGHUP signals to all jobs on exit is configured with the huponexit option.

  • Commands are executed upon read.

  • Bash checks for mail periodically.

  • Bash can be configured to exit when it encounters unreferenced variables. In interactive mode this behavior is disabled.

  • When shell built-in commands encounter redirection errors, this will not cause the shell to exit.

  • Special built-ins returning errors when used in POSIX mode don't cause the shell to exit. The built-in commands are listed in Section 1.3.2.

  • Failure of exec will not exit the shell.

  • Parser syntax errors don't cause the shell to exit.

  • Simple spell check for the arguments to the cd built-in is enabled by default.

  • Automatic exit after the length of time specified in the TMOUT variable has passed, is enabled.

More information:


1.2.2.4. Conditionals

Conditional expressions are used by the [[ compound command and by the test and [ built-in commands.

Expressions may be unary or binary. Unary expressions are often used to examine the status of a file. You only need one object, for instance a file, to do the operation on.

There are string operators and numeric comparison operators as well; these are binary operators, requiring two objects to do the operation on. If the FILE argument to one of the primaries is in the form /dev/fd/N, then file descriptor N is checked. If the FILE argument to one of the primaries is one of /dev/stdin, /dev/stdout or /dev/stderr, then file descriptor 0, 1 or 2 respectively is checked.

Conditionals are discussed in detail in Chapter 7.

More information about the file descriptors in Section 8.2.3.


1.2.2.5. Shell arithmetic

The shell allows arithmetic expressions to be evaluated, as one of the shell expansions or by the let built-in.

Evaluation is done in fixed-width integers with no check for overflow, though division by 0 is trapped and flagged as an error. The operators and their precedence and associativity are the same as in the C language, see Chapter 3.


1.2.2.6. Aliases

Aliases allow a string to be substituted for a word when it is used as the first word of a simple command. The shell maintains a list of aliases that may be set and unset with the alias and unalias commands.

Bash always reads at least one complete line of input before executing any of the commands on that line. Aliases are expanded when a command is read, not when it is executed. Therefore, an alias definition appearing on the same line as another command does not take effect until the next line of input is read. The commands following the alias definition on that line are not affected by the new alias.

Aliases are expanded when a function definition is read, not when the function is executed, because a function definition is itself a compound command. As a consequence, aliases defined in a function are not available until after that function is executed.

We will discuss aliases in detail in Section 3.5.


1.2.2.7. Arrays

Bash provides one-dimensional array variables. Any variable may be used as an array; the declare built-in will explicitly declare an array. There is no maximum limit on the size of an array, nor any requirement that members be indexed or assigned contiguously. Arrays are zero-based. See Chapter 10.


1.2.2.8. Directory stack

The directory stack is a list of recently-visited directories. The pushd built-in adds directories to the stack as it changes the current directory, and the popd built-in removes specified directories from the stack and changes the current directory to the directory removed.

Content can be displayed issuing the dirs command or by checking the content of the DIRSTACK variable.

More information about the workings of this mechanism can be found in the Bash info pages.


1.2.2.9. The prompt

Bash makes playing with the prompt even more fun. See the section Controlling the Prompt in the Bash info pages.


1.2.2.10. The restricted shell

When invoked as rbash or with the --restricted or -r option, the following happens:

  • The cd built-in is disabled.

  • Setting or unsetting SHELL, PATH, ENV or BASH_ENV is not possible.

  • Command names can no longer contain slashes.

  • Filenames containing a slash are not allowed with the . (source) built-in command.

  • The hash built-in does not accept slashes with the -p option.

  • Import of functions at startup is disabled.

  • SHELLOPTS is ignored at startup.

  • Output redirection using >, >|, ><, >&, &> and >> is disabled.

  • The exec built-in is disabled.

  • The -f and -d options are disabled for the enable built-in.

  • A default PATH cannot be specified with the command built-in.

  • Turning off restricted mode is not possible.

When a command that is found to be a shell script is executed, rbash turns off any restrictions in the shell spawned to execute the script.

More information:


1.3. Executing commands

1.3.1. General

Bash determines the type of program that is to be executed. Normal programs are system commands that exist in compiled form on your system. When such a program is executed, a new process is created because Bash makes an exact copy of itself. This child process has the same environment as its parent, only the process ID number is different. This procedure is called forking.

After the forking process, the address space of the child process is overwritten with the new process data. This is done through an exec call to the system.

The fork-and-exec mechanism thus switches an old command with a new, while the environment in which the new program is executed remains the same, including configuration of input and output devices, environment variables and priority. This mechanism is used to create all UNIX processes, so it also applies to the Linux operating system. Even the first process, init, with process ID 1, is forked during the boot procedure in the so-called bootstrapping procedure.


1.3.2. Shell built-in commands

Built-in commands are contained within the shell itself. When the name of a built-in command is used as the first word of a simple command, the shell executes the command directly, without creating a new process. Built-in commands are necessary to implement functionality impossible or inconvenient to obtain with separate utilities.

Bash supports 3 types of built-in commands:

  • Bourne Shell built-ins:

    :, ., break, cd, continue, eval, exec, exit, export, getopts, hash, pwd, readonly, return, set, shift, test, [, times, trap, umask and unset.

  • Bash built-in commands:

    alias, bind, builtin, command, declare, echo, enable, help, let, local, logout, printf, read, shopt, type, typeset, ulimit and unalias.

  • Special built-in commands:

    When Bash is executing in POSIX mode, the special built-ins differ from other built-in commands in three respects:

    1. Special built-ins are found before shell functions during command lookup.

    2. If a special built-in returns an error status, a non-interactive shell exits.

    3. Assignment statements preceding the command stay in effect in the shell environment after the command completes.

    The POSIX special built-ins are :, ., break, continue, eval, exec, exit, export, readonly, return, set, shift, trap and unset.

Most of these built-ins will be discussed in the next chapters. For those commands for which this is not the case, we refer to the Info pages.


1.3.3. Executing programs from a script

When the program being executed is a shell script, bash will create a new bash process using a fork. This subshell reads the lines from the shell script one line at a time. Commands on each line are read, interpreted and executed as if they would have come directly from the keyboard.

While the subshell processes each line of the script, the parent shell waits for its child process to finish. When there are no more lines in the shell script to read, the subshell terminates. The parent shell awakes and displays a new prompt.


1.4. Building blocks

1.4.1. Shell building blocks

1.4.1.1. Shell syntax

If input is not commented, the shell reads it and divides it into words and operators, employing quoting rules to define the meaning of each character of input. Then these words and operators are translated into commands and other constructs, which return an exit status available for inspection or processing. The above fork-and-exec scheme is only applied after the shell has analyzed input in the following way:

  • The shell reads its input from a file, from a string or from the user's terminal.

  • Input is broken up into words and operators, obeying the quoting rules, see Chapter 3. These tokens are separated by metacharacters. Alias expansion is performed.

  • The shell parses (analyzes and substitutes) the tokens into simple and compound commands.

  • Bash performs various shell expansions, breaking the expanded tokens into lists of filenames and commands and arguments.

  • Redirection is performed if necessary, redirection operators and their operands are removed from the argument list.

  • Commands are executed.

  • Optionally the shell waits for the command to complete and collects its exit status.


1.4.1.2. Shell commands

A simple shell command such as touch file1 file2 file3 consists of the command itself followed by arguments, separated by spaces.

More complex shell commands are composed of simple commands arranged together in a variety of ways: in a pipeline in which the output of one command becomes the input of a second, in a loop or conditional construct, or in some other grouping. A couple of examples:

ls | more

gunzip file.tar.gz | tar xvf -


1.4.1.3. Shell functions

Shell functions are a way to group commands for later execution using a single name for the group. They are executed just like a "regular" command. When the name of a shell function is used as a simple command name, the list of commands associated with that function name is executed.

Shell functions are executed in the current shell context; no new process is created to interpret them.

Functions are explained in Chapter 11.


1.4.1.4. Shell parameters

A parameter is an entity that stores values. It can be a name, a number or a special value. For the shell's purpose, a variable is a parameter that stores a name. A variable has a value and zero or more attributes. Variables are created with the declare shell built-in command.

If no value is given, a variable is assigned the null string. Variables can only be removed with the unset built-in.

Assigning variables is discussed in Section 3.2, advanced use of variables in Chapter 10.


1.4.1.5. Shell expansions

Shell expansion is performed after each command line has been split into tokens. These are the expansions performed:

  • Brace expansion

  • Tilde expansion

  • Parameter and variable expansion

  • Command substitution

  • Arithmetic expansion

  • Word splitting

  • Filename expansion

We'll discuss these expansion types in detail in Section 3.4.


1.4.1.6. Redirections

Before a command is executed, its input and output may be redirected using a special notation interpreted by the shell. Redirection may also be used to open and close files for the current shell execution environment.


1.4.1.7. Executing commands

When executing a command, the words that the parser has marked as variable assignments (preceding the command name) and redirections are saved for later reference. Words that are not variable assignments or redirections are expanded; the first remaining word after expansion is taken to be the name of the command and the rest are arguments to that command. Then redirections are performed, then strings assigned to variables are expanded. If no command name results, variables will affect the current shell environment.

An important part of the tasks of the shell is to search for commands. Bash does this as follows:

  • Check whether the command contains slashes. If not, first check with the function list to see if it contains a command by the name we are looking for.

  • If command is not a function, check for it in the built-in list.

  • If command is neither a function nor a built-in, look for it analyzing the directories listed in PATH. Bash uses a hash table (data storage area in memory) to remember the full path names of executables so extensive PATH searches can be avoided.

  • If the search is unsuccessful, bash prints an error message and returns an exit status of 127.

  • If the search was successful or if the command contains slashes, the shell executes the command in a separate execution environment.

  • If execution fails because the file is not executable and not a directory, it is assumed to be a shell script.

  • If the command was not begun asynchronously, the shell waits for the command to complete and collects its exit status.


1.4.1.8. Shell scripts

When a file containing shell commands is used as the first non-option argument when invoking Bash (without -c or -s, this will create a non-interactive shell. This shell first searches for the script file in the current directory, then looks in PATH if the file cannot be found there.


1.5. Developing good scripts

1.5.1. Properties of good scripts

This guide is mainly about the last shell building block, scripts. Some general considerations before we continue:

  1. A script should run without errors.

  2. It should perform the task for which it is intended.

  3. Program logic is clearly defined and apparent.

  4. A script does not do unnecessary work.

  5. Scripts should be reusable.


1.5.2. Structure

The structure of a shell script is very flexible. Even though in Bash a lot of freedom is granted, you must ensure correct logic, flow control and efficiency so that users executing the script can do so easily and correctly.

When starting on a new script, ask yourself the following questions:

  • Will I be needing any information from the user or from the user's environment?

  • How will I store that information?

  • Are there any files that need to be created? Where and with which permissions and ownerships?

  • What commands will I use? When using the script on different systems, do all these systems have these commands in the required versions?

  • Does the user need any notifications? When and why?


1.5.3. Terminology

The table below gives an overview of programming terms that you need to be familiar with:

Table 1-1. Overview of programming terms

TermWhat is it?
Command controlTesting exit status of a command in order to determine whether a portion of the program should be executed.
Conditional branchLogical point in the program when a condition determines what happens next.
Logic flowThe overall design of the program. Determines logical sequence of tasks so that the result is successful and controlled.
LoopPart of the program that is performed zero or more times.
User inputInformation provided by an external source while the program is running, can be stored and recalled when needed.

1.5.4. A word on order and logic

In order to speed up the developing process, the logical order of a program should be thought over in advance. This is your first step when developing a script.

A number of methods can be used; one of the most common is working with lists. Itemizing the list of tasks involved in a program allows you to describe each process. Individual tasks can be referenced by their item number.

Using your own spoken language to pin down the tasks to be executed by your program will help you to create an understandable form of your program. Later, you can replace the everyday language statements with shell language words and constructs.

The example below shows such a logic flow design. It describes the rotation of log files. This example shows a possible repetitive loop, controlled by the number of base log files you want to rotate:

  1. Do you want to rotate logs?

    1. If yes:

      1. Enter directory name containing the logs to be rotated.

      2. Enter base name of the log file.

      3. Enter number of days logs should be kept.

      4. Make settings permanent in user's crontab file.

    2. If no, go to step 3.

  2. Do you want to rotate another set of logs?

    1. If yes: repeat step 1.

    2. If no: go to step 3.

  3. Exit

The user should provide information for the program to do something. Input from the user must be obtained and stored. The user should be notified that his crontab will change.


1.5.5. An example Bash script: mysystem.sh

The mysystem.sh script below executes some well-known commands (date, w, uname, uptime) to display information about you and your machine.


tom:~> cat -n mysystem.sh
     1  #!/bin/bash
     2  clear
     3  echo "This is information provided by mysystem.sh.  Program starts now."
     4
     5  echo "Hello, $USER"
     6  echo
     7
     8  echo "Today's date is `date`, this is week `date +"%V"`."
     9  echo
    10
    11  echo "These users are currently connected:"
    12  w | cut -d " " -f 1 - | grep -v USER | sort -u
    13  echo
    14
    15  echo "This is `uname -s` running on a `uname -m` processor."
    16  echo
    17
    18  echo "This is the uptime information:"
    19  uptime
    20  echo
    21
    22  echo "That's all folks!"

A script always starts with the same two characters, "#!". After that, the shell that will execute the commands following the first line is defined. This script starts with clearing the screen on line 2. Line 3 makes it print a message, informing the user about what is going to happen. Line 5 greets the user. Lines 6, 9, 13, 16 and 20 are only there for orderly output display purposes. Line 8 prints the current date and the number of the week. Line 11 is again an informative message, like lines 3, 18 and 22. Line 12 formats the output of the w; line 15 shows operating system and CPU information. Line 19 gives the uptime and load information.

Both echo and printf are Bash built-in commands. The first always exits with a 0 status, and simply prints arguments followed by an end of line character on the standard output, while the latter allows for definition of a formatting string and gives a non-zero exit status code upon failure.

This is the same script using the printf built-in:


tom:~> cat mysystem.sh
#!/bin/bash
clear
printf "This is information provided by mysystem.sh.  Program starts now.\n"

printf "Hello, $USER.\n\n"

printf "Today's date is `date`, this is week `date +"%V"`.\n\n"

printf "These users are currently connected:\n"
w | cut -d " " -f 1 - | grep -v USER | sort -u
printf "\n"

printf "This is `uname -s` running on a `uname -m` processor.\n\n"

printf "This is the uptime information:\n"
uptime
printf "\n"

printf "That's all folks!\n"

Creating user friendly scripts by means of inserting messages is treated in Chapter 8.

NoteStandard location of the Bourne Again shell
 

This implies that the bash program is installed in /bin.

WarningIf stdout is not available
 

If you execute a script from cron, supply full path names and redirect output and errors. Since the shell runs in non-interactive mode, any errors will cause the script to exit prematurely if you don't think about this.

The following chapters will discuss the details of the above scripts.


1.5.6. Example init script

An init script starts system services on UNIX and Linux machines. The system log daemon, the power management daemon, the name and mail daemons are common examples. These scripts, also known as startup scripts, are stored in a specific location on your system, such as /etc/rc.d/init.d or /etc/init.d. Init, the initial process, reads its configuration files and decides which services to start or stop in each run level. A run level is a configuration of processes; each system has a single user run level, for instance, for performing administrative tasks, for which the system has to be in an unused state as much as possible, such as recovering a critical file system from a backup. Reboot and shutdown run levels are usually also configured.

The tasks to be executed upon starting a service or stopping it are listed in the startup scripts. It is one of the system administrator's tasks to configure init, so that services are started and stopped at the correct moment. When confronted with this task, you need a good understanding of the startup and shutdown procedures on your system. We therefore advise that you read the man pages for init and inittab before starting on your own initialization scripts.

Here is a very simple example, that will play a sound upon starting and stopping your machine:


#!/bin/bash

# This script is for /etc/rc.d/init.d
# Link in rc3.d/S99audio-greeting and rc0.d/K01audio-greeting

case "$1" in
'start')
  cat /usr/share/audio/at_your_service.au > /dev/audio
  ;;
'stop')
  cat /usr/share/audio/oh_no_not_again.au > /dev/audio
  ;;
esac
exit 0

The case statement often used in this kind of script is described in Section 7.2.5.


1.6. Summary

Bash is the GNU shell, compatible with the Bourne shell and incorporating many useful features from other shells. When the shell is started, it reads its configuration files. The most important are:

  • /etc/profile

  • ~/.bash_profile

  • ~/.bashrc

Bash behaves different when in interactive mode and also has a POSIX compliant and a restricted mode.

Shell commands can be split up in three groups: the shell functions, shell built-ins and existing commands in a directory on your system. Bash supports additional built-ins not found in the plain Bourne shell.

Shell scripts consist of these commands arranged as shell syntax dictates. Scripts are read and executed line per line and should have a logical structure.


1.7. Exercises

These are some exercises to warm you up for the next chapter:

  1. Where is the bash program located on your system?

  2. Use the --version option to find out which version you are running.

  3. Which shell configuration files are read when you login to your system using the graphical user interface and then opening a terminal window?

  4. Are the following shells interactive shells? Are they login shells?

    • A shell opened by clicking on the background of your graphical desktop, selecting "Terminal" or such from a menu.

    • A shell that you get after issuing the command ssh localhost.

    • A shell that you get when logging in to the console in text mode.

    • A shell obtained by the command xterm &.

    • A shell opened by the mysystem.sh script.

    • A shell that you get on a remote host, for which you didn't have to give the login and/or password because you use SSH and maybe SSH keys.

  5. Can you explain why bash does not exit when you type Ctrl+C on the command line?

  6. Display directory stack content.

  7. If it is not yet the case, set your prompt so that it displays your location in the file system hierarchy, for instance add this line to ~/.bashrc:

    export PS1="\u@\h \w> "

  8. Display hashed commands for your current shell session.

  9. How many processes are currently running on your system? Use ps and wc, the first line of output of ps is not a process!

  10. How to display the system hostname? Only the name, nothing more!


Chapter 2. Writing and debugging scripts

After going through this chapter, you will be able to:

  • Write a simple script

  • Define the shell type that should execute the script

  • Put comments in a script

  • Change permissions on a script

  • Execute and debug a script


2.1. Creating and running a script

2.1.1. Writing and naming

A shell script is a sequence of commands for which you have a repeated use. This sequence is typically executed by entering the name of the script on the command line. Alternatively, you can use scripts to automate tasks using the cron facility. Another use for scripts is in the UNIX boot and shutdown procedure, where operation of daemons and services are defined in init scripts.

To create a shell script, open a new empty file in your editor. Any text editor will do: vim, emacs, gedit, dtpad et cetera are all valid. You might want to chose a more advanced editor like vim or emacs, however, because these can be configured to recognize shell and Bash syntax and can be a great help in preventing those errors that beginners frequently make, such as forgetting brackets and semi-colons.

TipSyntax highlighting in vim
 

In order to activate syntax highlighting in vim, use the command

:syntax enable

or

:sy enable

or

:syn enable

You can add this setting to your .vimrc file to make it permanent.

Put UNIX commands in the new empty file, like you would enter them on the command line. As discussed in the previous chapter (see Section 1.3), commands can be shell functions, shell built-ins, UNIX commands and other scripts.

Give your script a sensible name that gives a hint about what the script does. Make sure that your script name does not conflict with existing commands. In order to ensure that no confusion can rise, script names often end in .sh; even so, there might be other scripts on your system with the same name as the one you chose. Check using which, whereis and other commands for finding information about programs and files:

which -a script_name

whereis script_name

locate script_name


2.1.2. script1.sh

In this example we use the echo Bash built-in to inform the user about what is going to happen, before the task that will create the output is executed. It is strongly advised to inform users about what a script is doing, in order to prevent them from becoming nervous because the script is not doing anything. We will return to the subject of notifying users in Chapter 8.

Figure 2-1. script1.sh

Write this script for yourself as well. It might be a good idea to create a directory ~/scripts to hold your scripts. Add the directory to the contents of the PATH variable:

export PATH="$PATH:~/scripts"

If you are just getting started with Bash, use a text editor that uses different colours for different shell constructs. Syntax highlighting is supported by vim, gvim, (x)emacs, kwrite and many other editors; check the documentation of your favorite editor.

NoteDifferent prompts
 

The prompts throughout this course vary depending on the author's mood. This resembles much more real life situations than the standard educational $ prompt. The only convention we stick to, is that the root prompt ends in a hash mark (#).


2.1.3. Executing the script

The script should have execute permissions for the correct owners in order to be runnable. When setting permissions, check that you really obtained the permissions that you want. When this is done, the script can run like any other command:


willy:~/scripts> chmod u+x script1.sh

willy:~/scripts> ls -l script1.sh
-rwxrw-r--    1 willy	willy		456 Dec 24 17:11 script1.sh

willy:~> script1.sh
The script starts now.
Hi, willy!

I will now fetch you a list of connected users:

  3:38pm  up 18 days,  5:37,  4 users,  load average: 0.12, 0.22, 0.15
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
root     tty2     -                Sat 2pm  4:25m  0.24s  0.05s  -bash
willy	 :0       -                Sat 2pm   ?     0.00s   ?     -
willy    pts/3    -                Sat 2pm  3:33m 36.39s 36.39s  BitchX willy ir
willy    pts/2    -                Sat 2pm  3:33m  0.13s  0.06s  /usr/bin/screen

I'm setting two variables now.
This is a string: black
And this is a number: 9

I'm giving you back your prompt now.

willy:~/scripts> echo $COLOUR

willy:~/scripts> echo $VALUE

willy:~/scripts>

This is the most common way to execute a script. It is preferred to execute the script like this in a subshell. The variables, functions and aliases created in this subshell are only known to the particular bash session of that subshell. When that shell exits and the parent regains control, everything is cleaned up and all changes to the state of the shell made by the script, are forgotten.

If you did not put the scripts directory in your PATH, and . (the current directory) is not in the PATH either, you can activate the script like this:

./script_name.sh

A script can also explicitly be executed by a given shell, but generally we only do this if we want to obtain special behavior, such as checking if the script works with another shell or printing traces for debugging:

rbash script_name.sh

sh script_name.sh

bash -x script_name.sh

The specified shell will start as a subshell of your current shell and execute the script. This is done when you want the script to start up with specific options or under specific conditions which are not specified in the script.

If you don't want to start a new shell but execute the script in the current shell, you source it:

source script_name.sh

Tipsource = .
 

The Bash source built-in is a synonym for the Bourne shell . (dot) command.

The script does not need execute permission in this case. Commands are executed in the current shell context, so any changes made to your environment will be visible when the script finishes execution:


willy:~/scripts> source script1.sh
--output ommitted--

willy:~/scripts> echo $VALUE
9

willy:~/scripts>

2.2. Script basics

2.2.1. Which shell will run the script?

When running a script in a subshell, you should define which shell should run the script. The shell type in which you wrote the script might not be the default on your system, so commands you entered might result in errors when executed by the wrong shell.

The first line of the script determines the shell to start. The first two characters of the first line should be #!, then follows the path to the shell that should interpret the commands that follow. Blank lines are also considered to be lines, so don't start your script with an empty line.

For the purpose of this course, all scripts will start with the line

#!/bin/bash

As noted before, this implies that the Bash executable can be found in /bin.


2.2.2. Adding comments

You should be aware of the fact that you might not be the only person reading your code. A lot of users and system administrators run scripts that were written by other people. If they want to see how you did it, comments are useful to enlighten the reader.

Comments also make your own life easier. Say that you had to read a lot of man pages in order to achieve a particular result with some command that you used in your script. You won't remember how it worked if you need to change your script after a few weeks or months, unless you have commented what you did, how you did it and/or why you did it.

Take the script1.sh example and copy it to commented-script1.sh, which we edit so that the comments reflect what the script does. Everything the shell encounters after a hash mark on a line is ignored and only visible upon opening the shell script file:


#!/bin/bash
# This script clears the terminal, displays a greeting and gives information
# about currently connected users.  The two example variables are set and displayed.

clear				# clear terminal window

echo "The script starts now."

echo "Hi, $USER!"		# dollar sign is used to get content of variable
echo

echo "I will now fetch you a list of connected users:"
echo							
w				# show who is logged on and
echo				# what they are doing

echo "I'm setting two variables now."
COLOUR="black"					# set a local shell variable
VALUE="9"					# set a local shell variable
echo "This is a string: $COLOUR"		# display content of variable 
echo "And this is a number: $VALUE"		# display content of variable
echo

echo "I'm giving you back your prompt now."
echo

In a decent script, the first lines are usually comment about what to expect. Then each big chunk of commands will be commented as needed for clarity's sake. Linux init scripts, as an example, in your system's init.d directory, are usually well commented since they have to be readable and editable by everyone running Linux.


2.3. Debugging Bash scripts

2.3.1. Debugging on the entire script

When things don't go according to plan, you need to determine what exactly causes the script to fail. Bash provides extensive debugging features. The most common is to start up the subshell with the -x option, which will run the entire script in debug mode. Traces of each command plus its arguments are printed to standard output after the commands have been expanded but before they are executed.

This is the commented-script1.sh script ran in debug mode. Note again that the added comments are not visible in the output of the script.


willy:~/scripts> bash -x script1.sh
+ clear

+ echo 'The script starts now.'
The script starts now.
+ echo 'Hi, willy!'
Hi, willy!
+ echo

+ echo 'I will now fetch you a list of connected users:'
I will now fetch you a list of connected users:
+ echo

+ w
  4:50pm  up 18 days,  6:49,  4 users,  load average: 0.58, 0.62, 0.40
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
root     tty2     -                Sat 2pm  5:36m  0.24s  0.05s  -bash
willy	 :0       -                Sat 2pm   ?     0.00s   ?     -
willy	 pts/3    -                Sat 2pm 43:13  36.82s 36.82s  BitchX willy ir
willy    pts/2    -                Sat 2pm 43:13   0.13s  0.06s  /usr/bin/screen
+ echo

+ echo 'I'\''m setting two variables now.'
I'm setting two variables now.
+ COLOUR=black
+ VALUE=9
+ echo 'This is a string: '
This is a string:
+ echo 'And this is a number: '
And this is a number:
+ echo

+ echo 'I'\''m giving you back your prompt now.'
I'm giving you back your prompt now.
+ echo

There is now a full-fledged debugger for Bash, available at SourceForge. These debugging features are available in most modern versions of Bash, starting from 3.x.


2.3.2. Debugging on part(s) of the script

Using the set Bash built-in you can run in normal mode those portions of the script of which you are sure they are without fault, and display debugging information only for troublesome zones. Say we are not sure what the w command will do in the example commented-script1.sh, then we could enclose it in the script like this:


set -x			# activate debugging from here
w
set +x			# stop debugging from here

Output then looks like this:


willy: ~/scripts> script1.sh
The script starts now.
Hi, willy!

I will now fetch you a list of connected users:

+ w
  5:00pm  up 18 days,  7:00,  4 users,  load average: 0.79, 0.39, 0.33
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
root     tty2     -                Sat 2pm  5:47m  0.24s  0.05s  -bash
willy    :0       -                Sat 2pm   ?     0.00s   ?     -
willy    pts/3    -                Sat 2pm 54:02  36.88s 36.88s  BitchX willyke
willy    pts/2    -                Sat 2pm 54:02   0.13s  0.06s  /usr/bin/screen
+ set +x

I'm setting two variables now.
This is a string:
And this is a number:

I'm giving you back your prompt now.

willy: ~/scripts>

You can switch debugging mode on and off as many times as you want within the same script.

The table below gives an overview of other useful Bash options:

Table 2-1. Overview of set debugging options

Short notationLong notationResult
set -fset -o noglobDisable file name generation using metacharacters (globbing).
set -vset -o verbosePrints shell input lines as they are read.
set -xset -o xtracePrint command traces before executing command.

The dash is used to activate a shell option and a plus to deactivate it. Don't let this confuse you!

In the example below, we demonstrate these options on the command line:


willy:~/scripts> set -v

willy:~/scripts> ls
ls 
commented-scripts.sh	script1.sh

willy:~/scripts> set +v
set +v

willy:~/scripts> ls *
commented-scripts.sh    script1.sh

willy:~/scripts> set -f

willy:~/scripts> ls *
ls: *: No such file or directory

willy:~/scripts> touch *

willy:~/scripts> ls
*   commented-scripts.sh    script1.sh

willy:~/scripts> rm *

willy:~/scripts> ls
commented-scripts.sh    script1.sh

Alternatively, these modes can be specified in the script itself, by adding the desired options to the first line shell declaration. Options can be combined, as is usually the case with UNIX commands:

#!/bin/bash -xv

Once you found the buggy part of your script, you can add echo statements before each command of which you are unsure, so that you will see exactly where and why things don't work. In the example commented-script1.sh script, it could be done like this, still assuming that the displaying of users gives us problems:


echo "debug message: now attempting to start w command"; w

In more advanced scripts, the echo can be inserted to display the content of variables at different stages in the script, so that flaws can be detected:


echo "Variable VARNAME is now set to $VARNAME."

2.4. Summary

A shell script is a reusable series of commands put in an executable text file. Any text editor can be used to write scripts.

Scripts start with #! followed by the path to the shell executing the commands from the script. Comments are added to a script for your own future reference, and also to make it understandable for other users. It is better to have too many explanations than not enough.

Debugging a script can be done using shell options. Shell options can be used for partial debugging or for analyzing the entire script. Inserting echo commands at strategic locations is also a common troubleshooting technique.


2.5. Exercises

This exercise will help you to create your first script.

  1. Write a script using your favorite editor. The script should display the path to your homedirectory and the terminal type that you are using. Additionally it shows all the services started up in runlevel 3 on your system. (hint: use HOME, TERM and ls /etc/rc3.d/S*)

  2. Add comments in your script.

  3. Add information for the users of your script.

  4. Change permissions on your script so that you can run it.

  5. Run the script in normal mode and in debug mode. It should run without errors.

  6. Make errors in your script: see what happens if you misspell commands, if you leave out the first line or put something unintelligible there, or if you misspell shell variable names or write them in lower case characters after they have been declared in capitals. Check what the debug comments say about this.


Chapter 3. The Bash environment

In this chapter we will discuss the various ways in which the Bash environment can be influenced:

  • Editing shell initialization files

  • Using variables

  • Using different quote styles

  • Perform arithmetic calculations

  • Assigning aliases

  • Using expansion and substitution


3.1. Shell initialization files

3.1.1. System-wide configuration files

3.1.1.1. /etc/profile

When invoked interactively with the --login option or when invoked as sh, Bash reads the /etc/profile instructions. These usually set the shell variables PATH, USER, MAIL, HOSTNAME and HISTSIZE.

On some systems, the umask value is configured in /etc/profile; on other systems this file holds pointers to other configuration files such as:

  • /etc/inputrc, the system-wide Readline initialization file where you can configure the command line bell-style.

  • the /etc/profile.d directory, which contains files configuring system-wide behavior of specific programs.

All settings that you want to apply to all your users' environments should be in this file. It might look like this:


# /etc/profile

# System wide environment and startup programs, for login setup

PATH=$PATH:/usr/X11R6/bin

# No core files by default
ulimit -S -c 0 > /dev/null 2>&1

USER="`id -un`"
LOGNAME=$USER
MAIL="/var/spool/mail/$USER"

HOSTNAME=`/bin/hostname`
HISTSIZE=1000

# Keyboard, bell, display style: the readline config file:
if [ -z "$INPUTRC" -a ! -f "$HOME/.inputrc" ]; then
    INPUTRC=/etc/inputrc
fi

PS1="\u@\h \W"

export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE INPUTRC PS1

# Source initialization files for specific programs (ls, vim, less, ...)
for i in /etc/profile.d/*.sh ; do
    if [ -r "$i" ]; then
        . $i
    fi
done

# Settings for program initialization
source /etc/java.conf
export NPX_PLUGIN_PATH="$JRE_HOME/plugin/ns4plugin/:/usr/lib/netscape/plugins"

PAGER="/usr/bin/less"

unset i

This configuration file sets some basic shell environment variables as well as some variables required by users running Java and/or Java applications in their web browser. See Section 3.2.

See Chapter 7 for more on the conditional if used in this file; Chapter 9 discusses loops such as the for construct.

The Bash source contains sample profile files for general or individual use. These and the one in the example above need changes in order for them to work in your environment!


3.1.1.2. /etc/bashrc

On systems offering multiple types of shells, it might be better to put Bash-specific configurations in this file, since /etc/profile is also read by other shells, such as the Bourne shell. Errors generated by shells that don't understand the Bash syntax are prevented by splitting the configuration files for the different types of shells. In such cases, the user's ~/.bashrc might point to /etc/bashrc in order to include it in the shell initialization process upon login.

You might also find that /etc/profile on your system only holds shell environment and program startup settings, while /etc/bashrc contains system-wide definitions for shell functions and aliases. The /etc/bashrc file might be referred to in /etc/profile or in individual user shell initialization files.

The source contains sample bashrc files, or you might find a copy in /usr/share/doc/bash-2.05b/startup-files. This is part of the bashrc that comes with the Bash documentation:


alias ll='ls -l'
alias dir='ls -ba'
alias c='clear'
alias ls='ls --color'

alias mroe='more'
alias pdw='pwd'
alias sl='ls --color'

pskill()
{
        local pid

        pid=$(ps -ax | grep $1 | grep -v grep | gawk '{ print $1 }')
        echo -n "killing $1 (process $pid)..."
        kill -9 $pid
        echo "slaughtered."
}

Apart from general aliases, it contains useful aliases which make commands work even if you misspell them. We will discuss aliases in Section 3.5.2. This file contains a function, pskill; functions will be studied in detail in Chapter 11.


3.1.2. Individual user configuration files

NoteI don't have these files?!
 

These files might not be in your home directory by default; create them if needed.


3.1.2.1. ~/.bash_profile

This is the preferred configuration file for configuring user environments individually. In this file, users can add extra configuration options or change default settings:


franky~> cat .bash_profile
#################################################################
#                                                               #
#   .bash_profile file                                          #
#                                                               #
#   Executed from the bash shell when you log in.               #
#                                                               #
#################################################################

source ~/.bashrc
source ~/.bash_login
case "$OS" in
  IRIX)
    stty sane dec
    stty erase
    ;;
#  SunOS)
#    stty erase
#    ;;
  *)
    stty sane
    ;;
esac

This user configures the backspace character for login on different operating systems. Apart from that, the user's .bashrc and .bash_login are read.


3.1.2.2. ~/.bash_login

This file contains specific settings that are normally only executed when you log in to the system. In the example, we use it to configure the umask value and to show a list of connected users upon login. This user also gets the calendar for the current month:


#######################################################################
#                                                                     #
#   Bash_login file                                                   #
#                                                                     #
#   commands to perform from the bash shell at login time             #
#   (sourced from .bash_profile)                                      #
#                                                                     #
#######################################################################
#   file protection
umask 002       # all to me, read to group and others
#   miscellaneous
w
cal `date +"%m"` `date +"%Y"`

In the absence of ~/.bash_profile, this file will be read.


3.1.2.3. ~/.profile

In the absence of ~/.bash_profile and ~/.bash_login, ~/.profile is read. It can hold the same configurations, which are then also accessible by other shells. Mind that other shells might not understand the Bash syntax.


3.1.2.4. ~/.bashrc

Today, it is more common to use a non-login shell, for instance when logged in graphically using X terminal windows. Upon opening such a window, the user does not have to provide a user name or password; no authentication is done. Bash searches for ~/.bashrc when this happens, so it is referred to in the files read upon login as well, which means you don't have to enter the same settings in multiple files.

In this user's .bashrc a couple of aliases are defined and variables for specific programs are set after the system-wide /etc/bashrc is read:


franky ~> cat .bashrc
# /home/franky/.bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
       . /etc/bashrc

fi

# shell options

set -o noclobber

# my shell variables

export PS1="\[\033[1;44m\]\u \w\[\033[0m\] "
export PATH="$PATH:~/bin:~/scripts"

# my aliases

alias cdrecord='cdrecord -dev 0,0,0 -speed=8'
alias ss='ssh octarine'
alias ll='ls -la'

# mozilla fix

MOZILLA_FIVE_HOME=/usr/lib/mozilla
LD_LIBRARY_PATH=/usr/lib/mozilla:/usr/lib/mozilla/plugins
MOZ_DIST_BIN=/usr/lib/mozilla
MOZ_PROGRAM=/usr/lib/mozilla/mozilla-bin
export MOZILLA_FIVE_HOME LD_LIBRARY_PATH MOZ_DIST_BIN MOZ_PROGRAM

# font fix
alias xt='xterm -bg black -fg white &'

# BitchX settings
export IRCNAME="frnk"

# THE END
franky ~>

More examples can be found in the Bash package. Remember that sample files might need changes in order to work in your environment.

Aliases are discussed in Section 3.5.


3.1.2.5. ~/.bash_logout

This file contains specific instructions for the logout procedure. In the example, the terminal window is cleared upon logout. This is useful for remote connections, which will leave a clean window after closing them.


franky ~> cat .bash_logout
#######################################################################
#                                                                     #
#   Bash_logout file                                                  #
#                                                                     #
#   commands to perform from the bash shell at logout time            #
#                                                                     #
#######################################################################
clear
franky ~>

3.1.3. Changing shell configuration files

When making changes to any of the above files, users have to either reconnect to the system or source the altered file for the changes to take effect. By interpreting the script this way, changes are applied to the current shell session:

Figure 3-1. Different prompts for different users

Most shell scripts execute in a private environment: variables are not inherited by child processes unless they are exported by the parent shell. Sourcing a file containing shell commands is a way of applying changes to your own environment and setting variables in the current shell.

This example also demonstrates the use of different prompt settings by different users. In this case, red means danger. When you have a green prompt, don't worry too much.

Note that source resourcefile is the same as . resourcefile.

Should you get lost in all these configuration files, and find yourself confronted with settings of which the origin is not clear, use echo statements, just like for debugging scripts; see Section 2.3.2. You might add lines like this:


echo "Now executing .bash_profile.."

or like this:


echo "Now setting PS1 in .bashrc:"
export PS1="[some value]"
echo "PS1 is now set to $PS1"

3.2. Variables

3.2.1. Types of variables

As seen in the examples above, shell variables are in uppercase characters by convention. Bash keeps a list of two types of variables:


3.2.1.1. Global variables

Global variables or environment variables are available in all shells. The env or printenv commands can be used to display environment variables. These programs come with the sh-utils package.

Below is a typical output:


franky ~> printenv
CC=gcc
CDPATH=.:~:/usr/local:/usr:/
CFLAGS=-O2 -fomit-frame-pointer
COLORTERM=gnome-terminal
CXXFLAGS=-O2 -fomit-frame-pointer
DISPLAY=:0
DOMAIN=hq.garrels.be
e=
TOR=vi
FCEDIT=vi
FIGNORE=.o:~
G_BROKEN_FILENAMES=1
GDK_USE_XFT=1
GDMSESSION=Default
GNOME_DESKTOP_SESSION_ID=Default
GTK_RC_FILES=/etc/gtk/gtkrc:/nethome/franky/.gtkrc-1.2-gnome2
GWMCOLOR=darkgreen
GWMTERM=xterm
HISTFILESIZE=5000
history_control=ignoredups
HISTSIZE=2000
HOME=/nethome/franky
HOSTNAME=octarine.hq.garrels.be
INPUTRC=/etc/inputrc
IRCNAME=franky
JAVA_HOME=/usr/java/j2sdk1.4.0
LANG=en_US
LDFLAGS=-s
LD_LIBRARY_PATH=/usr/lib/mozilla:/usr/lib/mozilla/plugins
LESSCHARSET=latin1
LESS=-edfMQ
LESSOPEN=|/usr/bin/lesspipe.sh %s
LEX=flex
LOCAL_MACHINE=octarine
LOGNAME=franky
LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:
MACHINES=octarine
MAILCHECK=60
MAIL=/var/mail/franky
MANPATH=/usr/man:/usr/share/man/:/usr/local/man:/usr/X11R6/man
MEAN_MACHINES=octarine
MOZ_DIST_BIN=/usr/lib/mozilla
MOZILLA_FIVE_HOME=/usr/lib/mozilla
MOZ_PROGRAM=/usr/lib/mozilla/mozilla-bin
MTOOLS_FAT_COMPATIBILITY=1
MYMALLOC=0
NNTPPORT=119
NNTPSERVER=news
NPX_PLUGIN_PATH=/plugin/ns4plugin/:/usr/lib/netscape/plugins
OLDPWD=/nethome/franky
OS=Linux
PAGER=less
PATH=/nethome/franky/bin.Linux:/nethome/franky/bin:/usr/local/bin:/usr/local/sbin:/usr/X11R6/bin:/usr/bin:/usr/sbin:/bin:/sbin:.
PS1=\[\033[1;44m\]franky is in \w\[\033[0m\]
PS2=More input>
PWD=/nethome/franky
SESSION_MANAGER=local/octarine.hq.garrels.be:/tmp/.ICE-unix/22106
SHELL=/bin/bash
SHELL_LOGIN=--login
SHLVL=2
SSH_AGENT_PID=22161
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
SSH_AUTH_SOCK=/tmp/ssh-XXmhQ4fC/agent.22106
START_WM=twm
TERM=xterm
TYPE=type
USERNAME=franky
USER=franky
_=/usr/bin/printenv
VISUAL=vi
WINDOWID=20971661
XAPPLRESDIR=/nethome/franky/app-defaults
XAUTHORITY=/nethome/franky/.Xauthority
XENVIRONMENT=/nethome/franky/.Xdefaults
XFILESEARCHPATH=/usr/X11R6/lib/X11/%L/%T/%N%C%S:/usr/X11R6/lib/X11/%l/%T/%N%C%S:/usr/X11R6/lib/X11/%T/%N%C%S:/usr/X11R6/lib/X11/%L/%T/%N%S:/usr/X11R6/lib/X11/%l/%T/%N%S:/usr/X11R6/lib/X11/%T/%N%S
XKEYSYMDB=/usr/X11R6/lib/X11/XKeysymDB
XMODIFIERS=@im=none
XTERMID=
XWINHOME=/usr/X11R6
X=X11R6
YACC=bison -y

3.2.1.2. Local variables

Local variables are only available in the current shell. Using the set built-in command without any options will display a list of all variables (including environment variables) and functions. The output will be sorted according to the current locale and displayed in a reusable format.

Below is a diff file made by comparing printenv and set output, after leaving out the functions which are also displayed by the set command:


franky ~> diff set.sorted printenv.sorted | grep "<" | awk '{ print $2 }'
BASE=/nethome/franky/.Shell/hq.garrels.be/octarine.aliases
BASH=/bin/bash
BASH_VERSINFO=([0]="2"
BASH_VERSION='2.05b.0(1)-release'
COLUMNS=80
DIRSTACK=()
DO_FORTUNE=
EUID=504
GROUPS=()
HERE=/home/franky
HISTFILE=/nethome/franky/.bash_history
HOSTTYPE=i686
IFS=$'
LINES=24
MACHTYPE=i686-pc-linux-gnu
OPTERR=1
OPTIND=1
OSTYPE=linux-gnu
PIPESTATUS=([0]="0")
PPID=10099
PS4='+
PWD_REAL='pwd
SHELLOPTS=braceexpand:emacs:hashall:histexpand:history:interactive-comments:monitor
THERE=/home/franky
UID=504

NoteAwk
 

the GNU Awk programming language is explained in Chapter 6.


3.2.1.3. Variables by content

Apart from dividing variables in local and global variables, we can also divide them in categories according to the sort of content the variable contains. In this respect, variables come in 4 types:

  • String variables

  • Integer variables

  • Constant variables

  • Array variables

We'll discuss these types in Chapter 10. For now, we will work with integer and string values for our variables.


3.2.2. Creating variables

Variables are case sensitive and capitalized by default. Giving local variables a lowercase name is a convention which is sometimes applied. However, you are free to use the names you want or to mix cases. Variables can also contain digits, but a name starting with a digit is not allowed:


prompt> export 1number=1
bash: export: `1number=1': not a valid identifier

To set a variable in the shell, use

VARNAME="value"

Putting spaces around the equal sign will cause errors. It is a good habit to quote content strings when assigning values to variables: this will reduce the chance that you make errors.

Some examples using upper and lower cases, numbers and spaces:


franky ~> MYVAR1="2"

franky ~> echo $MYVAR1
2

franky ~> first_name="Franky"

franky ~> echo $first_name
Franky

franky ~> full_name="Franky M. Singh"

franky ~> echo $full_name
Franky M. Singh

franky ~> MYVAR-2="2"
bash: MYVAR-2=2: command not found

franky ~> MYVAR1 ="2"
bash: MYVAR1: command not found

franky ~> MYVAR1= "2"
bash: 2: command not found

franky ~> unset MYVAR1 first_name full_name

franky ~> echo $MYVAR1 $first_name $full_name
<--no output-->

franky ~>

3.2.3. Exporting variables

A variable created like the ones in the example above is only available to the current shell. It is a local variable: child processes of the current shell will not be aware of this variable. In order to pass variables to a subshell, we need to export them using the export built-in command. Variables that are exported are referred to as environment variables. Setting and exporting is usually done in one step:

export VARNAME="value"

A subshell can change variables it inherited from the parent, but the changes made by the child don't affect the parent. This is demonstrated in the example:


franky ~> full_name="Franky M. Singh"

franky ~> bash

franky ~> echo $full_name


franky ~> exit

franky ~> export full_name

franky ~> bash

franky ~> echo $full_name
Franky M. Singh

franky ~> export full_name="Charles the Great"

franky ~> echo $full_name
Charles the Great

franky ~> exit

franky ~> echo $full_name
Franky M. Singh

franky ~>

When first trying to read the value of full_name in a subshell, it is not there (echo shows a null string). The subshell quits, and full_name is exported in the parent - a variable can be exported after it has been assigned a value. Then a new subshell is started, in which the variable exported from the parent is visible. The variable is changed to hold another name, but the value for this variable in the parent stays the same.


3.2.4. Reserved variables

3.2.4.1. Bourne shell reserved variables

Bash uses certain shell variables in the same way as the Bourne shell. In some cases, Bash assigns a default value to the variable. The table below gives an overview of these plain shell variables:

Table 3-1. Reserved Bourne shell variables

Variable nameDefinition
CDPATHA colon-separated list of directories used as a search path for the cd built-in command.
HOMEThe current user's home directory; the default for the cd built-in. The value of this variable is also used by tilde expansion.
IFSA list of characters that separate fields; used when the shell splits words as part of expansion.
MAILIf this parameter is set to a file name and the MAILPATH variable is not set, Bash informs the user of the arrival of mail in the specified file.
MAILPATHA colon-separated list of file names which the shell periodically checks for new mail.
OPTARGThe value of the last option argument processed by the getopts built-in.
OPTINDThe index of the last option argument processed by the getopts built-in.
PATHA colon-separated list of directories in which the shell looks for commands.
PS1The primary prompt string. The default value is "'\s-\v\$ '".
PS2The secondary prompt string. The default value is "'> '".

3.2.4.2. Bash reserved variables

These variables are set or used by Bash, but other shells do not normally treat them specially.

Table 3-2. Reserved Bash variables

Variable nameDefinition
auto_resumeThis variable controls how the shell interacts with the user and job control.
BASHThe full pathname used to execute the current instance of Bash.
BASH_ENVIf this variable is set when Bash is invoked to execute a shell script, its value is expanded and used as the name of a startup file to read before executing the script.
BASH_VERSIONThe version number of the current instance of Bash.
BASH_VERSINFOA read-only array variable whose members hold version information for this instance of Bash.
COLUMNSUsed by the select built-in to determine the terminal width when printing selection lists. Automatically set upon receipt of a SIGWINCH signal.
COMP_CWORDAn index into ${COMP_WORDS} of the word containing the current cursor position.
COMP_LINEThe current command line.
COMP_POINTThe index of the current cursor position relative to the beginning of the current command.
COMP_WORDSAn array variable consisting of the individual words in the current command line.
COMPREPLYAn array variable from which Bash reads the possible completions generated by a shell function invoked by the programmable completion facility.
DIRSTACKAn array variable containing the current contents of the directory stack.
EUIDThe numeric effective user ID of the current user.
FCEDITThe editor used as a default by the -e option to the fc built-in command.
FIGNOREA colon-separated list of suffixes to ignore when performing file name completion.
FUNCNAMEThe name of any currently-executing shell function.
GLOBIGNOREA colon-separated list of patterns defining the set of file names to be ignored by file name expansion.
GROUPSAn array variable containing the list of groups of which the current user is a member.
histcharsUp to three characters which control history expansion, quick substitution, and tokenization.
HISTCMDThe history number, or index in the history list, of the current command.
HISTCONTROLDefines whether a command is added to the history file.
HISTFILEThe name of the file to which the command history is saved. The default value is ~/.bash_history.
HISTFILESIZEThe maximum number of lines contained in the history file, defaults to 500.
HISTIGNOREA colon-separated list of patterns used to decide which command lines should be saved in the history list.
HISTSIZEThe maximum number of commands to remember on the history list, default is 500.
HOSTFILEContains the name of a file in the same format as /etc/hosts that should be read when the shell needs to complete a hostname.
HOSTNAMEThe name of the current host.
HOSTTYPEA string describing the machine Bash is running on.
IGNOREEOFControls the action of the shell on receipt of an EOF character as the sole input.
INPUTRCThe name of the Readline initialization file, overriding the default /etc/inputrc.
LANGUsed to determine the locale category for any category not specifically selected with a variable starting with LC_.
LC_ALLThis variable overrides the value of LANG and any other LC_ variable specifying a locale category.
LC_COLLATEThis variable determines the collation order used when sorting the results of file name expansion, and determines the behavior of range expressions, equivalence classes, and collating sequences within file name expansion and pattern matching.
LC_CTYPEThis variable determines the interpretation of characters and the behavior of character classes within file name expansion and pattern matching.
LC_MESSAGESThis variable determines the locale used to translate double-quoted strings preceded by a "$" sign.
LC_NUMERICThis variable determines the locale category used for number formatting.
LINENOThe line number in the script or shell function currently executing.
LINESUsed by the select built-in to determine the column length for printing selection lists.
MACHTYPEA string that fully describes the system type on which Bash is executing, in the standard GNU CPU-COMPANY-SYSTEM format.
MAILCHECKHow often (in seconds) that the shell should check for mail in the files specified in the MAILPATH or MAIL variables.
OLDPWDThe previous working directory as set by the cd built-in.
OPTERRIf set to the value 1, Bash displays error messages generated by the getopts built-in.
OSTYPEA string describing the operating system Bash is running on.
PIPESTATUSAn array variable containing a list of exit status values from the processes in the most recently executed foreground pipeline (which may contain only a single command).
POSIXLY_CORRECTIf this variable is in the environment when bash starts, the shell enters POSIX mode.
PPIDThe process ID of the shell's parent process.
PROMPT_COMMANDIf set, the value is interpreted as a command to execute before the printing of each primary prompt (PS1).
PS3The value of this variable is used as the prompt for the select command. Defaults to "'#? '"
PS4The value is the prompt printed before the command line is echoed when the -x option is set; defaults to "'+ '".
PWDThe current working directory as set by the cd built-in command.
RANDOMEach time this parameter is referenced, a random integer between 0 and 32767 is generated. Assigning a value to this variable seeds the random number generator.
REPLYThe default variable for the read built-in.
SECONDSThis variable expands to the number of seconds since the shell was started.
SHELLOPTSA colon-separated list of enabled shell options.
SHLVLIncremented by one each time a new instance of Bash is started.
TIMEFORMATThe value of this parameter is used as a format string specifying how the timing information for pipelines prefixed with the time reserved word should be displayed.
TMOUTIf set to a value greater than zero, TMOUT is treated as the default timeout for the read built-in. In an interative shell, the value is interpreted as the number of seconds to wait for input after issuing the primary prompt when the shell is interactive. Bash terminates after that number of seconds if input does not arrive.
UIDThe numeric, real user ID of the current user.

Check the Bash man, info or doc pages for extended information. Some variables are read-only, some are set automatically and some lose their meaning when set to a different value than the default.


3.2.5. Special parameters

The shell treats several parameters specially. These parameters may only be referenced; assignment to them is not allowed.

Table 3-3. Special bash variables

CharacterDefinition
$*Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, it expands to a single word with the value of each parameter separated by the first character of the IFS special variable.
$@Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word.
$#Expands to the number of positional parameters in decimal.
$?Expands to the exit status of the most recently executed foreground pipeline.
$-A hyphen expands to the current option flags as specified upon invocation, by the set built-in command, or those set by the shell itself (such as the -i).
$$Expands to the process ID of the shell.
$!Expands to the process ID of the most recently executed background (asynchronous) command.
$0Expands to the name of the shell or shell script.
$_The underscore variable is set at shell startup and contains the absolute file name of the shell or script being executed as passed in the argument list. Subsequently, it expands to the last argument to the previous command, after expansion. It is also set to the full pathname of each command executed and placed in the environment exported to that command. When checking mail, this parameter holds the name of the mail file.

Note$* vs. $@
 

The implementation of "$*" has always been a problem and realistically should have been replaced with the behavior of "$@". In almost every case where coders use "$*", they mean "$@". "$*" Can cause bugs and even security holes in your software.

The positional parameters are the words following the name of a shell script. They are put into the variables $1, $2, $3 and so on. As long as needed, variables are added to an internal array. $# holds the total number of parameters, as is demonstrated with this simple script:


#!/bin/bash

# positional.sh
# This script reads 3 positional parameters and prints them out.

POSPAR1="$1"
POSPAR2="$2"
POSPAR3="$3"

echo "$1 is the first positional parameter, \$1."
echo "$2 is the second positional parameter, \$2."
echo "$3 is the third positional parameter, \$3."
echo
echo "The total number of positional parameters is $#."

Upon execution one could give any numbers of arguments:


franky ~> positional.sh one two three four five
one is the first positional parameter, $1.
two is the second positional parameter, $2.
three is the third positional parameter, $3.

The total number of positional parameters is 5.

franky ~> positional.sh one two
one is the first positional parameter, $1.
two is the second positional parameter, $2.
 is the third positional parameter, $3.

The total number of positional parameters is 2.

More on evaluating these parameters is in Chapter 7 and Section 9.7.

Some examples on the other special parameters:


franky ~> grep dictionary /usr/share/dict/words
dictionary

franky ~> echo $_
/usr/share/dict/words

franky ~> echo $$
10662

franky ~> mozilla &
[1] 11064

franky ~> echo $!
11064

franky ~> echo $0
bash

franky ~> echo $?
0

franky ~> ls doesnotexist
ls: doesnotexist: No such file or directory

franky ~> echo $?
1

franky ~>

User franky starts entering the grep command, which results in the assignment of the _ variable. The process ID of his shell is 10662. After putting a job in the background, the ! holds the process ID of the backgrounded job. The shell running is bash. When a mistake is made, ? holds an exit code different from 0 (zero).


3.2.6. Script recycling with variables

Apart from making the script more readable, variables will also enable you to faster apply a script in another environment or for another purpose. Consider the following example, a very simple script that makes a backup of franky's home directory to a remote server:


#!/bin/bash

# This script makes a backup of my home directory.

cd /home

# This creates the archive
tar cf /var/tmp/home_franky.tar franky > /dev/null 2>&1

# First remove the old bzip2 file.  Redirect errors because this generates some if the archive
# does not exist.  Then create a new compressed file.
rm /var/tmp/home_franky.tar.bz2 2> /dev/null
bzip2 /var/tmp/home_franky.tar

# Copy the file to another host - we have ssh keys for making this work without intervention.
scp /var/tmp/home_franky.tar.bz2 bordeaux:/opt/backup/franky > /dev/null 2>&1

# Create a timestamp in a logfile.
date >> /home/franky/log/home_backup.log
echo backup succeeded >> /home/franky/log/home_backup.log

First of all, you are more likely to make errors if you name files and directories manually each time you need them. Secondly, suppose franky wants to give this script to carol, then carol will have to do quite some editing before she can use the script to back up her home directory. The same is true if franky wants to use this script for backing up other directories. For easy recycling, make all files, directories, usernames, servernames etcetera variable. Thus, you only need to edit a value once, without having to go through the entire script to check where a parameter occurs. This is an example:


#!/bin/bash
                                                                                                 
# This script makes a backup of my home directory.

# Change the values of the variables to make the script work for you:
BACKUPDIR=/home
BACKUPFILES=franky
TARFILE=/var/tmp/home_franky.tar
BZIPFILE=/var/tmp/home_franky.tar.bz2
SERVER=bordeaux
REMOTEDIR=/opt/backup/franky
LOGFILE=/home/franky/log/home_backup.log

cd $BACKUPDIR

# This creates the archive
tar cf $TARFILE $BACKUPFILES > /dev/null 2>&1
                                                                                                 
# First remove the old bzip2 file.  Redirect errors because this generates some if the archive 
# does not exist.  Then create a new compressed file.
rm $BZIPFILE 2> /dev/null
bzip2 $TARFILE

# Copy the file to another host - we have ssh keys for making this work without intervention.
scp $BZIPFILE $SERVER:$REMOTEDIR > /dev/null 2>&1

# Create a timestamp in a logfile.
date >> $LOGFILE
echo backup succeeded >> $LOGFILE

NoteLarge directories and low bandwidth
 

The above is purely an example that everybody can understand, using a small directory and a host on the same subnet. Depending on your bandwidth, the size of the directory and the location of the remote server, it can take an awful lot of time to make backups using this mechanism. For larger directories and lower bandwidth, use rsync to keep the directories at both ends synchronized.


3.3. Quoting characters

3.3.1. Why?

A lot of keys have special meanings in some context or other. Quoting is used to remove the special meaning of characters or words: quotes can disable special treatment for special characters, they can prevent reserved words from being recognized as such and they can disable parameter expansion.


3.3.2. Escape characters

Escape characters are used to remove the special meaning from a single character. A non-quoted backslash, \, is used as an escape character in Bash. It preserves the literal value of the next character that follows, with the exception of newline. If a newline character appears immediately after the backslash, it marks the continuation of a line when it is longer that the width of the terminal; the backslash is removed from the input stream and effectively ignored.


franky ~> date=20021226

franky ~> echo $date
20021226

franky ~> echo \$date
$date

In this example, the variable date is created and set to hold a value. The first echo displays the value of the variable, but for the second, the dollar sign is escaped.


3.3.3. Single quotes

Single quotes ('') are used to preserve the literal value of each character enclosed within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.

We continue with the previous example:


franky ~> echo '$date'
$date

3.3.4. Double quotes

Using double quotes the literal value of all characters enclosed is preserved, except for the dollar sign, the backticks (backward single quotes, ``) and the backslash.

The dollar sign and the backticks retain their special meaning within the double quotes.

The backslash retains its meaning only when followed by dollar, backtick, double quote, backslash or newline. Within double quotes, the backslashes are removed from the input stream when followed by one of these characters. Backslashes preceding characters that don't have a special meaning are left unmodified for processing by the shell interpreter.

A double quote may be quoted within double quotes by preceding it with a backslash.


franky ~> echo "$date"
20021226

franky ~> echo "`date`"
Sun Apr 20 11:22:06 CEST 2003

franky ~> echo "I'd say: \"Go for it!\""
I'd say: "Go for it!"

franky ~> echo "\"
More input>"

franky ~> echo "\\"
\

3.3.5. ANSI-C quoting

Words in the form "$'STRING'" are treated in a special way. The word expands to a string, with backslash-escaped characters replaced as specified by the ANSI-C standard. Backslash escape sequences can be found in the Bash documentation.


3.3.6. Locales

A double-quoted string preceded by a dollar sign will cause the string to be translated according to the current locale. If the current locale is "C" or "POSIX", the dollar sign is ignored. If the string is translated and replaced, the replacement is double-quoted.


3.4. Shell expansion

3.4.1. General

After the command has been split into tokens (see Section 1.4.1.1), these tokens or words are expanded or resolved. There are eight kinds of expansion performed, which we will discuss in the next sections, in the order that they are expanded.

After all expansions, quote removal is performed.


3.4.2. Brace expansion

Brace expansion is a mechanism by which arbitrary strings may be generated. Patterns to be brace-expanded take the form of an optional PREAMBLE, followed by a series of comma-separated strings between a pair of braces, followed by an optional POSTSCRIPT. The preamble is prefixed to each string contained within the braces, and the postscript is then appended to each resulting string, expanding left to right.

Brace expansions may be nested. The results of each expanded string are not sorted; left to right order is preserved:


franky ~> echo sp{el,il,al}l
spell spill spall

Brace expansion is performed before any other expansions, and any characters special to other expansions are preserved in the result. It is strictly textual. Bash does not apply any syntactic interpretation to the context of the expansion or the text between the braces. To avoid conflicts with parameter expansion, the string "${" is not considered eligible for brace expansion.

A correctly-formed brace expansion must contain unquoted opening and closing braces, and at least one unquoted comma. Any incorrectly formed brace expansion is left unchanged.


3.4.3. Tilde expansion

If a word begins with an unquoted tilde character ("~"), all of the characters up to the first unquoted slash (or all characters, if there is no unquoted slash) are considered a tilde-prefix. If none of the characters in the tilde-prefix are quoted, the characters in the tilde-prefix following the tilde are treated as a possible login name. If this login name is the null string, the tilde is replaced with the value of the HOME shell variable. If HOME is unset, the home directory of the user executing the shell is substituted instead. Otherwise, the tilde-prefix is replaced with the home directory associated with the specified login name.

If the tilde-prefix is "~+", the value of the shell variable PWD replaces the tilde-prefix. If the tilde-prefix is "~-", the value of the shell variable OLDPWD, if it is set, is substituted.

If the characters following the tilde in the tilde-prefix consist of a number N, optionally prefixed by a "+" or a "-", the tilde-prefix is replaced with the corresponding element from the directory stack, as it would be displayed by the dirs built-in invoked with the characters following tilde in the tilde-prefix as an argument. If the tilde-prefix, without the tilde, consists of a number without a leading "+" or "-", "+" is assumed.

If the login name is invalid, or the tilde expansion fails, the word is left unchanged.

Each variable assignment is checked for unquoted tilde-prefixes immediately following a ":" or "=". In these cases, tilde expansion is also performed. Consequently, one may use file names with tildes in assignments to PATH, MAILPATH, and CDPATH, and the shell assigns the expanded value.

Example:


franky ~> export PATH="$PATH:~/testdir"

~/testdir will be expanded to $HOME/testdir, so if $HOME is /var/home/franky, the directory /var/home/franky/testdir will be added to the content of the PATH variable.


3.4.4. Shell parameter and variable expansion

The "$" character introduces parameter expansion, command substitution, or arithmetic expansion. The parameter name or symbol to be expanded may be enclosed in braces, which are optional but serve to protect the variable to be expanded from characters immediately following it which could be interpreted as part of the name.

When braces are used, the matching ending brace is the first "}" not escaped by a backslash or within a quoted string, and not within an embedded arithmetic expansion, command substitution, or parameter expansion.

The basic form of parameter expansion is "${PARAMETER}". The value of "PARAMETER" is substituted. The braces are required when "PARAMETER" is a positional parameter with more than one digit, or when "PARAMETER" is followed by a character that is not to be interpreted as part of its name.

If the first character of "PARAMETER" is an exclamation point, Bash uses the value of the variable formed from the rest of "PARAMETER" as the name of the variable; this variable is then expanded and that value is used in the rest of the substitution, rather than the value of "PARAMETER" itself. This is known as indirect expansion.

You are certainly familiar with straight parameter expansion, since it happens all the time, even in the simplest of cases, such as the one above or the following:


franky ~> echo $SHELL
/bin/bash

The following is an example of indirect expansion:


franky ~> echo ${!N*}
NNTPPORT NNTPSERVER NPX_PLUGIN_PATH

Note that this is not the same as echo $N*.

The following construct allows for creation of the named variable if it does not yet exist:

${VAR:=value}

Example:


franky ~> echo $FRANKY

franky ~> echo ${FRANKY:=Franky}
Franky

Special parameters, among others the positional parameters, may not be assigned this way, however.

We will further discuss the use of the curly braces for treatment of variables in Chapter 10. More information can also be found in the Bash info pages.


3.4.5. Command substitution

Command substitution allows the output of a command to replace the command itself. Command substitution occurs when a command is enclosed like this:

$(command)

or like this using backticks:

`command`

Bash performs the expansion by executing COMMAND and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted, but they may be removed during word splitting.


franky ~> echo `date`
Thu Feb 6 10:06:20 CET 2003

When the old-style backquoted form of substitution is used, backslash retains its literal meaning except when followed by "$", "`", or "\". The first backticks not preceded by a backslash terminates the command substitution. When using the "$(COMMAND)" form, all characters between the parentheses make up the command; none are treated specially.

Command substitutions may be nested. To nest when using the backquoted form, escape the inner backticks with backslashes.

If the substitution appears within double quotes, word splitting and file name expansion are not performed on the results.


3.4.6. Arithmetic expansion

Arithmetic expansion allows the evaluation of an arithmetic expression and the substitution of the result. The format for arithmetic expansion is:

$(( EXPRESSION ))

The expression is treated as if it were within double quotes, but a double quote inside the parentheses is not treated specially. All tokens in the expression undergo parameter expansion, command substitution, and quote removal. Arithmetic substitutions may be nested.

Evaluation of arithmetic expressions is done in fixed-width integers with no check for overflow - although division by zero is trapped and recognized as an error. The operators are roughly the same as in the C programming language. In order of decreasing precedence, the list looks like this:

Table 3-4. Arithmetic operators

OperatorMeaning
VAR++ and VAR--variable post-increment and post-decrement
++VAR and --VARvariable pre-increment and pre-decrement
- and +unary minus and plus
! and ~logical and bitwise negation
**exponentiation
*, / and %multiplication, division, remainder
+ and -addition, subtraction
<< and >>left and right bitwise shifts
<=, >=, < and >comparison operators
== and !=equality and inequality
&bitwise AND
^bitwise exclusive OR
|bitwise OR
&&logical AND
||logical OR
expr ? expr : exprconditional evaluation
=, *=, /=, %=, +=, -=, <<=, >>=, &=, ^= and |=assignments
,separator between expressions

Shell variables are allowed as operands; parameter expansion is performed before the expression is evaluated. Within an expression, shell variables may also be referenced by name without using the parameter expansion syntax. The value of a variable is evaluated as an arithmetic expression when it is referenced. A shell variable need not have its integer attribute turned on to be used in an expression.

Constants with a leading 0 (zero) are interpreted as octal numbers. A leading "0x" or "0X" denotes hexadecimal. Otherwise, numbers take the form "[BASE'#']N", where "BASE" is a decimal number between 2 and 64 representing the arithmetic base, and N is a number in that base. If "BASE'#'" is omitted, then base 10 is used. The digits greater than 9 are represented by the lowercase letters, the uppercase letters, "@", and "_", in that order. If "BASE" is less than or equal to 36, lowercase and uppercase letters may be used interchangably to represent numbers between 10 and 35.

Operators are evaluated in order of precedence. Sub-expressions in parentheses are evaluated first and may override the precedence rules above.

Wherever possible, Bash users should try to use the syntax with square brackets:

$[ EXPRESSION ]

However, this will only calculate the result of EXPRESSION, and do no tests:


franky ~> echo $[365*24]
8760

See Section 7.1.2.2, among others, for practical examples in scripts.


3.4.7. Process substitution

Process substitution is supported on systems that support named pipes (FIFOs) or the /dev/fd method of naming open files. It takes the form of

<(LIST)

or

>(LIST)

The process LIST is run with its input or output connected to a FIFO or some file in /dev/fd. The name of this file is passed as an argument to the current command as the result of the expansion. If the ">(LIST)" form is used, writing to the file will provide input for LIST. If the "<(LIST)" form is used, the file passed as an argument should be read to obtain the output of LIST. Note that no space may appear between the < or > signs and the left parenthesis, otherwise the construct would be interpreted as a redirection.

When available, process substitution is performed simultaneously with parameter and variable expansion, command substitution, and arithmetic expansion.

More information in Section 8.2.3.


3.4.8. Word splitting

The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.

The shell treats each character of $IFS as a delimiter, and splits the results of the other expansions into words on these characters. If IFS is unset, or its value is exactly "'<space><tab><newline>'", the default, then any sequence of IFS characters serves to delimit words. If IFS has a value other than the default, then sequences of the whitespace characters "space" and "Tab" are ignored at the beginning and end of the word, as long as the whitespace character is in the value of IFS (an IFS whitespace character). Any character in IFS that is not IFS whitespace, along with any adjacent IF whitespace characters, delimits a field. A sequence of IFS whitespace characters is also treated as a delimiter. If the value of IFS is null, no word splitting occurs.

Explicit null arguments ("""" or "''") are retained. Unquoted implicit null arguments, resulting from the expansion of parameters that have no values, are removed. If a parameter with no value is expanded within double quotes, a null argument results and is retained.

NoteExpansion and word splitting
 

If no expansion occurs, no splitting is performed.


3.4.9. File name expansion

After word splitting, unless the -f option has been set (see Section 2.3.2), Bash scans each word for the characters "*", "?", and "[". If one of these characters appears, then the word is regarded as a PATTERN, and replaced with an alphabetically sorted list of file names matching the pattern. If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged. If the nullglob option is set, and no matches are found, the word is removed. If the shell option nocaseglob is enabled, the match is performed without regard to the case of alphabetic characters.

When a pattern is used for file name generation, the character "." at the start of a file name or immediately following a slash must be matched explicitly, unless the shell option dotglob is set. When matching a file name, the slash character must always be matched explicitly. In other cases, the "." character is not treated specially.

The GLOBIGNORE shell variable may be used to restrict the set of file names matching a pattern. If GLOBIGNORE is set, each matching file name that also matches one of the patterns in GLOBIGNORE is removed from the list of matches. The file names . and .. are always ignored, even when GLOBIGNORE is set. However, setting GLOBIGNORE has the effect of enabling the dotglob shell option, so all other file names beginning with a "." will match. To get the old behavior of ignoring file names beginning with a ".", make ".*" one of the patterns in GLOBIGNORE. The dotglob option is disabled when GLOBIGNORE is unset.


3.5. Aliases

3.5.1. What are aliases?

An alias allows a string to be substituted for a word when it is used as the first word of a simple command. The shell maintains a list of aliases that may be set and unset with the alias and unalias built-in commands. Issue the alias without options to display a list of aliases known to the current shell.


franky: ~> alias
alias ..='cd ..'
alias ...='cd ../..'
alias ....='cd ../../..'
alias PAGER='less -r'
alias Txterm='export TERM=xterm'
alias XARGS='xargs -r'
alias cdrecord='cdrecord -dev 0,0,0 -speed=8'
alias e='vi'
alias egrep='grep -E'
alias ewformat='fdformat -n /dev/fd0u1743; ewfsck'
alias fgrep='grep -F'
alias ftp='ncftp -d15'
alias h='history 10'
alias fformat='fdformat /dev/fd0H1440'
alias j='jobs -l'
alias ksane='setterm -reset'
alias ls='ls -F --color=auto'
alias m='less'
alias md='mkdir'
alias od='od -Ax -ta -txC'
alias p='pstree -p'
alias ping='ping -vc1'
alias sb='ssh blubber'
alias sl='ls'
alias ss='ssh octarine'
alias tar='gtar'
alias tmp='cd /tmp'
alias unaliasall='unalias -a'
alias vi='eval `resize`;vi'
alias vt100='export TERM=vt100'
alias which='type'
alias xt='xterm -bg black -fg white &'

franky ~>

Aliases are useful for specifying the default version of a command that exists in several versions on your system, or to specify default options to a command. Another use for aliases is for correcting incorrect spelling.

The first word of each simple command, if unquoted, is checked to see if it has an alias. If so, that word is replaced by the text of the alias. The alias name and the replacement text may contain any valid shell input, including shell metacharacters, with the exception that the alias name may not contain "=". The first word of the replacement text is tested for aliases, but a word that is identical to an alias being expanded is not expanded a second time. This means that one may alias ls to ls -F, for instance, and Bash will not try to recursively expand the replacement text. If the last character of the alias value is a space or tab character, then the next command word following the alias is also checked for alias expansion.

Aliases are not expanded when the shell is not interactive, unless the expand_aliases option is set using the shopt shell built-in.


3.5.2. Creating and removing aliases

Aliases are created using the alias shell built-in. For permanent use, enter the alias in one of your shell initialization files; if you just enter the alias on the command line, it is only recognized within the current shell.


franky ~> alias dh='df -h'

franky ~> dh
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda7             1.3G  272M 1018M  22% /
/dev/hda1             121M  9.4M  105M   9% /boot
/dev/hda2              13G  8.7G  3.7G  70% /home
/dev/hda3              13G  5.3G  7.1G  43% /opt
none                  243M     0  243M   0% /dev/shm
/dev/hda6             3.9G  3.2G  572M  85% /usr
/dev/hda5             5.2G  4.3G  725M  86% /var

franky ~> unalias dh

franky ~> dh
bash: dh: command not found

franky ~>

Bash always reads at least one complete line of input before executing any of the commands on that line. Aliases are expanded when a command is read, not when it is executed. Therefore, an alias definition appearing on the same line as another command does not take effect until the next line of input is read. The commands following the alias definition on that line are not affected by the new alias. This behavior is also an issue when functions are executed. Aliases are expanded when a function definition is read, not when the function is executed, because a function definition is itself a compound command. As a consequence, aliases defined in a function are not available until after that function is executed. To be safe, always put alias definitions on a separate line, and do not use alias in compound commands.

Aliases are not inherited by child processes. Bourne shell (sh) does not recognize aliases.

More about functions is in Chapter 11.

TipFunctions are faster
 

Aliases are looked up after functions and thus resolving is slower. While aliases are easier to understand, shell functions are preferred over aliases for almost every purpose.


3.6. More Bash options

3.6.1. Displaying options

We already discussed a couple of Bash options that are useful for debugging your scripts. In this section, we will take a more in-depth view of the Bash options.

Use the -o option to set to display all shell options:


willy:~> set -o
allexport		off
braceexpand		on
emacs			on
errexit			off
hashall			on
histexpand		on
history			on
ignoreeof		off
interactive-comments	on
keyword			off
monitor			on
noclobber		off
noexec			off
noglob			off
nolog			off
notify			off
nounset			off
onecmd			off
physical		off
posix			off
privileged		off
verbose			off
vi			off
xtrace			off

See the Bash Info pages, section Shell Built-in Commands->The Set Built-in for a description of each option. A lot of options have one-character shorthands: the xtrace option, for instance, is equal to specifying set -x.


3.6.2. Changing options

Shell options can either be set different from the default upon calling the shell, or be set during shell operation. They may also be included in the shell resource configuration files.

The following command executes a script in POSIX-compatible mode:


willy:~/scripts> bash --posix script.sh

For changing the current environment temporarily, or for use in a script, we would rather use set. Use - (dash) for enabling an option, + for disabling:


willy:~/test> set -o noclobber

willy:~/test> touch test

willy:~/test> date > test
bash: test: cannot overwrite existing file

willy:~/test> set +o noclobber

willy:~/test> date > test

The above example demonstrates the noclobber option, which prevents existing files from being overwritten by redirection operations. The same goes for one-character options, for instance -u, which will treat unset variables as an error when set, and exits a non-interactive shell upon encountering such errors:


willy:~> echo $VAR


willy:~> set -u

willy:~> echo $VAR
bash: VAR: unbound variable

This option is also useful for detecting incorrect content assignment to variables: the same error will also occur, for instance, when assigning a character string to a variable that was declared explicitly as one holding only integer values.

One last example follows, demonstrating the noglob option, which prevents special characters from being expanded:


willy:~/testdir> set -o noglob

willy:~/testdir> touch *

willy:~/testdir> ls -l *
-rw-rw-r--    1 willy    willy		0 Feb 27 13:37 *

3.7. Summary

The Bash environment can be configured globally and on a per user basis. Various configuration files are used to fine-tune the behavior of the shell.

These files contain shell options, settings for variables, function definitions and various other building blocks for creating ourselves a cosy environment.

Except for the reserved Bourne shell, Bash and special parameters, variable names can be chosen more or less freely.

Because a lot of characters have double or even triple meanings, depending on the environment, Bash uses a system of quoting to take away special meaning from one or multiple characters when special treatment is not wanted.

Bash uses various methods of expanding command line entries in order to determine which commands to execute.


3.8. Exercises

For this exercise, you will need to read the useradd man pages, because we are going to use the /etc/skel directory to hold default shell configuration files, which are copied to the home directory of each newly added user.

First we will do some general exercises on setting and displaying variables.

  1. Create 3 variables, VAR1, VAR2 and VAR3; initialize them to hold the values "thirteen", "13" and "Happy Birthday" respectively.

  2. Display the values of all three variables.

  3. Are these local or global variables?

  4. Remove VAR3.

  5. Can you see the two remaining variables in a new terminal window?

  6. Edit /etc/profile so that all users are greeted upon login (test this).

  7. For the root account, set the prompt to something like "Danger!! root is doing stuff in \w", preferably in a bright color such as red or pink or in reverse video mode.

  8. Make sure that newly created users also get a nice personalized prompt which informs them on which system in which directory they are working. Test your changes by adding a new user and logging in as that user.

  9. Write a script in which you assign two integer values to two variables. The script should calculate the surface of a rectangle which has these proportions. It should be aired with comments and generate elegant output.

Don't forget to chmod your scripts!


Chapter 4. Regular expressions

In this chapter we discuss:

  • Using regular expressions

  • Regular expression metacharacters

  • Finding patterns in files or output

  • Character ranges and classes in Bash


4.1. Regular expressions

4.1.1. What are regular expressions?

A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions by using various operators to combine smaller expressions.

The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash.


4.1.2. Regular expression metacharacters

A regular expression may be followed by one of several repetition operators (metacharacters):

Table 4-1. Regular expression operators

OperatorEffect
.Matches any single character.
?The preceding item is optional and will be matched, at most, once.
*The preceding item will be matched zero or more times.
+The preceding item will be matched one or more times.
{N}The preceding item is matched exactly N times.
{N,}The preceding item is matched N or more times.
{N,M}The preceding item is matched at least N times, but not more than M times.
-represents the range if it's not first or last in a list or the ending point of a range in a list.
^Matches the empty string at the beginning of a line; also represents the characters not in the range of a list.
$Matches the empty string at the end of a line.
\bMatches the empty string at the edge of a word.
\BMatches the empty string provided it's not at the edge of a word.
\<Match the empty string at the beginning of word.
\>Match the empty string at the end of word.

Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.

Two regular expressions may be joined by the infix operator "|"; the resulting regular expression matches any string matching either subexpression.

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.


4.1.3. Basic versus extended regular expressions

In basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use the backslashed versions "\?", "\+", "\{", "\|", "\(", and "\)".

Check in your system documentation whether commands using regular expressions support extended expressions.


4.2. Examples using grep

4.2.1. What is grep?

grep searches the input files for lines containing a match to a given pattern list. When it finds a match in a line, it copies the line to standard output (by default), or whatever other sort of output you have requested with options.

Though grep expects to do the matching on text, it has no limits on input line length other than available memory, and it can match arbitrary characters within a line. If the final byte of an input file is not a newline, grep silently supplies one. Since newline is also a separator for the list of patterns, there is no way to match newline characters in a text.

Some examples:


cathy ~> grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

cathy ~> grep -n root /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
12:operator:x:11:0:operator:/root:/sbin/nologin

cathy ~> grep -v bash /etc/passwd | grep -v nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
news:x:9:13:news:/var/spool/news:
mailnull:x:47:47::/var/spool/mqueue:/dev/null
xfs:x:43:43:X Font Server:/etc/X11/fs:/bin/false
rpc:x:32:32:Portmapper RPC user:/:/bin/false
nscd:x:28:28:NSCD Daemon:/:/bin/false
named:x:25:25:Named:/var/named:/bin/false
squid:x:23:23::/var/spool/squid:/dev/null
ldap:x:55:55:LDAP User:/var/lib/ldap:/bin/false
apache:x:48:48:Apache:/var/www:/bin/false

cathy ~> grep -c false /etc/passwd
7

cathy ~> grep -i ps ~/.bash* | grep -v history
/home/cathy/.bashrc:PS1="\[\033[1;44m\]$USER is in \w\[\033[0m\] "

With the first command, user cathy displays the lines from /etc/passwd containing the string root.

Then she displays the line numbers containing this search string.

With the third command she checks which users are not using bash, but accounts with the nologin shell are not displayed.

Then she counts the number of accounts that have /bin/false as the shell.

The last command displays the lines from all the files in her home directory starting with ~/.bash, excluding matches containing the string history, so as to exclude matches from ~/.bash_history which might contain the same string, in upper or lower cases. Note that the search is for the string "ps", and not for the command ps.

Now let's see what else we can do with grep, using regular expressions.


4.2.2. Grep and regular expressions

NoteIf you are not on Linux
 

We use GNU grep in these examples, which supports extended regular expressions. GNU grep is the default on Linux systems. If you are working on proprietary systems, check with the -V option which version you are using. GNU grep can be downloaded from http://gnu.org/directory/.


4.2.2.1. Line and word anchors

From the previous example, we now exclusively want to display lines starting with the string "root":


cathy ~> grep ^root /etc/passwd
root:x:0:0:root:/root:/bin/bash

If we want to see which accounts have no shell assigned whatsoever, we search for lines ending in ":":


cathy ~> grep :$ /etc/passwd
news:x:9:13:news:/var/spool/news:

To check that PATH is exported in ~/.bashrc, first select "export" lines and then search for lines starting with the string "PATH", so as not to display MANPATH and other possible paths:


cathy ~> grep export ~/.bashrc | grep '\<PATH'
  export PATH="/bin:/usr/lib/mh:/lib:/usr/bin:/usr/local/bin:/usr/ucb:/usr/dbin:$PATH"

Similarly, \> matches the end of a word.

If you want to find a string that is a separate word (enclosed by spaces), it is better use the -w, as in this example where we are displaying information for the root partition:


cathy ~> grep -w / /etc/fstab
LABEL=/                 /                       ext3    defaults        1 1

If this option is not used, all the lines from the file system table will be displayed.


4.2.2.2. Character classes

A bracket expression is a list of characters enclosed by "[" and "]". It matches any single character in that list; if the first character of the list is the caret, "^", then it matches any character NOT in the list. For example, the regular expression "[0123456789]" matches any single digit.

Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, "[a-d]" is equivalent to "[abcd]". Many locales sort characters in dictionary order, and in these locales "[a-d]" is typically not equivalent to "[abcd]"; it might be equivalent to "[aBbCcDd]", for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value "C".

Finally, certain named classes of characters are predefined within bracket expressions. See the grep man or info pages for more information about these predefined expressions.


cathy ~> grep [yf] /etc/group
sys:x:3:root,bin,adm
tty:x:5:
mail:x:12:mail,postfix
ftp:x:50:
nobody:x:99:
floppy:x:19:
xfs:x:43:
nfsnobody:x:65534:
postfix:x:89:

In the example, all the lines containing either a "y" or "f" character are displayed.


4.2.2.3. Wildcards

Use the "." for a single character match. If you want to get a list of all five-character English dictionary words starting with "c" and ending in "h" (handy for solving crosswords):


cathy ~> grep '\<c...h\>' /usr/share/dict/words
catch
clash
cloth
coach
couch
cough
crash
crush

If you want to display lines containing the literal dot character, use the -F option to grep.

For matching multiple characters, use the asterisk. This example selects all words starting with "c" and ending in "h" from the system's dictionary:


cathy ~> grep '\<c.*h\>' /usr/share/dict/words
caliph
cash
catch
cheesecloth
cheetah
--output omitted--

If you want to find the literal asterisk character in a file or output, use single quotes. Cathy in the example below first tries finding the asterisk character in /etc/profile without using quotes, which does not return any lines. Using quotes, output is generated:


cathy ~> grep * /etc/profile

cathy ~> grep '*' /etc/profile
for i in /etc/profile.d/*.sh ; do

4.3. Pattern matching using Bash features

4.3.1. Character ranges

Apart from grep and regular expressions, there's a good deal of pattern matching that you can do directly in the shell, without having to use an external program.

As you already know, the asterisk (*) and the question mark (?) match any string or any single character, respectively. Quote these special characters to match them literally:


cathy ~> touch "*"

cathy ~> ls "*"
*

But you can also use the square braces to match any enclosed character or range of characters, if pairs of characters are separated by a hyphen. An example:


cathy ~> ls -ld [a-cx-z]*
drwxr-xr-x    2 cathy	 cathy		4096 Jul 20  2002 app-defaults/
drwxrwxr-x    4 cathy    cathy          4096 May 25  2002 arabic/
drwxrwxr-x    2 cathy    cathy          4096 Mar  4 18:30 bin/
drwxr-xr-x    7 cathy    cathy          4096 Sep  2  2001 crossover/
drwxrwxr-x    3 cathy    cathy          4096 Mar 22  2002 xml/

This lists all files in cathy's home directory, starting with "a", "b", "c", "x", "y" or "z".

If the first character within the braces is "!" or "^", any character not enclosed will be matched. To match the dash ("-"), include it as the first or last character in the set. The sorting depends on the current locale and of the value of the LC_COLLATE variable, if it is set. Mind that other locales might interpret "[a-cx-z]" as "[aBbCcXxYyZz]" if sorting is done in dictionary order. If you want to be sure to have the traditional interpretation of ranges, force this behavior by setting LC_COLLATE or LC_ALL to "C".


4.3.2. Character classes

Character classes can be specified within the square braces, using the syntax [:CLASS:], where CLASS is defined in the POSIX standard and has one of the values

"alnum", "alpha", "ascii", "blank", "cntrl", "digit", "graph", "lower", "print", "punct", "space", "upper", "word" or "xdigit".

Some examples:


cathy ~> ls -ld [[:digit:]]*
drwxrwxr-x    2 cathy	cathy		4096 Apr 20 13:45 2/

cathy ~> ls -ld [[:upper:]]*
drwxrwxr--    3 cathy   cathy           4096 Sep 30  2001 Nautilus/
drwxrwxr-x    4 cathy   cathy           4096 Jul 11  2002 OpenOffice.org1.0/
-rw-rw-r--    1 cathy   cathy         997376 Apr 18 15:39 Schedule.sdc

When the extglob shell option is enabled (using the shopt built-in), several extended pattern matching operators are recognized. Read more in the Bash info pages, section Basic shell features->Shell Expansions->Filename Expansion->Pattern Matching.


4.4. Summary

Regular expressions are powerful tools for selecting particular lines from files or output. A lot of UNIX commands use regular expressions: vim, perl, the PostgreSQL database and so on. They can be made available in any language or application using external libraries, and they even found their way to non-UNIX systems. For instance, regular expressions are used in the Excell spreadsheet that comes with the MicroSoft Windows Office suite. In this chapter we got the feel of the grep command, which is indispensable in any UNIX environment.

Note

The grep command can do much more than the few tasks we discussed here; we only used it as an example for regular expressions. The GNU grep version comes with plenty of documentation, which you are strongly advised to read!

Bash has built-in features for matching patterns and can recognize character classes and ranges.


4.5. Exercises

These exercises will help you master regular expressions.

  1. Display a list of all the users on your system who log in with the Bash shell as a default.

  2. From the /etc/group directory, display all lines starting with the string "daemon".

  3. Print all the lines from the same file that don't contain the string.

  4. Display localhost information from the /etc/hosts file, display the line number(s) matching the search string and count the number of occurrences of the string.

  5. Display a list of /usr/share/doc subdirectories containing information about shells.

  6. How many README files do these subdirectories contain? Don't count anything in the form of "README.a_string".

  7. Make a list of files in your home directory that were changed less that 10 hours ago, using grep, but leave out directories.

  8. Put these commands in a shell script that will generate comprehensible output.

  9. Can you find an alternative for wc -l, using grep?

  10. Using the file system table (/etc/fstab for instance), list local disk devices.

  11. Make a script that checks whether a user exists in /etc/passwd. For now, you can specify the user name in the script, you don't have to work with arguments and conditionals at this stage.

  12. Display configuration files in /etc that contain numbers in their names.


Chapter 5. The GNU sed stream editor

At the end of this chapter you will know about the following topics:

  • What is sed?

  • Interactive use of sed

  • Regular expressions and stream editing

  • Using sed commands in scripts

NoteThis is an introduction
 

These explanations are far from complete and certainly not meant to be used as the definite user manual for sed. This chapter is only included in order to show some more interesting topics in the next chapters, and because every power user should have a basic knowledge of things that can be done with this editor.

For detailed information, refer to the sed info and man pages.


5.1. Introduction

5.1.1. What is sed?

A Stream EDitor is used to perform basic transformations on text read from a file or a pipe. The result is sent to standard output. The syntax for the sed command has no output file specification, but results can be saved to a file using output redirection. The editor does not modify the original input.

What distinguishes sed from other editors, such as vi and ed, is its ability to filter text that it gets from a pipeline feed. You do not need to interact with the editor while it is running; that is why sed is sometimes called a batch editor. This feature allows use of editing commands in scripts, greatly easing repetitive editing tasks. When facing replacement of text in a large number of files, sed is a great help.


5.1.2. sed commands

The sed program can perform text pattern substitutions and deletions using regular expressions, like the ones used with the grep command; see Section 4.2.

The editing commands are similar to the ones used in the vi editor:

Table 5-1. Sed editing commands

CommandResult
a\Append text below current line.
c\Change text in the current line with new text.
dDelete text.
i\Insert text above current line.
pPrint text.
rRead a file.
sSearch and replace text.
wWrite to a file.

Apart from editing commands, you can give options to sed. An overview is in the table below:

Table 5-2. Sed options

OptionEffect
-e SCRIPTAdd the commands in SCRIPT to the set of commands to be run while processing the input.
-fAdd the commands contained in the file SCRIPT-FILE to the set of commands to be run while processing the input.
-nSilent mode.
-VPrint version information and exit.

The sed info pages contain more information; we only list the most frequently used commands and options here.


5.2. Interactive editing

5.2.1. Printing lines containing a pattern

This is something you can do with grep, of course, but you can't do a "find and replace" using that command. This is just to get you started.

This is our example text file:


sandy ~> cat -n example
     1  This is the first line of an example text.
     2  It is a text with erors.
     3  Lots of erors.
     4  So much erors, all these erors are making me sick.
     5  This is a line not containing any errors.
     6  This is the last line.

sandy ~>

We want sed to find all the lines containing our search pattern, in this case "erors". We use the p to obtain the result:


sandy ~> sed  '/erors/p' example
This is the first line of an example text.
It is a text with erors.
It is a text with erors.
Lots of erors.
Lots of erors.
So much erors, all these erors are making me sick.
So much erors, all these erors are making me sick.
This is a line not containing any errors.
This is the last line.

sandy ~>

As you notice, sed prints the entire file, but the lines containing the search string are printed twice. This is not what we want. In order to only print those lines matching our pattern, use the -n option:


sandy ~> sed -n '/erors/p' example
It is a text with erors.
Lots of erors.
So much erors, all these erors are making me sick.

sandy ~>

5.2.2. Deleting lines of input containing a pattern

We use the same example text file. Now we only want to see the lines not containing the search string:


sandy ~> sed '/erors/d' example
This is the first line of an example text.
This is a line not containing any errors.
This is the last line.

sandy ~>

The d command results in excluding lines from being displayed.

Matching lines starting with a given pattern and ending in a second pattern are showed like this:


sandy ~> sed -n '/^This.*errors.$/p' example
This is a line not containing any errors.

sandy ~>

Note that the last dot needs to be escaped in order to actually match. In our example the expression just matches any character, including the last dot.


5.2.3. Ranges of lines

This time we want to take out the lines containing the errors. In the example these are lines 2 to 4. Specify this range to address, together with the d command:


sandy ~> sed '2,4d' example
This is the first line of an example text.
This is a line not containing any errors.
This is the last line.

sandy ~>

To print the file starting from a certain line until the end of the file, use a command similar to this:


sandy ~> sed '3,$d' example
This is the first line of an example text.
It is a text with erors.

sandy ~>

This only prints the first two lines of the example file.

The following command prints the first line containing the pattern "a text", up to and including the next line containing the pattern "a line":


sandy ~> sed -n '/a text/,/This/p' example
It is a text with erors.
Lots of erors.
So much erors, all these erors are making me sick.
This is a line not containing any errors.

sandy ~>

5.2.4. Find and replace with sed

In the example file, we will now search and replace the errors instead of only (de)selecting the lines containing the search string.


sandy ~> sed 's/erors/errors/' example
This is the first line of an example text.
It is a text with errors.
Lots of errors.
So much errors, all these erors are making me sick.
This is a line not containing any errors.
This is the last line.

sandy ~>

As you can see, this is not exactly the desired effect: in line 4, only the first occurrence of the search string has been replaced, and there is still an 'eror' left. Use the g command to indicate to sed that it should examine the entire line instead of stopping at the first occurrence of your string:


sandy ~> sed 's/erors/errors/g' example
This is the first line of an example text.
It is a text with errors.
Lots of errors.
So much errors, all these errors are making me sick.
This is a line not containing any errors.
This is the last line.

sandy ~>

To insert a string at the beginning of each line of a file, for instance for quoting:


sandy ~> sed 's/^/> /' example
> This is the first line of an example text.
> It is a text with erors.
> Lots of erors.
> So much erors, all these erors are making me sick.
> This is a line not containing any errors.
> This is the last line.

sandy ~>

Insert some string at the end of each line:


sandy ~> sed 's/$/EOL/' example
This is the first line of an example text.EOL
It is a text with erors.EOL
Lots of erors.EOL
So much erors, all these erors are making me sick.EOL
This is a line not containing any errors.EOL
This is the last line.EOL

sandy ~>

Multiple find and replace commands are separated with individual -e options:


sandy ~> sed -e 's/erors/errors/g' -e 's/last/final/g' example
This is the first line of an example text.
It is a text with errors.
Lots of errors.
So much errors, all these errors are making me sick.
This is a line not containing any errors.
This is the final line.

sandy ~>

Keep in mind that by default sed prints its results to the standard output, most likely your terminal window. If you want to save the output to a file, redirect it:

sed option 'some/expression' file_to_process > sed_output_in_a_file

TipMore examples
 

Plenty of sed examples can be found in the startup scripts for your machine, which are usually in /etc/init.d or /etc/rc.d/init.d. Change into the directory containing the initscripts on your system and issue the following command:

grep sed *


5.3. Non-interactive editing

5.3.1. Reading sed commands from a file

Multiple sed commands can be put in a file and executed using the -f option. When creating such a file, make sure that:

  • No trailing white spaces exist at the end of lines.

  • No quotes are used.

  • When entering text to add or replace, all except the last line end in a backslash.


5.3.2. Writing output files

Writing output is done using the output redirection operator >. This is an example script used to create very simple HTML files from plain text files.


sandy ~> cat script.sed
1i\
<html>\
<head><title>sed generated html</title></head>\
<body bgcolor="#ffffff">\
<pre>
$a\
</pre>\
</body>\
</html>

sandy ~> cat txt2html.sh
#!/bin/bash

# This is a simple script that you can use for converting text into HTML.
# First we take out all newline characters, so that the appending only happens
# once, then we replace the newlines.

echo "converting $1..."

SCRIPT="/home/sandy/scripts/script.sed"
NAME="$1"
TEMPFILE="/var/tmp/sed.$PID.tmp"
sed "s/\n/^M/" $1 | sed -f $SCRIPT | sed "s/^M/\n/" > $TEMPFILE
mv $TEMPFILE $NAME

echo "done."

sandy ~>

$1 holds the first argument to a given command, in this case the name of the file to convert:


sandy ~> cat test
line1
line2
line3

More on positional parameters in Chapter 7.


sandy ~> txt2html.sh test
converting test...
done.

sandy ~> cat test
<html>
<head><title>sed generated html</title></head>
<body bgcolor="#ffffff">
<pre>
line1
line2
line3
</pre>
</body>
</html>

sandy ~>

This is not really how it is done; this example just demonstrates sed capabilities. See Section 6.3 for a more decent solution to this problem, using awk BEGIN and END constructs.

NoteEasy sed
 

Advanced editors, supporting syntax highlighting, can recognize sed syntax. This can be a great help if you tend to forget backslashes and such.


5.4. Summary

The sed stream editor is a powerful command line tool, which can handle streams of data: it can take input lines from a pipe. This makes it fit for non-interactive use. The sed editor uses vi-like commands and accepts regular expressions.

The sed tool can read commands from the command line or from a script. It is often used to perform find-and-replace actions on lines containing a pattern.


5.5. Exercises

These exercises are meant to further demonstrate what sed can do.

  1. Print a list of files in your scripts directory, ending in ".sh". Mind that you might have to unalias ls. Put the result in a temporary file.

  2. Make a list of files in /usr/bin that have the letter "a" as the second character. Put the result in a temporary file.

  3. Delete the first 3 lines of each temporary file.

  4. Print to standard output only the lines containing the pattern "an".

  5. Create a file holding sed commands to perform the previous two tasks. Add an extra command to this file that adds a string like "*** This might have something to do with man and man pages ***" in the line preceding every occurence of the string "man". Check the results.

  6. A long listing of the root directory, /, is used for input. Create a file holding sed commands that check for symbolic links and plain files. If a file is a symbolic link, precede it with a line like "--This is a symlink--". If the file is a plain file, add a string on the same line, adding a comment like "<--- this is a plain file".

  7. Create a script that shows lines containing trailing white spaces from a file. This script should use a sed script and show sensible information to the user.


Chapter 6. The GNU awk programming language

In this chapter we will discuss:

  • What is gawk?

  • Using gawk commands on the command line

  • How to format text with gawk

  • How gawk uses regular expressions

  • Gawk in scripts

  • Gawk and variables

NoteTo make it more fun
 

As with sed, entire books have been written about various versions of awk. This introduction is far from complete and is only intended for understanding examples in the following chapters. For more information, best start with the documentation that comes with GNU awk: "GAWK: Effective AWK Programming: A User's Guide for GNU Awk".


6.1. Getting started with gawk

6.1.1. What is gawk?

Gawk is the GNU version of the commonly available UNIX awk program, another popular stream editor. Since the awk program is often just a link to gawk, we will refer to it as awk.

The basic function of awk is to search files for lines or other text units containing one or more patterns. When a line matches one of the patterns, special actions are performed on that line.

Programs in awk are different from programs in most other languages, because awk programs are "data-driven": you describe the data you want to work with and then what to do when you find it. Most other languages are "procedural." You have to describe, in great detail, every step the program is to take. When working with procedural languages, it is usually much harder to clearly describe the data your program will process. For this reason, awk programs are often refreshingly easy to read and write.

NoteWhat does it really mean?
 

Back in the 1970s, three programmers got together to create this language. Their names were Aho, Kernighan and Weinberger. They took the first character of each of their names and put them together. So the name of the language might just as well have been "wak".


6.1.2. Gawk commands

When you run awk, you specify an awk program that tells awk what to do. The program consists of a series of rules. (It may also contain function definitions, loops, conditions and other programming constructs, advanced features that we will ignore for now.) Each rule specifies one pattern to search for and one action to perform upon finding the pattern.

There are several ways to run awk. If the program is short, it is easiest to run it on the command line:

awk PROGRAM inputfile(s)

If multiple changes have to be made, possibly regularly and on multiple files, it is easier to put the awk commands in a script. This is read like this:

awk -f PROGRAM-FILE inputfile(s)


6.2. The print program

6.2.1. Printing selected fields

The print command in awk outputs selected data from the input file.

When awk reads a line of a file, it divides the line in fields based on the specified input field separator, FS, which is an awk variable (see Section 6.3.2). This variable is predefined to be one or more spaces or tabs.

The variables $1, $2, $3, ..., $N hold the values of the first, second, third until the last field of an input line. The variable $0 (zero) holds the value of the entire line. This is depicted in the image below, where we see six colums in the output of the df command:

Figure 6-1. Fields in awk

In the output of ls -l, there are 9 columns. The print statement uses these fields as follows:


kelly@octarine ~/test> ls -l | awk '{ print $5 $9 }'
160orig
121script.sed
120temp_file
126test
120twolines
441txt2html.sh

kelly@octarine ~/test>

This command printed the fifth column of a long file listing, which contains the file size, and the last column, the name of the file. This output is not very readable unless you use the official way of referring to columns, which is to separate the ones that you want to print with a comma. In that case, the default output separater character, usually a space, will be put in between each output field.

NoteLocal configuration
 

Note that the configuration of the output of the ls -l command might be different on your system. Display of time and date is dependent on your locale setting.


6.2.2. Formatting fields

Without formatting, using only the output separator, the output looks rather poor. Inserting a couple of tabs and a string to indicate what output this is will make it look a lot better:


kelly@octarine ~/test> ls -ldh * | grep -v total | \ 
awk '{ print "Size is " $5 " bytes for " $9 }'
Size is 160 bytes for orig
Size is 121 bytes for script.sed
Size is 120 bytes for temp_file
Size is 126 bytes for test
Size is 120 bytes for twolines
Size is 441 bytes for txt2html.sh

kelly@octarine ~/test>

Note the use of the backslash, which makes long input continue on the next line without the shell interpreting this as a separate command. While your command line input can be of virtually unlimited length, your monitor is not, and printed paper certainly isn't. Using the backslash also allows for copying and pasting of the above lines into a terminal window.

The -h option to ls is used for supplying humanly readable size formats for bigger files. The output of a long listing displaying the total amount of blocks in the directory is given when a directory is the argument. This line is useless to us, so we add an asterisk. We also add the -d option for the same reason, in case asterisk expands to a directory.

The backslash in this example marks the continuation of a line. See Section 3.3.2.

You can take out any number of columns and even reverse the order. In the example below this is demonstrated for showing the most critical partitions:


kelly@octarine ~> df -h | sort -rnk 5 | head -3 | \ 
awk '{ print "Partition " $6 "\t: " $5 " full!" }'
Partition /var  : 86% full!
Partition /usr  : 85% full!
Partition /home : 70% full!

kelly@octarine ~>

The table below gives an overview of special formatting characters:

Table 6-1. Formatting characters for gawk

SequenceMeaning
\aBell character
\nNewline character
\tTab

Quotes, dollar signs and other meta-characters should be escaped with a backslash.


6.2.3. The print command and regular expressions

A regular expression can be used as a pattern by enclosing it in slashes. The regular expression is then tested against the entire text of each record. The syntax is as follows:

awk 'EXPRESSION { PROGRAM }' file(s)

The following example displays only local disk device information, networked file systems are not shown:


kelly is in ~> df -h | awk '/dev\/hd/ { print $6 "\t: " $5 }'
/       : 46%
/boot   : 10%
/opt    : 84%
/usr    : 97%
/var    : 73%
/.vol1  : 8%

kelly is in ~>

Slashes need to be escaped, because they have a special meaning to the awk program.

Below another example where we search the /etc directory for files ending in ".conf" and starting with either "a" or "x", using extended regular expressions:


kelly is in /etc> ls -l | awk '/\<(a|x).*\.conf$/ { print $9 }'
amd.conf
antivir.conf
xcdroast.conf
xinetd.conf

kelly is in /etc>

This example illustrates the special meaning of the dot in regular expressions: the first one indicates that we want to search for any character after the first search string, the second is escaped because it is part of a string to find (the end of the file name).


6.2.4. Special patterns

In order to precede output with comments, use the BEGIN statement:


kelly is in /etc> ls -l | \
awk 'BEGIN { print "Files found:\n" } /\<[a|x].*\.conf$/ { print $9 }'
Files found:
amd.conf
antivir.conf
xcdroast.conf
xinetd.conf

kelly is in /etc>

The END statement can be added for inserting text after the entire input is processed:


kelly is in /etc> ls -l | \
awk '/\<[a|x].*\.conf$/ { print $9 } END { print \
"Can I do anything else for you, mistress?" }'
amd.conf
antivir.conf
xcdroast.conf
xinetd.conf
Can I do anything else for you, mistress?

kelly is in /etc>

6.2.5. Gawk scripts

As commands tend to get a little longer, you might want to put them in a script, so they are reusable. An awk script contains awk statements defining patterns and actions.

As an illustration, we will build a report that displays our most loaded partitions. See Section 6.2.2.


kelly is in ~> cat diskrep.awk
BEGIN { print "*** WARNING WARNING WARNING ***" }
/\<[8|9][0-9]%/ { print "Partition " $6 "\t: " $5 " full!" }
END { print "*** Give money for new disks URGENTLY! ***" }

kelly is in ~> df -h | awk -f diskrep.awk
*** WARNING WARNING WARNING ***
Partition /usr  : 97% full!
*** Give money for new disks URGENTLY! ***

kelly is in ~>

awk first prints a begin message, then formats all the lines that contain an eight or a nine at the beginning of a word, followed by one other number and a percentage sign. An end message is added.

NoteSyntax highlighting
 

Awk is a programming language. Its syntax is recognized by most editors that can do syntax highlighting for other languages, such as C, Bash, HTML, etc.


6.3. Gawk variables

As awk is processing the input file, it uses several variables. Some are editable, some are read-only.


6.3.1. The input field separator

The field separator, which is either a single character or a regular expression, controls the way awk splits up an input record into fields. The input record is scanned for character sequences that match the separator definition; the fields themselves are the text between the matches.

The field separator is represented by the built-in variable FS. Note that this is something different from the IFS variable used by POSIX-compliant shells.

The value of the field separator variable can be changed in the awk program with the assignment operator =. Often the right time to do this is at the beginning of execution before any input has been processed, so that the very first record is read with the proper separator. To do this, use the special BEGIN pattern.

In the example below, we build a command that displays all the users on your system with a description:


kelly is in ~> awk 'BEGIN { FS=":" } { print $1 "\t" $5 }' /etc/passwd
--output omitted--
kelly	Kelly Smith
franky	Franky B.
eddy	Eddy White
willy	William Black
cathy	Catherine the Great
sandy	Sandy Li Wong

kelly is in ~>

In an awk script, it would look like this:


kelly is in ~> cat printnames.awk
BEGIN { FS=":" }
{ print $1 "\t" $5 }

kelly is in ~> awk -f printnames.awk /etc/passwd
--output omitted--

Choose input field separators carefully to prevent problems. An example to illustrate this: say you get input in the form of lines that look like this:

"Sandy L. Wong, 64 Zoo St., Antwerp, 2000X"

You write a command line or a script, which prints out the name of the person in that record:

awk 'BEGIN { FS="," } { print $1, $2, $3 }' inputfile

But a person might have a PhD, and it might be written like this:

"Sandy L. Wong, PhD, 64 Zoo St., Antwerp, 2000X"

Your awk will give the wrong output for this line. If needed, use an extra awk or sed to uniform data input formats.

The default input field separator is one or more whitespaces or tabs.


6.3.2. The output separators

6.3.2.1. The output field separator

Fields are normally separated by spaces in the output. This becomes apparent when you use the correct syntax for the print command, where arguments are separated by commas:


kelly@octarine ~/test> cat test
record1         data1
record2         data2

kelly@octarine ~/test> awk '{ print $1 $2}' test
record1data1
record2data2

kelly@octarine ~/test> awk '{ print $1, $2}' test
record1 data1
record2 data2

kelly@octarine ~/test>

If you don't put in the commas, print will treat the items to output as one argument, thus omitting the use of the default output separator, OFS.

Any character string may be used as the output field separator by setting this built-in variable.


6.3.2.2. The output record separator

The output from an entire print statement is called an output record. Each print command results in one output record, and then outputs a string called the output record separator, ORS. The default value for this variable is "\n", a newline character. Thus, each print statement generates a separate line.

To change the way output fields and records are separated, assign new values to OFS and ORS:


kelly@octarine ~/test> awk 'BEGIN { OFS=";" ; ORS="\n-->\n" } \
{ print $1,$2}' test
record1;data1
-->
record2;data2
-->

kelly@octarine ~/test>

If the value of ORS does not contain a newline, the program's output is run together on a single line.


6.3.3. The number of records

The built-in NR holds the number of records that are processed. It is incremented after reading a new input line. You can use it at the end to count the total number of records, or in each output record:


kelly@octarine ~/test> cat processed.awk
BEGIN { OFS="-" ; ORS="\n--> done\n" }
{ print "Record number " NR ":\t" $1,$2 }
END { print "Number of records processed: " NR }

kelly@octarine ~/test> awk -f processed.awk test
Record number 1:        record1-data1
--> done
Record number 2:        record2-data2
--> done
Number of records processed: 2
--> done

kelly@octarine ~/test>

6.3.4. User defined variables

Apart from the built-in variables, you can define your own. When awk encounters a reference to a variable which does not exist (which is not predefined), the variable is created and initialized to a null string. For all subsequent references, the value of the variable is whatever value was assigned last. Variables can be a string or a numeric value. Content of input fields can also be assigned to variables.

Values can be assigned directly using the = operator, or you can use the current value of the variable in combination with other operators:


kelly@octarine ~> cat revenues
20021009        20021013        consultancy     BigComp         2500
20021015        20021020        training        EduComp         2000
20021112        20021123        appdev          SmartComp       10000
20021204        20021215        training        EduComp         5000

kelly@octarine ~> cat total.awk
{ total=total + $5 }
{ print "Send bill for " $5 " dollar to " $4 }
END { print "---------------------------------\nTotal revenue: " total }

kelly@octarine ~> awk -f total.awk test
Send bill for 2500 dollar to BigComp
Send bill for 2000 dollar to EduComp
Send bill for 10000 dollar to SmartComp
Send bill for 5000 dollar to EduComp
---------------------------------
Total revenue: 19500

kelly@octarine ~>

C-like shorthands like VAR+= value are also accepted.


6.3.5. More examples

The example from Section 5.3.2 becomes much easier when we use an awk script:


kelly@octarine ~/html> cat make-html-from-text.awk
BEGIN { print "<html>\n<head><title>Awk-generated HTML</title></head>\n<body bgcolor=\"#ffffff\">\n<pre>" }
{ print $0 }
END { print "</pre>\n</body>\n</html>" }

And the command to execute is also much more straightforward when using awk instead of sed:


kelly@octarine ~/html> awk -f make-html-from-text.awk testfile > file.html

TipAwk examples on your system
 

We refer again to the directory containing the initscripts on your system. Enter a command similar to the following to see more practical examples of the widely spread usage of the awk command:

grep awk /etc/init.d/*


6.3.6. The printf program

For more precise control over the output format than what is normally provided by print, use printf. The printf command can be used to specify the field width to use for each item, as well as various formatting choices for numbers (such as what output base to use, whether to print an exponent, whether to print a sign, and how many digits to print after the decimal point). This is done by supplying a string, called the format string, that controls how and where to print the other arguments.

The syntax is the same as for the C-language printf statement; see your C introduction guide. The gawk info pages contain full explanations.


6.4. Summary

The gawk utility interprets a special-purpose programming language, handling simple data-reformatting jobs with just a few lines of code. It is the free version of the general UNIX awk command.

This tools reads lines of input data and can easily recognize columned output. The print program is the most common for filtering and formatting defined fields.

On-the-fly variable declaration is straightforward and allows for simple calculation of sums, statistics and other operations on the processed input stream. Variables and commands can be put in awk scripts for background processing.

Other things you should know about awk:

  • The language remains well-known on UNIX and alikes, but for executing similar tasks, Perl is now more commonly used. However, awk has a much steeper learning curve (meaning that you learn a lot in a very short time). In other words, Perl is more difficult to learn.

  • Both Perl and awk share the reputation of being incomprehensible, even to the actual authors of the programs that use these languages. So document your code!


6.5. Exercises

These are some practical examples where awk can be useful.

  1. For the first exercise, your input is lines in the following form:

    Username:Firstname:Lastname:Telephone number

    Make an awk script that will convert such a line to an LDAP record in this format:

    
dn: uid=Username, dc=example, dc=com
    cn: Firstname Lastname
    sn: Lastname
    telephoneNumber: Telephone number
    

    Create a file containing a couple of test records and check.

  2. Create a Bash script using awk and standard UNIX commands that will show the top three users of disk space in the /home file system (if you don't have the directory holding the homes on a separate partition, make the script for the / partition; this is present on every UNIX system). First, execute the commands from the command line. Then put them in a script. The script should create sensible output (sensible as in readable by the boss). If everything proves to work, have the script email its results to you (use for instance mail -s Disk space usage < result).

    If the quota daemon is running, use that information; if not, use find.

  3. Create XML-style output from a Tab-separated list in the following form:

    
Meaning very long line with a lot of description
     
    meaning another long line
     
    othermeaning    more longline
     
    testmeaning     looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong line, but i mean really looooooooooooooooooooooooooooooooooooooooooooooooooong.
     
    

    The output should read:

    
<row>
    <entry>Meaning</entry>
    <entry>
    very long line
    </entry>
    </row>
    <row>
    <entry>meaning</entry>
    <entry>
    long line
    </entry>
    </row>
    <row>
    <entryothermeaning</entry>
    <entry>
    more longline
    </entry>
    </row>
    <row>
    <entrytestmeaning</entry>
    <entry>
    looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong line, but i mean really looooooooooooooooooooooooooooooooooooooooooooooooooong.
    </entry>
    </row>
    

    Additionally, if you know anything about XML, write a BEGIN and END script to complete the table. Or do it in HTML.


Chapter 7. Conditional statements

In this chapter we will discuss the use of conditionals in Bash scripts. This includes the following topics:

  • The if statement

  • Using the exit status of a command

  • Comparing and testing input and files

  • if/then/else constructs

  • if/then/elif/else constructs

  • Using and testing the positional parameters

  • Nested if statements

  • Boolean expressions

  • Using case statements


7.1. Introduction to if

7.1.1. General

At times you need to specify different courses of action to be taken in a shell script, depending on the success or failure of a command. The if construction allows you to specify such conditions.

The most compact syntax of the if command is:

if TEST-COMMANDS; then CONSEQUENT-COMMANDS; fi

The TEST-COMMAND list is executed, and if its return status is zero, the CONSEQUENT-COMMANDS list is executed. The return status is the exit status of the last command executed, or zero if no condition tested true.

The TEST-COMMAND often involves numerical or string comparison tests, but it can also be any command that returns a status of zero when it succeeds and some other status when it fails. Unary expressions are often used to examine the status of a file. If the FILE argument to one of the primaries is of the form /dev/fd/N, then file descriptor "N" is checked. stdin, stdout and stderr and their respective file descriptors may also be used for tests.


7.1.1.1. Expressions used with if

The table below contains an overview of the so-called "primaries" that make up the TEST-COMMAND command or list of commands. These primaries are put between square brackets to indicate the test of a conditional expression.

Table 7-1. Primary expressions

PrimaryMeaning
[ -a FILE ]True if FILE exists.
[ -b FILE ]True if FILE exists and is a block-special file.
[ -c FILE ]True if FILE exists and is a character-special file.
[ -d FILE ]True if FILE exists and is a directory.
[ -e FILE ]True if FILE exists.
[ -f FILE ]True if FILE exists and is a regular file.
[ -g FILE ]True if FILE exists and its SGID bit is set.
[ -h FILE ]True if FILE exists and is a symbolic link.
[ -k FILE ]True if FILE exists and its sticky bit is set.
[ -p FILE ]True if FILE exists and is a named pipe (FIFO).
[ -r FILE ]True if FILE exists and is readable.
[ -s FILE ]True if FILE exists and has a size greater than zero.
[ -t FD ]True if file descriptor FD is open and refers to a terminal.
[ -u FILE ]True if FILE exists and its SUID (set user ID) bit is set.
[ -w FILE ]True if FILE exists and is writable.
[ -x FILE ]True if FILE exists and is executable.
[ -O FILE ]True if FILE exists and is owned by the effective user ID.
[ -G FILE ]True if FILE exists and is owned by the effective group ID.
[ -L FILE ]True if FILE exists and is a symbolic link.
[ -N FILE ]True if FILE exists and has been modified since it was last read.
[ -S FILE ]True if FILE exists and is a socket.
[ FILE1 -nt FILE2 ]True if FILE1 has been changed more recently than FILE2, or if FILE1 exists and FILE2 does not.
[ FILE1 -ot FILE2 ]True if FILE1 is older than FILE2, or is FILE2 exists and FILE1 does not.
[ FILE1 -ef FILE2 ]True if FILE1 and FILE2 refer to the same device and inode numbers.
[ -o OPTIONNAME ]True if shell option "OPTIONNAME" is enabled.
[ -z STRING ]True of the length if "STRING" is zero.
[ -n STRING ] or [ STRING ]True if the length of "STRING" is non-zero.
[ STRING1 == STRING2 ] True if the strings are equal. "=" may be used instead of "==" for strict POSIX compliance.
[ STRING1 != STRING2 ] True if the strings are not equal.
[ STRING1 < STRING2 ] True if "STRING1" sorts before "STRING2" lexicographically in the current locale.
[ STRING1 > STRING2 ] True if "STRING1" sorts after "STRING2" lexicographically in the current locale.
[ ARG1 OP ARG2 ]"OP" is one of -eq, -ne, -lt, -le, -gt or -ge. These arithmetic binary operators return true if "ARG1" is equal to, not equal to, less than, less than or equal to, greater than, or greater than or equal to "ARG2", respectively. "ARG1" and "ARG2" are integers.

Expressions may be combined using the following operators, listed in decreasing order of precedence:

Table 7-2. Combining expressions

OperationEffect
[ ! EXPR ]True if EXPR is false.
[ ( EXPR ) ]Returns the value of EXPR. This may be used to override the normal precedence of operators.
[ EXPR1 -a EXPR2 ]True if both EXPR1 and EXPR2 are true.
[ EXPR1 -o EXPR2 ]True if either EXPR1 or EXPR2 is true.

The [ (or test) built-in evaluates conditional expressions using a set of rules based on the number of arguments. More information about this subject can be found in the Bash documentation. Just like the if is closed with fi, the opening square bracket should be closed after the conditions have been listed.


7.1.1.2. Commands following the then statement

The CONSEQUENT-COMMANDS list that follows the then statement can be any valid UNIX command, any executable program, any executable shell script or any shell statement, with the exception of the closing fi. It is important to remember that the then and fi are considered to be separated statements in the shell. Therefore, when issued on the command line, they are separated by a semi-colon.

In a script, the different parts of the if statement are usually well-separated. Below, a couple of simple examples.


7.1.1.3. Checking files

The first example checks for the existence of a file:


anny ~> cat msgcheck.sh
#!/bin/bash

echo "This scripts checks the existence of the messages file."
echo "Checking..."
if [ -f /var/log/messages ]
  then
    echo "/var/log/messages exists."
fi
echo
echo "...done."

anny ~> ./msgcheck.sh
This scripts checks the existence of the messages file.
Checking...
/var/log/messages exists.

...done.

7.1.1.4. Checking shell options

To add in your Bash configuration files:


# These lines will print a message if the noclobber option is set:

if [ -o noclobber ]
  then
	echo "Your files are protected against accidental overwriting using redirection."
fi

NoteThe environment
 

The above example will work when entered on the command line:


anny ~> if [ -o noclobber ] ; then echo ; echo "your files are protected
against overwriting." ; echo ; fi

your files are protected against overwriting.

anny ~>

However, if you use testing of conditions that depend on the environment, you might get different results when you enter the same command in a script, because the script will open a new shell, in which expected variables and options might not be set automatically.


7.1.2. Simple applications of if

7.1.2.1. Testing exit status

The ? variable holds the exit status of the previously executed command (the most recently completed foreground process).

The following example shows a simple test:


anny ~> if [ $? -eq 0 ]
More input> then echo 'That was a good job!'
More input> fi
That was a good job!

anny ~>

The following example demonstrates that TEST-COMMANDS might be any UNIX command that returns an exit status, and that if again returns an exit status of zero:


anny ~> if ! grep $USER /etc/passwd
More input> then echo "your user account is not managed locally"; fi
your user account is not managed locally

anny > echo $?
0

anny >

The same result can be obtained as follows:


anny > grep $USER /etc/passwd

anny > if [ $? -ne 0 ] ; then echo "not a local account" ; fi
not a local account

anny >

7.1.2.2. Numeric comparisons

The examples below use numerical comparisons:


anny > num=`wc -l work.txt`

anny > echo $num
201

anny > if [ "$num" -gt "150" ]
More input> then echo ; echo "you've worked hard enough for today."
More input> echo ; fi

you've worked hard enough for today.


anny >

This script is executed by cron every Sunday. If the week number is even, it reminds you to put out the garbage cans:


#!/bin/bash

# Calculate the week number using the date command:

WEEKOFFSET=$[ $(date +"%V") % 2 ]

# Test if we have a remainder.  If not, this is an even week so send a message.
# Else, do nothing.

if [ $WEEKOFFSET -eq "0" ]; then
  echo "Sunday evening, put out the garbage cans." | mail -s "Garbage cans out" your@your_domain.org
fi

7.1.2.3. String comparisons

An example of comparing strings for testing the user ID:


if [ "$(whoami)" != 'root' ]; then
        echo "You have no permission to run $0 as non-root user."
        exit 1;
fi

With Bash, you can shorten this type of construct. The compact equivalent of the above test is as follows:


[ "$(whoami)" != 'root' ] && ( echo you are using a non-privileged account; exit 1 )

Similar to the "&&" expression which indicates what to do if the test proves true, "||" specifies what to do if the test is false.

Regular expressions may also be used in comparisons:


anny > gender="female"

anny > if [[ "$gender" == f* ]]
More input> then echo "Pleasure to meet you, Madame."; fi
Pleasure to meet you, Madame.

anny >

NoteReal Programmers
 

Most programmers will prefer to use the test built-in command, which is equivalent to using square brackets for comparison, like this:


test "$(whoami)" != 'root' && (echo you are using a non-privileged account; exit 1)

NoteNo exit?
 

If you invoke the exit in a subshell, it will not pass variables to the parent. Use { and } instead of ( and ) if you do not want Bash to fork a subshell.

See the info pages for Bash for more information on pattern matching with the "(( EXPRESSION ))" and "[[ EXPRESSION ]]" constructs.


7.2. More advanced if usage

7.2.1. if/then/else constructs

7.2.1.1. Dummy example

This is the construct to use to take one course of action if the if commands test true, and another if it tests false. An example:


freddy scripts> gender="male"

freddy scripts> if [[ "$gender" == "f*" ]]
More input> then echo "Pleasure to meet you, Madame."
More input> else echo "How come the lady hasn't got a drink yet?"
More input> fi
How come the lady hasn't got a drink yet?

freddy scripts>

Important[] vs. [[]]
 

Contrary to [, [[ prevents word splitting of variable values. So, if VAR="var with spaces", you do not need to double quote $VAR in a test - eventhough using quotes remains a good habit. Also, [[ prevents pathname expansion, so literal strings with wildcards do not try to expand to filenames. Using [[, == and != interpret strings to the right as shell glob patterns to be matched against the value to the left, for instance: [[ "value" == val* ]].

Like the CONSEQUENT-COMMANDS list following the then statement, the ALTERNATE-CONSEQUENT-COMMANDS list following the else statement can hold any UNIX-style command that returns an exit status.

Another example, extending the one from Section 7.1.2.1:


anny ~> su -
Password:
[root@elegance root]# if ! grep ^$USER /etc/passwd 1> /dev/null
> then echo "your user account is not managed locally"
> else echo "your account is managed from the local /etc/passwd file"
> fi
your account is managed from the local /etc/passwd file
[root@elegance root]#

We switch to the root account to demonstrate the effect of the else statement - your root is usually a local account while your own user account might be managed by a central system, such as an LDAP server.


7.2.1.2. Checking command line arguments

Instead of setting a variable and then executing a script, it is frequently more elegant to put the values for the variables on the command line.

We use the positional parameters $1, $2, ..., $N for this purpose. $# refers to the number of command line arguments. $0 refers to the name of the script.

The following is a simple example:

Figure 7-1. Testing of a command line argument with if

Here's another example, using two arguments:


anny ~> cat weight.sh
#!/bin/bash

# This script prints a message about your weight if you give it your
# weight in kilos and height in centimeters.

weight="$1"
height="$2"
idealweight=$[$height - 110]

if [ $weight -le $idealweight ] ; then
  echo "You should eat a bit more fat."
else
  echo "You should eat a bit more fruit."
fi

anny ~> bash -x weight.sh 55 169
+ weight=55
+ height=169
+ idealweight=59
+ '[' 55 -le 59 ']'
+ echo 'You should eat a bit more fat.'
You should eat a bit more fat.

7.2.1.3. Testing the number of arguments

The following example shows how to change the previous script so that it prints a message if more or less than 2 arguments are given:


anny ~> cat weight.sh
#!/bin/bash

# This script prints a message about your weight if you give it your
# weight in kilos and height in centimeters.

if [ ! $# == 2 ]; then
  echo "Usage: $0 weight_in_kilos length_in_centimeters"
  exit
fi

weight="$1"
height="$2"
idealweight=$[$height - 110]

if [ $weight -le $idealweight ] ; then
  echo "You should eat a bit more fat."
else
  echo "You should eat a bit more fruit."
fi

anny ~> weight.sh 70 150
You should eat a bit more fruit.

anny ~> weight.sh 70 150 33
Usage: ./weight.sh weight_in_kilos length_in_centimeters

The first argument is referred to as $1, the second as $2 and so on. The total number of arguments is stored in $#.

Check out Section 7.2.5 for a more elegant way to print usage messages.


7.2.1.4. Testing that a file exists

This test is done in a lot of scripts, because there's no use in starting a lot of programs if you know they're not going to work:


#!/bin/bash

# This script gives information about a file.

FILENAME="$1"

echo "Properties for $FILENAME:"

if [ -f $FILENAME ]; then
  echo "Size is $(ls -lh $FILENAME | awk '{ print $5 }')"
  echo "Type is $(file $FILENAME | cut -d":" -f2 -)"
  echo "Inode number is $(ls -i $FILENAME | cut -d" " -f1 -)"
  echo "$(df -h $FILENAME | grep -v Mounted | awk '{ print "On",$1", \
which is mounted as the",$6,"partition."}')"
else
  echo "File does not exist."
fi

Note that the file is referred to using a variable; in this case it is the first argument to the script. Alternatively, when no arguments are given, file locations are usually stored in variables at the beginning of a script, and their content is referred to using these variables. Thus, when you want to change a file name in a script, you only need to do it once.

TipFilenames with spaces
 

The above example will fail if the value of $1 can be parsed as multiple words. In that case, the if command can be fixed either using double quotes around the filename, or by using [[ instead of [.


7.2.2. if/then/elif/else constructs

7.2.2.1. General

This is the full form of the if statement:

if TEST-COMMANDS; then

CONSEQUENT-COMMANDS;

elif MORE-TEST-COMMANDS; then

MORE-CONSEQUENT-COMMANDS;

else ALTERNATE-CONSEQUENT-COMMANDS;

fi

The TEST-COMMANDS list is executed, and if its return status is zero, the CONSEQUENT-COMMANDS list is executed. If TEST-COMMANDS returns a non-zero status, each elif list is executed in turn, and if its exit status is zero, the corresponding MORE-CONSEQUENT-COMMANDS is executed and the command completes. If else is followed by an ALTERNATE-CONSEQUENT-COMMANDS list, and the final command in the final if or elif clause has a non-zero exit status, then ALTERNATE-CONSEQUENT-COMMANDS is executed. The return status is the exit status of the last command executed, or zero if no condition tested true.


7.2.2.2. Example

This is an example that you can put in your crontab for daily execution:


anny /etc/cron.daily> cat disktest.sh
#!/bin/bash

# This script does a very simple test for checking disk space.

space=`df -h | awk '{print $5}' | grep % | grep -v Use | sort -n | tail -1 | cut -d "%" -f1 -`
alertvalue="80"

if [ "$space" -ge "$alertvalue" ]; then
  echo "At least one of my disks is nearly full!" | mail -s "daily diskcheck" root
else
  echo "Disk space normal" | mail -s "daily diskcheck" root
fi

7.2.3. Nested if statements

Inside the if statement, you can use another if statement. You may use as many levels of nested ifs as you can logically manage.

This is an example testing leap years:


anny ~/testdir> cat testleap.sh
#!/bin/bash
# This script will test if we're in a leap year or not.

year=`date +%Y`

if [ $[$year % 400] -eq "0" ]; then
  echo "This is a leap year.  February has 29 days."
elif [ $[$year % 4] -eq 0 ]; then
        if [ $[$year % 100] -ne 0 ]; then
          echo "This is a leap year, February has 29 days."
        else
          echo "This is not a leap year.  February has 28 days."
        fi
else
  echo "This is not a leap year.  February has 28 days."
fi

anny ~/testdir> date
Tue Jan 14 20:37:55 CET 2003

anny ~/testdir> testleap.sh
This is not a leap year.

7.2.4. Boolean operations

The above script can be shortened using the Boolean operators "AND" (&&) and "OR" (||).

Figure 7-2. Example using Boolean operators

We use the double brackets for testing an arithmetic expression, see Section 3.4.6. This is equivalent to the let statement. You will get stuck using square brackets here, if you try something like $[$year % 400], because here, the square brackets don't represent an actual command by themselves.

Among other editors, gvim is one of those supporting colour schemes according to the file format; such editors are useful for detecting errors in your code.


7.2.5. Using the exit statement and if

We already briefly met the exit statement in Section 7.2.1.3. It terminates execution of the entire script. It is most often used if the input requested from the user is incorrect, if a statement did not run successfully or if some other error occurred.

The exit statement takes an optional argument. This argument is the integer exit status code, which is passed back to the parent and stored in the $? variable.

A zero argument means that the script ran successfully. Any other value may be used by programmers to pass back different messages to the parent, so that different actions can be taken according to failure or success of the child process. If no argument is given to the exit command, the parent shell uses the current value of the $? variable.

Below is an example with a slightly adapted penguin.sh script, which sends its exit status back to the parent, feed.sh:


anny ~/testdir> cat penguin.sh
#!/bin/bash
                                                                                                 
# This script lets you present different menus to Tux.  He will only be happy
# when given a fish.  We've also added a dolphin and (presumably) a camel.
                                                                                                 
if [ "$menu" == "fish" ]; then
  if [ "$animal" == "penguin" ]; then
    echo "Hmmmmmm fish... Tux happy!"
  elif [ "$animal" == "dolphin" ]; then
    echo "Pweetpeettreetppeterdepweet!"
  else
    echo "*prrrrrrrt*"
  fi
else
  if [ "$animal" == "penguin" ]; then
    echo "Tux don't like that.  Tux wants fish!"
    exit 1
  elif [ "$animal" == "dolphin" ]; then
    echo "Pweepwishpeeterdepweet!"
    exit 2
  else
    echo "Will you read this sign?!"
    exit 3
  fi
fi

This script is called upon in the next one, which therefore exports its variables menu and animal:


anny ~/testdir> cat feed.sh
#!/bin/bash
# This script acts upon the exit status given by penguin.sh
                                                                                                 
export menu="$1"
export animal="$2"
                                                                                                 
feed="/nethome/anny/testdir/penguin.sh"
                                                                                                 
$feed $menu $animal
                                                                                                 
case $? in
                                                                                                 
1)
  echo "Guard: You'd better give'm a fish, less they get violent..."
  ;;
2)
  echo "Guard: It's because of people like you that they are leaving earth all the time..."
  ;;
3)
  echo "Guard: Buy the food that the Zoo provides for the animals, you ***, how
do you think we survive?"
  ;;
*)
  echo "Guard: Don't forget the guide!"
  ;;
esac
                                                                                                 
anny ~/testdir> ./feed.sh apple penguin
Tux don't like that.  Tux wants fish!
Guard: You'd better give'm a fish, less they get violent...

As you can see, exit status codes can be chosen freely. Existing commands usually have a series of defined codes; see the programmer's manual for each command for more information.


7.3. Using case statements

7.3.1. Simplified conditions

Nested if statements might be nice, but as soon as you are confronted with a couple of different possible actions to take, they tend to confuse. For the more complex conditionals, use the case syntax:

case EXPRESSION in CASE1) COMMAND-LIST;; CASE2) COMMAND-LIST;; ... CASEN) COMMAND-LIST;; esac

Each case is an expression matching a pattern. The commands in the COMMAND-LIST for the first match are executed. The "|" symbol is used for separating multiple patterns, and the ")" operator terminates a pattern list. Each case plus its according commands are called a clause. Each clause must be terminated with ";;". Each case statement is ended with the esac