bash-programming-from-scratch/loop-operators.md at 944d705ed5fc00aaf74c4addcf6089acfa4edd3d

lfs/bash-programming-from-scratch

Fork 0

mirror of https://github.com/ellysh/bash-programming-from-scratch.git synced 2026-01-12 08:21:52 +00:00

Files

Ilya Shpigor 944d705ed5 Translate the "Loop Operators" section

2021-02-21 15:28:40 +01:00

40 KiB

Raw Blame History

Loop Constructs

Conditional statements manage the control flow of a program. The control flow is the order in which the operators and commands of a program are executed.

The conditional operator chooses a branch of execution depending on the Boolean expression. This operator is not enough sometimes. You want additional features to manage the control flow. Loop constructs solve tasks that the conditional operator cannot handle.

The loop construct repeats the same block of commands multiple times. The single execution of this block is called the loop iteration. At each iteration, the loop checks its condition. The check result defines to perform the next iteration or not.

Repetition of Commands

Why do you need to repeat the same block of commands in a program? Several examples will help us to answer this question.

We are already familiar with the find utility. It looks for files and directories on the hard disk. If you add the -exec option to the find call, you can specify an action. The utility performs this action on each object found.

For example, the following command deletes all PDF documents in the ~/Documents directory: {line-numbers: false, format: Bash}

find ~/Documents -name "*.pdf" -exec rm {} \;

In this case, find calls the rm utility several times. It passes the next found file on each call. It means that the find utility executes a loop. The loop ends when all the files found have been processed.

The du utility is another example of the repetition of commands. The utility estimates the amount of disk space used on the disks. It has an optional parameter. The parameter sets the path where the estimation starts.

You can call du like this, for example: {line-numbers: false, format: Bash}

du ~/Documents

Here the utility recursively traverses all ~/Documents subdirectories. It adds the size of each file found to the final result. This way, incrementing the result value repeats over and over again.

The du utility has an internal loop. It traverses over all files and subdirectories. There are the same actions on each loop iteration. The only difference is the checked file system object.

Repetition of operations happens in mathematical calculations often. A canonical example is the calculation of factorial. The factorial of the number N is a multiplication of natural numbers from 1 to N inclusive.

Here is an example of calculating the factorial of number 4: {line-numbers: false, format: Bash}

4! = 1 * 2 * 3 * 4 = 24

You can calculate the factorial easily by using the loop operator. The loop should pass through the integers from 1 to N in sequence. Multiply each integer to the final result on the loop iteration. This way, you repeat the multiplication operation several times.

Here is the last example of repetition actions in a computer system. Repetition helps to manage some events.

Suppose that you write a program. It downloads files to your computer from the Internet. First, the program establishes a connection to a server. If the server doesn't respond, the program has two options to do. The first one is to terminate with a non-zero exit status. The second option is to wait for the server's response. This option is preferable. There are many reasons why the server's response delays. There is an overload of the network or the server, for example. A couple of seconds waiting is enough to get a response. Then our program can continue to work.

Now the question arises: how can you wait for the event to occur in the program? The easiest way is to use the loop operator. Its condition should check if the event occurs. In this case, the operator stops.

Let's come back to our example. The loop should stop when the program receives a response from the server. While it does not happen, the loop continues. You do not need any actions on each loop's iteration. Thus, you can leave this code block empty. This technique is called busy waiting.

You can optimize the busy waiting. Replace the loop's empty code block with a command that stops the program for a short while. This way, OS gets a chance to execute another task while your program is waiting.

We have considered several examples where the program repeats the same actions. Let's write down the tasks that we have solved in each example. Here is the list:

Process of multiple entities monotonously. The find utility processes the search results this way.
Apply the intermediate data to accumulate the final result. The du utility does it for collecting statistics.
Mathematical calculations. You can calculate factorial using the loop operator.
Wait for some event to happen. You can wait for the server's response in a busy waiting loop.

The list is far from being complete. These are just the most common programming tasks that require a loop operator.

While Statement

There are two loop operators in Bash: while and for. First, we will consider the while statement. It works simpler than for.

The while syntax resembles the if statement. It looks like this in general: {line-numbers: true, format: Bash}

while CONDITION
do
  ACTION
done

You can write the while statement in one line: {line-numbers: false, format: Bash}

while CONDITION; do ACTION; done

The CONDITION is a single command or block of commands. The same is true for the ACTION. It resembles the if statement again. The ACTION is called the loop body.

Bash checks the CONDITION of the while statement first. If the CONDITION command returns null exist status, it equals "true". In this case, Bash executes the ACTION. Then it checks the CONDITION again. If it is still true, the ACTION is performed again. The loop execution stops when the CONDITION becomes "false".

Use the while loop when you do not know the number of iterations beforehand. The example is busy waiting for some event.

Let's write a script with the while statement. It should check if the server is available on the Internet. The simplest way for that is by sending a request to the server. When the server replies, the script displays a message and stops.

We can call the ping utility to send a request to the server. The utility uses the ICMP protocol. The protocol is an agreement for the format of the messages between the computers on the network. The ICMP protocol describes the format of the messages to serve the network. You need them, for example, to check if some computer is available.

The ping utility takes one mandatory input parameter. It is URL or IP address of the target host. A host is any computer or device connected to the network.

Here is the command to call the ping utility: {line-numbers: false, format: Bash}

ping google.com

We have specified the Google server as the target host. The utility sends ICMP messages to it. The server replies to each of them. The output of the utility looks like this: {line-numbers: true, format: Bash}

PING google.com (172.217.21.238) 56(84) bytes of data.
64 bytes from fra16s13-in-f14.1e100.net (172.217.21.238): icmp_seq=1 ttl=51 time=17.8 ms
64 bytes from fra16s13-in-f14.1e100.net (172.217.21.238): icmp_seq=2 ttl=51 time=18.5 ms

You see information about each ICMP message sent by the utility. The "time" field means the delay between sending the request and receiving the server's response.

The utility runs in an infinite loop by default. You can stop it by pressing Ctrl+C.

You do not need to send several requests to check the availability of a server. It is sufficient to send a single ICMP message instead. The -c option of the ping utility specifies the number of messages to send. Here is an example of how to use it: {line-numbers: false, format: Bash}

ping -c 1 google.com

If the google.com server is available, the utility returns the zero exit status. Otherwise, it returns non-zero.

The ping utility waits for the server's response until you do not interrupt it. The -W option limits the waiting time to one second. Using the option, we get the following command: {line-numbers: false, format: Bash}

ping -c 1 -W 1 google.com

Now we have the condition for the while statement. Let's write the whole statement like this: {line-numbers: true, format: Bash}

while ! ping -c 1 -W 1 google.com &> /dev/null
do
  sleep 1
done

The output of the ping utility is not interested in our case. Therefore, we redirect it to the /dev/null file.

We invert the ping result in the while condition. Therefore, Bash executes the loop's body as long as the utility returns a non-zero exit status. It means that the loop continues as long as the server is unavailable.

We call the sleep utility in the loop's body. It stops the script for the specified number of seconds. The stop lasts for one second in our case.

I> You can specify a suffix for the parameter when using the sleep utility. It defines units of time. The suffix "s" matches seconds. It is "m" for minutes, "h" for hours and "d" for days.

Listing 3-18 shows a complete script for checking server availability.

{caption: "Listing 3-18. Script for checking server availability", line-numbers: true, format: Bash}

The while statement has an alternative form called until. Here the ACTION is executed as long as the CONDITION is "false". It means that the loop continues as long as the CONDITION returns a non-zero exit status. Use the until statement when you want to invert the condition of the while loop.

Here is the until statement in general: {line-numbers: true}

until CONDITION
do
  ACTION
done

You can write it in one line the same way as while: {line-numbers: false}

until CONDITION; do ACTION; done

Let's replace the while statement with until in Listing 3-18. You should remove the negation of the ping utility result for that. Listing 3-19 shows the resulting script.

{caption: "Listing 3-19. Script for checking server availability", line-numbers: true, format: Bash}

The scripts in Listing 3-18 and Listing 3-19 behave the same.

Choose the while or until statement depending on the loop condition. Try to avoid negations in conditions. Negations make the code harder to read.

Infinite Loop

The while statement fits well to implement infinite loops. This kind of loop continues as long as the program is running.

You can find infinite loops in system software that runs until the computer is powered off. Examples are OS or microcontroller firmware. Computer games and monitor programs for collecting statistics also use such loops.

The while loop becomes infinite if its condition is always true. The easiest way to set such a condition is to call the true command. Here is an example of using it: {line-numbers: true, format: Bash}

while true
do
  sleep 1
done

The true command always returns the "true" value. It means that it returns zero exit status. There is the symmetric command called false. It always returns exit status one that matches the "false" value.

I> Words "true" and "false" are literals in most programming languages. Literals are reserved words for representing fixed values. In our case, they represent the "true" and "false" values.

You can replace the true command in the while condition with a colon. This way, you get the following: {line-numbers: true, format: Bash}

while :
do
  sleep 1
done

The colon is synonymous with the true command. The synonymous solves the compatibility task with the Bourne shell. This shell does not have true and false commands. Therefore, Bourne shell scripts use colons and Bash should support them.

The POSIX standard includes all three commands: colon, true, and false. However, avoid using a colon in your scripts. It is a deprecated syntax that makes your code harder to understand.

Here is an example of an infinite loop. We want to write a script that displays statistics of disk space usage. The df utility can help us in this case. It prints the following when called without parameters: {line-numbers: true, format: Bash}

$ df
Filesystem     1K-blocks      Used Available Use% Mounted on
C:/msys64       41940988  24666880  17274108  59% /
Z:             195059116 110151748  84907368  57% /z

The utility shows "Used" and "Available" disk space in bytes. We can add the -h option to the utility call. Then it shows kilobytes, megabytes, gigabytes and terabytes instead of bytes. Also, we add an option -T. It shows the file system type for each disk. This way, we get the following output: {line-numbers: true, format: Bash}

$ df -hT
Filesystem     Type  Size  Used Avail Use% Mounted on
C:/msys64      ntfs   40G   24G   17G  59% /
Z:             hgfs  187G  106G   81G  57% /z

If you want to get information about all mount points, add the -a option.

Now let's write an infinite loop. It calls the df utility on each iteration. This way, we get a simple script to monitor the file system. Listing 3-20 shows the script.

{caption: "Listing 3-20. The script to monitor the file system", line-numbers: true, format: Bash}

The first action of the cycle iteration is the clear utility call. It clears the terminal window of text. Thanks to this step, the terminal shows the output of our script only.

Executing a command in a cycle is a common task that arises when working with Bash. The watch utility solves this task. The utility is a part of the procps package. The following command installs this package to the MSYS2 environment: {line-numbers: false, format: Bash}

pacman -S procps

Now you can replace the script from listing 3-20 with a single command. It looks like this: {line-numbers: false, format: Bash}

watch -n 2 "df -hT"

The -n option of the watch utility specifies the interval between command calls. The command to execute follows all utility options.

The -d utility option highlights the difference in the command's output at the current iteration and the last iteration. This way, it is easier to keep track of changes that have occurred.

Reading a Standard Input Stream

The while loop fits well for handling an input stream. Here is an example of such a task. We want to write a script that reads a text file. The script makes an associative array from the file's content.

Listing 3-10 shows the script for managing the list of contacts. The script stores contacts in the format of the Bash array declaration. It makes adding a new person to the list inconvenient. The user must know the Bash syntax. Otherwise, he can make a mistake when initializing an array element. It will break the script.

We can solve the problem of editing the contacts list. Let's put the list in a separate text file. Our script should read it at startup. This way, we separate data and code. It is a well-known and good practice in software development.

Listing 3-21 shows a possible format of the file with contacts.

{caption: "Listing 3-21. The file with contacts contacts.txt", line-numbers: true, format: text}

Let's write a script to read this file. It is convenient to read the list of contacts directly into the associative array. This way, we keep the searching mechanism over the list as effective as before.

When reading the file, we should process its lines in the same manner. It means that we will repeat our actions. Therefore, we need a loop statement. At the beginning of the loop, we don't know the size of the file. Thus, we do not know the number of iterations to do. The while statement fits this case perfectly.

Why do we not know the number of iterations in advance? It happens because the script reads the file line by line. It cannot count the lines before it reads them all. We can make two loops. The first one counts the lines. The second loop processes them. However, this solution works slower and less ineffective.

We can use the read built-in command for reading lines of the file. The command receives a string from the standard input stream. Then it writes the string into the specified variable. You can pass the variable's name as a parameter. Here is an example of doing that: {line-numbers: false, format: Bash}

read var

Run this command. Then type the string and press Enter. The read command writes your string into the var variable. You can call read without parameters. It writes the string into the reserved variable REPLY in this case.

When read receives the string, it removes backslashes \ there. They escape special characters. Therefore, the read command considers the backslashes unnecessary. The -r option disables this feature. Use it always to prevent losing characters of the input string.

You can pass several variable names to the read command. Then it divides the input text into parts. The command uses the delimiters from the reserved variable IFS in this case. Default delimiters are spaces, tabs and line breaks.

Here is an example of multiple variables for the read command. Suppose that we want to store the input string into two variables. They are called path and file. The following command reads them: {line-numbers: false, format: Bash}

read -r path file

The user types the following string for this command: {line-numbers: false, format: text}

~/Documents report.txt

Then the read command writes the ~/Documents path into the path variable. The filename report.txt comes into the file variable.

If the path or filename contains spaces, an error occurs. Suppose the user type the following string: {line-numbers: false, format: text}

~/My Documents report.txt

Then the command writes the ~/My string into the path variable. The file variable stores the rest part of the input: Documents report.txt. This is a wrong result. Don't forget about such behavior when using the read command.

We can solve the problem of splitting the input string. This can be done by redefining the IFS variable. Here is an example to specify comma as only one possible delimiter: {line-numbers: false, format: text}

IFS=$',' read -r path file

Here we have applied the Bash-specific type of quotes $'...'. Bash does not perform any expansions inside them. At the same time, you can place some control sequences there: \n (new line), \\\ (escaped backslash), \t (tabulation) and \xnn (bytes in hexadecimal).

The new IFS declaration allows to process the following input string properly: {line-numbers: true, format: text}

~/My Documents,report.txt

The comma separates the path and filename. Therefore, the read command writes the ~/My Documents string into the path variable. The report.txt string comes into the file variable.

The read command receives data from the standard input stream. It means that you can redirect the file's content to the command.

Here is an example to read the first line of the contacts.txt file from Listing 3-21. The following command does it: {line-numbers: false, format: Bash}

read -r contact < contacts.txt

This command writes the "Alice=alice@gmail.com" string into the contact variable.

We can write the name and contact information into two different variables. Let's define the equal sign as a delimiter to do that. Then our read command looks like this: {line-numbers: false, format: Bash}

IFS=$'=' read -r name contact < contacts.txt

Now the name variable gets the "Alice" name. The e-mail address comes into the contact variable.

Let's try the following while loop for reading the entire contacts.txt file: {line-numbers: true, format: Bash}

while IFS=$'=' read -r name contact < "contacts.txt"
do
  echo "$name = $contact"
done

Unfortunately, it does not work. Here we got an infinite loop accidentally. It happens because the read command always reads only the first line of the file. Then the command returns the zero exit status. The zero status leads to the loop body execution. It happens over and over again.

We should force the while loop to pass through all lines of the file. The following form of the loop does it: {line-numbers: true, format: Bash}

while CONDITION
do
  ACTION
done < FILE

You can handle the input from the keyboard this way. Specify the /dev/tty file in this case. The loop will read keystrokes until you press Ctrl+D.

Here is the right while loop to read the contacts.txt file: {line-numbers: true, format: Bash}

while IFS=$'=' read -r name contact
do
  echo "$name = $contact"
done < "contacts.txt"

This loop prints the entire contents of the contact file.

There is the last step left to finish our task. We should write the name and contact variables to the array on each iteration. The name variable is the key and contact is the value.

Listing 3-22 shows the final version of the script for reading the contacts from the file.

{caption: "Listing 3-22. The script for managing the contacts", line-numbers: true, format: Bash}

This script behaves the same way as one in Listing 3-10.

For Statement

There is another loop statement in Bash called for. Unlike while, use it when you know the number of iterations in advance.

The for statement has two forms. The first one processes words in a string sequentially. The second form applies an arithmetic expression in the loop's condition.

The First Form of For

Let's start with the first form of the for statement. It looks like this in general: {line-numbers: true, format: Bash}

for VARIABLE in STRING
do
  ACTION
done

You can write the same construction in a single line like this: {line-numbers: false, format: Bash}

for VARIABLE in STRING; do ACTION; done

The ACTION in the for statement is a single command or a block of commands. It is the same as in the while statement.

Bash performs all expansions in the for condition before starting the first iteration of the loop. What does it mean? Suppose you specified the command instead of a STRING. Then Bash executes this command and replaces it with its output. Also, you can specify a pattern instead of STRING. Then Bash expands it before starting the loop.

BASH splits the STRING into words when there are no commands or patterns left in the for condition. It takes separators for splitting from the IFS variable.

Then Bash executes the first iteration of the loop. The first word of the STRING is available via VARIABLE inside the loop body on the first iteration. Then Bash writes the second word of the STRING to the VARIABLE and starts the second iteration. It happens again and again until we pass all words of the STRING.

Here is an example of the for loop. We want to write a script to print words in a string one by one. The script receives the string via the first parameter.

Listing 3-23 shows the script.

{caption: "Listing 3-23. The script for printing words of a string", line-numbers: true, format: Bash}

Here you should not enclose the position parameter $1 in quotes. Quotes prevent word splitting, but we want it in this case. Otherwise, Bash passes the whole string to the first iteration of the for loop. Then the loop finishes. We do not want this behavior. The script should process each word of the string separately.

When you call the script, you should enclose the input string in double-quotes. Then the whole string comes into the $1 parameter. Here is an example of calling the script:

There is a way to get rid of the double-quotes when calling the script. Replace the $1 parameter in the for condition with $@. Then the loop statement becomes like this: {line-numbers: true, format: Bash}

for word in $@
do
  echo "$word"
done

Now both following script calls work properly: {line-numbers: true, format: Bash}

./for-string.sh this is a string
./for-string.sh "this is a string"

The for loop condition has a short form. Use it when you want to pass through all input parameters of the script. This short form looks like this: {line-numbers: true, format: Bash}

for word
do
  echo "$word"
done

It does the same as the script in Listing 3-23. We just dropped the "in $@" part in the condition. It did not change the loop behavior.

Let's make the task a bit more complicated. Suppose the script receives a list of paths on input. Commas separate them. The paths may contain spaces. We should redefine the IFS variable to process such input correctly.

Listing 3-24 shows the for loop to print the list of paths.

{caption: "Listing 3-24. The script for printing the list of paths", line-numbers: true, format: Bash}

We have specified only one allowable delimiter in the IFS variable. The delimiter is the comma. Therefore, the for loop ignores spaces when splitting the input string.

You can call the script this way: {line-numbers: false, format: Bash}

./for-path.sh "~/My Documents/file1.pdf,~/My Documents/report2.txt"

Here double-quotes for the input string are mandatory. You cannot replace the $1 parameter with $@ in the for condition and omit quotes. This will lead to an error. The error happens because Bash does word splitting when calling the script. This word splitting applies spaces as delimiters. It happens before our redeclaration of the IFS variable. Thus, Bash ignores our change of the variable in this case

If there is a comma in one of the paths, it leads to an error.

The for loop can pass through the elements of an indexed array. It works the same way as processing words in a string. Listing 3-25 shows an example of doing that.

{caption: "Listing 3-25. The script for printing all elements of the array", line-numbers: true, format: Bash}

Suppose you need the first three elements. Then you should expand only the elements you need in the loop condition. Listing 3-26 shows how to do that.

{caption: "Listing 3-26. The script for printing the first three elements of the array", line-numbers: true, format: Bash}

There is another option to pass through the array. You can iterate over the indexes instead of the array's elements. Write the string with indexes of the elements you need. Spaces should separate them. Put the string into the for condition. Then the loop gives you an index on each iteration. The loop looks like this: {line-numbers: true, format: Bash}

array=(Alice Bob Eve Mallory)

for i in 0 1 2
do
  echo "${array[i]}"
done

This loop passes only through elements with indexes 0, 1 and 2.

You can apply the brace expansion to specify the indexes list. Here is an example: {line-numbers: true, format: Bash}

array=(Alice Bob Eve Mallory)

for i in {0..2}
do
  echo "${array[i]}"
done

The loop behaves the same way. It prints the first three elements of the array.

Do not iterate over the element's indexes when processing arrays with gaps. Expand the array's elements in the loop condition instead. Listing 3-25 and Listing 3-26 show how to do that.

Files Processing

The for loop fits well for processing a list of files. When solving this task, you should compose the loop condition correctly. There are several common mistakes here. Let's consider them by examples.

The first example is a script that prints types of files in the current directory. We can do it by calling the file utility for each file.

The most common mistake when composing the for loop condition is neglecting patterns (globbing). Users often call the ls or find utility to get the STRING. It happens this way: {line-numbers: true, format: Bash}

for filename in $(ls)
for filename in $(find . - type f)

This is wrong. Such a solution leads to the following problems:

Word splitting breaks names of files and directories with spaces.
If the filename contains an asterisk, Bash performs globbing before starting the loop. Then it writes the expansion result to the filename variable. This way, you lose the actual filename.
The output of the ls utility depends on the regional settings. Therefore, you can get question marks instead of the national alphabet characters in filenames. Then the for loop cannot process these files.

Always use patterns in the for loop to enumerate filenames. It is the only correct solution for this task.

We should write the following for loop condition in our case: {line-numbers: false, format: Bash}

for filename in *

Listing 3-27 shows the complete script.

{caption: "Listing 3-27. The script for printing the file types", line-numbers: true, format: Bash}

Do not forget to use double-quotes when accessing the filename variable. They prevent word splitting of filenames with spaces.

You can still use the pattern in the for loop condition if you want to process files from a specific directory. Here is an example of such a pattern: {line-numbers: false, format: Bash}

for filename in /usr/share/doc/bash/*

A pattern can filter out files with a specific extension or name. It looks like this: {line-numbers: false, format: Bash}

for filename in ~/Documents/*.pdf

There is a new feature for patterns in Bash version 4. You can pass through directories recursively. Here is an example: {line-numbers: true, format: Bash}

shopt -s globstar

for filename in **

This feature is disabled by default. Activate it by enabling the globstar interpreter option with the shopt command.

When Bash meets the ** pattern, it inserts a list of all subdirectories and their files starting from the current directory. You can combine this mechanism with regular patterns.

For example, let's process all files with the PDF extension from the user's home directory. The following for loop condition does that: {line-numbers: true, format: Bash}

shopt -s globstar

for filename in ~/**/*.pdf

There is another common mistake when using the for loop. Sometimes you just do not need it. For example, you can replace the script in Listing 3-27 with the following find call: {line-numbers: false, format: Bash}

find . -maxdepth 1 -exec file {} \;

This command is more efficient than the for loop. It is compact and works faster because of fewer operations to do.

When should you use the for loop instead of the find utility? Use find when one short command processes found files. If you need a conditional statement or block of commands for this job, use the for loop.

There are cases when patterns are not enough in the for loop condition. You want to do a complex search with checking file types, for example. In this case, use the while loop instead of for.

Let's replace the for loop in Listing 3-27 with while. The find utility will provide us a list of files. But we should call it with the -print0 option. This way, we avoid word splitting issues. Listing 3-28 shows how to combine the find utility and while loop properly.

{caption: "Listing 3-28. The script for printing the file types", line-numbers: true, format: Bash}

There are several tricky solutions in this script. Let's take a closer look at them. The first question is why we need to assign an empty value to the IFS variable? If we keep the variable unchanged, Bash splits the find output by default delimiters (spaces, tabs and line breaks). It can break filenames with these characters.

The second solution is applying the -d option of the read command. The option defines a delimiter character for splitting the input text. When using it, the filename variable gets the part of the string that comes before the next delimiter.

The -d option specifies the empty delimiter. It means a NULL character. You can also specify it explicitly. Do it like this: {line-numbers: false, format: Bash}

while IFS= read -r -d $'\0' filename

Thanks to the -d option, the read command handles the find output correctly. There is the -print0 option in the utility call. It means that find separates found files by a NULL character. This way, we reconcile the read input format and the find output.

Note that you cannot specify a NULL character as a delimiter using the IFS variable. In other words, the following solution does not work: {line-numbers: false, format: Bash}

while IFS=$'\0' read -r filename

The problem comes from the peculiarity when interpreting the IFS variable. If the variable is empty, Bash does not do word splitting at all. When you assign a NULL character to the variable, it means an empty value for Bash.

There is the last tricky solution in Listing 3-28. We use process substitution for passing the find output to the while loop. Why did we not use the command substitution instead? We can do it like this: {line-numbers: true, format: Bash}

while IFS= read -r -d '' filename
do
  file "$filename"
done < $(find . -maxdepth 1 -print0)

Unfortunately, this redirection does not work. The < operator couples the input stream and the specified file descriptor. But there is no file descriptor when using the command substitution. Bash calls the find utility and inserts its output instead of $(...). When you use process substitution, Bash writes the find output to a temporary file. This file has a descriptor. Therefore, the stream redirection works fine.

There is only one issue with process substitution. It is not part of the POSIX standard. If you need to follow the standard, use a pipeline instead. Listing 3-29 demonstrates how to do it.

{caption: "Listing 3-29. The script for printing the file types", line-numbers: true, format: Bash}

Combine the while loop and find utility only when you have both following cases at the same time:

You need a conditional statement or code block to process files.
You have a complex condition for searching files.

When combining while and find, always use a NULL character as a delimiter. This way, you avoid the word splitting problems.

The Second Form of For

The second form of the for statement allows you to specify an arithmetic expression as a condition. Let's consider cases when do you need it.

Suppose we need a script to calculate the factorial. The solution for this task depends on the way we enter the data. The first option is we have a predefined integer. Then the first form of the for loop fits well. Listing 3-30 shows this solution.

{caption: "Listing 3-30. The script for calculating the factorial for integer 5", line-numbers: true, format: Bash}

The second option is to receive the integer as an input parameter of the script. We can try the following loop's condition to process the $1 parameter: {line-numbers: false, format: Bash}

for i in {1..$1}

We expect that Bash will do brace expansion for integers from one to the $1 value. However, it does not work this way.

According to Table 3-2, the brace expansion happens before the parameter expansion. Thus, the loop condition gets the string "{1...$1}" instead of "1 2 3 4 5". Bash does not recognize the brace expansion because the upper bound of the range is not an integer. Then Bash writes the "{1...$1}" string to the i variable. Therefore, the following (( operator fails.

The seq utility can solve our problem. It generates a sequence of integers or fractions.

Table 3-21 shows the ways to call the seq utility.

{caption: "Table 3-21. The ways to call the seq utility", width: "100%"}

Number of parameters	Description	Example	Result
1	The parameter defines the last number in the generated sequence. The sequence starts with one.	`seq 5`	1 2 3 4 5

2	The parameters are the first and last numbers of the sequence.	`seq -3 3`	-2 -1 0 1 2

3	The parameters are the first number, step and last numbers of the sequence.	`seq 1 2 5`	1 3 5

The seq utility splits integers on the output by line breaks. The -s option allows you to specify another delimiter. The IFS variable contains the line break symbol. Therefore, you do not need the -s option in our case.

There are line breaks instead of spaces in the "Result" column of Table 3-21. This is done for convenience.

Let's apply the seq utility and write the script to calculate a factorial for any integer. Listing 3-31 shows this script.

{caption: "Listing 3-31. The script for calculating a factorial", line-numbers: true, format: Bash}

This solution works properly. However, it is ineffective. The performance overhead comes because of calling the external seq utility. It costs the same time as launching an application (for example, Windows Calculator). The OS kernel performs several complicated operations whenever Bash creates a new process. They take significant time on the processor's scale. Therefore, apply the built-in Bash commands whenever possible.

We need the second form of the for statement to solve the task effectively. This form looks like this in general: {line-numbers: true, format: Bash}

for (( EXPRESSION_1; EXPRESSION_2; EXPRESSION_3 ))
do
  ACTION
done

You can write it in one line this way: {line-numbers: false, format: Bash}

for (( EXPRESSION_1; EXPRESSION_2; EXPRESSION_3 )); do ACTION; done

Bash executes the for loop with an arithmetic condition this way:

Bash calculates the EXPRESSION_1 once before the first iteration of the loop.
The loop continues as long as EXPRESSION_2 remains true. The loop stops when it returns "false".
Bash calculates the EXPRESSION_3 at the end of each iteration.

Let's replace the seq utility call with the arithmetic expression in Listing 3-31. Listing 3-32 shows the result.

{caption: "Listing 3-32. The script for calculating a factorial", line-numbers: true, format: Bash}

This script works faster. It uses Bash built-in commands only. There is no need to create new processes here.

The for statement in the script follows this algorithm:

Declare the i variable before the first iteration of the loop. Assign it integer 1. The variable is a loop counter.
Compare the loop counter with the input parameter $1.
If the counter is smaller than the $1 parameter, do the loop iteration.
If the counter is greater than the parameter, stop the loop.
Calculate the arithmetic expression "result *= i" in the loop's body. It multiplies the result variable by i.
When the loop iteration is done, calculate the "++i" expression of the for condition. It increments the i variable by one.
Go to the 2nd step of the algorithm.

I> In the general case, you do not need the dollar sign for variables names in the (( operator and let command. However, it is necessary for code in Listing 3-32. Without the sign, Bash confuses the $1parameter and literal 1.

We use the prefix increment form in the loop. It works faster than the postfix form.

Use the second form of the for whenever you should calculate the loop counter. There are no other effective solutions in this case.

40 KiB Raw Blame History