Linux Sed Branching: Conditional Text Processing
Branching in sed allows you to create conditional workflows within your sed scripts.
Depending on the input or a specific condition, sed can choose to execute one set of commands over another. It’s similar to the “if-then-else” structures in programming.
Unconditional Branching
The b
command in sed
allows for unconditional branching. It’s a way to direct the flow of execution to another location in the script without evaluating any conditions.
echo "apple banana cherry" | sed '2b; s/apple/orange/'
Output:
orange banana cherry
2b;
: The b
command in sed
causes a branch to the end of the script, essentially skipping any subsequent commands.
Here, the command 2b
is prefixed with the line number 2
, so the branch will only be taken on the second line.
This means that if sed
is processing the second line (i.e., “banana”), it will skip any subsequent commands and move to the next line.
s/apple/orange/
: This is a substitution command. It tries to replace the first occurrence of “apple” with “orange”.
Conditional Branching After a Successful Substitution
The t
command is a way for sed
to execute a branch, but only after a successful substitution has been made.
If no substitution has taken place since the last input line was read or since the last t
command was executed, then no branching occurs.
This is useful when you want to perform additional operations only if a particular substitution happens.
echo "apple pie apple tart banana split" | sed '/apple/ { s/apple/peach/; t; s/pie/cobbler/; }'
Output:
peach cobbler peach tart banana split
Walkthrough:
- First, we target lines containing “apple” with the
/apple/
address. - Inside the curly braces
{}
, we make a series of commands to execute. - The
s/apple/peach/
command replaces “apple” with “peach”. - The
t
command checks if the above substitution was successful. If it was, it branches to the end of the commands inside the curly braces, skipping the next command. If no substitution was done, it continues executing the subsequent commands. - The
s/pie/cobbler/
command is only executed if the previouss/apple/peach/
substitution wasn’t done.
So what happened is:
- For “apple pie”:
s/apple/peach/
replaces “apple” with “peach”, resulting in “peach pie”.- Since the last substitution was successful, the
t
command causes a branch to the end of the block, skipping the next command. - Thus, the line remains “peach pie”.
- For “apple tart”:
s/apple/peach/
replaces “apple” with “peach”, resulting in “peach tart”.- Again, the
t
command causes a branch to the end of the block due to the successful substitution. - The line remains “peach tart”.
- For “banana split”:
- This line does not contain “apple”, so none of the commands inside the
/apple/ { ... }
block are executed. - The line remains “banana split”.
- This line does not contain “apple”, so none of the commands inside the
Labels in sed
Labels in sed
are essential when you’re incorporating branching mechanisms in your scripts.
Think of them as markers or destinations you can jump to with the b
, t
, and T
commands.
Defining Labels
Labels are defined using the :
command followed by the name of the label. For instance, :myLabel
defines a label named “myLabel”.
echo "dummy" | sed ':myLabel'
Output:
dummy
Here, the label does nothing by itself. It’s merely a marker waiting to be branched to.
Conditional Branching After an Unsuccessful Substitution
In contrast to the t
command, the T
command in sed
does branching when a substitution attempt is unsuccessful.
echo "apple pie apple tart banana split" | sed '/apple/ { s/apple/mango/; T end; s/pie/crisp/; :end }'
Output:
mango crisp mango tart banana split
Walkthrough:
- We initially target lines containing “apple” using the
/apple/
address. - Inside the
{}
, we have a sequence of commands. - The
s/apple/mango/
command aims to replace “apple” with “mango”. - The
T end
command checks if the previous substitution was unsuccessful. If the substitution didn’t take place, it will branch to the:end
label, skipping the next command. If a substitution did occur, it continues to the next command in sequence. - The
s/pie/crisp/
command is only executed if the precedings/apple/mango/
substitution was successful. :end
is a label that we can jump to using branching commands likeT
.
In our output, “apple pie” gets transformed to “mango crisp” because the “apple” to “mango” substitution was successful, so the execution proceeded to the next command, changing “pie” to “crisp”.
Branching by Line Number
You can direct the flow of execution based on both the line number and content-based conditions.
echo "apple cherry banana apple tart" | sed '2b branch; s/apple/peach/; t branch; s/cherry/berry/; :branch'
Output:
peach cherry banana peach tart
Walkthrough:
- We’ve defined a label called
branch
using:branch
. 2b branch
tellssed
to jump to thebranch
label directly when it’s processing the second line, skipping all commands that follow until the label.- For all other lines, the script attempts to replace “apple” with “peach”.
- If the replacement of “apple” with “peach” is successful, the
t branch
command will branch to the:branch
label, bypassing the next substitution. - The command
s/cherry/berry/
will only execute if the preceding substitution of “apple” to “peach” didn’t happen. However, it doesn’t have an effect here as the second line was already branched to:branch
directly, and no other line contains “cherry” after the first substitution.
From the output:
- “apple” gets converted to “peach”.
- “cherry” remains unchanged because it’s on the second line, which gets branched before any other command can be executed on it.
- “banana” remains unchanged as it doesn’t match any conditions.
- “apple tart” becomes “peach tart” due to the “apple” to “peach” substitution.
Creating Loops in sed
sed
doesn’t provide conventional loop structures like for
or while
, but with the clever use of branching commands and labels, you can emulate looping behavior.
Simple Looping with the b
Command
The b
command can create a straightforward infinite loop if not handled carefully.
echo "apple" | sed ':loopStart; s/apple/peach/; b loopStart'
Here, sed
would continuously attempt to replace “apple” with “peach”, looping infinitely because it keeps branching back to loopStart
.
Caution: This example will run indefinitely. It’s for illustrative purposes only.
Looping with a Condition using the t
Command
Loop until a condition fails.
echo "apple apple apple" | sed ':loopStart; s/apple/peach/; t loopStart'
Output:
peach peach peach
Walkthrough:
- The
s/apple/peach/
command replaces the first instance of “apple” with “peach”. - If the substitution is successful, the
t
command branches back toloopStart
, and the next “apple” is processed. - This loop continues until there are no more instances of “apple” left to replace, at which point the
t
command won’t branch, and the script ends.
Looping with a Negative Condition using the T
Command
Loop while a condition fails.
echo "apple apple banana apple" | sed ':loopStart; s/banana/orange/; T loopStart; s/apple/peach/'
Output:
peach apple orange apple
Walkthrough:
:loopStart;
: This defines a label namedloopStart
.s/banana/orange/;
: This attempts to substitute the first occurrence of “banana” with “orange”.- If the previous substitution (
s/banana/orange/
) was not successful, then it will branch (jump) to theloopStart
label. If the substitution was successful, it will proceed to the next command. s/apple/peach/;
: This substitutes the first occurrence of “apple” with “peach”.
Multi-line Scripting with Branching
In sed
, you can work with more than one line of input using commands like N
, D
, and P
.
When combined with branching, this allows for multi-line manipulations based on specific conditions.
Using the D
and P
Commands with Branching
The D
command deletes the first part of the pattern space up to the newline, while the P
command prints up to the first new line of the pattern space.
echo "apple pie banana tart" | sed ':start; N; s/apple\npie/peach tart/; T end; P; D; b start; :end'
Output:
peach tart banana tart
Walkthrough:
:start
: This defines a label namedstart
.N
: This appends the next line of input into the pattern space.s/apple\npie/peach tart/
: This tries to substitute the sequence of “apple” followed by a newline and then “pie” with the string “peach tart”.T end
: If the previous substitution was not successful, then it will branch (jump) to theend
label.P
: Prints the first line of the pattern space.D
: Deletes the first line of the pattern space and starts a new cycle with the remaining pattern space, without reading a new line of input.b start
: This branches (jumps) back to thestart
label.:end
: This defines a label namedend
.
Manipulating Multiple Lines with Conditional Branching
With the above commands, you can perform conditional operations across multiple lines.
echo "apple banana banana apple" | sed ':loop; N; s/banana\napple/orange\npeach/; t endLoop; P; D; b loop; :endLoop'
Output:
apple banana orange peach
Walkthrough:
:loop
: This defines a label namedloop
.N
: This appends the next line of input into the pattern space.s/banana\napple/orange\npeach/
: This attempts to substitute the sequence of “banana” followed by a newline and then “apple” with the string “orange” followed by a newline and then “peach”.t endLoop
: If the substitution in the previous step was successful, then it will branch (jump) to theendLoop
label.P
: This prints the first line of the pattern space.D
: Deletes the first line of the pattern space and starts a new cycle with the remaining pattern space, without reading a new line of input.b loop
: This always branches (jumps) back to theloop
label.:endLoop
: This defines a label namedendLoop
.
Real-world Examples
Let’s see some with real-world examples that a Linux system administrator or programmer might encounter.
Cleaning up a Configuration File
Imagine a configuration file where commented-out lines (#
prefixed) need to be removed. But, if a commented line contains the word “IMPORTANT”, it should be preserved.
echo -e "# This is a comment\n# IMPORTANT: Keep this\nvalue=42" | sed '/^#/!b keep; /IMPORTANT/b keep; d; :keep'
Output:
# IMPORTANT: Keep this value=42
Walkthrough:
- Lines not starting with
#
immediately branch to thekeep
label. - If a line contains “IMPORTANT”, it also branches to
keep
. - All other lines (i.e., non-important comments) are deleted.
Converting a Log File’s Timestamps
Let’s assume you have a log file with timestamps that need to be converted from “YYYY-MM-DD” to “DD/MM/YYYY”.
However, if a line does not contain a timestamp, you wish to append “(NO TIMESTAMP)” to it.
echo -e "Error at 2023-09-06\nWarning at 2023-09-07\nGeneral Error" | sed 's/\([0-9]\{4\}\)-\([0-9]\{2\}\)-\([0-9]\{2\}\)/\3\/\2\/\1/; t; s/$/ (NO TIMESTAMP)/'
Output:
Error at 06/09/2023 Warning at 07/09/2023 General Error (NO TIMESTAMP)
Walkthrough:
- The script first attempts to substitute the date format.
- If successful, the
t
command causes it to skip the next command. - If unsuccessful, the script appends “(NO TIMESTAMP)” to the line.
Editing an XML File
Consider an XML file where you want to replace the content inside specific tags. If a <value>
tag contains the number “42”, you want to change it to “84”.
However, if the replacement happens, you also want to add a comment after the closing tag.
echo -e "<value>42</value>\n<value>23</value>" | sed '/<value>42<\/value>/ { s/42/84/; t addComment; b; :addComment; s/$/ <!-- Modified -->/; }'
Output:
<value>84</value> <!-- Modified --> <value>23</value>
Walkthrough:
- The script checks if a line matches
<value>42</value>
. - If it does, it replaces “42” with “84”.
- The
t
command then branches toaddComment
if the substitution was successful. - In
addComment
, the script appends a comment to the line.
Resource
https://www.gnu.org/software/sed/manual/html_node/Branching-and-flow-control.html
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.