Replace Text by Line and Column Using awk in Linux
The awk
command in Linux is a scripting language designed for text processing.
In this tutorial, you’ll learn to use awk
for replacing text in Linux.
We’ll cover how to replace single occurrences, specific columns, multiple characters, and recursive replacements.
Replace By Line Number
Imagine you have a file named customer_data.txt
with the following content:
1234,active 5678,inactive 9101,active 1213,inactive 1415,active
To replace the 3rd line of this file, you can use the following awk
command:
awk 'NR==3 {$0="9101,updated"} 1' customer_data.txt
Output:
1234,active 5678,inactive 9101,updated 1213,inactive 1415,active
This command specifies that for the 3rd line (NR==3), the entire line ($0
) should be replaced with “9101,updated”.
The 1
at the end of the command ensures that all lines are printed.
Replace First Occurrence in Each Line
Considering our customer_data.txt
file:
101,David,Gold,Platinum,Gold 102,Sarah,Gold,Gold,Bronze 103,Alex,Silver,Gold,Silver 104,Morgan,Bronze,Gold,Gold 105,Rachel,Gold,Silver,Gold
To replace only the first occurrence of Gold
with Diamond
in each line, use the following awk
command:
awk '{sub(/Gold/, "Diamond"); print}' customer_data.txt
Output:
101,David,Diamond,Platinum,Gold 102,Sarah,Diamond,Gold,Bronze 103,Alex,Silver,Diamond,Silver 104,Morgan,Bronze,Diamond,Gold 105,Rachel,Diamond,Silver,Gold
This command uses awk
sub
function which replaces the first instance of the pattern /Gold/
with Diamond
in each line.
The print
command then outputs the modified lines.
Replace Last Occurrence in Line
The content of customer_data.txt
is:
101,David,Platinum,Platinum 102,Sarah,Gold,Gold 103,Alex,Silver,Silver 104,Morgan,Bronze,Bronze 105,Rachel,Gold,Gold
To replace the last occurrence of Gold
with Diamond
, use the following awk
command:
awk 'BEGIN{FS=OFS=","} {for (i=NF; i>0; i--) if ($i=="Gold") {$i="Diamond"; break}}1' customer_data.txt
Output:
101,David,Platinum,Platinum 102,Sarah,Gold,Diamond 103,Alex,Silver,Silver 104,Morgan,Bronze,Bronze 105,Rachel,Gold,Diamond
This awk
script sets the field separator to a comma and iterates backward through each field in a line.
When it finds Gold
, it replaces it with Diamond
and stops the loop.
Replace Every nth Occurrence
Consider the following file:
101,David,Platinum,Platinum,Platinum 102,Sarah,Gold,Gold,Gold 103,Alex,Silver,Silver,Silver 104,Morgan,Bronze,Bronze,Bronze 105,Rachel,Gold,Gold,Gold
To replace every second occurrence of Gold
with Diamond
, use this awk
command:
awk 'BEGIN{FS=OFS=","} {cnt=0; for (i=1; i<=NF; i++) if ($i=="Gold" && ++cnt%2==0) $i="Diamond"} 1' customer_data.txt
Output:
101,David,Platinum,Platinum,Platinum 102,Sarah,Gold,Diamond,Gold 103,Alex,Silver,Silver,Silver 104,Morgan,Bronze,Bronze,Bronze 105,Rachel,Gold,Diamond,Gold
This command sets the field separator to a comma and iterates through each field.
It uses a counter to keep track of occurrences of Gold
.
When the counter is even (every second occurrence), it replaces Gold
with Diamond
.
Replace Character in Column
Imagine our customer_data.txt
file looks like this:
101,David,Platinum 102,Sarah,Gold 103,Alex,Silver 104,Morgan,Bronze 105,Rachel,Gold
Suppose you want to replace the character ‘o’ with ‘0’ in the third column only.
The awk
command for this will be:
awk 'BEGIN{FS=OFS=","} {$3=gensub(/o/, "0", "g", $3); print}' customer_data.txt
Output:
101,David,Platin0m 102,Sarah,G0ld 103,Alex,Silver 104,Morgan,Br0nze 105,Rachel,G0ld
This awk
script sets the field separator (FS) and output field separator (OFS) to a comma.
It then uses gensub
function to globally replace ‘o’ with ‘0’ in the third field ($3
) of each line.
Replace Multiple Characters
Suppose our customer_data.txt
contains:
101,David,Platinum 102,Sarah,Gold 103,Alex,Silver 104,Morgan,Bronze 105,Rachel,Gold
To replace both ‘a’ with ‘@’ and ‘o’ with ‘0’ in the entire file, use the following awk
command:
awk '{gsub(/a/, "@"); gsub(/o/, "0"); print}' customer_data.txt
Output:
101,D@vid,Pl@tinum 102,S@r@h,G0ld 103,@lex,Silver 104,M0rg@n,Br0nze 105,R@chel,G0ld
In this command, awk
uses the gsub
function twice: first to globally replace all occurrences of ‘a’ with ‘@’, and then to replace ‘o’ with ‘0’.
Replace Ending Character in Each Line
Let’s use our customer_data.txt
file:
101,David,Platinum; 102,Sarah,Gold; 103,Alex,Silver; 104,Morgan,Bronze; 105,Rachel,Gold;
Suppose you want to replace the semicolon (;
) at the end of each line with a period (.
).
The awk
command for this will be:
awk '{sub(/;$/, "."); print}' customer_data.txt
Output:
101,David,Platinum. 102,Sarah,Gold. 103,Alex,Silver. 104,Morgan,Bronze. 105,Rachel,Gold.
This command uses awk
sub
function which substitutes the semicolon at the end of each line (/$/
signifies the end of the line) with a period.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.