Replace Text by Line and Column Using awk in Linux

The awk command in Linux is a scripting language designed for text processing.

In this tutorial, you’ll learn to use awk for replacing text in Linux.

We’ll cover how to replace single occurrences, specific columns, multiple characters, and recursive replacements.

 

 

Replace By Line Number

Imagine you have a file named customer_data.txt with the following content:

1234,active
5678,inactive
9101,active
1213,inactive
1415,active

To replace the 3rd line of this file, you can use the following awk command:

awk 'NR==3 {$0="9101,updated"} 1' customer_data.txt

Output:

1234,active
5678,inactive
9101,updated
1213,inactive
1415,active

This command specifies that for the 3rd line (NR==3), the entire line ($0) should be replaced with “9101,updated”.

The 1 at the end of the command ensures that all lines are printed.

 

Replace First Occurrence in Each Line

Considering our customer_data.txt file:

101,David,Gold,Platinum,Gold
102,Sarah,Gold,Gold,Bronze
103,Alex,Silver,Gold,Silver
104,Morgan,Bronze,Gold,Gold
105,Rachel,Gold,Silver,Gold

To replace only the first occurrence of Gold with Diamond in each line, use the following awk command:

awk '{sub(/Gold/, "Diamond"); print}' customer_data.txt

Output:

101,David,Diamond,Platinum,Gold
102,Sarah,Diamond,Gold,Bronze
103,Alex,Silver,Diamond,Silver
104,Morgan,Bronze,Diamond,Gold
105,Rachel,Diamond,Silver,Gold

This command uses awk sub function which replaces the first instance of the pattern /Gold/ with Diamond in each line.

The print command then outputs the modified lines.

 

Replace Last Occurrence in Line

The content of customer_data.txt is:

101,David,Platinum,Platinum
102,Sarah,Gold,Gold
103,Alex,Silver,Silver
104,Morgan,Bronze,Bronze
105,Rachel,Gold,Gold

To replace the last occurrence of Gold with Diamond, use the following awk command:

awk 'BEGIN{FS=OFS=","} {for (i=NF; i>0; i--) if ($i=="Gold") {$i="Diamond"; break}}1' customer_data.txt

Output:

101,David,Platinum,Platinum
102,Sarah,Gold,Diamond
103,Alex,Silver,Silver
104,Morgan,Bronze,Bronze
105,Rachel,Gold,Diamond

This awk script sets the field separator to a comma and iterates backward through each field in a line.

When it finds Gold, it replaces it with Diamond and stops the loop.

 

Replace Every nth Occurrence

Consider the following file:

101,David,Platinum,Platinum,Platinum
102,Sarah,Gold,Gold,Gold
103,Alex,Silver,Silver,Silver
104,Morgan,Bronze,Bronze,Bronze
105,Rachel,Gold,Gold,Gold

To replace every second occurrence of Gold with Diamond, use this awk command:

awk 'BEGIN{FS=OFS=","} {cnt=0; for (i=1; i<=NF; i++) if ($i=="Gold" && ++cnt%2==0) $i="Diamond"} 1' customer_data.txt

Output:

101,David,Platinum,Platinum,Platinum
102,Sarah,Gold,Diamond,Gold
103,Alex,Silver,Silver,Silver
104,Morgan,Bronze,Bronze,Bronze
105,Rachel,Gold,Diamond,Gold

This command sets the field separator to a comma and iterates through each field.

It uses a counter to keep track of occurrences of Gold.

When the counter is even (every second occurrence), it replaces Gold with Diamond.

 

Replace Character in Column

Imagine our customer_data.txt file looks like this:

101,David,Platinum
102,Sarah,Gold
103,Alex,Silver
104,Morgan,Bronze
105,Rachel,Gold

Suppose you want to replace the character ‘o’ with ‘0’ in the third column only.

The awk command for this will be:

awk 'BEGIN{FS=OFS=","} {$3=gensub(/o/, "0", "g", $3); print}' customer_data.txt

Output:

101,David,Platin0m
102,Sarah,G0ld
103,Alex,Silver
104,Morgan,Br0nze
105,Rachel,G0ld

This awk script sets the field separator (FS) and output field separator (OFS) to a comma.

It then uses gensubfunction to globally replace ‘o’ with ‘0’ in the third field ($3) of each line.

 

Replace Multiple Characters

Suppose our customer_data.txt contains:

101,David,Platinum
102,Sarah,Gold
103,Alex,Silver
104,Morgan,Bronze
105,Rachel,Gold

To replace both ‘a’ with ‘@’ and ‘o’ with ‘0’ in the entire file, use the following awk command:

awk '{gsub(/a/, "@"); gsub(/o/, "0"); print}' customer_data.txt

Output:

101,D@vid,Pl@tinum
102,S@r@h,G0ld
103,@lex,Silver
104,M0rg@n,Br0nze
105,R@chel,G0ld

In this command, awk uses the gsub function twice: first to globally replace all occurrences of ‘a’ with ‘@’, and then to replace ‘o’ with ‘0’.

 

Replace Ending Character in Each Line

Let’s use our customer_data.txt file:

101,David,Platinum;
102,Sarah,Gold;
103,Alex,Silver;
104,Morgan,Bronze;
105,Rachel,Gold;

Suppose you want to replace the semicolon (;) at the end of each line with a period (.).

The awk command for this will be:

awk '{sub(/;$/, "."); print}' customer_data.txt

Output:

101,David,Platinum.
102,Sarah,Gold.
103,Alex,Silver.
104,Morgan,Bronze.
105,Rachel,Gold.

This command uses awk sub function which substitutes the semicolon at the end of each line (/$/ signifies the end of the line) with a period.

Leave a Reply

Your email address will not be published. Required fields are marked *