Text Replacement with sed: Guide to Substitute Command
Text replacement is one of its most used capabilities of sed
command. This feature is valuable when you need to replace instances of text patterns across large files or streams of input without manually editing each occurrence.
The basic syntax for text replacement with sed
is:
sed 's/search_pattern/replacement_text/g' filename
In this structure:
s
signals that we are performing a substitution.search_pattern
identifies the sequence of characters you wish to replace.replacement_text
assigns the new content you’d like in place of the search pattern.g
ensures a global replacement, meaning every occurrence in each line undergoes replacement. Withoutg
, only the first instance on each line gets addressed.filename
denotes the target file you’re working with.
By default, sed
sends the modified content to standard output (your terminal) without changing the original file.
Replacing Every Occurrence on Each Line
Assume you have a file named sample.txt
with the following content:
Hello, world! Hello, user! Hello, admin!
Now, suppose you want to replace every occurrence of “Hello” with “Hi”. Here’s how you’d do it:
$ sed 's/Hello/Hi/g' sample.txt
Output:
Hi, world! Hi, user! Hi, admin!
Here’s the breakdown of what we just did:
s
: Signals a substitution operation.
Hello
: This is the search pattern.
Hi
: This is the replacement text.
g
: Instructs sed
to replace all occurrences in a line. Without it, only the first “Hello” in each line would be replaced.
If you want to save the modified content back to the file, you’d redirect the output or use sed’s in-place editing option
sed -i 's/Hello/Hi/g' sample.txt
Replacing Only the First Occurrence on Each Line
You can restrict text replacement to only the first occurrence by omitting the g
flag in the sed
command.
Consider the same sample.txt
file we used earlier:
Hello, world! Hello again. Hello, user! Hello once more. Hello, admin! Hello for the last time.
Let’s replace only the first instance of “Hello” with “Hi” on each line:
$ sed 's/Hello/Hi/' sample.txt
Output:
Hi, world! Hello again. Hi, user! Hello once more. Hi, admin! Hello for the last time.
By removing the g
flag, sed
only targets the first “Hello” on each line, leaving subsequent occurrences untouched.
Using Delimiters
In the basic syntax s/search_pattern/replacement_text/g
, the character /
is the delimiter.
It differentiates between the command, the search pattern, and the replacement. In essence, it tells sed
where one section ends and another begins.
Changing the Default Delimiter
If your pattern or replacement text contains a lot of forward slashes (common with file paths or URLs), constantly escaping them with backslashes can make your command hard to read.
In such cases, you can use a different delimiter.
Let’s consider a real-world scenario where you want to replace the path /home/user/old_dir
with /home/user/new_dir
. Using the default delimiter would look like this:
$ sed 's/\/home\/user\/old_dir/\/home\/user\/new_dir/g' filename.txt
That’s quite cluttered, isn’t it? Let’s change the delimiter to #
:
$ sed 's#/home/user/old_dir#/home/user/new_dir#g' filename.txt
This command is more readable. You can use any character as a delimiter.
Case-insensitive Replacement
The I
flag with sed
enables you to perform case-insensitive searches and replacements, ensuring that you catch all variations of a particular pattern.
Let’s work with a sample file, cases.txt
, containing:
Linux is great. LINUX is powerful. linux is open-source.
If you want to replace every instance of “linux” with “UNIX”, regardless of case, here’s how you’d do it:
$ sed 's/linux/UNIX/Ig' cases.txt
Output:
UNIX is great. UNIX is powerful. UNIX is open-source.
A quick note for portability: While the I
flag works with GNU sed
(common on Linux), if you’re working on macOS or using BSD sed
, you’d use the i
flag instead.
Limiting the Number of Replacements
With sed
, you can limit the number of replacements to a specific count by appending a number after the substitute command, which dictates the specific occurrence to target for replacement.
Replace a Specific Occurrence
Given a file, repeats.txt
, that reads:
apple apple apple banana banana banana cherry cherry cherry
Suppose you want to replace only the second occurrence of each fruit with “fruit”. Here’s how:
$ sed 's/apple/fruit/2' repeats.txt
Output:
apple fruit apple banana fruit banana cherry fruit cherry
If you wanted to target the third occurrence instead, you’d simply change the number to 3.
Escaping Special Characters
Special characters, often termed metacharacters, have specific meanings in regular expressions.
To use them as literal characters, or to avoid their special meanings, you must “escape” them using a backslash (\
).
Common Special Characters
In sed
and regular expressions, several characters have unique roles:
.
: Matches any single character.*
: Matches zero or more of the preceding character or group.^
: Anchors the pattern to the start of a line.$
: Anchors the pattern to the end of a line.[...]
: Matches any one of the characters inside the brackets.(
and)
: Groups patterns.
Escaping Special Characters
To use any of these characters literally in a sed
command, prepend them with a backslash.
For instance, if you have a file named special.txt
with content:
end...end start*start start.end
And you want to replace ...
with ---
:
$ sed 's/\.\.\./---/g' special.txt
Output:
end---end start*start start.end
Here, you’re escaping each period (.
) with a backslash to ensure sed
interprets them as literal dots and not as a wildcard character matching any character.
Similarly, to replace *
with +
, you’d use:
$ sed 's/\*/+/g' special.txt
Replacing Text in Multiple Files (sed with find)
The find
command allows you to search for files within a directory hierarchy. Pairing it with sed
lets you recursively replace text across numerous files.
Imagine you have a project with various text files, and you want to replace all instances of “old_project” with “new_project”. Execute the following:
$ find /path/to/directory -type f -name "*.txt" -exec sed -i 's/old_project/new_project/g' {} +
Breaking down the components:
find /path/to/directory
: Search within the specified directory.-type f
: Targets only files.-name "*.txt"
: Filters the search to.txt
files.-exec
: Executes a command on each found item.sed -i 's/old_project/new_project/g'
: Thesed
command you’re familiar with. The-i
flag tellssed
to edit files in place.{} +
: This syntax allowsfind
to replace{}
with the file names found, effectively passing them to thesed
command for processing.
Using sed with xargs for Enhanced Performance
In some cases where you’re dealing with multiple files, using xargs
will boost performance by reducing the number of individual sed
processes spawned:
$ find /path/to/directory -type f -name "*.txt" | xargs sed -i 's/old_project/new_project/g'
Here, xargs
takes the list of files from find
and feeds them to sed
in more sizeable chunks, minimizing process overhead.
Benchmark exec vs. xargs (xargs is faster)
To benchmark the speed of file content replacement using sed
and find
with exec
and xargs
, we have a directory filled with 1000 files containing a certain pattern.
Let’s replace that pattern with another string.
Here’s a simple step-by-step guide:
- Create a directory with a large number of files containing a specific pattern.
- Measure the time taken to replace the pattern using
find
withexec
. - Measure the time taken to replace the pattern using
find
withxargs
.
Create a Directory with Sample Files:
mkdir benchmark_dir cd benchmark_dir # Create 1000 files with the content "replace_me" for i in {1..1000}; do echo "replace_me" > file_$i.txt done
Benchmark using find
with exec
:
time find . -type f -name '*.txt' -exec sed -i 's/replace_me/replaced/g' {} \;
Output:
real 0m2.187s user 0m1.258s sys 0m0.825s
Reset the Files for testing xargs
:
for i in {1..1000}; do echo "replace_me" > file_$i.txt done
Benchmark using find
with xargs
:
time find . -type f -name '*.txt' | xargs sed -i 's/replace_me/replaced/g'
Output:
real 0m0.271s user 0m0.016s sys 0m0.249s
As you can see from comparing the real times from both methods, xargs is faster than using exec due to the reduced overhead of process creation.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.