Linux AWK match Function: Search Strings Using Patterns
The match
function in awk
allows you to search for patterns within a string.
In this tutorial, you’ll learn how to use the awk
match
function, perform conditional processing based on matches, and iterate over multiple matches within a string.
Syntax and Usage
The basic syntax of the awk
match
function is:
awk '{ if (match($0, pattern)) print $0; }' filename
Here, $0
represents the entire line of input, and pattern
is the regular expression you are searching for in each line of the file named filename
.
Let’s consider a sample data file sample_data.txt
contains various log entries:
2024-03-10 10:15:00, Data Plan Activated, User 45678 2024-03-10 10:17:00, Data Plan Deactivated, User 12345 2024-03-10 10:19:00, Payment Received, User 45678
To find all entries related to Data Plan Activated
, use the following command:
awk '{ if (match($0, "Data Plan Activated")) print $0; }' sample_data.txt
Output:
2024-03-10 10:15:00, Data Plan Activated, User 45678
This command searches each line for the phrase “Data Plan Activated” and prints the line where the match is found.
Using the RSTART and RLENGTH variables
The RSTART
and RLENGTH
variables allow you to capture the position and length of the matched substring.
When a match is found, RSTART
will contain the index of the first character of the matched substring, and RLENGTH
will contain the length of the matched substring.
Consider the same data file, sample_data.txt
.
Suppose you want to extract the timestamp from lines that mention ‘Data Plan Activated’:
awk '{ if (match($0, /[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/)) print substr($0, RSTART, RLENGTH) }' sample_data.txt
Output:
2024-03-10 10:15:00 2024-03-10 10:17:00 2024-03-10 10:19:00
In this command, the regular expression [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}
matches the timestamp format.
Once a match is found, substr($0, RSTART, RLENGTH)
extracts the substring starting from RSTART
with the length of RLENGTH
.
Conditional Processing Based On Matches
Suppose you want to take different actions based on whether the line contains ‘Data Plan Activated’ or ‘Data Plan Deactivated’.
awk '{ if (match($0, "Data Plan Activated")) { print "Activation: ", $0 } else if (match($0, "Data Plan Deactivated")) { print "Deactivation: ", $0 } }' sample_data.txt
Output:
Activation: 2024-03-10 10:15:00, Data Plan Activated, User 45678 Deactivation: 2024-03-10 10:17:00, Data Plan Deactivated, User 12345
This script uses if
and else if
conditions to check for matches and perform different print actions.
When ‘Data Plan Activated’ is matched, it prints the line prefixed with “Activation: “, and when ‘Data Plan Deactivated’ is matched, it prefixes the line with “Deactivation: “.
Find & Process Multiple Matches
Let’s use the following sample data:
User 12345, Data Plan Activated, Payment Pending; User 67890, Data Plan Deactivated, Payment Complete
Imagine you need to extract and process each user’s details separately from this line.
Here’s how you can iterate over multiple matches using awk
:
awk '{ n = split($0, segments, "; "); for (i = 1; i <= n; i++) { if (match(segments[i], /User [0-9]+, Data Plan (Activated|Deactivated), Payment (Pending|Complete)/)) { print segments[i] } } }' sample_data.txt
Output:
User 12345, Data Plan Activated, Payment Pending User 67890, Data Plan Deactivated, Payment Complete
In this example, split($0, segments, "; ")
splits the line into segments based on the semicolon and space delimiter.
The for
loop iterates through each segment, and match()
is used to check if the segment contains the desired pattern. If a match is found, that segment is printed.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.