Linux AWK asort() function: Guide to Array Sorting

The asort() function in AWK allows you to sort the contents of an array.

In this tutorial, you will learn various methods for sorting arrays using awk asort() function, from basic sorting of numeric and string values to more advanced concepts like custom sorting orders,  sorting locale-sensitive data, and sorting multidimensional arrays.

 

 

Syntax and Parameters

The basic syntax of the asort() function in awk is:

asort(source_array, destination_array [, how])
  • source_array: This is the array you want to sort.
  • destination_array (optional): This is where the sorted elements will be stored. If omitted, source_array is sorted in place.
  • how (optional): A string that specifies the sorting method. By default, asort() sorts in ascending order. You can also use “ind_str_asc” or “ind_num_asc” for specific sorting types.

Here’s an example:

echo -e "3\n2\n5\n1" | awk 'BEGIN {i=1} {a[i++]=$0} END {asort(a); for (i in a) print a[i]}'

Output:

1
2
3
5

This command takes a list of numbers, stores them in an array, and then sorts the array using asort().

 

Sort an Array

You can sort an array using asort the same way:

awk 'BEGIN {
    speed["line1"] = 5.2
    speed["line2"] = 3.8
    speed["line3"] = 6.5
    asort(speed)
    for (i in speed)
        print i, speed[i]
}'

Output:

1 3.8
2 5.2
3 6.5

In this output, the array speed is sorted in ascending order.

 

Custom Sorting Order

You can pass the custom sorting order function to the how parameter to sort the array with a custom order:

awk '
function compare_speed(i1, v1, i2, v2,   l1, l2, s1, s2) {
    split(i1, l1, "_")
    split(i2, l2, "_")
    s1 = l1[2] + v1/10
    s2 = l2[2] + v2/10
    return (s1 < s2) ? -1 : ((s1 == s2) ? 0 : 1)
}
BEGIN {
    speed["line1_30"] = 5.2
    speed["line2_20"] = 3.8
    speed["line3_25"] = 6.5
    asort(speed, sorted_speed, "compare_speed")
    for (i in sorted_speed)
        print sorted_speed[i], speed[sorted_speed[i]]
}'

Output:

3.8 
6.5 
5.2

In this output, the custom function compare_speed sorts the array based on a combined metric of line number and speed.

 

Sort String Values

The asort() function allows you to sort arrays alphabetically in a case-sensitive manner by default:

awk 'BEGIN {
    plans["a"] = "Unlimited"
    plans["b"] = "Basic"
    plans["c"] = "Premium"
    asort(plans)
    for (i in plans)
        print i, plans[i]
}'

Output:

1 Basic
2 Premium
3 Unlimited

awk handles string values according to their ASCII values.

 

Locale-Sensitive Sorting

awk respects the locale settings of the environment which leads to different sorting orders based on the specified locale.

awk 'BEGIN {
    # Set the locale to French
    LANG="fr_FR.UTF-8"
    service_categories["a"] = "Élite"
    service_categories["b"] = "économique"
    service_categories["c"] = "Affaires"
    asort(service_categories)
    for (i in service_categories)
        print i, service_categories[i]
}'

Output:

1 Affaires
2 Élite
3 économique

In this output, the service_categories array is sorted considering French locale settings.

The sort order respects the accents and cases which are significant in many non-English locales.

 

Sort Subarrays (Multidimensional Array)

Sorting subarrays in awk requires you to sort not only the elements of the main array but also the elements within each subarray.

Here’s how you can sort a multidimensional array:

awk '
function sort_subarray(array, n) {
    for (i = 1; i <= n; i++) {
        asort(array[i])
    }
}
BEGIN {
    plans[1]["feature1"] = "Unlimited Calls"
    plans[1]["feature2"] = "10GB Data"
    plans[2]["feature1"] = "500MB Data"
    plans[2]["feature2"] = "100 Min Calls"
    plans[3]["feature1"] = "2GB Data"
    plans[3]["feature2"] = "Unlimited Texts"

    sort_subarray(plans, 3)

    for (plan in plans) {
        print "Plan " plan ":"
        for (feature in plans[plan])
            print "\t" plans[plan][feature]
    }
}'

Output:

Plan 1:
	10GB Data
	Unlimited Calls
Plan 2:
	100 Min Calls
	500MB Data
Plan 3:
	2GB Data
	Unlimited Texts

In this output, each plan’s features are sorted within their respective subarrays.

The sort_subarray function iterates through each subarray and applies asort() to sort the features of each plan.

 

awk function asort never defined

If you’re encountering the error “awk function asort never defined”, it likely means that your version of AWK does not support the asort function.

The asort function is available in GNU AWK (gawk) but not present in other versions of AWK, such as nawk or mawk.

To resolve this issue, you need to install gawk:

Debian-based systems like Ubuntu:

sudo apt-get update
sudo apt-get install gawk

RPM-based systems like RedHat:

sudo dnf install gawk
Leave a Reply

Your email address will not be published. Required fields are marked *