Python NumPy arange() Tutorial

The NumPy  arange() function is used to generate a sequence of values within a given interval. You can think of it as a numerical range maker. It allows you to create a NumPy array with evenly spaced values within a specified range.

It offers more functionality, like the ability to work with floating point numbers and the flexibility to define the interval of the values contained in the array explicitly.

Throughout this tutorial, we’ll explore the syntax, parameters, and various use-cases of the np.arange() function, showcasing its versatility and utility in data creation, manipulation, and difference between arange() and other functions such as linspace() and Python range().



Syntax and Parameters

The np.arange() function in NumPy has a straightforward syntax:

numpy.arange([start, ]stop, [step, ]dtype=None)

The function takes the following positional arguments:

  • start: This is an optional argument, representing the start of the interval. If not provided, the function assumes start = 0.
  • stop: This is a mandatory argument that defines the end of the interval. The stop value is not included in the generated sequence unless explicitly instructed otherwise.
  • step: This is also an optional argument, defining the spacing between values. The default step size is 1.
  • dtype: This is an optional argument where you can specify the desired data type of the resulting array. If not provided, the function will infer the data type from the other input arguments.Here’s how to use the np.arange() function with the dtype parameter:
import numpy as np
arr = np.arange(0, 20, 2, dtype=int)


array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In this example, we create an array with the dtype parameter explicitly set to int. We define the interval [0,20) with a step value of 2.

The function returns an array with evenly spaced integer values within the given interval.


What does np.arange function do?

The np.arange function, part of the NumPy library, is a function used to generate a one-dimensional array of numeric values within a defined interval.

The elements in the array are evenly spaced as per the interval and step size provided to the function.
Here is another demonstration of the function:

import numpy as np
arr = np.arange(5, 15)


array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In this example, we use the NumPy arange function to create an array with values between 5 and 15.

Notice how the resulting array begins with the start value, 5, and includes every subsequent integer (due to the default step size of 1), up to but not including the end value, 15.


Generating an Array of Evenly Spaced Values

The NumPy arange function generates an array with evenly spaced values based on a specified interval and step value. Let’s consider a more detailed example:

import numpy as np

# create an array with a specified step value
arr = np.arange(0, 50, 5)


array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

In the above code, we use the NumPy arange function to create an array starting from 0, ending at 50, with a step size of 5. The array returned by the function includes evenly spaced values within the defined interval.
The step value dictates the difference between consecutive numbers in the array.

In this example, every number is 5 greater than the preceding number, indicating an evenly spaced sequence as per the interval and step size.
This kind of array with evenly spaced elements is helpful in scenarios where you need data in a particular range with a specific difference between consecutive values.


Using arange() with Floating Point Numbers

The np.arange() function is versatile and can also be used to generate an array of floating point numbers.

This is especially useful when you need a more granular sequence of decimal values.
Here’s how you can create an array with floating point numbers:

import numpy as np

# Create an array with floating point numbers
arr = np.arange(0.0, 1.0, 0.1)


array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

In this example, we use the NumPy arange function to create an array with evenly spaced decimal values between 0.0 and 1.0.

The function takes the start, stop, and step values as arguments, generating an array that starts at 0.0 (inclusive) and ends at 1.0 (exclusive), with a step value of 0.1.
As a result, the function returns an array of floating-point numbers with evenly spaced elements as per the interval specified. This is particularly useful when precision is required in the sequence of values.


Negative Step Size in arange() (Reverse Array)

The np.arange() function also supports a negative step size, allowing us to create a descending sequence of numbers.
Let’s illustrate this with an example:

import numpy as np

# Create an array with a negative step size
arr = np.arange(10, 0, -1)


array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1])

In this code snippet, we use the NumPy arange function to create a descending array starting from 10 and ending at 0 (exclusive), with a step size of -1.
When a negative step size is used, the start value should be greater than the stop value to get a non-empty array.

The resulting array comprises a sequence of values, with each value being 1 less than the preceding value due to the step value of -1.

This feature is particularly useful when you need to generate a reverse array.


Handling Memory Efficiency with dtype Parameter

One of the advantages of using NumPy functions for array creation is their efficient memory management.

The np.arange() function offers a dtype parameter, which allows you to specify the data type of the elements in the array.

By selecting an appropriate data type, you can control the memory allocation for the array.
Here’s an example of how to do it:

import numpy as np

# Create an array with dtype parameter
arr = np.arange(0, 10, dtype=np.int8)


array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int8)

In this example, we create an array using the NumPy arange function and explicitly specify the data type of the elements as np.int8.

This is an 8-bit integer type, which can hold values from -128 to 127, thus using less memory compared to higher bit integers.
This approach can be particularly helpful when you are working with large arrays and want to optimize memory usage.


Differences between arange() and linspace()

While both np.arange() and np.linspace() are used to create arrays with evenly spaced values, they differ in how they define the spacing and the endpoint.
To clarify the differences, let’s create similar arrays using both functions.
With np.arange():

import numpy as np
arr = np.arange(0, 10, 2)


array([0, 2, 4, 6, 8])

With np.linspace():

import numpy as np
arr = np.linspace(0, 8, 5)


array([0., 2., 4., 6., 8.])

The primary differences are:

  1. Endpoint Inclusion: np.arange() does not include the endpoint by default, while np.linspace() does include the endpoint.
  2. Return Type: Both functions return a numpy.ndarray, but np.linspace() can return an array with floating point numbers even when the inputs are integers due to its internal computations.


Pros and Cons of Using arange() Over linspace()

np.arange() and np.linspace() are both powerful functions in NumPy that generate arrays with evenly spaced values.

However, depending on your needs, one function may be more suitable than the other. Here’s a comparison of their advantages and disadvantages.
Pros of Using arange():

  1. Simplicity: np.arange() is similar to Python’s built-in range function, making it intuitive for those transitioning from vanilla Python to NumPy.
  2. Step-Based Spacing: np.arange() generates sequences based on a defined ‘step’ size, which is helpful when you want specific increments between values.

Cons of Using arange():

  1. Excluding the Stop Value: By default, np.arange() doesn’t include the stop value in the array, which can be a source of confusion.
  2. Precision with Floating Point: When using np.arange() with floating-point step values, the output size may not be as predictable due to the finite precision of floating point arithmetic.

Pros of Using linspace():

  1. Including the Endpoint: np.linspace() includes the endpoint by default, which can be desirable in many cases.
  2. Control Over Number of Elements: With np.linspace(), you can specify the exact number of elements you want in your array, which provides finer control when you need a specific size of the array.

Cons of Using linspace():

  1. Complexity: np.linspace() requires understanding of the ‘num’ parameter, which may not be as straightforward as the ‘step’ parameter in np.arange().

In summary, your choice between np.arange() and np.linspace() will depend on your specific needs, whether you want control over the step size or the total number of points, and whether you want the endpoint included or not.


Differences Between arange() and Python range()

The Python built-in range() function and NumPy’s arange() function both generate sequences of numbers, but they have significant differences in their use, flexibility, and functionality.
Here’s a comparison using examples from both:
With Python’s range():

arr = list(range(0, 10, 2))


[0, 2, 4, 6, 8]

With NumPy’s arange():

import numpy as np
arr = np.arange(0, 10, 2)


array([0, 2, 4, 6, 8])

Here are the key differences:

  1. Return Type: Python’s range() function returns a range object that needs to be converted to a list to display its values. On the other hand, np.arange() directly returns a NumPy array (numpy.ndarray).
  2. Usage with Floats: Python’s range() only works with integer start, stop, and step values, while np.arange() can accept floating point numbers.
  3. Performance: np.arange() is faster and more efficient for large ranges because it directly creates an array in memory. In contrast, Python’s range() is better for small ranges as it creates elements on the fly and doesn’t consume memory unnecessarily.

These differences highlight how np.arange() can be more versatile and efficient for numerical computations involving arrays, particularly with large datasets or complex numerical tasks.


NumPy arange() in Realworld

I was hired by a client to help them understand and visualize their large dataset, which consisted of temperature readings taken every hour over the past year.

The raw dataset was just a plain sequence of 24 * 365 = 8760 temperature values, with no corresponding timestamps.

The client wanted to see how the temperature trend evolved over time, and the first thing I realized was that I needed to align the temperature values with their respective timestamps.

Since the temperature data was collected hourly, I needed to create an array representing each hour of the year as a way to model time.

And here it comes NumPy arange() function.

I wrote the following piece of code to generate the required time data:

import numpy as np

# Creating an array with each hour of the year
hours = np.arange(0, 8760)

This code efficiently created a numpy array from 0 to 8759 (representing each hour of the year), which I used as the X-axis values for the plot.

Next, I plotted the temperature data against these timestamps using matplotlib.

The temperature readings were stored in the temperature_values array. I used the following code to create the plot:

import matplotlib.pyplot as plt
plt.plot(hours, temperature_values)
plt.xlabel('Hour of the Year')

The NumPy arange() function made it incredibly easy and efficient to create the necessary time data.


Further Reading

Leave a Reply

Your email address will not be published. Required fields are marked *