NumPy random seed (Generate Predictable random Numbers)

In computational science, a random seed is a starting point for the sequence of pseudorandom numbers that are generated.

These numbers appear random, but they follow a deterministic sequence. The seed determines the initial state of this sequence.
In Python’s NumPy library, you can set the random seed using the numpy.random.seed() function. This will make the output of random number generation predictable and reproducible.

 

 

Pseudorandom vs. True Random Numbers

Random numbers can be broadly classified into two categories: pseudorandom numbers and true random numbers.

Pseudorandom Numbers

Pseudorandom numbers are generated using deterministic algorithms. Given the same initial seed, they produce the same sequence of numbers every time.

Pseudorandom numbers are efficient to generate and suitable for most applications, including simulations and statistical sampling.

True Random Numbers

True random numbers are generated from fundamentally random physical processes.

They are not predictable and do not follow an algorithm. True random numbers are typically used in cryptographic applications where unpredictability is crucial.
In Python, true random numbers can be obtained using specialized hardware or online services, but they are outside the scope of this tutorial.

 

How to set a random seed in NumPy?

You use the numpy.random.seed() function and provide an integer that will be used as the seed.
Here’s an example:

import numpy as np
np.random.seed(5)
print(np.random.rand())

Output:

0.22199317108973948

In this code, the random seed is set to 5. Every time you run this code, the random float generated will be the same.

You can change the seed to any integer to generate a different sequence of random numbers, but the sequence corresponding to a specific seed will always be the same.

 

Why use a random seed?

When working with random numbers, consistency and reproducibility can be crucial, especially in scientific computations, simulations, or machine learning tasks.

By using a random seed, you can ensure that the random numbers generated are the same every time the code is run.
Here’s a simple demonstration:
Without seeding:

import numpy as np
random_numbers_without_seed = [np.random.rand() for _ in range(5)]
print(random_numbers_without_seed)

Output:

[0.9507143064099162, 0.7319939418114051, 0.5986584841970366, 0.15601864044243652, 0.15599452033620265]

With seeding:

np.random.seed(42)
random_numbers_with_seed = [np.random.rand() for _ in range(5)]
print(random_numbers_with_seed)

Output:

[0.3745401188473625, 0.9507143064099162, 0.7319939418114051, 0.5986584841970366, 0.15601864044243652]

In the first code snippet, without setting a seed, the random numbers will be different each time you run the code.

In the second snippet, where we set the seed to 42, the numbers will be identical each time you run it.

This allows for testing, validation, and sharing of your code in a manner that others can replicate exactly.

 

How to set the global random seed?

Setting the global random seed in NumPy affects all random number generation functions in the library. It’s a crucial tool for making code involving random processes reproducible.
Here’s an example:

import numpy as np

np.random.seed(42)
print(np.random.rand())
print(np.random.randint(10, 20))

Output:

0.3745401188473625
17

By setting the seed to 42, both the random float and random integer generated will be the same each time the code run.

This demonstrates how setting the global seed affects all random functions in NumPy.

Examples of functions affected

The global random seed in NumPy affects a wide range of functions that generate random numbers or perform random operations.

Here are examples of some of these functions.

rand

Generates random floats between 0 and 1:

import numpy as np
np.random.seed(0)
print(np.random.rand(3))

Output:

[0.5488135  0.71518937 0.60276338]

randint

Generates random integers within a specified range:

np.random.seed(0)
print(np.random.randint(1, 10, 3))

Output:

[6 1 4]

shuffle

Shuffles the elements of an array randomly:

np.random.seed(0)
arr = [1, 2, 3, 4, 5]
np.random.shuffle(arr)
print(arr)

Output:

[3, 1, 2, 4, 5]

Each of these functions is affected by the global seed, and setting the seed ensures that the results are consistent across different runs of the code.

 

Best practices for seeding

Setting the seed early in your code

Set the seed at the beginning of your code or a function that requires reproducible random numbers. This ensures that the sequence is initialized properly.

Choosing arbitrary seed values vs deterministic seeds

An arbitrary seed value leads to a specific sequence of random numbers.

Deterministic seeds, like using the current date, can also be used, but they won’t ensure reproducibility across different runs or machines.

Managing seeds for reproducibility across code executions

It’s essential to document the seed values used in your code to ensure that others can reproduce the exact results.

Here’s a code snippet that shows best practices:

import numpy as np

# Set the seed early
seed_value = 42
np.random.seed(seed_value)
random_numbers = np.random.rand(3)
print(f"Seed: {seed_value}")
print(f"Random Numbers: {random_numbers}")

Output:

Seed: 42
Random Numbers: [0.37454012 0.95071431 0.73199394]

These practices ensure that your code’s random processes are transparent, controlled, and reproducible, both for you and others who might use your code.

 

Implementing simulations with reproducible results

When implementing simulations that require random number generation, it is often crucial to reproduce the results.

Using a fixed seed is the key to achieving this.
Here’s an example of a simple Monte Carlo simulation to estimate the value of π:

import numpy as np
np.random.seed(0)
num_points = 10000
inside_circle = 0
for _ in range(num_points):
    x, y = np.random.rand(2)
    if x**2 + y**2 <= 1:
        inside_circle += 1
estimated_pi = (inside_circle / num_points) * 4
print("Estimated π:", estimated_pi)

Output:

Estimated π: 3.1428

By setting the seed at the beginning of the simulation, you can ensure that the results are consistent every time you run it.

This allows you to compare changes, validate your simulation, and share it with confidence that others will obtain the same results.

 

Resources

https://numpy.org/doc/stable/reference/random/generated/numpy.random.seed.html

Leave a Reply

Your email address will not be published. Required fields are marked *