Most of you must have to use NumPy random seed during Python coding. Sometimes, we use code repeatedly but don’t exactly know the purpose it serves.
A similar situation is with NumPy random seed. This article is to understand the use of the random seed. And to understand the actual need for random seed and what purpose does it serve.
Table of Contents
What is NumPy Random Seed?
As the name signifies, the purpose of random seed is related to random numbers. The syntax mostly used is:
import numpy as np np.random.seed()
random() is the module offered by the NumPy library in Python to work with random numbers. The NumPy random() function does not generate ‘truly’ random numbers but we used it to generate pseudo-random numbers.
By Pseudo-random numbers we mean, they can be determined, not exactly generated randomly. We will explain pseudo-random numbers in detail in the next section.
The random() function generates pseudo-random numbers based on a seed value.
What is the Pseudo-Random Number?
As the name signifies, the Pseudo-random number is not a ‘truly’ random number but a partial random number. The pseudo-random numbers are computer-generated numbers that look like they are random, but are actually pre-determined.
Our computer system works on algorithms. If we give the same input to an algorithm, the output remains the same.
A set of algorithms created by Computer Scientists to generate pseudo-random numbers, which approximates the properties of random numbers. These algorithms are called “pseudo-random number generators.”
NumPy random seed functions generate random numbers based on “pseudo-random number generators” algorithms.
Random Seed Importance
NumPy random() function generates pseudo-random numbers based on some value. This value is called a seed value.
Numpy.random.seed() method initialized a Random State. Every time this module is called, the generator is re-seeded.
For a specific seed value, the random state of the seed function is saved. So, the particular seed value will produce the same random numbers even on multiple executions.
The same seed value led to the same random number generation even on different machines given the environment remains the same.
import numpy as np np.random.seed(101) #Here, 101 is seed value np.random.randint(low = 1, high = 10, size = 10)
With seed value 101, the above random function generates the same output every time.
Here, we can use different seed values. E.g., seed value 100, generates the below output every time for the same random function.
import numpy as np np.random.seed(100) #Here, 100 is seed value np.random.randint(low = 1, high = 10, size = 10)
NumPy.random has no Seed Number
Now the question arises what if we don’t give any seed number, then what will happen. Let’s try and execute code with no seed number.
import numpy as np np.random.seed() np.random.randint(low = 1, high = 10, size = 10)
Output on two executions:
We have executed our code twice and the output is different both times. With no seed number, it picks random seeds and different random numbers generated every single time.
Actually, random seed always uses the current system’s time as a seed value when we don’t assign a seed number.
NumPy.random.seed(0) sets the random seed to ‘0’. The pseudo-random numbers generated with seed value 0 will start from the same point every time. NumPy.random.seed(0) is widely used for debugging in some cases.
import numpy as np np.random.seed(0) np.random.randint(low = 1, high = 10, size = 10)
Output on two executions:
From the above example, in both executions, we got the same set of random numbers with the same seed value ‘0’.
NumPy.random.seed(101) sets the random seed to ‘101’. The pseudo-random numbers generated with seed value ‘101’ will start from the same point every time.
import numpy as np np.random.seed(101) np.random.randint(low = 1, high = 10, size = 10)
Output on two executions:
From, above example, in both executions, we got the same set of random numbers with seed value 101.
random seed scope
What will happen if we change the random seed scope? Let’s try with an example.
import numpy as np np.random.seed(242) print("random 1: ", np.random.randint(0, 10, 5)) print("random 2: ", np.random.randint(0, 10, 5)) np.random.seed(242) print("random 3: ", np.random.randint(0, 10, 5))
From the above code we see, output of ‘random 1’ and ‘random 2’ are different. Seed value ‘242’ works for ‘random1’ only.
For the ‘random 2’ array, the seed value is picked up randomly. And when we again set the seed value to ‘242’ for ‘random 3’, the same value as of ‘random 1’ comes out.
Seed to the Time
Time never stops. It keeps on moving. Using time as a random seed number is a great idea. Every time we execute the code, the current time changes, so the seed value changes and we get different random numbers on every execution.
import numpy as np import time np.random.seed(int(time.time())) np.random.randint(low = 1, high = 10, size = 10)
Output on two executions:
As we can see from the above example, on both execution different random numbers are generated with the current time as a seed value.
Random Seed Multiprocessing
Multiprocessing is implemented to improve the performance of the system. Every thread executes a different process or we can say multiple processes executed independently.
Imagine, we are implementing multithreading with the same seed value, the output will be the same for every thread. Then what’s the use of running multiple processes. It will be a complete disaster implementation of multiprocessing.
Let’s implement two processes with the same seed value:
import numpy as np from multiprocessing import Process def square_num(): """ function to print square of random number """ np.random.seed(101) num = np.random.random() print("Square of "+ str(num) + " is: " + str(num*num)) if __name__ == '__main__': p1 = Process(target=square_num) #Process 1 p2 = Process(target=square_num) #Process 2 #Start Process p1.start() p2.start() p1.join() p2.join() #Both process finished print("Done")
From the above example, we can see that we generated the same random number using the same seed value and both processes give the same output.
So, setting random seed values for the different threads is the key. You can do this by explicitly setting different seed numbers for every processor. By doing this, it will randomly pick by itself.
""" function to print square of random number """ np.random.seed() num = np.random.random() print("Square of "+ str(num) + " is: " + str(num*num)) if __name__ == '__main__': p1 = Process(target=square_num) #Process 1 p2 = Process(target=square_num) #Process 2 #Start Process p1.start() p2.start() p1.join() p2.join() #Both process finished print("Done")
To implement multiprocessing, randomly picking seed value works very well. Process p1 and p2 generate different random numbers, so the output of both processes varies.
Seed the same across computers
NumPy random seed with the same value works similarly across computers. With the same Python version and same operating system Numpy.random.seed() generates the same values across different computers if it takes the same seed value.
Random seed after 1000 time
What happens when we run the same seed more than 1000 times?
import numpy as np for i in range(1100): np.random.seed(int(time.time())) print(np.random.randint(low = 1, high = 10, size = 10)) i=i+1
I have run numpy.random.seed with seed value ‘100’ for more than 1000 times and pseudo-random values are the same every time.
Random seed 2d array
Using NumPy random function 2D array is generated. With the same seed, the same 2D array with the same random numbers will be generated.
import numpy as np np.random.seed(24) np.random.random((3,3))
In the above example, we have created a 3*3 size 2D array. After multiple executions, with the same seed, the same array is generated.
How to change random seed?
There are three ways to generate random seed numbers.
- The first method is not to pass any seed value. It will randomly pick seed value by itself as we describe in the section above in detail.
- The second way is to pass the current time as seed number. Time is always changing, so a random seed number will be generated.
- The third way is to randomly generate seed numbers using random.randint(). See the example below.
import numpy as np seed_value=np.random.randint(0,100) print("seed value: ", seed_value) np.random.seed(seed_value) np.random.randint(low = 1, high = 10, size = 10)
Output on two execution:
On every execution, it generates a new seed value, so that generates a different set of pseudo-random numbers.
NumPy random seed shuffle
You can shuffle the sequence of numbers using NumPy random.shuffle(). Using shuffle without using seed, it shuffles the sequence randomly, every time we execute the command.
With the same seed value, you can shuffle the sequence in a particular order, every time we execute the command.
import numpy as np arr = np.arange(10) print("array: ", arr) np.random.seed(99) np.random.shuffle(arr) print("array 1: ",arr) np.random.seed(199) np.random.shuffle(arr) print("array 2: ",arr)
In the above code, using the seed() function with the same value, every execution results in the same value as shown above.
Without using the seed() function it shuffles randomly on every execution.
NumPy random seed vs Python random seed
There are two ways to initialize seed. One is using the Python pseudo-random generator random.seed() like this:
# Python pseudo-random generator at a fixed value import random random.seed(101) for i in range(10): print(random.randint(1,10))
The second method is using NumPy pseudo-random generator np.random.seed() like this:
# NumPy pseudo-random generator at a fixed value import numpy as np np.random.seed(101) np.random.randint(low = 1, high = 10, size = 10)
Both functions work on pseudo-random generator algorithms internally. But, with the same seed value, both functions give different output starting from different random values.
Random number generations are very crucial and important in various fields like probability, statistics, machine learning, and deep learning applications. We have discussed all major functions and scenarios of random.seed() function.
The practice is the key to a deep understanding of any topic. Keep experimenting with code snippets I shared in the article. The more you practice, the clearer the topic will be.
I am a Data Scientist, Artificial Intelligence/Machine Learning Enthusiast and Technical Content Writer(Python and R programming) with SEO content writing skills.
I have experience in various machine learning algorithms like Linear Regression, Logistic Regression, KNN, Naive Bayes, Support Vector Machine, Decision Trees, Random Forest and Data Visualization with 3 years of experience.
I have worked on Natural Language Processing(BOW, TF-IDF, W2Vec, Glove Embeddings), LSTMs and different Computer Vision techniques like CNN, RCNN, Fast RCNN, Faster R-CNN, Mask-R CNN.
I have experience with OpenCV library, TensorFlow, Keras, Pandas, NumPy, Seaborn, Matplotlib, Microsoft Excel, Microsoft Office, MySQL as well.