Important Update
The Guide Feature will be discontinued after December 15th, 2023. Until then, you can continue to access and refer to the existing guides.
Author avatar

Ezra Chu

Almost Random Numbers and Distributions with NumPy

Ezra Chu

  • Dec 5, 2019
  • 5 Min read
  • 9,228 Views
  • Dec 5, 2019
  • 5 Min read
  • 9,228 Views
Python
NumPy

Introduction

Random numbers are considered a sort of "Holy Grail" in computing. Randomness, it seems, is rather elusive; contemporary processors have a very difficult time producing truly random numbers. Yet, randomness is critical in a variety of real-world applications. Most modern cryptography methods depend on the generation of truly random numbers (e.g., How a Bunch of Lava Lamps Protect Us From Hackers).

For all of us who don't need the true randomness of natural phenomena, there's a good-enough alternative: pseudorandom numbers (PRN). That's a fancy way of saying random numbers that can be regenerated given a "seed". Let's take a look at how we would generate pseudorandom numbers using NumPy. (Note: You can accomplish many of the tasks described here using Python's standard library but those generate native Python arrays, not the more robust NumPy arrays.)

All of the following samples require these lines at the top:

1import numpy.random as random
2
3#The following is what we call "seeding" the PRNG.
4random.seed(100)
python

When the PRNG is seeded with the same value, 100 in this case, it will always generate the same sequence of random numbers. With this, the values generated on your machine should be the same as the ones listed in this guide. This is why PRNGs are considered "cryptographically insecure". Anybody who has the seed can generate the exact sequence of random numbers that you have.

Generate PRNG

Let's begin by generating a couple of PRNs and logging them to the console.

1print(random.rand(1))
2print(random.rand(1))
python

rand() selects random numbers from a uniform distribution between 0 and 1. Because we are using a seed, no matter where or when this is run, it will always generate the following random numbers:

1[ 0.54340494]
2[ 0.27836939]
python

Notice that the rand(1) calls have an argument. rand() returns an array when given an argument and the arguments denote the shape of the array.

1# We're going to print an array with dimensions 10x2.
2print(random.rand(10, 2))
python

The above will output this two-dimensional array of random numbers selected from a uniform distribution:

1[[ 0.54340494  0.27836939]
2 [ 0.42451759  0.84477613]
3 [ 0.00471886  0.12156912]
4 [ 0.67074908  0.82585276]
5 [ 0.13670659  0.57509333]
6 [ 0.89132195  0.20920212]
7 [ 0.18532822  0.10837689]
8 [ 0.21969749  0.97862378]
9 [ 0.81168315  0.17194101]
10 [ 0.81622475  0.27407375]]
python

Random integers are generated using randint():

1print(random.randint(0, 100, 10))
python

This will output the following array. Notice that the random numbers are between 0 and 100, and the length of the array is 10.

1[ 8 24 67 87 79 48 10 94 52 98]
python

Generate PRNG Distributions

Normal Distributions

To generate an array of Gaussian values, we will use the normal() function.

1mu, sigma = 10, 2 # mean and standard deviation
2print(random.normal(mu, sigma, 10))
python
1[  6.50046905  10.68536081  12.30607161   9.49512793  11.96264157
2  11.02843768  10.44235934   7.85991334   9.62100834  10.51000289]
python

Binomial Distributions

NumPy provides functionality to generate values of various distributions, including binomial, beta, Pareto, Poisson, etc. Let's take a look at how we would generate some random numbers from a binomial distribution.

Let's say we wanted to simulate the result of 10 coin flips.

1n, p = 10, .5
2s = np.random.binomial(n, p, 5)
python

This runs 5 different trials of the 10 coin flips and returns the number of times the coin lands on heads (or tails, your call) for each of those trials:

1[5 4 5 7 1]
python

Choice Distributions

Sometimes, you want to be able to pick random items from a list. For example, say that you wanted to choose randomly between red (p=0.25), green (p=0.5), and blue (p=0.25). That would look something like this:

1print(random.choice(['red', 'green', 'blue'], 5, p=[0.25, 0.5, 0.25]))
python

The resulting array looks like this:

1['green' 'green' 'green' 'blue' 'red']
python

You can also generate random choices from a range of integers.

1print(random.choice(5, 10, p=[0.2, 0, 0.2, 0.5, 0.1]))
python

Here, 0 has 20% probability of occurring, 1 has 0%, 2 has 20%, and so forth. This is the result:

1[3 2 3 3 0 2 3 3 2 3]
python

Using the choice() function, you can create random numbers from arbitrary distributions using frequency data.

Conclusion

In this guide, we covered how you would leverage NumPy's random module to generate PRNs and briefly discussed the difference between pseudo-randomness and true randomness. If you still have questions about how NumPy PRNG functions work, check out the documentation. It provides readable examples of how to use each functionality that the random module provides.