Almost Random Numbers and Distributions with NumPy

Randomness from computers is elusive. Learn how to generate pseudo random numbers and distributions with NumPy.

By Ezra Chu

Dec 5, 2019 • 5 Minute Read

Introduction

Random numbers are considered a sort of "Holy Grail" in computing. Randomness, it seems, is rather elusive; contemporary processors have a very difficult time producing truly random numbers. Yet, randomness is critical in a variety of real-world applications. Most modern cryptography methods depend on the generation of truly random numbers (e.g., How a Bunch of Lava Lamps Protect Us From Hackers).

For all of us who don't need the true randomness of natural phenomena, there's a good-enough alternative: pseudorandom numbers (PRN). That's a fancy way of saying random numbers that can be regenerated given a "seed". Let's take a look at how we would generate pseudorandom numbers using NumPy. (Note: You can accomplish many of the tasks described here using Python's standard library but those generate native Python arrays, not the more robust NumPy arrays.)

All of the following samples require these lines at the top:

          import numpy.random as random

#The following is what we call "seeding" the PRNG.
random.seed(100)
    

When the PRNG is seeded with the same value, 100 in this case, it will always generate the same sequence of random numbers. With this, the values generated on your machine should be the same as the ones listed in this guide. This is why PRNGs are considered "cryptographically insecure". Anybody who has the seed can generate the exact sequence of random numbers that you have.

Generate PRNG

Let's begin by generating a couple of PRNs and logging them to the console.

          print(random.rand(1))
print(random.rand(1))
    

rand() selects random numbers from a uniform distribution between 0 and 1. Because we are using a seed, no matter where or when this is run, it will always generate the following random numbers:

           0.54340494]
[ 0.27836939
    

Notice that the rand(1) calls have an argument. rand() returns an array when given an argument and the arguments denote the shape of the array.

          # We're going to print an array with dimensions 10x2.
print(random.rand(10, 2))
    

The above will output this two-dimensional array of random numbers selected from a uniform distribution:

          [ 0.54340494  0.27836939]
 [ 0.42451759  0.84477613]
 [ 0.00471886  0.12156912]
 [ 0.67074908  0.82585276]
 [ 0.13670659  0.57509333]
 [ 0.89132195  0.20920212]
 [ 0.18532822  0.10837689]
 [ 0.21969749  0.97862378]
 [ 0.81168315  0.17194101]
 [ 0.81622475  0.27407375]
    

Random integers are generated using randint():

      print(random.randint(0, 100, 10))

This will output the following array. Notice that the random numbers are between 0 and 100, and the length of the array is 10.

       8 24 67 87 79 48 10 94 52 98

Generate PRNG Distributions

Normal Distributions

To generate an array of Gaussian values, we will use the normal() function.

          mu, sigma = 10, 2 # mean and standard deviation
print(random.normal(mu, sigma, 10))
    

            6.50046905  10.68536081  12.30607161   9.49512793  11.96264157
  11.02843768  10.44235934   7.85991334   9.62100834  10.51000289
    

Binomial Distributions

NumPy provides functionality to generate values of various distributions, including binomial, beta, Pareto, Poisson, etc. Let's take a look at how we would generate some random numbers from a binomial distribution.

Let's say we wanted to simulate the result of 10 coin flips.

          n, p = 10, .5
s = np.random.binomial(n, p, 5)
    

This runs 5 different trials of the 10 coin flips and returns the number of times the coin lands on heads (or tails, your call) for each of those trials:

      5 4 5 7 1

Choice Distributions

Sometimes, you want to be able to pick random items from a list. For example, say that you wanted to choose randomly between red (p=0.25), green (p=0.5), and blue (p=0.25). That would look something like this:

      print(random.choice(['red', 'green', 'blue'], 5, p=[0.25, 0.5, 0.25]))

The resulting array looks like this:

      'green' 'green' 'green' 'blue' 'red'

You can also generate random choices from a range of integers.

      print(random.choice(5, 10, p=[0.2, 0, 0.2, 0.5, 0.1]))

Here, 0 has 20% probability of occurring, 1 has 0%, 2 has 20%, and so forth. This is the result:

      3 2 3 3 0 2 3 3 2 3

Using the choice() function, you can create random numbers from arbitrary distributions using frequency data.

Conclusion

In this guide, we covered how you would leverage NumPy's random module to generate PRNs and briefly discussed the difference between pseudo-randomness and true randomness. If you still have questions about how NumPy PRNG functions work, check out the documentation. It provides readable examples of how to use each functionality that the random module provides.

Ezra C.

Written content author.

More about this author