What is a Monte Carlo simulation?

1. What is a Monte Carlo simulation?

Welcome to this course on Monte Carlo simulations!

2. Simulations and Monte Carlo simulations

Simulations are experiments done to imitate reality. Computer programs are often used to create these simulations; we'll use Python. A Monte Carlo simulation is a model used to predict the probability of different outcomes impacted by the presence of random variables. Because they rely on repeated random sampling, the numerical results have a stochastic nature. A stochastic model contains randomness, so results will be different each time the model is run, even given the same inputs. In contrast, a deterministic model has no randomness and will arrive at the same results each time is it run with a given set of inputs.

3. Simulation example

Consider the example of Tom, who rolls a fair six-sided die n times. After each roll, he records the outcome: a number from one to six. Then, Tom puts the rolled die in a bag and selects a new die for the next roll. We are interested in two questions: How many dice will Tom collect after n rolls? And what will be the mean outcome after n rolls? We can use simulations to answer these questions.

4. Simulating Tom's outcomes

We first import the random and NumPy modules. The random module has many useful functions for drawing random numbers, while NumPy is great for numerical calculations. Defining a function called roll_dice, we use random-dot-seed to set a seed for the random number generation so that the program is reproducible. We define total_dice to tally the number of dice in Tom's bag and a point_dice list to keep track of the roll outcomes. We use random-dot-randint to randomly sample integers from one through six, each of which will be recorded in the point_dice list. After each roll, we increase the number of total dice rolled by one. After n rolls, we can calculate the mean_point_dice by using np-dot-mean. Our function returns the total number of dice collected in the bag as well as the mean outcome.

5. Simulation results

Let's roll the dice! We'll use a random seed of 1231. We roll 10, 100, 1,000, and 10,000 times. The simulation for the total number of dice collected in Tom's bag is deterministic. Each time a die is rolled, it goes in the bag; therefore, the number of dice Tom collects is equal to the number he rolled. We see this in the first item of each returned list: 10, 100, 1,000, and 10,000. On the other hand, the simulation for the mean outcome of all rolls is a Monte Carlo simulation that is stochastic. The mean outcome of rolling a fair six-sided die infinitely should be 3-point-5. We can see that the mean outcome of our simulations tends to be close to 3-point-5, but not always 3-point-5. Let's look at another simulation using a different seed. The mean outcomes of rolling the same number of dice are different due to the stochastic nature of Monte Carlo simulations. If we look at the last line of results for both sets of simulations, the average of rolling 10,000 dice is 3-point-503 using one seed but 3-point-5508 with the other.

6. The Law of Large Numbers

If Monte Carlo simulations are stochastic, does that mean the simulation results will be all over the place and therefore not trustworthy? Fortunately, that's not the case with large numbers of simulations because of the Law of Large Numbers. As the number of identically distributed, randomly generated variables increases, their sample mean approaches the theoretical mean. In this third set of simulations, we roll dice 100,000, 500,000, and one million times. The results are all very close to the theoretical average of 3-point-5, meaning these are reliable estimates. Additionally, as the number of rolls increases, the returned mean outcome gets closer to 3-point-5.

7. Let's practice!

Let's practice!