A Monte Carlo Simulation (MCS) is a sampling experiment whose aim is to estimate the distribution of a quantity of interest that depends on one or more stochastic input variables.

The development of an MCS involves three basic steps:

1. Set up a predictive model clearly identifying the dependent variable (outcome, unknown quantity of interest, function to be evaluated) and the independent variables (random input variables).

2. Identify the probability distributions of the input variables. The standard procedure involves analyzing empirical or historical data and fitting the data to a particular distribution. When such data is not available, select an appropriate probability distribution and choose wisely its parameters.

3. Repeat the simulation a significant number of times obtaining random values of the output or dependent variable. Use statistical methods to calculate some descriptive statistical measures and draw charts such as histograms or density plots of the outcome variable.

As indicated in a previous article, we can use a Monte Carlo simulation to quantitatively account for risk in business or operations management tasks.

Machine Breakdown & Repair

A practical application of a Monte Carlo Simulation includes manufacturing operations where time between machine failures and their corresponding repair times are random variables.

Assume a big factory where a processing unit runs 24 hours per day, 335 days per year (30 days for personnel vacations and yearly maintenance). Management wants to buy another processing unit to elaborate on another final product. They know that such processing units incur in random mechanical failures that require a highly skilled repair team. So, every breakdown represents an important cost.

The management decided to conduct a Monte Carlo Simulation to numerically assess the risk associated with the new project.

Historical data of the present processing unit shows that the time between machine failures has the following empirical distribution:

None

The time required to repair such a processing unit has the following empirical distribution:

None

Python code for our Monte Carlo Simulation

The following Python code allows us to develop an MCS to numerical assess the risk involved in the new investment:

First, we imported several Python libraries:

NumPy: a library for the Python programming language that provides support for creating large multidimensional vectors and matrices, along with a large collection of high-level mathematical functions to operate on them.

SciPy: a free and open source library for Python. Based on NumPy, contains modules for statistics, linear algebra, optimization, interpolation, signal processing, and other usual methods for science and engineering.

Matplotlib and Seaborn are traditional Python libraries for charting.

PrettyTable is a Python library for easily displaying tabular data in a visually appealing ASCII table format.

@author: darwt
"""
# Import Modules
import numpy  as np
from scipy import stats
from scipy.stats import sem
import matplotlib.pyplot as plt
import seaborn as sns 
from prettytable import PrettyTable
  
your_path = 'your path'

Now we indicate the operating hours per year, the hourly cost incurred by the repair team, the empirical distribution for hours of operation between failures, and the empirical distribution needed for repairing.

# initialization module
 
Operating_hours_year = 335 * 24
Cost_LostTime_hour = 55.50    
# .........................................
# historical data of hours of operation between failures (HbF)
Hours_between_Failures = [200, 400, 600, 800, 1000, 1200, 1400]
# discrete probabilities for hours of operation between failures
probabilities_HBF = [0.1, 0.2, 0.1, 0.2, 0.3, 0.05, 0.05]
# historical data of hours needed to repair (TTR)
Time_to_Repair = [24, 48, 72, 96, 120]
probabilities_TTR =[0.10, 0.20, 0.25, 0.25, 0.20]

We set a huge number of replications, so our sample mean will be very close to the expected value. We also defined the level of confidence for our Confidence Interval (CI).

Number_of_Replications = 2000
confidence = 0.95                      ## selected by the analyst
# Lists to be filled during calculations
list_of_HBF,    list_of_TTR = [], []
list_Lost_Time, list_cost_TTR = [], []
list_Hours_Prod = []

The logic behind the MCS is described in the following lines of code. We have two loops: the outer loop (for run in range(Number_of_Replication) related to the number of runs or replications; the inner loop (while acum_time <= Operating_hours_year ) related to the yearly production and breakdowns. We used twice np.random.choice to generate a random sample of size one from both empirical distributions according to their corresponding probabilities.

for run in range(Number_of_Replications):
    acum_time, acum_hbf, acum_ttr = 0, 0, 0
    while acum_time <= Operating_hours_year:
        HBF = np.random.choice(Hours_between_Failures, 
                               1, p=list(probabilities_HBF)) 
        TTR = np.random.choice(Time_to_Repair,
                               1, p=list(probabilities_TTR))
        acum_time += (HBF + TTR)
        acum_hbf  += HBF
        acum_ttr  +=TTR
         
        
    Total_HBF = round(float(acum_hbf),4)
    Total_TTR = round(float(acum_ttr),4)
    Perc_lost_Time = round(float(Total_TTR/Total_HBF * 100),2)
    Hours_Prod = Operating_hours_year - Total_TTR
    Cost_TTR   = Total_TTR * Cost_LostTime_hour
    
    list_of_HBF.append(Total_HBF)
    list_of_TTR.append(Total_TTR) 
    list_Lost_Time.append(Perc_lost_Time)
    
    list_cost_TTR.append(Cost_TTR)
    list_Hours_Prod.append(Hours_Prod)

We used NumPy and SciPy to calculate key descriptive statistical measures, particularly a Percentile Table for our principal outcome Cost_TTR = Total_TTR * Cost_LostTime_hour. This output variable represents the cost associated with machine failures and the corresponding repair times. media represents the expected value of such cost.

media =  round(np.mean(list_cost_TTR),2)
median = round(np.median(list_cost_TTR),2)
var   = round(np.var(list_cost_TTR), 2) 
stand = round(np.std(list_cost_TTR),2)
std_error = round(sem(list_cost_TTR),2)
skew   = round(stats.skew(list_cost_TTR),2)
kurt   = round(stats.kurtosis(list_cost_TTR),2)
minimum = round(np.min(list_cost_TTR),2)
maximum = round(np.max(list_cost_TTR),2)
dof  = Number_of_Replications - 1    
t_crit = np.abs(stats.t.ppf((1-confidence)/2,dof))
half_width =  round(stand *t_crit/np.sqrt(Number_of_Replications),2)  
inf = media - half_width
sup = media + half_width  
                      
inf = round(float(inf),2)
sup = round(float(sup),2)

We used PrettyTable for the statistic report and the percentile report:

t = PrettyTable(['Statistic', 'Value'])
t.add_row(['Trials', Number_of_Replications])
t.add_row(['Mean', media])
t.add_row(['Median', median])
t.add_row(['Variance', var])
t.add_row(['Stand Dev', stand])
t.add_row(['Skewness', skew])
t.add_row(['Kurtosis', kurt])
t.add_row(['Half Width', half_width])
t.add_row(['CI inf', inf])
t.add_row(['CI sup', sup])
t.add_row(['Minimum', minimum])
t.add_row(['Maximum', maximum])
print(t)
percents = np.percentile(list_cost_TTR, [10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
p = PrettyTable(['Percentile (%)', 'Lost Time Cost'])
p.add_row(['  10  ', percents[0]])
p.add_row(['  20  ', percents[1]])
p.add_row(['  30  ', percents[2]])
p.add_row(['  40  ', percents[3]])
p.add_row(['  50  ', percents[4]])
p.add_row(['  60  ', percents[5]])
p.add_row(['  70  ', percents[6]])
p.add_row(['  80  ', percents[7]])
p.add_row(['  90  ', percents[8]])
p.add_row([' 100  ', percents[9]])
print(p)

Finally, we used Matplotlib and Seaborn for charting: a histogram that shows the underlying frequency distribution of the expected lost time cost per year and a density plot that shows the probability density function of the percent lost time due to breakdowns and repairs.

# Lost Time Cost Histogram
n_bins = 20
fig, ax = plt.subplots(figsize=(8, 6))
ax.hist(list_cost_TTR,  histtype ='bar', bins=20, color = 'c',
         edgecolor='k', alpha=0.65, density = False)  # density=False show counts
ax.axvline(media,  color='g', linestyle='dashed', linewidth=3)
ax.axvline(median, color='r', linestyle='dashed', linewidth=3)
 
ax.set_title('Frequency Chart')
ax.set_ylabel('Counts')
ax.set_xlabel('U$S')
ax.grid(axis = 'y')
plt.savefig(your_path +'Expected Lost Time Cost per Year',
            bbox_inches='tight', dpi=150)
plt.show()
# Percent Lost Time Cost Density Plot
sns.distplot(list_Lost_Time, hist=False, kde=True, 
             bins=int(180/5), color = 'lightblue', 
             axlabel = 'Percent Lost Time (%)',
             hist_kws={'edgecolor':'black'},
             kde_kws = {'shade': True, 'linewidth': 3})
plt.savefig(your_path + 'Percent Lost Time' , 
            bbox_inches='tight', dpi=150)          
plt.show()

Output Analysis

We made 2000 replications appending several lists. Then, we calculated key descriptive statistical measures for the cost associated with machine failures and the corresponding repair times. The following table shows the Statistics Report:

None
Table 1, made by the author with PrettyTable.

Table 1 shows little difference between the mean and the median and the half-width interval is almost negligible. The outcome distribution is positively skewed with a negligible degree of kurtosis.

Table 2 shows the Percentile Report. We can see that the chance that the lost time cost will be greater than 50 M U$S is more than 20%.

None
Table 2, made by the author with PrettyTable.

Figure 1 shows the distribution of our outcome (Lost Time Cost) as a histogram with 20 bins. The mean and the median are indicated with green and red vertical lines respectively. As usual, the histogram provides a visual representation of the distribution of our quantity of interest: its location, spread, skewness, and kurtosis. We claim that the expected value (95% confidence level) is around USS 44.000.

None
Fig. 1, made by the author with Matplotlib.

Figure 2 displays a density plot of the output variable Percent Lost Time. Remember that the key idea in density plots is to eliminate the jaggedness that characterizes histograms, allowing better visualization of the shape of the distribution.

Clearly, the distribution is unimodal. We claim that, on average, 10% of the operating hours will be lost due to machine breakdowns.

None
Fig. 2, made by the author with Seaborn.

A Monte Carlo Simulation is a computational technique used by decision-makers and project managers to quantitatively assess the impact of risk. An essential step of any simulation study is an appropriate statistical analysis of the output data. Statistical reports and distribution charts (histograms and density plots) are mandatory for assessing the level of exposure to risk.

Don't forget to give a TIP, particularly if you add the article to a list.