Monte Carlo Simulation using PowerShell

After reading the quite interesting book called Fooled by Randomness by Nassim Nicolas Taleb, and reading the there mentioned, and also interesting book Randomness by Deborah J. Bennett, I wanted to try out some Monte Carlo Simulation. In the last book there was also the mention of the 'Three doors problem', one that I had come across earlier a few times. One of those times, I created a small C# winforms application, that ran multiple simulations of the situation where the show host knew what door to open, and then checking the percentages when switching, allowing me to input a number of runs. Allocation of the value behind the doors, and the initial selection was done at random. This simulation gave the same answer that I already was familiar with (switching in this situation would be beneficial). Remembering I created this, I realized that I had already done a simple Monte Carlo Simulation. Unfortunately this small piece of code was no longer available, so I created a new version in PowerShell.

I used the following PowerShell to check the 'Three doors problem':

# limit the runs
$numRounds = 10000

function GetDoorNumber () {
    return Get-Random -InputObject @(1,2,3)
}

function GetDifferentDoorNumber ($carDoor, $choiceDoor) {
    $options = @()
    1..3 | foreach {
        if ($_ -ne $carDoor -and $_ -ne $choiceDoor) {
            $options += $_
        }
    }
    return Get-Random -InputObject $options
}

function CheckThreeDoorProblem ($showHostKnows = $true) {
    $noSwitchWins = 0
    $switchWins = 0

    1..$numRounds | foreach {
        $carDoor = GetDoorNumber
        $choiceDoor = GetDoorNumber
        if ($showHostKnows) {
            $openDoor = GetDifferentDoorNumber $carDoor $choiceDoor # reflects the door opened by a show host who knows not to open the car door.
        }
        else {
            $openDoor = GetDifferentDoorNumber $choiceDoor # reflects the door opened by a show host who does not know what door the car is behind.
        }
        $switchDoor = GetDifferentDoorNumber $choiceDoor $openDoor

        # Uncomment to check logic
        #Write-Host "Car is in door $carDoor, Choice door is $choiceDoor, Opened door is $openDoor, Switched door is $switchDoor"

        if ($choiceDoor -eq $carDoor) {
            $noSwitchWins++
        }
        if ($switchDoor -eq $carDoor) {
            $switchWins++
        }
        if ($openDoor -eq $carDoor) {
            # ignore as we are not interested in this, as this isn't our situation
        }

        # update status
        $percentage = $_ / $numRounds * 100
        Write-Progress -Activity "Monte Carlo Simulation in progress" -Status "Complete:" -PercentComplete $percentage
    }

    Write-Host "Situation show host $(if($showHostKnows) { "knows what door the car is behind (never opens a door with the car behind it)" } else { "does not know what door the car is behind" }) and $numRounds rounds of play:"
    Write-Host "Not switching wins $noSwitchWins out of $numRounds, this is $($noSwitchWins / $numRounds * 100)% of the time"
    Write-Host "Switching wins $switchWins out of $numRounds, this is $($switchWins / $numRounds * 100)% of the time"
    if (-not $showHostKnows) {
        $roundsWhereShowHostDoesNotSelectTheCar = $noSwitchWins + $switchWins
        $selectsCar = $numRounds - $roundsWhereShowHostDoesNotSelectTheCar
        Write-Host "Show host selects the car $selectsCar out of $numRounds, this is $($selectsCar / $numRounds * 100)% of the time"
        Write-Host ""
        Write-Host "Percentage based on the situation that no car was selected by the show host."
        Write-Host "Not switching wins $noSwitchWins out of $roundsWhereShowHostDoesNotSelectTheCar, this is $($noSwitchWins / $roundsWhereShowHostDoesNotSelectTheCar * 100)% of the time"
        Write-Host "Switching wins $switchWins out of $roundsWhereShowHostDoesNotSelectTheCar, this is $($switchWins / $roundsWhereShowHostDoesNotSelectTheCar * 100)% of the time"
    }
    Write-Host ""
}

Write-Host ""

CheckThreeDoorProblem $true
CheckThreeDoorProblem $false

Write-Host "Note: These are all approximations of the probabilities of winning/losing."
Write-Host "The higher the number of trail runs, the better the approximations."
Write-Host "The approximations validity depend on a solid implementation of the rules that are used."

In this PowerShell script I also added an extended situation, the one where the show host doesn't know the location of the car, and could have selected the door with the car behind it. This small detail makes the difference in estimating/calculating the probabilities. These situations are not that complex to do the calculations for, but if things get a little bit more complex, a Monte Carlo Simulation can be much easier than doing the calculations, although you will get an approximation using the Monte Carlo method. The more simulation runs, the more accurate the output.

Here is a sample output of this script:

Situation show host knows what door the car is behind (never opens a door with the car behind it) and 10000 rounds of play:
Not switching wins 3369 out of 10000, this is 33.69% of the time
Switching wins 6631 out of 10000, this is 66.31% of the time

Situation show host does not know what door the car is behind and 10000 rounds of play:
Not switching wins 3234 out of 10000, this is 32.34% of the time
Switching wins 3369 out of 10000, this is 33.69% of the time
Show host selects the car 3397 out of 10000, this is 33.97% of the time

Percentage based on the situation that no car was selected by the show host.
Not switching wins 3234 out of 6603, this is 48.9777373920945% of the time
Switching wins 3369 out of 6603, this is 51.0222626079055% of the time

Note: These are all approximations of the probabilities of winning/losing.
The higher the number of trail runs, the better the approximations.
The approximations validity depend on a solid implementation of the rules that are used.

As these are approximations, and the math is easy, we should be able to see that the first is a 1/3 vs 2/3 chance, the second is a 1/3 chance for each, and the third is a 1/2 chance for each option left. Note that due to randomness, percentages can differ from this in these simulations.

Actually, I created the 'Three doors problem' PowerShell after I created another piece of Monte Carlo Simulation, that was one to check the probability of dice throwing in the game Risk. This is a more complex situation, and doing the math seemed much harder, and I didn't think that I could do that math (I will be looking into this area a bit more, as it is quite interesting). I do play the game of Risk sometimes, so I thought it would be nice the get an approximation of the probabilities when throwing the dice.

This is the PowerShell I created to check dice throwing in a game of Risk:

# limit the runs
$numRounds = 10000

function ThrowDice () {
    return Get-Random -InputObject @(1,2,3,4,5,6)
}

# returns 1 attacker wins, 2 defender wins, 0 draw
function CheckWinings($diceAttacker, $diceDefender) {
    $diceAttacker = $diceAttacker | sort -Descending
    $diceDefender = $diceDefender | sort -Descending

    $checkLength = ($($diceAttacker.length, $diceDefender.length) | sort)[0]

    $winningOutcomes = @()
    
    for ($i = 0; $i -lt $checkLength; $i++) {
        $attackerDice = $diceAttacker[$i]
        $defenderDice = $diceDefender[$i]
        $winningOutcomes += ($attackerDice -gt $defenderDice)
    }


    $wins = ($winningOutcomes | where { $_ -eq $true }).Count
    $loses = ($winningOutcomes | where { $_ -eq $false }).Count


    if ($wins -gt $loses) {
        return 1
    }
    if ($loses -gt $wins) {
        return 2
    }
    return 0
}

function CheckRisk($numAttackDice, $numDefenseDice) {
    $winAttack = 0
    $winDefense = 0
    $draws = 0

    1..$numRounds | foreach {
        # initialize dice variables
        $diceAttack = @()
        $diceDefense = @()

        # throw number of dice supplied
        1..$numAttackDice | foreach { $diceAttack += ThrowDice }
        1..$numDefenseDice | foreach { $diceDefense += ThrowDice }

        # check winnings
        $winnings = CheckWinings $diceAttack $diceDefense
        if ($winnings -eq 1) {
            $winAttack++
        }
        elseif ($winnings -eq 2) {
            $winDefense++
        } else {
            $draws++
        }

        # Uncomment to check and verify logic
        #Write-Host "$diceAttack => $diceDefense outcome => $winnings"

        # update status
        $percentage = $_ / $numRounds * 100
        Write-Progress -Activity "Monte Carlo Simulation in progress" -Status "Complete:" -PercentComplete $percentage
    }

    Write-Host "Situation $numAttackDice attack dice vs $numDefenseDice defense dice:"
    Write-Host "Attacker wins $winAttack out of $numRounds, this is $($winAttack / $numRounds * 100)% of the time"
    Write-Host "Defender wins $winDefense out of $numRounds, this is $($winDefense / $numRounds * 100)% of the time"
    Write-Host "Draws $draws out of $numRounds, this is $($draws / $numRounds * 100)% of the time"
    Write-Host ""
}

Write-Host ""

# check all options
CheckRisk 3 2
CheckRisk 3 1
CheckRisk 2 2
CheckRisk 2 1
CheckRisk 1 2
CheckRisk 1 1

Write-Host "Note: These are all approximations of the probabilities of winning/losing."
Write-Host "The higher the number of trail runs, the better the approximations."
Write-Host "The approximations validity depend on a solid implementation of the rules that are used."

I did add a warning at the end, that these are approximations, and that it heavily depends on the implementation of the rules/logic. I am known to be good at, as I say it in Dutch: "Bug's maken", or creating/fixing bugs. The Dutch word 'maken' can both mean 'create' or 'fix', in this context. So as I am aware of that, I try to be cautious of claiming perfect code, although I hope this logic was correct.

If we run this simulation, I set the number of runs here at 10000 times, we get an approximation of the probabilities of winning, losing or draws. Here is an example output:

Situation 3 attack dice vs 2 defense dice:
Attacker wins 3686 out of 10000, this is 36.86% of the time
Defender wins 2910 out of 10000, this is 29.1% of the time
Draws 3404 out of 10000, this is 34.04% of the time

Situation 3 attack dice vs 1 defense dice:
Attacker wins 6601 out of 10000, this is 66.01% of the time
Defender wins 3399 out of 10000, this is 33.99% of the time
Draws 0 out of 10000, this is 0% of the time

Situation 2 attack dice vs 2 defense dice:
Attacker wins 2230 out of 10000, this is 22.3% of the time
Defender wins 4491 out of 10000, this is 44.91% of the time
Draws 3279 out of 10000, this is 32.79% of the time

Situation 2 attack dice vs 1 defense dice:
Attacker wins 5753 out of 10000, this is 57.53% of the time
Defender wins 4247 out of 10000, this is 42.47% of the time
Draws 0 out of 10000, this is 0% of the time

Situation 1 attack dice vs 2 defense dice:
Attacker wins 2528 out of 10000, this is 25.28% of the time
Defender wins 7472 out of 10000, this is 74.72% of the time
Draws 0 out of 10000, this is 0% of the time

Situation 1 attack dice vs 1 defense dice:
Attacker wins 4138 out of 10000, this is 41.38% of the time
Defender wins 5862 out of 10000, this is 58.62% of the time
Draws 0 out of 10000, this is 0% of the time

Note: These are all approximations of the probabilities of winning/losing.
The higher the number of trail runs, the better the approximations.
The approximations validity depend on a solid implementation of the rules that are used.

One thing I would like to mention. Although it is easy to use PowerShell, it might not be the most performant, especially for complex and large amount of simulation runs. So that is something you should take into account, if you need good performance. I hope you liked this, and maybe you have some ideas yourself on things you would like to figure out this way.

Update: So I was wrong, as I was able to do a brute force calculation

After some additional research on the topic of probability, I thought about how I could calculate the probabilities, and realized a brute force looping all possibilities would also do. Sometimes it just requires a little bit of thinking. Although I would have liked to do it using R or some other method, using formulas, if possible. Below you will find the PowerShell script I created for doing the brute force calculation for Risk dice throwing:

# returns 1 attacker wins, 2 defender wins, 0 draw
function CheckWinings($diceAttacker, $diceDefender) {
    $diceAttacker = $diceAttacker | sort -Descending
    $diceDefender = $diceDefender | sort -Descending

    $checkLength = ($($diceAttacker.length, $diceDefender.length) | sort)[0]

    $winningOutcomes = @()
    
    for ($i = 0; $i -lt $checkLength; $i++) {
        $attackerDice = $diceAttacker[$i]
        $defenderDice = $diceDefender[$i]
        $winningOutcomes += ($attackerDice -gt $defenderDice)
    }


    $wins = ($winningOutcomes | where { $_ -eq $true }).Count
    $loses = ($winningOutcomes | where { $_ -eq $false }).Count


    if ($wins -gt $loses) {
        return 1
    }
    if ($loses -gt $wins) {
        return 2
    }
    return 0
}

function GetDicePossibilities($numberOfDice) {
    if ($numberOfDice -gt 3) {
        Write-Host "$numberOfDice dice are not supported, please use 1, 2 or 3 dice."
    }
    $options = @()
    $diceSides = 6

    1..$diceSides | ForEach-Object {
        $dice1 = $_
        $dice = @($dice1)
        if ($numberOfDice -ge 2) {        
            1..$diceSides | ForEach-Object {
                $dice2 = $_
                $dice = @($dice1,$dice2)
                if ($numberOfDice -ge 3) {
                    1..$diceSides | ForEach-Object {
                        $dice = @($dice1,$dice2,$_)
                        
                        $options += , $dice
                    }
                }
                else {
                    $options += , $dice
                }
            }
        }
        else {
            $options += , $dice
        }
    }
    
    return $options
}

function CheckRisk($numAttackDice, $numDefenseDice) {
    $winAttack = 0
    $winDefense = 0
    $draws = 0

    $attackPosibilities = GetDicePossibilities $numAttackDice
    $defensePosibilities = GetDicePossibilities $numDefenseDice

    $total = $attackPosibilities.Length * $defensePosibilities.Length
    $count = 0
    $attackPosibilities | foreach {
        $diceAttack = $_
        $defensePosibilities | foreach {
            $diceDefense = $_

            # check winnings
            $winnings = CheckWinings $diceAttack $diceDefense
            if ($winnings -eq 1) {
                $winAttack++
            }
            elseif ($winnings -eq 2) {
                $winDefense++
            } else {
                $draws++
            }

            # Uncomment to check and verify logic
            #Write-Host "$diceAttack => $diceDefense outcome => $winnings"

            # update status
            $count++
            $percentage = $count / $total * 100
            Write-Progress -Activity "Calculation in progress" -Status "Complete:" -PercentComplete $percentage
        }
    }

    Write-Host "Situation $numAttackDice attack dice vs $numDefenseDice defense dice:"
    Write-Host "Attacker wins $winAttack out of $total, this is $($winAttack / $total * 100)% of the time"
    Write-Host "Defender wins $winDefense out of $total, this is $($winDefense / $total * 100)% of the time"
    Write-Host "Draws $draws out of $total, this is $($draws / $total * 100)% of the time"
    Write-Host ""
}

Write-Host ""

# check all options
CheckRisk 3 2
CheckRisk 3 1
CheckRisk 2 2
CheckRisk 2 1
CheckRisk 1 2
CheckRisk 1 1

I have used the same logic to check the winnings. Added logic to get all dice options for 1, 2 or 3 dice, and then loop through all possibilities, and count the numbers of wins, losses or draws and check that against the total amount of possibilities. It seems this is a much faster method than a Monte Carlo Simulation (with the 10000 rounds), and it's accurate (provided that the logic is valid). The output is displayed below:

Situation 3 attack dice vs 2 defense dice:
Attacker wins 2890 out of 7776, this is 37.1656378600823% of the time
Defender wins 2275 out of 7776, this is 29.2566872427983% of the time
Draws 2611 out of 7776, this is 33.5776748971193% of the time

Situation 3 attack dice vs 1 defense dice:
Attacker wins 855 out of 1296, this is 65.9722222222222% of the time
Defender wins 441 out of 1296, this is 34.0277777777778% of the time
Draws 0 out of 1296, this is 0% of the time

Situation 2 attack dice vs 2 defense dice:
Attacker wins 295 out of 1296, this is 22.7623456790123% of the time
Defender wins 581 out of 1296, this is 44.8302469135802% of the time
Draws 420 out of 1296, this is 32.4074074074074% of the time

Situation 2 attack dice vs 1 defense dice:
Attacker wins 125 out of 216, this is 57.8703703703704% of the time
Defender wins 91 out of 216, this is 42.1296296296296% of the time
Draws 0 out of 216, this is 0% of the time

Situation 1 attack dice vs 2 defense dice:
Attacker wins 55 out of 216, this is 25.462962962963% of the time
Defender wins 161 out of 216, this is 74.537037037037% of the time
Draws 0 out of 216, this is 0% of the time

Situation 1 attack dice vs 1 defense dice:
Attacker wins 15 out of 36, this is 41.6666666666667% of the time
Defender wins 21 out of 36, this is 58.3333333333333% of the time
Draws 0 out of 36, this is 0% of the time

As you can see, these results are quite similar to the ones that where given by the Monte Carlo Simulation method. So the Monte Carlo Method works, although it gives an approximate answer, but it is better to just do the calculations, if these are not that hard to do (and requires less processing power). One thing to add, this calculation assumes fair dice.