Monte Carlo Powerbases - kusimanse - November 6, 2018
Note: this is a guest article. If you have questions about the back-end work for this article, please contact kusimanse. If you would like to contribute a guest article to A+Space, you can learn more here!
Hello, my name is Dan, kusimanse on Eternal, and today I’m going to be talking about mulligan statistics using simulations instead of hypergeometric calculations. I’m a graduate student who’s played Magic casually for a long time. I switched to Eternal after moving away from my play group, and mostly draft in Eternal. I’m not the greatest player by any stretch of the imagination, so I’ll stick more to numbers than lines of play.
I got this idea when watching a stream (Kaelos, I think?), where the streamer mentioned that the presence of mulligan systems means using a hypergeometric distribution to calculate opening hand sizes isn't always correct. I was kind of curious how big of an impact that would have on draws. This is also inspired by Frank Karsten's amazing series, where he uses statistics and modelling to estimate probabilities related to magic.
Eternal Warcry has a simple version, but it doesn't estimate joint probabilities (for example, you can calculate the odds of drawing FF on turn four, but not FFSS), the impact of cards like Seek Power having to choose which power you want, and it doesn't attempt to take into account mulligan decisions. I had most of this made when I came across this discussion about how to build the power base for Haunted Highway, so also I made a few changes to get Diplomatic Seal to work under a lot of assumptions.
For opening hands before mulligans, hypergeometric distributions work quite well. For background, Shiftstoned has a very nice tool and great explanation. For mulligans, it becomes more complicated. As far as I know, the Eternal mulligan system works by giving you a 1/3 chance of drawing both 2,3, and 4 power. To calculate probabilities of power distributions form here, you'd need to calculate probabilities of drawing power and influence distributions for 2, 3, and 4 power, then use the hypergeometric calculator over your remaining deck.
This gets fairly complicated quickly, especially once cards like Seek Power come into play – as best I can tell, Shiftstoned treats it as one of each sigil in your deck, so Monte Carlo simulations become a natural choice instead. For background on the type of simulations I'm going to run, and an explanation of how to make your own, pro tour champion Allen Wu has a great explanation here. The basic idea is pretty simple – to quote Allen, 'Monte Carlo simulation sounds fancy, but it essentially means just doing something over and over and tracking what happens'. You then use what happens to estimate the probabilities that you want to calculate.
To do this, I wrote a basic solitaire Eternal simulator. What it does is, given a power base, fills the rest of the deck out with blanks (and Seek Powers, if required), then draws opening hands, mulligans, and plays the first ten turns of the game repeatedly. Each time the results are recorded, relevant numbers are aggregated then divided by the number of times it was ran. For example, to calculate the odds of drawing at least one Fire influence, it takes a given deck, draws the opening hand and ten turns of draws, counts how many of them had one or more Fire influence, then aggregates that across multiple runs. In this case, I ran the simulation 100,000 times, which was a large enough number to give relatively stable results* without taking more than a minute to run. I verified the results by comparing the initial hands against the Shiftstone calculator and mulliganed hands against the Eternal Warcry tool – the mulligan system is pretty simple, so I hope this is working properly, but there's always a chance of bugs, especially with the Diplomatic Seal stuff. If anyone notices any fishy numbers, let me know or feel free to look at the code.
*(within +/- 1% from what I saw, but this is super unscientific)
I’ll do a few things here:
Using NotoriousGHP’s Haunted Highway, compare odds to draw power and influence under a blind draw 7 and a mulligan.
Using Almost’s West-Wind Combo, compare odds of having Seek Power vs. extra power in your deck on the mulligan
A very basic comparison of ManuS's Diplomatic Seal power base vs. the standard for Haunted Highway decks. This is going to be extremely oversimplified and preliminary, though.
Finally, I’ll have some fun with the most fun card to come out of the latest campaign – Cauldron Cookbook – and ask how much can you cook before your soup boils over?
Mulligans vs. opening hands
For the most basic case, take NotoriousGHP's Haunted Highway deck from the Reunion tournament. I chose it mostly because it’s a cool deck that wants to hit high influence requirements early. It has a total of 25 power, with 18 Fire sources, 15 Shadow sources, and 11 Primal sources. There are a few caveats here. I ignore the influence of Nightfall and crests, which both potentially shift the power curve one turn faster. I also ignore whether power enters depleted/undepleted – this is important, but tricky because of the importance of Banners.
With these caveats in mind, here’s what the odds of drawing various influence turns 1-10 are:
There is a fairly large difference here between drawing seven random cards and mulliganing. You average ~.6 more power per turn with a mulligan, there are fairly large improvements in drawing each power, alongside your deck's influence requirements – you've gone from a one third chance to a half chance of drawing your decks needed influence on turn 4! Your actual odds are even better – this doesn't consider mulligan decisions! To take those into account, I ran a simulation with a simple mulligan strategy – only keep hands with 2-4 power and at least one of each source of influence (this is ‘strategy’ in the table). This is obviously way too simple – it keeps some awful hands without any synergy, and throws away some useable ones (say, with Nightfall), but it should give an idea at least of what happens under smarter mulligan decisions.
One interesting thing to note is your overall odds of hitting the deck’s influence requirements haven't changed much between drawing seven and mulliganing. The first turn actually drops, as the best way to get FFSSP on turn one is by having a ton of power, but it's fairly similar throughout – this mulliganing strategy encourages keeping hands that look like post-mulligan hands, so the odds of drawing into your full influence later on is probably similar.
The odds of hitting early power are even better than with this simple strategy, though! Crests and Nightfall would improve on these results – with one nightfall card and one crest, you can essentially shift the curve over two turns and be at 75% to hit your influence requirements by turn 4.
The influence of Seek Power on redrawn hands:
Many decks play Seek Power not only for finding sigils to fix influence or for spell synergies, but also to improve the consistency of the deck on mulliganed hands. The 2-4 power drawn in these hands doesn't include effective sources of power like Seek Power or Vara's Favor, which increases the odds of drawing power in your mulliganed hands. For this example, I used Almost's awesome winds combo deck. This deck has power requirements of 6SSPPF. For the non-Seek Power version, I swapped 4 Seeks for 2 Shadow, 1 Primal, and 1 Fire (these changes are fairly arbitrary). To implement Seek Power, I had it compare the influence that was currently drawn to the decks influence requirements, then fetch the influence that it was missing the most. For example, if you drew FFS it would see it’s missing -1F, 1S, and 2 P, so it would choose to fetch a Primal. Ties were broken in an arbitrary fashion, but changing the order of the tiebreakers didn’t really change the outcome.
A simple switch of 4 power to 4 Seek Power has a fairly large impact on outcomes! You draw ~.3 more power on average and improve your influence distribution quite a bit. This was pretty much already known, but it’s cool to see the impact.
How does Diplomatic Seal compare in later turns vs. a non Diplomatic Seal power base?
I'm going to make the same assumption here I did with Seek Power – if you can play a Seal and gain power, gain power towards your influence requirements, favoring S>F>P (just going by the faction breakdown). The decisions on how to play seal are also simplified as well – if you have it in your hand and have less than 3 influence, you play it, otherwise you play a random power (when played the influence from seal is added to the total influence you've drawn, which is displayed here-played power is only kept track of to determine if the seal produces influence).
I’d probably slightly lean towards ManuS’s version here, although the differences are quite small: You have a lower chance of having Fire influence in your opening hand, but there is a higher chance you can play undepleted Fire turn one (which isn't captured here). The flexibility Seal gives early game provides the deck a higher chance midgame of hitting all of their power, but it's surpassed in a few turns, which Nightfall speeds up. This could easily be wrong – either way, the power is quite robust for a deck that wants to hit FFSSP early. The tradeoff looks to be two extra undepleted power for the early game and slightly better mid game influence for Diplomatic Seal, versus slightly better late game influence otherwise.
Note: Turn zero numbers aren't all that meaningful, as the Diplomatic Seals don't count for influence until they are played, which is turn 1 at the earliest.
How much cook could a Cookbook cook if a book cook didn’t blow up?
If you cook every turn, how much damage do you take? How long until your recipe goes soupernova?
After about 20 turns you have a 50/50 chance of killing yourself, and after 25 turns your goose is cooked. Damage goes up quickly after that too, but if you haven’t been killed or killed them after drawing 23 extra cards, you have other problems brewing.
If instead, you started drawing on turn 5 because you grabbed cookbook with a merchant, and drew on a random 50% of turns, what happens?
That looks much better. We pushed off a gristley demise by another twenty turns, and more than halved the damage we take. One thing to note, though is the variance is very high. I ran this playing Cookbook turn five, with a 90% draw probability, calculated the standard deviation, and plotted that as well:
In this graph the dark line is the expected damage, the blue band all values within one standard deviation, and the pale band two standard deviations. What that effectively means is there’s approximately a 68% chance of taking damage within the blue band and a 95% chance of taking damage within the pale band on any given turn. So on turn 15, there is an expected damage of 9.76 and a standard deviation of 5.93. This gives an approximate 68% chance of taking damage in the range [3.83, 15.96], and a 95% chance of taking damage in the range [0, 21.62]. This is a huge range-you’d have to be pretty unlucky to take 25 damage, but taking 15 or even 20 by this isn’t out of the realm of possibility.
Cauldron Cookbook offers the possibility of burying your opponent, but you should have life gain or a fast clock to pair with it, otherwise your opponent can sit back and let you stew in your own juice.
Note: These simulations don't take into account that firebomb draws you an extra card. They don't change the results all that much-for example, the mean and standard deviation on turn 15 change by less than 0.2.
I'll probably tinker with this a bit – add support for tapped and untapped power sources and such, along with various fun gimmicks as I find things that interest me. If there's enough interest in it, I'll try to turn it more useable interface/tool (it'll probably still be in the terminal, though) – with better support for deck import and fancier tables and graphs. If anyone has anything else they’d like supported, or any other interesting ideas for simulations, let me know!
For all of you who made it through the whole thing, or even just skipped to the end, thank you for reading this! I’d love to hear any feedback - I lurk on Reddit and Discord, and will look at issues on GitHub as well.