Tuesday, 7 June 2016

Don’t Loot Unless Your Bad (#Clickbait)

LSV posed the following hypothetical:

The Looter Problem

This is my answer.

1) Looting in this situation is functionally equivalent to milling yourself 1 card.

The only difference is if we would keep the card we draw off the loot. Which we would only do if the top card of the library is better than lightning blast.

But if we knew the top card was better than lightning blast, in this situation we would prefer to not loot anyways. So lets consider another less controversial equivalent question:

2) In the middle of a game of limited if you were given the opportunity to mill the top card of your library would you?

2a) Decking
There is a small chance that you deck earlier and this ends up being relevant.

Verdict: Very Small Negative.

2b) Information
Harder to evaluate but I think generally symmetrical information is bad for player 1. The unknowns in your hand are the hardest part for an opponent to play around optimally. Knowing that player 1 has milled their bomb/sweeper/combat trick, is almost certainly going to be more valuable for their opponent than it is for Player 1. In some sense, the most valuable information for you is when some # of outs have been eliminated. But again we have a case where in order for the information to be valuable our deck composition has become worse.

Verdict: Net Small Negative.

2c) Variance

This is the most important factor for the decision. If you choose to mill you are increasing the variance of your draw. At the moment you mill each card has some value to you. And your draw step has the expected value of those cards. But if you remove a card at random you are changing the EV of your draw step. If your remove a good card then you are decreasing it and if you mill a bad card you are increasing it.

However the EV changes of all possible mills is net zero. This should hopefully be intuitive.

So milling the card increases the variance of your draw step (it is now higher or lower EV than before), but that change was net neutral in EV gained. So you have just generated variance with no gain in EV.

Verdict: Negative if adding variance is bad.

3)  So when do want more variance?

As usual we want more variance in the situations where we are behind or if we think our opponent is better than us.

In this particular example, there is no evidence we are currently behind. And I would hope for most people that they think they are better than their random opponent.

Verdict: In the middle of limited game you normally wouldn’t want to mill your top card.

Final: Thus you don’t loot.

Monday, 18 August 2014

4th at the Toronto WMCQ with BW Midrange.

Elspeth Suns Champion
Desecration Demon
Pack Rat
Nightveil Specter
Obzedat, Ghost Council of Orzhova

Banishing Light
Bile Blight
Devour Flesh
Sign in Blood
Hero's Downfall
Whip of Erebos
Godless Shrine
Urborg, Tomb of Yawgmoth
Caves of Koilos
BW Temple

Lifebane Zombie
Sin Collector
Underworld Connections
Drown in Sorrow
Last Breath
Banishing Light
Doom Blade

I have been experimenting with a lot of versions of Bx in the past few months. Before the PT I was interested in exploring versions which were similar to standard jund decks of old. In other words:

-Lots of Removal
-A few insane finishers
-A little bit of card advantage
-A couple of anti-control cards

Key to these builds were moving away from the standard MonoB synergy power and towards individual card power.  Channelfireball had already moved towards this direction, but I had enough experience with the deck to recognize a lot of things I didn’t like in their PT build.

After grinding for a week (including winning a Bye for WMCQ), I came across BBD’s build at 8am on Sunday morning. His biggest innovation was 3x Obzedat and 0 Bloodbaron’s main. Its also where I saw 4 sign in bloods.

I did see his article on SCG, but mostly adopted the changes based on a couple of things that really clicked for me after playing around with something closer to CFB’s list.

No BBoV: This card is good against MonoB and GW. But MonoB had already adopted 4 devour fleshes and 4 lifebanes main.  Rabble Red and Jund Planeswalkers had also adopted a million ways to answer it.
Meanwhile Obzedat is a faster clock (though worse at stabilizing) and more resilient. Basically demons 5-7. The clock part is highly relevant, because even if you tear their hand/board to shreds you can’t control the top of their deck (as a true control deck can). So you need to kill them.

Sign in Blood: Underworld connections becomes a debatable inclusion when not playing with grey merchant. It’s definitely great against UW and decent against the mirror. But sign in blood as a couple of key things going on. First, it lets you play 25 lands. Secondly, it makes your deck better on the draw. There is a significant portion of the field where connections is a mulligan on the draw (MonoU, GW, rabble red, even planeswalkers sometimes). Your deck is already bad on the draw because pack rat loses so much power.  Against MonoB sign in bloods are great because you want to start drawing cards in the midgame rapidly after you are getting thoughtseize and lbz’ed. They also make your curve much smoother since:

Sign -> 3 Drop or Removal ->Sign-> demon is much less awkward than when you have a Connections.

2 NVS, 1 LBZ: Lifebane Zombie is slowly becoming awful. Big green idiot decks are pretty poorly positioned and I am tired of getting chandra’ed. The card is good against GW, but NVS is reasonable there also. Meanwhile NVS is much better vs MonoB, MonoU, UW and Rabble Red. The 4 most important decks in the format. In general I am not high on 3 drops and almost cut them all from my deck. If you noticed my analogy to Jund, than the 3s don’t really fit anywhere in it. You could definitely save some sideboard slots and run 2 LBZ (cutting the one from the board).

4 Demon: Playing less than 4 demon is completely idiotic. Its your anti-bullshit card. People obsess over pack rat, without realizing that demon kills them just as quickly.

Duress Main: Very good against jund planeswalkers. Decent against rabble, GW and the mirror. Pretty bad against MonoU.

SB: The sideboard really only has one interesting decision. Either you respect Rabble Red (and play 3 drownn) or you use those slots to beat up on Midrange Green decks or GW. On Modo I saw barely any GW all week and the dealers were sold out of Legion Loyalists. So I decided on 3 drowns.

Erebos is an awful card. Only good against UW and then often just turns on Deicide.
The second last breath, LBZ and BBoV are the cards you want to beat GW.
You don’t really need to sideboard in the mirror.

The only mistake in deck construction was the Whip of Erebos. It should probably just be an interactive spell or a bloodbaron. I have never liked Whip in MonoB or Bg. But because this was my first time playing 3 obzdaddy, I decided to try it. Since BBD and Ben Friedman were clearly on board, I figured it might be better in BW than the previous MonoB versions I was more familiar with. I only drew it g1 twice and cast it once (vs Hayne and I lost).  I included it to try and be better against UW (since I was cutting connections), but am not sure it does enough.

I side it out versus basically every deck since I am not really interested in racing in any matchup except the mirror.

W - MonoU
W - Jund PW
W - UW

W - UW
W - Br
W - Jund PW
Draw (I couldn’t play for seed because I got paired against a friend).  This would come back to bite me because I would be on the draw vs GW in the semis (which probably swings matchup by ~10-20%).

QF: Beat GW on play.
SF: Lose GW on the draw.

Tuesday, 29 July 2014

NEW Organized Play Changes

Doing the Math.
sPTQ=Sub-PTQ = Feeder events.
rPTQ = Regional PTQ

Toronto has a population of about 3.5M people. Those people are served by 30 stores according to site location on Wizards (was actually easy to find with google!). I assume that 90% of the stores can and will host the sPTQs. An rPTQ has to cover about 20.6M in NA assuming it covers an equal fraction (16 rPTQs for 340M people in NA).

So that means we expect the rPTQ to average about 160 people in attendance (triggering the 128 cap). If the cap is not triggered (variance in attendance or less than 90% of stores hold sPTQs) than things are worse for most people.

Consider a “grinder”. This player will attend up to 10 sPTQs and has a 10% chance to win all of them. He also has a 8% (1.5x a 8/160 chance) chance to top 8 the rPTQ (reflecting his skill relative to increased difficulty).

Expected Cost of Qualifying:
6.5 Local sPTQs (sub 1.5 hr travel) + 1 rPTQ (65% of the time).
5% net probability of qualifying (you can increase to 7.1% chance if you attend all 27 sPTQs).

Grinder is willing to travel to 6 PTQs a season (everything <4 hours away). Has a probability of winning the PTQ = 2%.

Expected Cost of Qualifying:
6 PTQs (since low prob of winning).
11% Chance of Qualifying (this falls to 5.9% if you are only willing to attend 3 PTQs).

You will have less travel time per event. But spend more time on Magic. You are about ½ as likely to qualify if attended 6 ptqs a season. But only about 1% less likely if you attended 3 PTQs (though your magic time is now 3x). 

Friday, 14 March 2014

Random Thoughts on BTT Limited

Biggest Disagreements with CFB articles on limited (Limited to 2 cards per color):

Siren of the Coast Fang (MindControl Tribute): Seems to be rated as one of the better uncommons. I think it is mediocre. Essentially equivalent in power to a Prescient Chimera. And since we get two packs of Chimeras/Horizon Scholars I am not interested.

Its often going to be a 4/4 flier which is obviously good but not a ton better then prescient chimera. However on an empty board or versus a 2/1 (or 1/1, 2/1, 2/3 etc) it becomes a 1/1 + shitty creature. Imagine a board of three 2/1s. Mind control can be worse than a 3/4.

Archetype of Imagination: This is much better than mind control tribute. It can single handedly win a game. It represents a fast clock with any other board presence. Of course you can get blown out by removal but that isn’t much different than many other game changing pants in history of magic. At least in this format instant speed removal is much harder to come by. Perfect curve topper in any aggressive blue deck, though if it could only stop the unbeatable Nessian Asp it would be perfect.

Hero of Iroas: Frank Karsten has it as the third best rare. Which mostly doesn’t make sense in the context of how he rates Akroan Skyguard. I have had the Hero twice, it makes about 2 mana per game (when you draw it). Usually only one of that mana is meaningful (in terms of improving your curve). That upside does not make it significantly better then skyguard/wingsteed/favored hoplite etc… White is the best color, but Hero is much closer to Skyguard than Endless Legions.

Akroan Phalanx: White/Red is the best deck (or at least tied for it). Phalanx is insane in that deck while serviceable if you can’t activate it. However its fairly easy to get a shimmering grotto or nylea’s prescence. I am definitely first picking it over Vanguard of Brimaz.

Everflame Eidolon > Fall of Hammer  >> Bolt of Keranos and Searing Blood.

I think cheap removal is overrated since I only really want it vs ordeals (and edict hero). You also end up low on space in your heroic decks since combat tricks are also required. Everflame eidolon is cheap, trades with everything and a huge threat on an empty board.

Fearsome Temper: Second best common in the set. Can’t fathom a world I take searing blood over this. Post PT I started taking it over Bolt (which I didn’t have the balls for even though it felt right initially). Do not regret.

Nessian’s Wild Ravager: Personal bias definitely plays a part here, but I think the card is very overrated again. Worse than nessian asp. If you are losing to huge flying creature (or aqueous form etc..) the ability isn’t saving you. It comes down slow (acceleration is now less common as well). Basically too interchangeable with other shitty green boom booms to be a first pick.

Mortal’s Resolve: Fantastic card. This is green’s wannabe God’s Willing . Should be in discussion for best green common in the set (at least your first copy).  On that note stop passing Boon of Erebos.

For commons/uncommons:
1. Bile Blight
2. Asphyxiate
3. Servant of Tymaret

Don’t like Shrike Harpy in aggressive decks and its more replacable than the servant (though maybe slightly more powerful).

Thursday, 27 February 2014

Welcome to the Jungle.

I played Zoo at the PT. It didn’t go well. Limited didn’t go well either.

I still think this is the best Zoo deck. But given  that 3 people played the deck and the other two hated it, maybe its time to let go. 

The List:
4 Wild Nacatl
4 Kird Ape
4 Loam Lion
4 Experiment One
4 Tarmogoyf
2 Flint Hoof Boar
3 Mutagenic Growth
4 Ghor-Clan Rampager
4 Lightning Bolt
4 Path to Exile
4 Tribal Flames

4 Verdant Catacombs
4 Misty Rainforest
4 Arid Mesa
1 Forest
1 Hallowed Fountain
1 Steam Vents
1 Blood Crypt
1 Temple Garden
1 Sacred Foundry
1 Stomping Ground

3 Pyroclasm
3 Scavenging Ooze
1 Destructive Revelry
1 Ancient Grudge
1 Ray of Revelation
1 Combust
1 Torpor Orb
1 Thrun, the Last Troll
1 Sword of WaP
1 Harm's Way
1 Stony silence.

On this Zoo deck:
16 one drop zoo makes you a 50/50 deck where 15% of the time you draw 3 one drops and they are just dead. You also get a bunch of free wins against people who just didn’t really respect zoo. The main question is why this is better than a bigger version of zoo.

Going large is a lot less effective in the mirror these days. It used to be that a 4/4 (or 4/5) blocking a 2/3 or 3/3 was always a 2 for 1. Mutagenic Growth and Rampager changed that. All of the sudden your opponent often has to block on turn 3 or 4 with his 5/5 knight and you are going to blow him out.

People were also focused on fighting Zoo with permanents so I wanted to have the best reach possible. Once the board gums up you often need to be able to deal 5-8 damage off of two cards. Enter Rampager/Tribal Flames. We tested the mirror a bunch and felt that the added instability/damage from lands was basically the never deciding factor in the mirror. Against most big decks you are the aggressor (and life is not relevant) and against the other smaller decks flood/screw and helixes were the most important things. Taking an additional 2 from your lands wasn’t a big deal.

3 drops in general suck because you end up having to build a manabase which wants to get to 3. This means you can’t operate on 1 or 2 effectively and when you flood you draw 5 or 6 lands instead of 4 or 5. I think 3’s also generally are a trade off between resilience and speed. For an open format I generally want speed.

Maindeck I wanted to be immune to 2 power creatures (electrolyze/grim lavamancer/magma spray) being relevant by themselves. E.g. I didn’t want the front half of voice, back half of Finks or other random 2-power dudes to do anything against me by themselves. Thus there are no Goblin Guides or Burning Tree Emissarys. In most matchups you are fast enough without them.

The biggest sideboard innovation came from Todd. His suggestion of Pyroclasm was excellent in a variety of very close matchups: Affinity, Pod, BW Tokens, UR Pyromancer. It was also randomly decent against things like Boggles. In those matchups they are often relying on chumping while stabilizing or racing. Being able to clear multiple blockers (for two mana) is often enough to swing the game.

You can’t really bring out more than 5 cards in any matchup with this deck, so the rest of the sideboard is to provide you with some disruptive cards that cover almost any and every matchup. You can have insane 4 drops for the UWR matchup because they path you.

I had concerns with 3 matchups:
1) Burn.
2) UWR.
3) Twin decks built like our team’s. Tempo twin decks (such as ones that top 8ed) are much easier to beat because Spellskite is the real problem card. Didn’t think most people would play the more combo-ish version.

The Manabase is a work of art. Avoiding the chronic Steam Vents + Scalding Tarn combo that seems perennial in almost every tribal flames deck.

On Deck Selection:
The reason the top pros didn’t do well (in my opinion) is because even if you found the best archetypes with 3-5 days to go it didn’t matter. You weren’t going to be able to play/build them proficiently. Not having experts in the "hard" decks also made some of the fringe strategies seem better than they probably were. For example I know that going into the week of testing I thought that Scapeshift crushed Pod. I still think its favorable but much closer than I previously thought.

2-3 days before the PT I thought Pod was the best deck. But, after watching Josh McClain, talking to Sam Pardee and trying a few games for myself, I realized I couldn’t play it. I just don’t have the intuition to be able to pilot the deck with anything close to optimality.

In addition to Pod, I was confident we also had the best Affinity list (courteousy Alex Majlaton), the best Burn deck (courteousy Glenn McIelwain and team refinements), the Best Zoo deck (me and Todd) as well as the best Twin list (Glenn again).

Burn: This was our best anti Zoo deck. But it was a bit of a glass cannon and I (pretty much alone on the team) thought there was a good chance that Zoo would be a small part of the metagame (~10%) or not at the top tables.

Affinity: Early in testing we were all looking at cutting affinity hate or playing more generic cards (e.g. Destructive Revelry over Stony Silence). We didn’t think affinity would be a big player. When everyone starts believing that, it becomes the perfect time to robot some people. I am not good at affinity so it wasn’t really an option for me. It also turned out that other people kept their affinity hate for the most part.

Twin: For the metagame Face to Face was testing I thought Glenn had broken it. It was a better version of Burn as far as I was concerned. The all in Twin version had much better game vs Zoo and sacrificed against U-Control decks and Thoughtseize decks. Neither of which I thought would be big players (but I was open to being wrong here). However after testing a few games I was miserable. I think in 10 games I won 1. It was a weird headspace where I couldn’t beat the deck and I couldn’t win with it. I was drawing 3 Splinter Twins in every game or missing 4th land drops. My mind said the deck was great, my practice said it sucked. On the other hand I was confident in every card choice for Zoo.

Zoo: From our testing even the decks built to beat Zoo only won 40% of the time. And doing so contorted your deck to be worse against everyone who wasn’t trying so hard. Thus I thought the top tables would just be the versions of decks where they didn’t try so hard to beat Zoo (and that is essentially what Sam/Jacob/Josh did). Unfortunately most people decided to just jam 3 Anger of the Gods main and it didn’t hurt them because everyone was doing it.

Going forward I would be okay playing this deck again. The only matchup I wouldn’t want to play against for sure in the top 8 was Sean’s deck. Storm might also be bad, but I assume it will be unplayable in the near future as people go back to having some hate for it. I have no idea how Blue Moon plays out, but I know they have a lot of cards I normally don’t mind seeing across from me. Maybe their LD is good enough.

Tuesday, 19 November 2013

Facts of MODO

There are a lot of conversations about why MODO (Magic Online) is broken. This post is designed to collect information to inform that conversation. People correctly criticize me for opinions without facts. So I am going to try and go the other way.

I do not have any access to truly insider information from Wizards of the Coast or Hasbro.

What I do have access to:
Hasbro Financials (this includes transcripts from Investor Day, Accounting and Financial Statements, Annual Investor Report).
Facebook Comments (assorted from friends and acquaintences)
LinkedIn Profiles for various WotC employees
Hipstersofthecoast.com historical review of the MODO program (highly suggested reading).
Various salary and employment websites (glassdoor, salarylist, careerbliss)
Reddit, Wikipedia, Google etc..

The TL;DR estimates
Magic Revenues: $360M
Magic Players Worldwide: 3.3M - 12M
MODO Revenues: $140M
MODO Employees: 50-150
MODO Players: 500,000 - 700,000
MODO Developer Salaries: $60,000 (Industy Median is $75,000)
MODO Costs: $60-120M

The biggest problem with accuracy is that my statements will aggregate data from 2011-2013. And the timeline isn’t exactly clear even to me. So you might feel that the following answers are an unfair characterization of the situation.

What is the Problem with MODO?
            Hipstersofthecoast explains the problem with version 2 (note current client is v3 and Beta is v4) as said by Randy Buehler:

            “You might think that we could add more servers to deal with this problem, but that’s just not how Magic Online works. We can add more game servers to handle as many games as people want to play, but there is only one master server that handles everything else that goes on (chat, trading, ratings, etc.). Every time any user does anything outside of a “duel,” Magic Online has to spend some time thinking about that user. As we add more cool new features to the game, the amount of memory that needs to be allocated to each user keeps going up. At some point, when enough users are logged in doing enough things, the whole master server comes crashing down.”

I would highly suggest reading the full article here:

From what I can tell they attempted to go from one server organizing all non-game activity to multiple, but that clearly has not worked. It is unclear if we currently operate on one server still or multiple servers which do not scale well. Further discussion on Reddit suggests that MODO is built on antiquated language/framework (.NET /non-scalable etc…). I am in no way qualified to tell if this is true (EDIT 11/20: Reddit has also since commented that I am wrong).
How is Magic Doing?
Very good. Based on my readings of financial statements, the has been between 150%-300% growth in revenues during the 4 years since 2008. Additionally there is an expected growth of 35% in 2013.

Based on those numbers we would have revenues of 250-500M (Million) dollars this year.

Alternatively Hasbro reports 12M active magic players (including digital). Assuming each only spends 30$ per year on average. Then revenues are $360M. Hasbro’s revenues from “Games” is $1.2 Billion. It has stated that Magic is the biggest brand in the portfolio. Thus the the estimates seem reasonable.

*EDIT 11/20: The Hasbro 2012 report states there are 3.3M players currently. Despite their being an NBC article which quotes 12M as of 2013. The only official Hasbro source which uses the 12M number is a few years old, so I will be editing the range of players.

How is Hasbro doing? Any reason to think they are pinching pennies?
            As a company Hasbro had a rough 2012 (in terms of stock price). It has since rebounded during 2013 (though that was simply consistent with US Large Cap market in general). During 2012 things were bad enough that Hasbro was engaging in layoffs and restructuring as a cost saving measure.

            However there is little evidence that Hasbro makes many decisions re: the Magic the Gathering brand. I have read (in a statement by Aaron Forsythe I believe) that Hasbro has little input on the decisions to manage Wizards of the Coast properties. Sean McGowan (an analyst at Needham & Co.) says “The Best thing [Hasbro] did was leave [Magic] alone for several years.” when discussing the explosion in Magic’s popularity.

            In all financial statements Hasbro touts Magic as its model product. They  reference digital (though in most cases Duels of the Planeswalkers) and paper growth. Many people associated with MODO since 2008 (when Buehler et al. were fired), have been promoted. This includes Worth, Arron and Elaine Chase. 
Unclear whether MODO growth has mimiced paper.

How many people play MODO?
Note I will use multiple methodologies to arrive at different estimate and see how much they align.

According to the Linked In of Vice President of Digital Technology – ( See description below).  He alludes to “Direct Brand revenue impact of 150M+”.

I remember seeing somewhere that MODO is equivalent to North American revenues for Magic, was also equal to about 30-50% of overall revenues. Assuming this is true (it was according to Worth in 2007), and using the base number of 360M revenues overall, we estimate MODO has a revenue of 144M.

Assuming the average player spends $100 a year (which seems reasonable), then there are 1.4M people who touch MODO in a given year.

Based on personal observation there are only ~5000 people on MODO at peak times. Assuming each person plays 24 hours a year on average there would be 730,000 players.

According to Reddit other sources place MODO playerbase at 500,000.

So the True Number is likely somewhere between 500,000-1,400,000.

Note if we use the low end, then the average player is spending ~$300 year. Making MODO the highest revenue game per player that I could find.

To keep scale in perspective - a year old estimate puts League of Legends players at 32M active per month. Their revenues are in the $200M estimate range (Wikipedia).

How much resources are thrown at MODO?
According to the Linked In of Vice President of Digital Technology he is “Responsible for managing the technology development and operations for the Magic Online, free-to-play digital objects business, Duels of the Planeswalkers game title (XBLA, Steam, iTunes, Android Market) and the subscription D&Di digital experience.

He states he has a Budget of 40M+. Note the budget presumably wouldn't cover the fixed costs of developing MODO that he has no control over (office space, legal etc). I would assume MODO costs at least 150-300% of that budget.$60-120M.

He also mentions having 200 employees (150 of whom work onsite). Wizards of the Coast has 1000-5000 employees total according to glassdoor. There are 550 on LinkedIn. Given that Hasbro has 6000 employees total (based on company documents), 600-1000 overall seems about right for WotC.

Assuming MODO comprises 2/3s of the Digital Team at WotC there are 100 people working on it.

To put this in perspective Riot Games which makes league of legends has 2013 estimates of 200M in revenues and 1000 employees. MODO should require less employees (because actual game design is not part of the product). Blizzard (with Revenues in the area of $2B) had 7061 employees in 2012.

Are WotC software developers underpaid?
            According to Facebook WotC software interns earn $4000 less for a summer then other major software firms in Seattle. Getting an accurate measure of compensation for senior developers is much more complicated since few people actually report salaries.

Salary List Reports the following for Developers:
                       “Wizards of the Coast Software Developer average salary is $59,000, median salary is $59,000 with a salary range from $59,000 to $59,000. Wizards of the Coast Software Developer salaries are collected from government agencies and companies. Each salary is associated with a real job position. Wizards of the Coast Software Developer salary statistics is not exclusive and is for reference only. They are presented "as is" and updated regularly.”

The median for Software Developers overall is $75,000 (average is slightly higher).

            If you look at the last 10 employee reviews on glassdoor.com for WotC, 7 out of the 10 rated Wizards 2/5 or worse on compensation. The 3 people who rated them higher worked in Graphics, Art and Game Design. A couple of people who interviewed for software positions complained that interviews were conducted by recruiters and not people working directly with the product. Those complaints cited that recruiters had a lack of knowledge (both technical and regarding magic). Take this with a grain of salt since I assume most complainees did not receive job offers. I am unsure how common the use of recruiters is in the software industry. Blizzard had similar people surrounding it.

Also note that Hasbro is routinely voted one of the best places to work in the United States (via Fortune Magazine). However most of the benefits are perks and not direct compensation.

Are 3rd Party Developers a realistic option?
            Hasbro already pays EA studios to develop games for 8 of its various brands. It has also acquired a majority stake in Backflip Studios (a mobile game developer). Duels of the Planeswalkers is developed by a third party and the original version of MODO was developed by a professional studio. Current versions are made inhouse.

Does wizards have a track record re: inhouse development. Are they planning to move it out of house?
            I would refer you the Hipsters’ article linked above. Wizards has repeatedly made comments similar to the 11/2013 blog post. They have also removed premiere events before. The last time they were down for about 4-6 months. The 3.0 version was delayed by about 18 months (on top of the 18 month schedule).

Wizards is currently hiring Senior Magic Developers/Testers/Technology Project managers to work on MODO (as of Nov 5/2013).

Monday, 29 July 2013

Final Thoughts on the HoF and the Skill Paradox

A final thought on the HoF.

PT Top 8s in the modern era are worth less than PT Top 8s from earlier in magic’s history. This is in spite of the average modern player being much more skilled.

**What I am writing about is an adaptation of well known theory in investing, Sabremetrics and Poker. For those interested in a more detailed analysis I suggest Mauboussin and googling the ‘Skill Paradox.’.

The Paradox of Skill

We start with the fairly simple assumption that

Performance = Skill + Luck

We also assume each person’s luck is drawn each tournament from some distribution that is equal to all players. E.g. LSV might have been be luckier in PT Kyoto than Nassif because he opened Nicol Bolas at that specific tournament, however they had equal chances to open it.

Because a person’s skill and luck are uncorrelated, we arrive at the Paradox of Skill:

Variance of Population Performance =
Variance of Skill in population + Variance of Luck in Population

As the Variance in skill gets smaller, the variance associated with luck starts to dominate in determining the overall outcome of tournaments.

Consider the following example:
A)    A PT in 1999, where Jon Finkel is far and away the best player. The 100th best player barely knows how to draft and the rest of population is somewhere in between.

B)     A PT in 2013 where, the top 100 players are all equal to skill Jon Finkel @1999 skill level.

It should be clear that someone’s final position in PT A is strongly correlated with skill. In other words we can be confident that the person in 8th was better than the person in 16th.

In PT B, the opposite would be true. The only difference between someone who gets 8th and 15th was the amount of luck they had in that specific tournament.

Assumptions I am using:
1)      The skill dispersion (especially at the top of the game) is much lower today than it was historically. In other words, the top 50 players in the game are much closer today (even if they are all much better) than they were historically.

And that’s it. Everything I have read from Kai, Finkel and Kibler on the topic would seem to support the view, but I haven’t bothered to try and prove the above assumption.

Just to reinforce that this situation isn’t completely impossible. In a world where the top 50 players attend 3 PTs a year and each have 10% chance to top8 a PT: we would still expect one person to top 8 two PTs a year. In other words the fact that some players do consistently well isn't enough to disprove assumption 1. If you have ever heard or read about the birthday paradox, the same principles apply.

Practical Implications
We are seriously overweighting T8s and wins in the modern era competitors. Instead we should focus on a looser metric (e.g. 32s/64s etc…). Rate metrics and consistency become much more important. For older players, top 8s are more likely to imply that they were one of the best 8 players in the tournament. And a top 32 is more likely to imply that they were NOT one of the top 8.

Recently in his SCG article Reid Duke made the point we shouldn’t punish anyone for having a few bad initial years on the PT. And I really wanted this to be true (Because me obv). But if we now know that luck is the major determinant in people’s short term success rates, things like 3 Yr medians should mean less for modern competitors. Forgiving a few “bad years” makes it more likely you select someone who's results are variance driven (as opposed to skill).

Putting this together for HoF implications I think should go as follows. Suppose someone has 2 PT Top 8s and 6 PT top 16s. In the Modern Age: I think “Hes unlucky”. If he is old school: I think “he probably wasn’t that good”.

Focusing on results through this lens I think we could argue:
Underrated (in no particular order):
1)      Shouta Yasooka
2)      Hoaen (do we consider him “modern”?)
3)      Osyp

      1)   Edel.
2)      Saito (if “Modern”)
3)      Ikeda
4)      Gary

Final Unrelated HoF Thoughts:
Stats I used in my previous formula driven HoF Ballot:
Longevity = # of PTs, # of Pts
Consistency = PT Median, 3 Yr Median, Difference in Medians, T16s, GP Top 8s
Best in World = 3 Yr Median, POYs
Place in History = These are indicator variables (e.g. are you in the top 20%). In other words having 4 PT Top 8s is the same as zero because 80% of ballotees had 4 or less.

Top 8s, Money List, GP Top 8s, Pro Points.

Skill = T16s per PT, Median Finish, POYs per years played.

My Ballot (which I don’t have):

1. LSV
2. Edel + Ikeda
These are the only two who are not top 5 stat wise. I think pioneers in a field deserve credit. I am willing to go beyond the stats if there is proof they did something truly unique. I feel the case for Ikeda is weaker than Edel (he has more similar analogues in Fujita, Oishi etc..). I could be convinced to vote for Osyp (easily the most underrated candidate on the ballot) instead.

3. Shota Yasooka.
Stats + Skill Paradox already implied he was one of the best players skill wise on the ballot. Juza’s interview on cfb was a nice (if unnecessary) confirmation.

4. Saito.

1.      He was (or at least top 3) the best deckbuilder in the world for a long period of time. Still seems like he might be.

2.      He is one of the best players I have seen play. I can sometimes remember individual matches where I was blow away by the play I saw. Saito in TSP block is amongst those. Ditto San Juan. Most players I have talked to feel that he was easily amongst the best when he played.

3.      He was an angle shooter, but a lot of people on the PT are. Stalling in particular seems like one of the most hypocritical things for many players to call someone out on (based on my PT experience). So while he might be the scummiest of successful players (which I doubt), other players are close to that level. This might be too much apologizing for someone who is arguably a cheater (I differentiate between rules lawyers/cheaters/angle shooters), but I don’t believe (based on 1 and 2) that his results were significantly impacted by his angel shooting.

The honourable mentions: BenS, Efro, Gary.