17 March 2014

When I was playing Dungeons and Dragons, I noticed something: I roll badly. A lot. How bad? Well, I wanted to find that out, so I started taking notes on my die results and wrote them into a spreadsheet. I fully expected it to show me that I was just imagining things. You obviously tend to remember extremely bad results more than boring ones, so it would make sense that I would get the feeling of rolling badly even if in fact I was just rolling regularly.

That suspicion, as it turned out, was totally wrong.

The first time I reviewed the stats, one third, \(\frac{1}{3}\), of the rolls I made were a \(1\), the worst result! If we did something simple, like taking a p-value, my results would be publishable in social science research: The probability of a fair die rolling that low was smaller than \(5\%\) .

I'm not going to tell you, how to calculate p-values however, because they are the most misused concept of statistics. A p-value only tells you that the null- hypothesis (a fair die) is unlikely, it does not give you any hint of the strength of effect, the origin of the same or how likely a slightly different null-hypothesis would explain your data. Instead I will show you how to do a Bayesian model of my die rolls, which arguably would give you all of the above.

For this, I will be using SciPy and the modelling library pyMC.

  %pylab inline
  %config InlineBackend.figure_format = 'svg'
  import pymc
  from collections import Counter
  from dendropy.mathlib import statistics as stats
  import gdata.spreadsheets.client as s
  import gdata.client as sc
  import gdata

Interface with Google Docs

In order to interface with google docs, we need to generate an application specific password at accounts.google.com and copy the spreadsheet id (the long string in the URL).

  client = s.SpreadsheetsClient()
  token=client.ClientLogin('<email>', '<password>', 'spreadsheet-cruncher')

  entry=client.GetCells(u'<spreadsheet id>', '<sheet name>', auth_token=token)

We can then just count the occurances of each value

  throws_c = Counter()
  for entry in entry.GetElements():
      if isinstance(entry, gdata.spreadsheets.data.CellEntry):
          throws_c[int(float(entry.cell.numeric_value))] += 1

Here are the results:

  throws = [throws_c[i] for i in range(1,21)]
  nthrows = sum(throws)
  print throws
  print len(throws)
  print nthrows
[9, 4, 3, 2, 3, 0, 1, 3, 4, 3, 2, 3, 2, 1, 2, 3, 1, 3, 4, 3]
20
56

But this could come from a fair die result, right? We'd actually expect it to be - what are the odds of being actually unlucky, right?

However this post would be boring if we just kept that assumption (and we'd be quite unscientific if we didn't check our theory), so instead we start out assuming that we have a fair die and update our belief according to the data we see.

So we model our die intead as a multinomial distribution: Every result has a distinct probability \( p_i \) to come up on the die, so the overall probability of any combination of results is

\[ P(X) = \frac{\mathtt{nthrows}!}{\prod n_i!} p_i^{n_i} \]

But we also need to make some assumptions about how likely we would expect \(p_i\) to be a certain value. A good guess is that it is a Dirichlet distribution: The probabilities sum up to 1 and we are biased to assume that they are equal:

  strength_of_prior = 10
  dp=pymc.Dirichlet(name="d", theta=[strength_of_prior]*20)

Now we can set up the multinomial with the prior weights:

  draws = pymc.Multinomial(
                      name="dice",
                      p=dp, n=nthrows,
                      observed=True,
                      value=throws
  )

There is a closed form solution to a simple model like this: You could just add your real data to the pseudo-data from the prior to update your belief. The pymc library lets you define arbitrarily complex models, which can only be evaluated numerically. To do that, we set up a Markov Chain Monte Carlo simulation of our model, which is basically just an elaborate and efficient way to produce random samples. We then count the samples that correspond to our data and check which parameters produced those.

  mcmc = pymc.MCMC([dp, draws])

  mcmc.sample(iter=100000, burn=10000, thin=10)
[-----------------100%-----------------] 100000 of 100000 complete in 15.7 sec

We can now compute the estimation of the parameters (the probabilities \(p_i\)) by averaging the results over the run of the algorithm.

  results = mcmc.trace('d')[:]
  results = zip(*results)
  means = []
  for r in results:
      m, v = stats.mean_and_sample_variance(r)
      means.append(m)
  means.append(1.0 - sum(means))
  print
  print "result\tprobability"
  for i, m in enumerate(means,1):
      print "%-5s\t%05f" % (i,m)
result  probability
1       0.074841
2       0.054479
3       0.051117
4       0.046901
5       0.051254
6       0.039374
7       0.042444
8       0.050789
9       0.054757
10      0.050486
11      0.047042
12      0.051116
13      0.047223
14      0.042702
15      0.046054
16      0.050521
17      0.042745
18      0.050555
19      0.055060
20      0.050540

and produce a nice graph, that shows our initial estimation (that we have a fair die) with the actual results of our simulation:

  f=figure()
  xlim((1,20))
  ylim((0,0.2))
  real = bar(arange(1,21), means, .7, yerr=var, color=(1,0,0,.5))
  theory = plot(arange(1,21), [0.05]*20, color=(0,0,.75,1))

  legend((real[0], theory[0]), ('Real', 'Fair die'))

svg

What does that mean? Well, if we strongly (as determined by strength_of_prior) assume that I throw results evenly distributed, the data still tells us, that I'm far more likely to roll ones than any other result. The most likely explanation is plotted right there, so basically even statistics says I'm really unlucky.



blog comments powered by Disqus