Monday, March 9, 2015

Why is my stats class so focused on bell curves?

I would wager that many a student has taken a statistics class, and been introduced to bell curves without having the slightest idea why they're used. That was me, to some extent, when I took my first statistics class. I was told they were useful, and was willing at the time to take their word for it. But it was not until a second statistics class, many years later, that I learned why bell curves are used.

Suppose for a minute that you roll a dice - or a "die," if you want to get technical. What are your chances of rolling a one? If the die is fair, it's 1 in 6; because the six sides of the die are equally likely. Your chances of rolling a two are the same; as are a three, a four, a five, and a six. The 6 outcomes of rolling a die are equally likely.


But suppose for a minute that you roll two dice. If you do this (as when playing Monopoly), things change. There's only one way to roll snake-eyes (a.k.a. the number two), which is a one and a one. And there's likewise only one way to roll a twelve, which is a six and a six. But the number seven is quite different. You could have a 1 and a 6, a 2 and a 5, or a 3 and a 4 - not to mention having the dice in these combinations switched around, as with a 4 and a 3, a 5 and a 2, and a 6 and a 1.

Color-coded chart of the results of rolling two dice

If you count these numbers up, there are thus six ways to roll a seven. But there's only one way to roll snake-eyes or twelve. If you were to tally up the number of ways you could roll each number in between, you would find that the outcomes in the middle are more likely than those at either extreme. A graph of it would look like this:

Now let's ratchet up the complexity a notch. Suppose for a minute that you roll three dice. The possible outcomes range from a 3 (three ones) to an 18 (three sixes). Tallying up the number of ways you could roll each one is very time-consuming, because there are so many different combinations for rolling three dice with six possible outcomes each. (6 × 6 × 6 = 216 combinations.) But if you do it (and I did it once), you get a graph that looks like this:

This is a simple bell curve. From rolling a mere 3 dice with a mere 6 outcomes each, you get the bell curve shape. And the higher the number of dice (and number of outcomes each), the more it starts to look like a bell curve.

The principle here is that anytime something is decided by the sum of multiple factors (in this case, dice rolls), you get the bell curve when the numbers are high enough. A lot of things work this way - human height, for example; or IQ scores. Both height and intelligence are affected by several genes in our DNA, and so you're rolling a lot of dice (figuratively speaking) with multiple outcomes each. Thus, height and intelligence can be modeled by bell curves. This means, in practical terms, that you're much more likely to find someone with average height than either very tall or very short; and you're more likely to find someone with average intelligence than either very smart or very stupid.

And height and intelligence aren't the only things that work this way - anything decided by the sum of multiple factors can be modeled by a bell curve. You find these in biology, psychology, economics, and business. You find these in nature and in humanity. And you find them in the most unexpected places.

Of course, not everything can be modeled that way - one dice roll couldn't be, nor could the product of several events (the result of multiplying numbers rather than adding them). But a lot of things can be; so statistics classes spend a lot of time teaching you how to use them. Bell curves can teach you that 60% of humanity is above a certain height, or that only 10% of humanity has higher than a certain IQ.

In short, bell curves can teach you a lot about the frequency of various things; and thus, the probabilities of encountering them in everyday life.

So if they seem difficult and arcane, just remind yourself of what they can be used for; and hopefully it'll be easier to persevere through them.

Other posts about statistics

No comments:

Post a Comment

Follow by email

Google+ Badge