Missing picture

Probably Not

John Wiley and Sons, New Jersey, 2008


From The Preface

For as long as I can remember, I have been interested in how well we know what we say we know, how we acquire data about things, and how we react to and use this data. I was surprised when I first realized that many people are not only not interested in these things, but are actually averse to learning about them. It wasn’t until fairly recently that I concluded that we seem to be genetically programmed, on one hand, to intelligently learn how to acquire data and use it to our advantage but also, on the other hand, to stubbornly refuse to believe what some simple calculations and/or observations tell us.

My first conclusion is supported by our march through history, learning about agriculture and all the various forms of engineering and using this knowledge to make life better and easier. My latter conclusion comes from seeing all the people sitting on stools in front of slot machines at gambling casinos, many of whom are there that day because the astrology page in the newspaper told them that this was “their day.”

This is a book about probability and statistics. It's mostly about probability, with just one chapter dedicated to an introduction to the vast field of statistical inference.

There are many excellent books on this topic available today. I find that these books fall into two general categories. One category is textbooks. Textbooks are heavily mathematical with derivations, proofs and problem sets, and an agenda to get you through a term's course work. This is just what you need if you are taking a course.

The other category is books that are meant for a more casual audience — an audience that's interested in the topic but isn't interested enough to take a course. We're told today that people have “mathephobia,” and the books that appeal to these people try very hard to talk around the mathematics without actually presenting any of it. Probability and statistics are mathematical topics. A book on these subjects without math is sort of like a book on French grammar without any French words in it. It's not impossible, but it sure is doing things the hard way.

If you thumb through the book, you'll see a few “fancy” formulas. These are either simply shorthand notations for things like repeated additions, which I discuss in great detail to get you comfortable with them, or in a few cases some formulas that I'm quoting just for completeness but that you don' t need to understand if you don't want to.

As I discuss in the first chapter, probability is all about patterns of things such as what happens when I roll a pair of dice a thousand times, or what the life expectancies of the population of the United States looks like, or how a string of traffic lights slows you down in traffic. Just as a course in music with some discussions of rhythm and harmony helps you to “feel” the beauty of the music, a little insight into the mathematics of the patterns of things in our life can help you to feel the beauty of these patterns as well as to plan things that are specifically unpredictable (when will the next bus come along and how long will I have to stand in the rain to meet it?) as best possible.

Most popular science and math books include a lot of biographical information about the people who developed these particular fields. This can often be interesting reading, though quite honestly I'm not sure that knowing how Einstein treated his first wife helps me to understand special relativity.

I have, therefore, decided not to include biographical information. I often quote a name associated with a particular topic (Gaussian curves, Simpson's Paradox, Poisson distribution) because that's how it's known.

Probabilistic considerations show up in several areas of our lives. Some we get explicitly from nature, such as daily rainfall or distances to the stars. Some we get from human activities, including everything from gambling games to manufacturing tolerances. Some come from nature, but we don't see them until we “look behind the green curtain.” This includes properties of gases (e.g., the air around us) and the basic atomic and subatomic nature of matter.

Probability and statistics deals a lot with examining sets of data and drawing a conclusion — for example, “the average daily temperature in Coyoteville is 75 degrees Fahrenheit.” This sounds like a great place to live until you learn that the temperature during the day peaks at 115 degrees while at night it drops to 35 degrees. In some cases we will be adding insight by summarizing a data set, but in some cases we will be losing insight.

My brother-in-law Jonathan sent me the following quote, attributing it to his father. He said that I could use it if I acknowledge my source: Thanks, Jonathan.

“The average of an elephant and a mouse is a cow, but you won't learn much about either elephants or mice by studying cows.” I'm not sure exactly what the arithmetic in this calculation would look like, but I think it's a memorable way of making a very good point.

I could write a long treatise on how bad conclusions have been reached because the people who had to draw the conclusions just weren't looking at all the data. Two examples that come to mind are (1) the Dow silicone breast implant lawsuit where a company was put out of business because the plaintiffs “ demonstrated ” that the data showed a link between the implants and certain serious disease and (2) the crash of the space shuttle Challenger where existing data that the rubber O - rings sealing the liquid hydrogen tanks get brittle below a certain temperature somehow never made it to the table.

The field of probability and statistics has a very bad reputation (“Lies, Damned Lies, and Statistics*”). It is so easy to manipulate conclusions by simply omitting some of the data, or to perform the wrong calculations correctly, or to misstate the results — any and all of these possibly innocently — because some problems are very complicated and subtle. I hope the materials to follow show what information is needed to draw a conclusion and what conclusion(s) can and can't be drawn from certain information. Also I'll show how to reasonably expect that sometimes, sometimes even inevitably, as the bumper stickers say, stuff happens.

I spend a lot of time on simple gambling games because, even if you're not a gambler, there's a lot to be learned from the simplest of random events — for example, the result of coin flips. I've also tried to choose many examples that you don't usually see in probability books. I look at traffic lights, waiting for a bus, life insurance, scheduling appointments, and so on. What I hope to convey is that we live in a world where so many of our daily activities involve random processes and the statistics involved with them.


*This quote is usually attributed to Benjamin Disraeli, but there seems to be some uncertainty here. I guess that, considering the book you’re now holding, I should say that “There is a high probability that this quote should be attributed to Benjamin Disraeli."