Missing picture

Probably Not

John Wiley and Sons, New Jersey, 2019

From Chapter 1, An Introduction to Probability

Predicting the Future

The term Predicting the Future conjures up images of veiled women staring into hazy crystal balls or bearded men with darting eyes passing their hands over cups of tea leaves or something else equally humorously mysterious. We call these people fortune tellers, and relegate their professions to the regime of carnival side-show entertainment, along with snake charmers and the like. For party entertainment, we bring out a Ouija board; everyone sits around the board in a circle and watches the board extract its mysterious energy from our hands while it answers questions about things-to-come.

On the one hand, we all seem to have firm ideas about the future based on consistent patterns of events that we have observed. We are pretty sure that there will be a tomorrow, and that our clocks will all run at the same rate tomorrow as they did today. If we look in the newspaper (or these days, on the Internet), we can find out what time the sun will rise and set tomorrow – and it would be difficult to find someone willing to place a bet that this information is not accurate. On the other hand, whether or not you will meet the love of your life tomorrow is not something you expect to see accurately predicted in the newspaper.

We seem willing to classify predictions of future events into categories of the knowable and the unknowable. The latter category is left to carnival fortune tellers to illuminate. The former category includes predictions of when you’ll next need a haircut, how much weight you’ll gain if you keep eating so much pizza, etc.

But, there does seem to be an intermediate area of knowledge of the future. Nobody knows for certain when you’re going to die. An insurance company, however, seems able to consult its mystical Actuarial Tables and decide how much to charge you for a life insurance policy. How can an insurance company do this if nobody knows when you’re going to die? The answer seems to lie in the fact that if you study thousands of people similar in age, health, life style, etc., you can calculate an average life span – and that if the insurance company sells enough insurance policies with rates based upon this average, in a financial sense this is as good as if the insurance company knows exactly when you are going to die. There is, therefore, a way to describe life expectancies in terms of the expected behavior of large groups of people of similar circumstances.

When predicting future events, you often find yourself in these situations. You know something about future trends but you do not know exactly what is going to happen. If you flip a coin, you know you’ll get either heads or tails, but you don’t know which. If you flip 100 coins, or equivalently flip one coin 100 times, however, you’d expect to get approximately 50 heads and 50 tails.

If you roll a pair of dice you know that you’ll get some number between two and twelve, but you don’t know which number you’ll get. You do know that it’s more likely that you’ll get six than that you’ll get two.

When you buy a new light bulb, you may see written on the package “estimated lifetime 10,000 hours.” You know that this light bulb might last 10346 hours, 11211 hours, 9587 hours, 12094 hours, or any other number of hours. If the bulb turns out to last 11434 hours you won’t be surprised, but if it only lasts 1000 hours you’d probably switch to a different brand of light bulbs.

There is a hint in each of these examples which shows that even though you couldn’t accurately predict the future, you could find some kind of pattern that teaches you something about the nature of the future. Finding these patterns, working with them, and learning what knowledge can and cannot be inferred from them is the subject matter of the study of probability and statistics.

We can separate our study into two classes of problems. The first of these classes is understanding the likelihood that something might occur. We’ll need a rigorous definition of likelihood so that we can be consistent in our evaluations. With this definition in hand, we can look at problems such as “How likely is it that you can make money in a simple coin flipping game?” or “How likely is it that a certain medicine will do you more good than harm in alleviating some specific ailment?” We’ll have to define and discuss random events and the patterns that these events fall into, called Probability Distribution Functions (PDFs). This study is the study of Probability.

The second class of problems involves understanding how well you really know something. We will only discuss quantifiable issues, not “does she really love me?” or “is this sculpture a fine work of art?”

The uncertainties in how well we know something can come from various sources. Let’s return to the example of light bulbs. Suppose you’re the manufacturer of these light bulbs. Due to variations in materials and manufacturing processes, no two light bulbs are identical. There are variations in the lifetime of your product that you need to understand. The easiest way to study the variations in lifetime would be to run all your light bulbs until they burn out and then look at the data, but for obvious reasons this is not a good idea. If you could find the pattern by just burning out some (hopefully a small percentage) of the light bulbs, then you have the information you need both to truthfully advertise your product and to work on improving your manufacturing process.

Learning how to do this is the study of Statistics. We will assume that we are dealing with a stationary random process. In a stationary random process, if nothing causal changes, we can expect that the nature of the pattern of the data already in hand will be the same as the nature of the pattern of future events of this same situation, and we use statistical inference to predict the future. In the practical terms of the example of our light bulb manufacturer, we are saying that as long as we don’t change anything, the factory will turn out bulbs with the same distribution of lifetimes next week as it did last week. This assertion is one of the most important characteristics of animal intelligence, namely the ability to discern and predict based upon patterns.

The light bulb problem also exemplifies another issue that we will want to examine. We want to know how long the light bulb we’re about to buy will last. We know that no two light bulbs are identical. We also realize that our knowledge is limited by the fact that we haven’t measured every light bulb made. We must learn to quantify how much of our ignorance comes from each of these factors and develop ways to express both our knowledge and our lack of knowledge.

Rule Making

As the human species evolved, we took command of our environment because of our ability to learn. We learn from experience. Learning from experience is the art/science of recognizing patterns and then generalizing these patterns into a rule. In other words, the pattern is the relevant raw data that we’ve collected. A rule is what we create from our analysis of the pattern that we then use to predict the future. Part of the rule are (or some) preferred extrapolations and responses. Successful pattern recognition is, for example, seeing that seeds from certain plants, when planted at the right time of the year and given the right amount of water, will yield food; and that the seed from a given plant will always yield that same food. Dark, ominous looking, clouds usually precede a fierce storm and it’s prudent to take cover when such clouds are seen. Also, leaves turning color and falling off the trees means that winter is coming and preparations must be made so as to survive until the following spring.

If we noticed that every time it doesn’t rain for more than a week our vegetable plants die, we would generate a rule that if there is no rain for a week, we need to irrigate or otherwise somehow water the vegetable garden. Implicit in this is that somewhere a hypothesis or model is created. In this case our model is that plants need regular watering. When the data is fit to this model, we quantify the case that vegetable plants need water at least once a week, and then the appropriate watering rule may be created.

An interesting conjecture is that much, if not all, of what we call the arts came about because our brains are so interested in seeing patterns that we take delight and often find beauty in well-designed original patterns. Our eyes look at paintings and sculptures, our ears listen to music, our brains process the language constructs of poetry and prose, etc. In every case we are finding pleasure in studying patterns. Sometimes the patterns are clear, as in a Bach fugue. Sometimes the patterns are harder to recognize, as in a surrealistic Picasso painting. Sometimes we are playing a game looking for patterns that may or may not be there – as in a Pollack painting. Perhaps this way of looking at things is sheer nonsense, but then how can you explain why a good book or a good symphony (or rap song if that’s your style) or a good painting can grab your attention, and in some sense, please you? The arts don’t seem to be necessary for our basic survival; why do we have them at all?

A subtle rustling in the brush near the water hole at dusk sometimes – but not always – means that a man-eating tiger is stalking you. It would be to your advantage to make a decision and take action. Even if you’re not certain that there’s really a tiger present, you should err on the cautious side and beat a hasty retreat; you won’t get a second chance. This survival skill is a good example of our evolutionary tendency to look for patterns and to react as if these patterns are there, even when we are not really sure that they indeed are there. In formal terms, you don’t have a lot of data, but you do have anecdotal information.

Our prehistoric ancestors lived a very provincial existence. Life spans were short; most people did not live more than about 30 years. They didn’t get to see more than about 10000 sunrises. People outside their own tribe (and possibly some nearby tribes) were hardly ever encountered, so that the average person never saw more than a few hundred people over the course of a lifetime. Also, very few people (other than members of nomadic tribes) ever traveled more than about 50 miles from where they were born. There are clearly many more items that could be added to this list, but the point has probably been adequately made: Peoples’ brains never needed to cope with situations where there were hundreds of thousands or millions of data points to reconcile.

However, in today’s world things are very different: A state lottery could sell a hundred million tickets every few months. There are about 7 billion (that’s seven thousand million) people on the earth. Many of us (at least in North America and Western Europe) have traveled thousands of miles from the place of our birth many times; even more of us have seen movies and TV shows that depict places and people all over the world. Due to the ease with which people move around, a disease epidemic is no longer a local issue. Also, because we are aware of the lives of so many people in so many places, we know about diseases that attack only one person in a hundred thousand and tragedies that occur just about anywhere. If there’s a vicious murderer killing teenage girls in Boston, then parents in California, Saskatoon, and London hear about it on the evening news and worry about the safety of their daughters.

When dealing with unlikely events spread over large numbers of opportunities, your intuition can and does often lead you astray. Since you cannot easily comprehend millions of occurrences, or lack of occurrences, of some event, you tend to see patterns from a small numbers of examples; again the anecdotal approach. Even when patterns don’t exist, you tend to invent them; you are using your “better safe than sorry” prehistoric evolved response. This could lead to the inability to correctly make many important decisions in your life: What medicines or treatments stand the best chance of curing your ailments? Which proffered medicines have been correctly shown to be useful, which are simply quackery? Which environmental concerns are potentially real and which are simple coincidence? Which environmental concerns are no doubt real but probably so insignificant that we can reasonably ignore them? Are sure bets on investments or gambling choices really worth anything? We need an organized methodology for examining a situation and coping with information, correctly extracting the pattern and the likelihood of an event happening or not happening to us, and also correctly processing a large set of data and concluding, when appropriate, whether or not a pattern is really present.

We want to understand how to cope with a barrage of information. We need a way of measuring how sure we are of what we know, and when or if what we know is adequate to make some predictions about what’s to come.