Probability and Research Design

What Probabilities , Definition

Definition(s) of probability

We could choose one of several technical definitions for probability, but for our purposes it refers to an assessment of the likelihood of the various possible outcomes in an experiment or some other situation with a "random" outcome. Note that in probability theory the term “outcome” is used in a more general sense than the outcome vs. explanatory variable terminology that is used in the rest of this book.

In probability theory the term “outcome” applies not only to the “outcome variables” of experiments but also to “explanatory variables” if their values are not fixed. For example, the dose of a drug is normally fixed by the experimenter, so it is not an outcome in probability theory, but the age of a randomly chosen subject, even if it serves as an explanatory variable in an experiment, is not “fixed ” by the experimenter, and thus can be an “outcome” under probability theory.

The collection of all possible outcomes of a particular random experiment (or other well defined random situation) is called the sample space, usually abbreviated as S or Ω (omega). The outcomes in this set (list) must be exhaustive (cover all possible outcomes) and mutually exclusive (non-overlapping), and should be as simple as possible.

We use the term event to represent any subset of the sample space. One way to think about events is that they can be defined before the experiment is carried out, and they either occur or do not occur when the experiment is carried out. In probability theory we learn to compute the chance that events like “odd side up” will occur based on assumptions about things like the probabilities of the elementary outcomes in the sample space.

Technically, this mapping is called a random variable, but more commonly and informally we refer to the unknown numeric outcome itself (before the experiment is run) as a “random variable”. Random variables commonly are represented as upper case English letters towards the end of the alphabet, such as Z, Y or X. Sometimes the lower case equivalents are used to represent the actual outcomes after the experiment is run.

Random variables are maps from the sample space to the real numbers, but they need not be one-to-one maps. For example, in the die experiment we could map all of the outcomes in the set {1du, 3du, 5du} to the number 0 and all of the outcomes in the set {2du, 4du, 6du} to the number 1, and call this random variable Y.

If we call the random variable that maps to 1 through 6 as X, then random variable Y could also be thought of as a map from X to Y where the odd numbers of X map to 0 in Y and the even numbers to 1. Often the term transformation is used when we create a new random variable out of an old one in this way. It should now be obvious that many, many different random variables can be defined/invented for a given experiment.

A few more basic definitions are worth learning at this point. A random variable that takes on only the numbers 0 and 1 is commonly referred to as an indicator (random) variable. It is usually named to match the set that corresponds to the number 1. So in the previous example, random variable Y is an indicator for even outcomes.

For any random variable, the term support is used to refer to the set of possible real numbers defined by the mapping from the physical experimental outcomes to the numbers. Therefore, for random variables we use the term “event” to represent any subset of the support.

Ignoring certain technical issues, probability theory is used to take a basic set of assigned (or assumed) probabilities and use those probabilities (possibly with additional assumptions about something called independence) to compute the probabilities of various more complex events.

The core of probability theory is making predictions about the chances of occurrence of events based on a set of assumptions about the underlying probability processes.

One way to think about probability is that it quantifies how much we can know when we cannot know something exactly. Probability theory is deductive, in the sense that it involves making assumptions about a random (not completely predictable) process, and then deriving valid statements about what is likely to happen based on mathematical principles.

For this course a fairly small number of probability definitions, concepts, and skills will suffice.

For those who are unsatisfied with the loose definition of probability above, here is a brief description of three different approaches to probability, although it is not necessary to understand this material to continue through the chapter. If you want even more detail, I recommend Comparative Statistical Inference by Vic Barnett.

Valid probability statements do not claim what events will happen, but rather which are likely to happen. The starting point is sometimes a judgment that certain events are a priori equally likely.

Then using only the additional assumption that the occurrence of one event has no bearing on the occurrence of another separate event (called the assumption of independence), the likelihood of various complex combinations of events can be worked out through logic and mathematics. This approach has logical consistency, but cannot be applied to situations where it is unreasonable to assume equally likely outcomes and independence.

A second approach to probability is to define the probability of an outcome as the limit of the long-term fraction of times that outcome occurs in an ever-larger number of independent trials. This allows us to work with basic events that are not equally likely, but has a disadvantage that probabilities are assigned through observation.

Nevertheless this approach is sufficient for our purposes, which are mostly to figure out what would happen if certain probabilities are assigned to some events.

A third approach is subjective probability, where the probabilities of various events are our subjective (but consistent) assignments of probability. This has the advantage that events that only occur once, such as the next presidential election, can be studied probabilistically.

Despite the seemingly bizarre premise, this is a valid and useful approach which may give different answers for different people who have different beliefs, but still helps calculate your rational but personal probability of future uncertain events, given your prior beliefs.

Regardless of which definition of probability you use, the calculations we need are basically the same. First we need to note that probability applies to some well-defined unknown or future situation in which some outcome will occur, the list of possible outcomes is well defined, and the exact outcome is unknown.

If the outcome is categorical or discrete quantitative , then each possible outcome gets a probability in the form of a number between 0 and 1 such that the sum of all of the probabilities is 1.

This indicates that impossible outcomes are assigned probability zero, but assigning a probability zero to an event does not necessarily mean that that outcome is impossible (see below). (Note that a probability is technically written as a number from 0 to 1, but is often converted to a percent from 0% to 100%. In case you have forgotten, to convert to a percent multiply by 100, eg, 0.25 is 25 % and 0.5 is 50% and 0.975 is 97.5%.)

“Every valid probability must be a number between 0 and 1 (or a percent between 0% and 100%).”

We will need to distinguish two types of random variables. Discrete random variables correspond to the categorical variables plus the discrete quantitative variables. Their support is a (finite or infinite) list of numeric outcomes, each of which has a non-zero probability. (Here we will loosely use the term “support” not only for the numeric outcomes of the random variable mapping, but also for the sample space when we do not explicitly map an outcome to a number.)

Examples of discrete random variables include the result of a coin toss (the support using curly brace set notation is {H,T}), the number of tosses out of 5 that are heads ({0, 1, 2, 3, 4, 5}), the color of a random person's eyes ({blue, brown, green, other}), and the number of coin tosses until a head is obtained ({1, 2, 3, 4, 5, . . .} ). Note that the last example has an infinitely sized support.

Continuous random variables correspond to the continuous quantitative variables. Their support is a continuous range of real numbers (or rarely several disconnected ranges) with no gaps. When working with continuous random variables in probability theory we think as if there is no rounding, and each value has an infinite number of decimal places.

In practice we can only measure things to a certain number of decimal places, actual measurement of the continuous variable “length” might be 3.14, 3.15, etc., which does have gaps. But we approximate this with a continuous random variable rather than a discrete random variable because more precise measurement is possible in theory.

A strange aspect of working with continuous random variables is that each particular outcome in the support has probability zero, while none is actually impossible. The reason each outcome value has probability zero is that otherwise the probabilities of all of the events would add up to more than 1.

So for continuous random variables we usually work with intervals of outcomes to say, eg , that the probability that an outcome is between 3.14 and 3.15 might be 0.02 while each real number in that range, eg, π (exactly), has zero probability. Examples of continuous random variables include ages, times, weights, lengths, etc. All of these can theoretically be measured to an infinite number of decimal places.

It is also possible for a random variable to be a mixture of discrete and continuous random variables, eg, if an experiment is to flip a coin and report 0 if it is heads and the time it was in the air if it is tails, then this variable is a mixture of the discrete and continuous types because the outcome “0” has a non-zero (positive) probability, while all positive numbers have a zero probability (though intervals between two positive numbers would have probability greater than zero.)

What Probabilities

Probability and Research Design

Definition(s) of probability

Afza.Malik GDA

Post a Comment

Risk Management and Nursing Role

Devotion for Nursing

#buttons=(Ok, Go it!) #days=(20)

Contact form

What Probabilities

Probability and Research Design

Definition(s) of probability

Afza.Malik GDA

You Might Like

Post a Comment

#buttons=(Ok, Go it!) #days=(20)

Contact form