Probability

  • The probability of an event can only be between 00 and 11 and can also be written as a percentage.

  • The probability of event AA is often written as P(A)P(A)P(A).

  • If P(A)>P(B)P(A) > P(B), then event AA has a higher chance of occurring than event BB.

  • If P(A)=P(B)P(A) = P(B), then events AA and BB are equally likely to occur.

X={3,13,5,13}Y={14,15,6,3}XY={3}   "and"XY={3,12,5,13,14,15,6}   "or"X = \{ 3, 13, 5, 13 \} \newline Y = \{14, 15, 6, 3\}\newline X\cap Y = \{3\} \ \ \ {"and"}\newline X\cup Y = \{3, 12, 5, 13, 14, 15, 6 \} \ \ \ {"or"}

AND == is in both groups

OR == both groups minus AND

A={5,3,17,12,19}B={17,19,6}AB={5,3,12}A\B all the things not in BB\ABA={6}A=\{5, 3, 17, 12, 19\}\newline B=\{17, 19, 6\}\newline A-B=\{5, 3, 12\}\newline {A}\backslash{B} \Longrightarrow \text{ all the things not in B}\newline {B}\backslash{A} \Longrightarrow B-A = \{6\}

A(set of all things in B that are not in A)=BA=B\AA' \text{(set of all things in B that are not in A)} = B-A = {B}\backslash{A}

C={5,0,7}7Cis member of9Cis not member C = \{-5, 0, 7\}\newline 7 \in C \Longrightarrow \text{is member of} \newline 9 \ni C \Longrightarrow\text{is not member}

If there is overlap:

P(A or B)=P(A)+P(B)P(A and B)P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)

Mutually Exclusive means that there is NO overlap, That they can not happen at the same time.

P(A or B)=P(A)+P(B)P(A \text{ or } B) = P(A) + P(B)

Sample Space

Coin Flip = {H, T}

compound sample space - all possible outcomes

Compound probability of independent events

P(H)=12P(HH)=14[HHHTTHTT]P(THT)=18P(T1)P(H2)P(T3)(for each P is and 1/2 possibility)P(H) = \frac{1}{2}\newline P(HH) =\frac{1}{4} \Longrightarrow [HH - HT - TH - TT]\newline P(THT) = \frac{1}{8} \Longrightarrow P(T_1)*P(H_2)*P(T_3) \newline \Longrightarrow\text{(for each P is and 1/2 possibility)}

P(at least 1 H in 3 flips)=78P(Not getting all tails in 3 flips)=1P(TTT)=78112121211878P( \text{at least 1 H in 3 flips)} = \frac{7}{8} \newline P(\text{Not getting all tails in 3 flips)} = 1 - P(TTT) = \frac{7}{8} \newline \Rightarrow 1 - \frac{1}{2}*\frac{1}{2}*\frac{1}{2} \Rightarrow 1 - \frac{1}{8} \Rightarrow \frac{7}{8}

P(at least one Head in 10 flips)=P(not all tails in 10 flips)=1P(10 tails in a row)=11210=111024=99.9%P\text{(at least one Head in 10 flips)} \newline = P\text{(not all tails in 10 flips)}\newline = 1 - P\text{(10 tails in a row)}\newline = 1 - \frac{1}{2^{10}}\newline = 1 - \frac{1}{1024}\newline = 99.9\%

If you have a 33% possibility to make a 3-pointer, what is your chance to make 3 in the row?

P(33100)3=P(0.33)3 P(\frac{33}{100})^3 = P(0.33)^3

If you flip three fair coins, what is the probability that you'll get two tails and one head in any order? If we flip three coins, there are 2\blue2 possible outcomes for each individual flip, so there are 2×2×2=8\blue2\times\blue2\times\blue2=8 total possible outcomes. Each outcome is equally likely. Equals to 38\frac{3}{8}.

The general multiplication rule

When we calculate probabilities involving one event AND another event occurring, we multiply their probabilities.

In some cases, the first event happening impacts the probability of the second event. We call these dependent events.

In other cases, the first event happening does not impact the probability of the seconds. We call these independent events.

Independent events: Flipping a coin twice

What is the probability of flipping a fair coin and getting "heads" twice in a row? That is, what is the probability of getting heads on the first flip AND heads on the second flip?

Imagine we had 100 people simulate this and flip a coin twice. On average, 50people would get heads on the first flip, and then 25 of them would get heads again. So 25 out of the original 100 people — or 1/4 of them — would get heads twice in a row.The number of people we start with doesn't really matter. Theoretically, 1/2 of the original group will get heads, and 1/2 of that group will get heads again. To find a fraction of a fraction, we multiply.We can represent this concept with a tree diagram like the one shown below.

​We multiply the probabilities along the branches to find the overall probability of one event AND the next even occurring.For example, the probability of getting two "tails" in a row would be:

P(T and T)=1212=14P(\text{T and T})=\dfrac12 \cdot \dfrac12=\dfrac14

When two events are independent, we can say that

P(A and B)=P(A)P(B)P(\text{A and B})=P(\text{A}) \cdot P(\text{B})

Dependent events: Drawing cards

We can use a similar strategy even when we are dealing with dependent events.

Consider drawing two cards, without replacement, from a standard deck of 52 cards. That means we are drawing the first card, leaving it out, and then drawing the second card.

What is the probability that both cards selected are black?

Half of the 52 cards are black, so the probability that the first card is black is 26/52. But the probability of getting a black card changes on the next draw, since the number of black cards and the total number of cards have both been decreased by 1.

Here's what the probabilities would look like in a tree diagram:

So the probability that both cards are black is:

P(both black)=265225510.245P(\text{both black})=\dfrac{26}{52}\cdot\dfrac{25}{51}\approx0.245

The general multiplication rule

For any two events, we can say that

P(A and B)=P(A)P(BA)P(\text{A and B})=P(\text{A}) \cdot P(\text{B}|\text{A})

The vertical bar in P(BA)P(\text{B}|\text{A}) means "given," so this could also be read as "the probability that B occurs given that A has has occurred."

This formula says that we can multiply the probabilities of two events, but we need to take the first event into account when considering the probability of the second event.

If the events are independent, one happening doesn't impact the probability of the other, and in that case, P(BA)=P(B)P(\text{B}|\text{A})=P(\text{B}).

Tree diagrams and conditional probability

Example: Bags at an airport

An airport screens bags for forbidden items, and an alarm is supposed to be triggered when a forbidden item is detected.

  • Suppose that 5% of bags contain forbidden items.

  • If a bag contains a forbidden item, there is a 98%, percent chance that it triggers the alarm.

  • If a bag doesn't contain a forbidden item, there is an 8% chance that it triggers the alarm.

Given a randomly chosen bag triggers the alarm, what is the probability that it contains a forbidden item?

Let's break up this problem into smaller parts and solve it step-by-step.

Starting a tree diagram

The chance that the alarm is triggered depends on whether or not the bag contains a forbidden item, so we should first distinguish between bags that contain a forbidden item and those that don't.

"Suppose that 5% of bags contain forbidden items."

Filling in the tree diagram

"If a bag contains a forbidden item, there is a 98% chance that it triggers the alarm."

"If a bag doesn't contain a forbidden item, there is an 8% chance that it triggers the alarm."

We can use these facts to fill in the next branches in the tree diagram like this:

Completing the tree diagram

We multiply the probabilities along the branches to complete the tree diagram. Here's the completed diagram:

Solving the original problem

"Given a randomly chosen bag triggers the alarm, what is the probability that it contains a forbidden item?"

Use the probabilities from the tree diagram and the conditional probability formula:

P(forbidden  alarm)=P(FA)P(A)P(\text{forbidden }| \text{ alarm})=\dfrac{P(\text{F} \cap \text{A})}{P(\text{A})}

The vertical bar means "given."

So P(forbidden  alarm)P(\text{forbidden }| \text{ alarm}) can be read as "the probability that the bag contains a forbidden item GIVEN the alarm is triggered."

The intersection symbol means "and."

So P(FA)P(\text{F} \cap \text{A}) can be read as "the probability that a bag contains a forbidden item AND triggers the alarm."

Conditional probability and independence

In probability, we say two events are independent if knowing one event occurred doesn't change the probability of the other event.

For example, the probability that a fair coin shows "heads" after being flipped is 1/2. What if we knew the day was Tuesday? Does this change the probability of getting "heads?" Of course not. The probability of getting "heads," given that it's a Tuesday, is still 1/2. So the result of a coin flip and the day being Tuesday are independent events; knowing it was a Tuesday didn't change the probability of getting "heads."

Not every situation is this obvious. What about gender and handedness (left handed vs. right handed)? It may seem like a person's gender and whether or not they are left-handed are totally independent events. When we look at probabilities though, we see that about 10% of all people are left-handed, but about 12% of males are left-handed. So these events are not independent, since knowing a random person is a male increases the probability that they are left-handed.

The big idea is that we check for independence with probabilities.

Two events, AA and BB, are independent if P( B)=P(A)P(\text{A } | \text{ B})=P(\text{A})and P( A)=P(B)P(\text{B } | \text{ A})=P(\text{B}).

What if the probabilities are close?

When we check for independence in real world data sets, it's rare to get perfectly equal probabilities. Just about all real events that don't involve games of chance are dependent to some degree.

In practice, we often assume that events are independent and test that assumption on sample data. If the probabilities are significantly different, then we conclude the events are not independent. We'll learn more about this process in inferential statistics.

Last updated