Interesting probability puzzle I have been pondering:

Question

Interesting probability puzzle I have been pondering:

Charles Scott

You roll an unfair N sided die M times where M >= N. What is the probability that you will see every side at least once?

The probability of each of the N sides coming up are, respectively, [p_1, p_2, ... p_N]

June 29, 2017 - 03:04

Other urls found in this thread:

en.wikipedia.org/wiki/Inclusion–exclusion_principle#In_probability
twitter.com/SFWRedditVideos

Oliver Nelson

Mfw pure math brainlets will argue over this answer when you can simply use a Monte Carlo algorithm to find the answer in minutes

June 29, 2017 - 15:36

Colton Lee

I thought it would be a cool exercise and I tried but what are p_1, p_2, etc. supposed to represent? You can't simulate anything if you don't know the probabilities of each side

June 29, 2017 - 16:23

Julian Sanchez

brainlet
it doesn't matter what the probabilities are specifically. [math]\sum _{i=1}^nP_i = 1[/math] is all you need to know.

Anyway, the probability of not seeing the ith side after M rolls is [math](1-P_i)^M[/math]

The probability of not NOT seeing the ith side after M rolls is [math]1-(1-P_i)^M[/math]

The probability of not not seeing all the sides in M rolls is [math]\sum _{i=1}^n\left(1-\left(1-P_i\right)^M\right)[/math] which can be rewritten as [math]n-\sum _{i=1}^n\left(1-P_i\right)^M[/math]
In other words, that's the probability that you'll see every side after M rolls.

June 29, 2017 - 16:51

Ayden Robinson

I think it's (M choose N) * product of all N probabilities

M = 3 and N = 3, (3 choose 3) p1 p2 p3 = p1 p2 p3

M = 4 and N = 3, say the sides are numbered 1 2 and 3, you want some combination of 123_ (4 choose 3) = 4 of them, where the underscore is arbitrary and happens with probability 1 (ie the outcome is just (1 or 2 or 3 showing))

not sure...

June 29, 2017 - 16:55

Noah Jones

[eqn] \sum_{k=0}^{N-1} (-1)^{k} \sum_{\underset{|I| = k}{I \subset \{1,2,\ldots,N \}}} \left(\sum_{i \not \in I} p_i \right)^M [/eqn]

June 29, 2017 - 16:56

Zachary Torres

>re-indexing

June 29, 2017 - 16:56

Liam Davis

Thanks that makes sense but from a practical standpoint it seems pretty useless.

June 29, 2017 - 16:57

Caleb Bailey

well of course it's useless, it's just an algorithm to calculate the probability
N, M and all the Ps are variables. Did you expect some sort of beautiful simplification?

June 29, 2017 - 17:00

Landon Reyes

No it's fine I'm not complaining, I think it's just boring without an actual formula for the probabilities of each side. That way I could run a sim and see how it behaves with more sides and such

June 29, 2017 - 17:04

Jonathan Rodriguez

This is wrong.

June 29, 2017 - 17:10

Ryder Wilson

> that makes sense
If that makes sense to you then it's a bad sign, because it's incorrect.

June 29, 2017 - 17:12

Luke Ward

What's the right answer?

June 29, 2017 - 17:13

Grayson Brooks

[math] 1 - \sum_{i=1}^{N}(1-p_i)^M [/math]

June 29, 2017 - 17:39

Bentley Mitchell

that can't possibly be right; the probability would be negative

June 29, 2017 - 17:40

Samuel Price

>the probability would be negative
No it wouldn't. (You're retarded.)

June 29, 2017 - 17:45

Nolan Thomas

So you're steadfast in your assertion that the probability of seeing every side of a 4-sided dice with side weights of 1/12, 7/12, 2/12 and 2/12 in 8 rolls is approximately 3%?

Upon further consideration I realize my answer must be wrong (because in the same situation my probability is 303%) so I am certainly retarded, unless we're dealing in amplitudes. But I'm pretty sure we're both retarded.

June 29, 2017 - 18:02

Caleb Thomas

You calculated the probability of seeing at least one of its sides after m rolls. Think about it, [math] 1 - (1-p_i)^m [/math] is the probability that the i-th side will come up at least once after m rolls. If you sum these probabilities you're calculating the probability of a disjunction of events, that is, the probability that either of the sides will come up after m rolls.
The probability of seeing *all* sides after m rolls is
[math] \prod_{i=1}^{n}(1-(1-p_i)^m) [/math]. It's a conjunction of events: the probability that the 1st side will come up at least once AND the 2nd side will come up at least once AND ... AND the n-th side will come up at least once

This is even more incorrect.

June 29, 2017 - 18:12

Ryan Collins

>mfw finding the answer the "math brainlet" way is faster than simulating it
>mfw you're a brainlet who has no idea why and how Monte Carlo methods work
>mfw you probably can't even implement one if given step-by-step instructions

June 29, 2017 - 18:21

Caleb Perry

Answer is already given here It's a direct application of the inclusion-exclusion principle to the sample space.

June 29, 2017 - 18:56

Gabriel Cox

That answer is wrong.

June 29, 2017 - 19:30

Jaxson Harris

What does the inclusion-exclusion principle have to do with this?

June 29, 2017 - 21:29

Luis Morgan

There are N^M possible outcomes reflecting the M possible consecutive rolls, and we can formally define the sample space [math]\Omega[/math] as the set of all length-M strings drawn from the alphabet {1,2,...,N}.
Next we define the series of events [math]A_1, \cdots, A_N[/math] where event [math]A_i[/math] is the subset of strings containing the letter i at least once, and we want to calculate [math]P(\bigcup_{i=1}^N A_i)[/math].

Recall that for finite sample spaces, an event is simply a subset of [math]\Omega[/math] and the probability of an event is simply the sum of the probabilities of its individual elements. For example, P(rolling M copies of the same number) = [math]P(\{11\ldots 1, 22 \ldots 2, \cdots, NN\ldots N\})= \sum_{i=1}^N p_i^M[/math].

Now apply the form of the inclusion-exclusion principle given in en.wikipedia.org/wiki/Inclusion–exclusion_principle#In_probability with the substitution [math]P(A_I)[/math] = P(none of the letters in I appear on the M rolls) = P(not rolling a letter in I in a single roll)^M which yields the required expression.

See above.

June 29, 2017 - 22:38

Evan Butler

Interesting. So what's wrong with this

June 30, 2017 - 00:37

Andrew Miller

>the probability that the 1st side will come up at least once AND the 2nd side will come up at least once AND ... AND the n-th side will come up at least once
The multiplicative rule for conjunction probabilities -- P(A and B) = P(A)*P(B), or [math]P(\prod_{i=1}^n A_i) = \prod_{i=1}^nP(A_i)[/math] in this case -- need not apply if the events are not independent.
For a trivial example, P(A and not-A) = 0 but this doesn't imply P(A)*P(not A) = 0.

In OP's actual problem, we can already see this happening with the simple case N=M=2, p_1 = p_2 = 0.5 (i.e. flipping a fair coin twice). If the multiplicative rule worked, we would have P(heads and tails each appear once) = P(head appears at least once)*P(tails appears at least once). But the LHS is 2/4 and the right hand side is (3/4)^2 which are not equal.

June 30, 2017 - 01:02

Nathaniel Gutierrez

Errata: [math]P(\prod_{i=1}^n A_i)[/math] should have been [math]P(\bigcap_{i=1}^n A_i)[/math]. [math]\prod_{i=1}^nA_i[/math] is notation for the Cartesian product, not the conjunction.

June 30, 2017 - 01:11

Joseph Morales

>need not apply if the events are not independent.
But the events are independent. In the example you give every coin flip is independent.
>flipping a fair coin twice ... But the LHS is 2/4 and the right hand side is (3/4)^2 which are not equal.
Why is the left-hand side of the equation 1/2?

June 30, 2017 - 01:14

Owen Bell

intuitively/anecdotally 9/16 (~ 0.56) is more correct than 1/2 (=0.5). When flipping coins, you get different results on consecutive flips slightly more often than identical results on consecutive flips.

Also, I suspect you applied the includion-exclusion principle incorrectly Why are you calculating the union of "none of the letters in I appear on the M rolls" to find out the probability of "every letter appears at least once"?
Makes no sense.

June 30, 2017 - 01:27

Jeremiah Ramirez

>Makes no sense
In the fair coin flip example, you want P(heads appears at least once AND tails appears at least once) but your formula does P(heads appears at least once OR tails appears at least once) = P(heads appears at least once) + P(tails appears at least once) - P(heads appears at least once AND tails appears at least once).

So yeah, you applied the principle wrong.

June 30, 2017 - 01:39

Christopher Bailey

The more fair coin flips you do the more likely it is that each side will appear at least once. Or, in the limit, you expect probability to be 1 as M goes to infinity. With product formula this happens.
With this formula the limit is 0.

Another failed test.

June 30, 2017 - 01:59

Joshua Ward

The coin flips are independent, but the events "Heads appears at least once" and "Tails appears at least once"
are not, and it is the probabilities of these events that you're forbidden to multiply.

>Why is the left-hand side of the equation 1/2?
When you flip 2 coins there are 2^2=4 possible outcomes, leading to a sample space of {HH, HT, TH, TT}.
Of these four outcomes, exactly two of them satisfy the criterion of seeing each side at least once: namely, HT and TH.
And in this case all four outcomes occur with equal probability 1/4, so:
P(seeing each side at least once, when a fair coin is flipped twice) = P(getting a head in one flip and a tail in the other) = 2/4.

You're right, I had also just realized that what I should have done is to take the de Morgan dual by defining [math]A_i[/math] to be the complementary event that letter i does not appear, then compute their union (= probability that at least one letter does not appear) and take the complement of that.
Then the event that none of the letters in I appear in the M rolls would be [math]\bigcap_{i\in I}A_i[/math] as required.

June 30, 2017 - 02:11

Ryder Smith

Actually, that formula is not even about inclusion-exclusion counts, it's completely wrong. Using that formula it's also possible to get negative probabilities. Let's do N=M=2, p1=p2=0.5
you get

1*(0 + 0) - 1 * (0.5^2 + 0.5^2) = - 0.5 .... like, huh???

June 30, 2017 - 02:13

Matthew Young

Using the corrected method, the new formula becomes

[eqn]1 + \sum_{k=1}^N (-1)^k \sum_{\underset{|I| = k}{I \subset \{1,2,\ldots,N \}}} \left(\sum_{i \not \in I} p_i \right)^M[/eqn]

which, for the case of N=2 and p1=p2=0.5, evaluates to 1 - (0.5^M + 0.5^M) + 0 = [math]1 - \frac{1}{ 2^{M-1} }[/math].

June 30, 2017 - 02:29

Jacob Nguyen

>the events "Heads appears at least once" and "Tails appears at least once"
>are not independent
How?

This one seems correct. It also gives certain probability as M goes to infinity.
It's pretty interesting how close the results this formula gives are to the ones given by the product formula though:

(1- (1/2)^m)^2 = 1 - (1/2)^(m-1) + (1/2)^2m , the last term goes to 0 exponentially.
Do you have any idea what that residue represents?

June 30, 2017 - 02:41

Justin Cooper

You know that if [math]I [/math] is the empty set then
[eqn] \left(\sum_{i \not \in I} p_i \right)^M = 1 [/eqn]
rather than 0?

June 30, 2017 - 02:53

Caleb Harris

Why?

June 30, 2017 - 02:56

Nathaniel Jenkins

Because you will sum over all p_i and they add up to 1.

June 30, 2017 - 02:58

Jack Russell

Actually scratch that question: you're wrong. (empty sum)^power is not an empty product. The empty sum is 0. 0^M = 0.

June 30, 2017 - 03:01

Grayson Davis

nvm, I didn't notice that was the not-included sign. Small font and no glasses.

June 30, 2017 - 03:03

1 2 ... 4 Next

Interesting probability puzzle I have been pondering:

Last threads