Skip to main content
  1. Blog Posts/

Taking One-In-N Chances

·479 words·3 mins

If you take a one-in-n chance nn times (e.g. taking 10 one-in-10 chances), what’s the probability that at least one of them will come off? Somewhat satisfyingly, the answer, regardless of what nn is, turns out to be “around 63%”. Here’s why.

First, some equations.

  • If you take a one-in-n chance, the probability of it coming off is 1n \frac{1}{n} . For example, if you roll a six-side die, the probability of rolling a 6 is 16\frac{1}{6}.

  • The probability of the event not occurring is one minus the probability that it does occur,
    11n1 - \frac{1}{n}.

  • The probability of trying twice and it not happening is the probability of it not happening the first time, times the probability of it not happening the second time:
    (11n)×(11n)(1 - \frac{1}{n}) \times (1 - \frac{1}{n}), or (11n)2(1 - \frac{1}{n})^2.

  • Similarly, the probability of it not happening in nn attempts is
    (11n)n(1 - \frac{1}{n})^n

  • So the probability that it does happen at least once is the probability that it doesn’t not happen,
    1(11n)n1 - (1 - \frac{1}{n})^n

For a one-in-two chance, this works out as
1(112)2=114=75%1 - (1 - \frac{1}{2})^2 = 1 - \frac{1}{4} = 75\%.

For one-in-three, it’s around 70.4%70.4\%. For one-in-four, 68.4%68.4\%. As nn increases, the answer gets closer and closer to 11e63.2%1 - \frac{1}{e} \approx 63.2\%, where ee is Euler’s number.

Why 11e1 - \frac{1}{e}? To be honest, you would have to ask someone better at maths than me, but I think it’s a pretty cool result.

Obligatory XKCD #

xkcd.com/882: “Significant”

Code #

library(tidyverse)
theme_set(theme_minimal(base_size = 16))
ns = 2:12 # Values of n

# Probability of one or more successes
prob_of_success = function(n){
  1 - (1 - 1/n)^n
}
probs = prob_of_success(ns)
limit_val = 1 - (1 / exp(1))

df = data.frame(
  n = ns,
  prob = probs
)
ggplot(df, aes(n, prob)) +
  geom_path() +
  geom_point() +
  coord_cartesian(ylim=c(.6, .8)) +
  scale_x_continuous(breaks=ns) +
  scale_y_continuous(labels=scales::percent) +
  geom_hline(yintercept=limit_val, linetype='dashed') +
  annotate('label', x=3, y=limit_val, label=expression(1  -  frac(1,'e'))) +
  labs(x = "Value of n", y="Probability of at\nleast one success",
       caption = "You take a one-in-n chance, n times.\nWhat's the probability you're successful at least once?")
ggplot(data.frame(n=2:100), aes(x=n)) +
  geom_function(fun=prob_of_success) +
  coord_cartesian(ylim=c(.6, .8)) +
  # scale_x_continuous(breaks=ns) +
  scale_y_continuous(labels=scales::percent) +
  geom_hline(yintercept=limit_val, linetype='dashed') +
  annotate('label', x=3, y=limit_val, label=expression(1  -  frac(1,'e'))) +
  labs(x = "Value of n", y="Probability of at\nleast one success",
       caption = "Limit as n goes to infinity")

Don’t trust my formula? I don’t blame you, I never trust anything I figure out myself. That’s why I double check it against a simulation.

sim_func = function(n, nsims=100000){
  outcomes = rbinom(n=nsims, size=n, prob=1/n)
  mean(outcomes > 0)
}
df$sim_p = map_dbl(ns, sim_func)
df %>%
  pivot_longer(-n) %>%
  mutate(name = ifelse(name=='prob', 'Formula', 'Simulation')) %>%
  ggplot(aes(n, value, color=name)) +
  geom_path() + 
  geom_point() +
  coord_cartesian(ylim=c(.6, .8)) +
  scale_x_continuous(breaks=ns) +
  scale_y_continuous(labels=scales::percent) +
  geom_hline(yintercept=limit_val, linetype='dashed') +
  labs(x = "Value of n", y="Probability of at\nleast one success", color = "Method",
       caption = "Formula agrees with simulation")