Visualizing Binomial Distribution in R

Visualizing Binomial Distribution: Number of Successes vs PMF as bar plot
Binomial Probability Distribution as Barplot

In this tutorial, we will learn how to visualize binomial distribution in R. Binomial Distribution is one of the useful discrete probability distributions that comes handy in modelling problems in a number of scenarios.

The classic example of binomial distribution is tossing a coin n times and counting the number of heads (successes) for a coin that is fair or biased. Binomial distribution can help us in computing/estimating the probability of having k successes. And visualizing binomial distribution and understanding the shape of the distribution is useful.

library(tidyverse)
theme_set(theme_bw(16))

In R, we can readily compute probability mass function using dbinom() function. We need to specify the number of trials (size), probability of success (p). In the coin toss experiment where tossing a coin 10 times with a fair coin, size= 10 and p = 0.5. And we can compute the probability of getting 5 successes as shown below. We can see that we have about 25% probability of getting 5 successes when tossing a fair coin 10 times.

dbinom(5, size=10, prob=0.5)

## [1] 0.2460938

The probability is negligent if we are interested in 10 successes out 10 tosses.

dbinom(10, size=10, prob=0.5)

## [1] 0.0009765625

Visualizing Binomial Distribution as a line plot

One of the ways to visualize binomial distribution is to make a line plot of probability for successes. Let us compute the the probabilities using dbinom() function for all possible successes in tossing a fair coin 10 times.

# number of trials
n = 10
# probability of success in a trial
p = 0.5
binom_prob_df1 <- tibble(n_success = 1:n) %>%
  mutate(prob = dbinom(n_success, size=n, prob=p))

And this how our probabilities look like.

binom_prob_df1

## # A tibble: 10 × 2
##    n_success     prob
##        <int>    <dbl>
##  1         1 0.00977 
##  2         2 0.0439  
##  3         3 0.117   
##  4         4 0.205   
##  5         5 0.246   
##  6         6 0.205   
##  7         7 0.117   
##  8         8 0.0439  
##  9         9 0.00977 
## 10        10 0.000977

Here we visualize these binomial probabilities as line plot using ggplot’s geom_line() function with number of successes on x-axis and the probability on y-axis.

binom_prob_df1 %>%
  ggplot(aes(x=n_success, y=prob))+
  geom_line()+
  geom_point(size=2)+
  scale_x_continuous(breaks=1:n)+
  scale_y_continuous(breaks = scales::pretty_breaks(n = 5))+
  labs(x= "Number of Successes",
       y= "Probability",
       title=paste0("Binomial Distribution: n=",n,", p =",p))

Quickly we can see that, when we toss a fair coin 10 times getting 5 successes is most likely with a probabiliy of about 0.25.

Binomial Probability Distribution: PMF vs Number of Successes as Line plot

Visualizing Binomial Distribution as a bar plot

Another way to visualize the binomial distribution is to use barplot with number of successes on x-axis and probability on y-axis. Visualizing binomial distribution as a barplot is more common. And here we use geom_col() function to make the bar plot.

binom_prob_df1 %>%
  ggplot(aes(x=n_success,y=prob))+
  geom_col(width=0.25)+
  scale_x_continuous(breaks=1:n)+
  scale_y_continuous(
                     breaks = scales::pretty_breaks(n = 5))+
  labs(x= "Number of Successes",
       y= "Probability",
       title=paste0("Binomial Distribution: n=",n,", p =",p))
ggsave("binomial_probability_distribution_barplot_n10_p5.png")

When the coin is fair, i.e. the probability of success is 50%, we can see that the binomial distribution is symmetric.

Binomial Probability Distribution as Barplot

Binomial Distribution: A biased coin example

Let us quickly consider a scenario, where our coin is not a fair coin, i.e with p =0.7 biased to have more heads than tails.

# probability of head/success
p <- 0.7
# binomial distribution with dbinom()
binom_prob_df3 <- tibble(n_success=1:n) %>%
  mutate(prob=dbinom(n_success,n,p))

# barplot visualize 
binom_prob_df3 %>%
  ggplot(aes(x=n_success,y=prob))+
  geom_col(width=0.25)+
  scale_x_continuous(breaks=1:n)+
  scale_y_continuous(breaks = scales::pretty_breaks(n = 5))+
  labs(x= "Number of Successes",
       y= "Probability",
       title=paste0("Binomial Distribution: n=",n,", p =",p))
ggsave("binomial_probability_distribution_barplot_n10_p7p.png")

Since the coin is biased towards heads and when we consider getting heads as successes, Now the binomial distribution is skewed. And also getting 7 heads/successes is the most likely outcome compared to the other outcomes.

Visualizing Binomial Distribution: A biased coin toss example

Exit mobile version