• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Viz with Python and R

Learn to Make Plots in Python and R

  • Home
  • Python Viz
  • Seaborn
  • Altair
  • R Viz
  • ggplot2
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to make random jittered points reproducible

datavizpyr · December 22, 2021 ·

In this post we will learn how to make a random jitter plots made with ggplot2 reproducible. We have multiple posts on the importance of showing the actual data points while making boxplots/violinplots. One of the ways to avoid overplotting, is to add random jitters on the x-axis to the data points, so they all don’t overlap on each other. The best way to add the random jitters on a boxplot is to make them reproducible, so they look exactly the same no matter when you rand and generated the plot.

H/T to Tom Mock, who shared the really useful #rstats #ggplot2 tip on how to make a plot with random jitters reproducible. You guessed it, we need to set a seed for random jitter. And the way to set seed is to use position_jitter() and specify see inside it as argument.

Let us get started with loading the packages we will be using.

library(tidyverse)
library(palmerpenguins)
library(patchwork)
theme_set(theme_bw(16))
penguins %>%
  ggplot(aes(x = species,
             y = bill_length_mm,
             color = species))+
  geom_boxplot(width=0.25,
               outlier.shape = NA)+
  geom_jitter(width=0.1)+
  theme(legend.position="none")
ggsave("boxplot_with_jittered_points_ggplot2.png")
Boxplot with jittered data points
Boxplot with jittered data points

How to make Boxplots with reproducible jittered points

First, let us try to make two variations of the same plot with position_jitter(). In the first plot we did not add any color to the plot.

p1 <- penguins %>%
  ggplot(aes(x=species,y=bill_length_mm))+
  geom_boxplot(outlier.shape = NA)+
  geom_point(position = position_jitter(seed = 42,width=0.15))+
  theme(legend.position = "none")

In the second plot, we have colored the data points by species. Otherwise they both are the same plot.

p2 <- penguins %>%
  ggplot(aes(x=species,y=bill_length_mm, color=species))+
  geom_boxplot(outlier.shape = NA)+
  geom_point(position = position_jitter(width=0.15))+
  theme(legend.position = "none")
p1+p2+ plot_annotation(
  title = 'Boxplots with jittered points',
  subtitle="without using seed in position_jitter()")

If we carefully look at the data points in these two plots, we can see that the original plot is not reproducible.

Boxplot with jittered data points - irreproducible
Boxplot with jittered data points – irreproducible

To make a boxplot with reproducible jittered data points, we will be using geom_point() function on top of the geom_boxplot() function as before. Inside geom_point() function, we use position argument with value position_jitter(seed = 42) to make the jittered points reproducible.

Let us try reproducing the same plot, but with slightly different color options, but with the same seed inside position_jitter() function.

p3 <- penguins %>%
  ggplot(aes(x=species,y=bill_length_mm, color=species))+
  geom_boxplot(outlier.shape = NA)+
  geom_point(position = position_jitter(seed = 42,width=0.15))+
  theme(legend.position = "none")

We can combine these two plots to make the comparison.

p1 + p3+ plot_annotation(
  title = 'Boxplots with reproducible jittered points',
  subtitle='using position_jitter(seed=42)')

We can clearly see that the jittered data points are reproducible.

Boxplots with reproducible jittered points in ggplot2
Boxplots with reproducible jittered points in ggplot2

If we make the boxplot with jittered points, without the seed argument we will a get plot similar to the one below, but not reproducible.

Related posts:

Customizing Mean mark to boxplot with ggplot2How To Show Mean Value in Boxplots with ggplot2? Scatterplot with marginal multi-histogram with ggExtraHow To Make Scatterplot with Marginal Histograms in R? ggforce geom_circle(): Annotate with a circleHow To Annotate a Plot with Circle in R Default ThumbnailHow to Make Axis Text Bold in ggplot2

Filed Under: ggplot2, R Tagged With: Reproducible Jittered Data Points ggplot2

Primary Sidebar

Tags

Altair barplot Boxplot boxplot python boxplot with jiitered text labels Bubble Plot Color Palette Countplot Density Plot Facet Plot gganimate ggplot2 ggplot2 Boxplot ggplot2 error ggplot boxplot ggridges ggtext element_markdown() Grouped Barplot R heatmap heatmaps Histogram Histograms Horizontal boxplot Python lollipop plot Maps Matplotlib Pandas patchwork pheatmap Pyhon Python R RColorBrewer reorder boxplot ggplot Ridgeline plot Scatter Plot Scatter Plot Altair Seaborn Seaborn Boxplot Stock Price Over Time Stripplot UpSetR Violinplot Violin Plot World Map ggplot2

Buy Me a Coffee

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version