Scatter Plot with Transparent Points Using ggplot2 in R

Scatter Plot with Transparent Data Points
Scatter Plot with Transparent Data Points

While scatter plots are ideal for visualizing the relationship between two quantitative variables, their effectiveness diminishes with large datasets due to overplotting. This issue occurs when numerous data points are plotted on top of each other, hiding the true density of the data. To solve this, we can make the points transparent. In this tutorial, we’ll use the alpha argument in ggplot2 to control the transparency and create a more insightful plot.

Let us load ggplot2 to make the scatter plot with transparent data points.
library(ggplot2)
Let us make a data frame with two quantitative variables. We generate these variables using random numbers from normal/gaussian distribution.
set.seed(42)
x <- rnorm(2000, mean=15, sd=20)
y <- x+ rnorm(2000, mean=2, sd=30)
df <- data.frame(x=x, y=y)
Let us make a simple scatter plot to illustrate the problem of overplotting
df %>% 
  ggplot(aes(x=x, y=y)) + geom_point()
Scatter plot with overplotting of data points
We can see that many points overlap on each other and make it difficult to see most of the data points. Let us set transparency level to avoid over plotting
df %>%  
 ggplot(aes(x=x,y=y)) + geom_point(alpha=0.3)
Scatter Plot with Transparent Data Points

Explore the Complete ggplot2 Guide

35+ tutorials with code: scatterplots, boxplots, themes, annotations, facets, and more—tested and beginner-friendly.

Visit the ggplot2 Hub → No fluff—just code and visuals.
Exit mobile version