Scatter Plot with Transparent Points Using ggplot2 in R

Scatter Plot with Transparent Data Points
Scatter Plot with Transparent Data Points
Scatter plot is a great way visualize the relationship between two quantitative variables. However, scatter plot can suffer from over-plotting of data points, when you have lots of data. Overplotting plots multiple overlapping data points. One of the solutions to avoid overplotting is to set the transparency levels for data points using the argument alpha in ggplot2. Let us load ggplot2 to make the scatter plot with transparent data points.
library(ggplot2)
Let us make a data frame with two quantitative variables. We generate these variables using random numbers from normal/gaussian distribution.
set.seed(42)
x <- rnorm(2000, mean=15, sd=20)
y <- x+ rnorm(2000, mean=2, sd=30)
df <- data.frame(x=x, y=y)
Let us make a simple scatter plot to illustrate the problem of overplotting
df %>% 
  ggplot(aes(x=x, y=y)) + geom_point()
Scatter plot with overplotting of data points
We can see that many points overlap on each other and make it difficult to see most of the data points. Let us set transparency level to avoid over plotting
df %>%  
 ggplot(aes(x=x,y=y)) + geom_point(alpha=0.3)
Scatter plot with transparency of data points with alpha
Exit mobile version