How to Turn off “missing values have been dropped” warning message in ggplot2

In this post, we will learn how to turn off the “missing values” warning message from ggplot2, when making a scatterplot with data containing missing values. geom_point() in ggplot2 gives a warning when it drops missing values from from the dataset it is plotting. Here is example of the warning when geom_point() drops 2 data points while plotting.

Removed 2 rows containing missing values (`geom_point()`)

We will see two examples of how to turn off the warning message. Let us get started by loading tidyverse and palmer penguin dataset for making plots.

library(tidyverse)
library(palmerpenguins)
theme_set(theme_bw(16))

Palmer penguin dataset has missing values. When we try to make a scatter plot as shown below

penguins %>%
  ggplot(aes(x=body_mass_g, 
             y = flipper_length_mm,
             color=species))+
  geom_point()+
  scale_color_brewer(palette ="Dark2" )
ggsave("remove_missing_values_dropped_warning_ggplot.png")

We get the following warning

## Warning: Removed 2 rows containing missing values (`geom_point()`).

Drop NAs from data to avoid the warning message

One approach to get around the warning message “Removed 2 rows containing missing values” is to drop rows containing missing values before plotting using drop_na() function in tidyr.

drop_na() function by default removes a row if there is any NA value. Therefore we will not see the warning message.

penguins %>%
  drop_na() %>%
  ggplot(aes(x=body_mass_g, 
             y = flipper_length_mm,
             color=species))+
  geom_point()+
  scale_color_brewer(palette ="Dark2" )

use na.rm in geom_point() to avoid the warning message

We can actually turn off the warning message that “rows containg missing values have been dropped” by specifying na.rm=TRUE as argument to geom_point() function.

penguins %>%
  drop_na() %>%
  ggplot(aes(x=body_mass_g, 
             y = flipper_length_mm,
             color=species))+
  geom_point(na.rm=TRUE)+
  scale_color_brewer(palette ="Dark2" )

When wsing na.rm=TRUE within geom_point(), ggplot2 takes care of the rows with missing values insterad of us dropping the rows with missing values in the whole dataframe.

Exit mobile version