Multiple Line Plots or Time Series Plots with ggplot2 in R

Multiple Line Plots with ggplot2
Multiple Line Plots with ggplot2

Line plots or time series plots are helpful to understand the trend over time. In this post we will learn how to make multiple line plots (or time-series plots in the sample plot) in R using ggplot2.

Let us load tidyverse the suite of R packages including ggplot2 to make the line plots.

 
library(tidyverse) 
theme_set(theme_bw(base_size=16)) 

We will use crime data in US over time from the Marshall Project. We load the data from Marshall Project’s github page.

 
data_url <- "https://raw.githubusercontent.com/themarshallproject/city-crime/master/data/ucr_crime_1975_2015.csv"
crime_data <- read_csv(data_url)

We will be using three of the variables from the data, year and number of violent crimes per 100k of population in different city/town. The city/town information is in “department_name” variable.

 
crime_data %>% 
  select(year, department_name, violent_per_100k) %>% 
  head()

## # A tibble: 6 x 3
##    year department_name   violent_per_100k
##   <dbl> <chr>                        <dbl>
## 1  1975 Albuquerque, N.M.             833.
## 2  1975 Arlington, Texas              247.
## 3  1975 Atlanta                      1637.
## 4  1975 Aurora, Colo.                 524.
## 5  1975 Austin, Texas                 404.
## 6  1975 Baltimore                    1862.

Let us make line plots of violence rate over year for each of the city, i.e. department name. We can make line plot using the geom, geom_line() in ggplot2. In our example, we want year on x-axis and violent_per_100k on y axis for every region (department_name).

Our first instinct make such a line plot is to add the geom_line() layer after specifying x and y variables.

 
crime_data %>%
  ggplot(aes(x=year, violent_per_100k)) +
  geom_line()

And the resulting plot we got is not what we intended.

Multiple Line Plots with ggplot2

Basically, in our effort to make multiple line plots, we used just two variables; year and violent_per_100k. And we did not specify the grouping variable, i.e. region/department_name information in our data.

 
crime_data %>%
  ggplot(aes(x=year, violent_per_100k)) +
  geom_line(aes(group=department_name))

After we specify the grouping variable with aes(group=department_name) inside geom_line(), we get a nice multiple line plots with each line showing crime rate over time for each region.

Multiple Line Plots with ggplot2
Exit mobile version