r two how to show a legend on dual y-axis ggplot




sec.axis ggplot2 (7)

It's not possible in ggplot2 because I believe plots with separate y scales (not y-scales that are transformations of each other) are fundamentally flawed. Some problems:

  • The are not invertible: given a point on the plot space, you can not uniquely map it back to a point in the data space.

  • They are relatively hard to read correctly compared to other options. See A Study on Dual-Scale Data Charts by Petra Isenberg, Anastasia Bezerianos, Pierre Dragicevic, and Jean-Daniel Fekete for details.

  • They are easily manipulated to mislead: there is no unique way to specify the relative scales of the axes, leaving them open to manipulation. Two examples from the Junkcharts blog: one, two

  • They are arbitrary: why have only 2 scales, not 3, 4 or ten?

You also might want to read Stephen Few's lengthy discussion on the topic Dual-Scaled Axes in Graphs Are They Ever the Best Solution?.

I am trying to compose a dual y-axis chart using ggplot. Firstly let me say that I am not looking for a discussion on the merits of whether or not it is good practice to do so. I find them to be particularly useful when looking at time based data to identify trends in 2 discrete variables. A further discussion of this is better suited to crossvalidated in my opinion.

Kohske provides a very good example of how to do it, which I have used to great effect so far. I am however at my limits to include a legend for both y-axes. I have also seen similar questions here and here but none seem to address the issue of including a legend.

I've got a reproduceable example using the diamonds dataset from ggplot.

Data

library(ggplot2)
library(gtable)
library(grid)
library(data.table)
library(scales)

grid.newpage()

dt.diamonds <- as.data.table(diamonds)

d1 <- dt.diamonds[,list(revenue = sum(price),
                        stones = length(price)),
                  by=clarity]

setkey(d1, clarity)

Charts

p1 <- ggplot(d1, aes(x=clarity,y=revenue, fill="#4B92DB")) +
  geom_bar(stat="identity") +
  labs(x="clarity", y="revenue") +
  scale_fill_identity(name="", guide="legend", labels=c("Revenue")) +
  scale_y_continuous(labels=dollar, expand=c(0,0)) + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1),
        axis.text.y = element_text(colour="#4B92DB"), 
        legend.position="bottom")

p2 <- ggplot(d1, aes(x=clarity, y=stones, colour="red")) +
  geom_point(size=6) + 
  labs(x="", y="number of stones") + expand_limits(y=0) +
  scale_y_continuous(labels=comma, expand=c(0,0)) +
  scale_colour_manual(name = '',values =c("red","green"), labels = c("Number of Stones"))+
  theme(axis.text.y = element_text(colour = "red")) +
  theme(panel.background = element_rect(fill = NA),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_rect(fill=NA,colour="grey50"),
        legend.position="bottom")

# extract gtable
g1 <- ggplot_gtable(ggplot_build(p1))
g2 <- ggplot_gtable(ggplot_build(p2))


pp <- c(subset(g1$layout, name == "panel", se = t:r))
g <- gtable_add_grob(g1, g2$grobs[[which(g2$layout$name == "panel")]], pp$t,
                     pp$l, pp$b, pp$l)
# axis tweaks
ia <- which(g2$layout$name == "axis-l")
ga <- g2$grobs[[ia]]
ax <- ga$children[[2]]
ax$widths <- rev(ax$widths)
ax$grobs <- rev(ax$grobs)
ax$grobs[[1]]$x <- ax$grobs[[1]]$x - unit(1, "npc") + unit(0.15, "cm")
g <- gtable_add_cols(g, g2$widths[g2$layout[ia, ]$l], length(g$widths) - 1)
g <- gtable_add_grob(g, ax, pp$t, length(g$widths) - 1, pp$b)
# draw it
grid.draw(g)

QUESTION: Does anyone have some tips on how to get the 2nd part of the legend to show?

The following are the charts produced in order p1, p2, combined p1&p2, you'll notice that the legend for p2 doesn't show in the combined chart.

p1

p2

combined p1 & p2


Seems that the legend capture approach is the most generalizable in similar situations, though in this specific on @jennybryan's is simpler and probably what most people would want. I document the legend capture approach here as well. I first learned this approach from @Sandy Muspratt HERE.

dat <- data.frame(
    y = rnorm(200),
    x = sample(c("A", "B"), 200, TRUE),
    z = sample(100:200, 200, TRUE), 
    a = sample(c("male", "female"), 200, TRUE),
    b = factor(sample(1:2, 200, TRUE))
)

if (!require("pacman")) install.packages("pacman")
pacman::p_load(ggplot2, grid, gridExtra, gtable)

coldot <- ggplot(dat, aes(y = y, x = x)) +
    geom_point(aes(color = a, size = z)) + 
    #geom_boxplot(fill = NA, size=.75, aes(color=b)) +
    scale_color_manual(values = c("#F8766D", "#00BFC4"))

colbox <- ggplot(dat, aes(y = y, x = x)) +
    #geom_point(aes(color = a, size = z)) + 
    geom_boxplot(fill = NA, size=.75, aes(color=b)) +
    scale_color_manual(values = c("orange", "purple"))



leg1 <- gtable_filter(ggplot_gtable(ggplot_build(coldot)), "guide-box") 
leg1Grob <- grobTree(leg1)

leg2 <- gtable_filter(ggplot_gtable(ggplot_build(colbox)), "guide-box") 
leg2Grob <- grobTree(leg2)


noleg <- ggplot(dat, aes(y = y, x = x)) +
    geom_point(aes(color = a, size = z)) + 
    geom_boxplot(fill = NA, size=.75, aes(color=b), position=position_dodge(1)) +
    scale_color_manual(values = c("orange", "purple", "#F8766D", "#00BFC4")) +
    theme(
        plot.margin = unit(c(5.1, 4.1, 4.1, 2.1), "pt"),
        legend.position=c(1.3, 0.87)
    ) +
    guides(color = FALSE)

legs <- ggplot(data = data.frame(x=1, y=1)) +
    geom_blank(aes(x=x, y=y)) + 
    theme_minimal() + 
    ylab(NULL) + xlab(NULL) +
    theme(
        axis.text = element_blank(),
        axis.ticks = element_blank(),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank() 
    ) +
    annotation_custom(leg1Grob, xmin=1, xmax=1, ymin=.95, ymax=1.3) +
    annotation_custom(leg2Grob, xmin=.6, xmax=.8, ymin=.75, ymax=1) 

out <- arrangeGrob(noleg, legs, ncol=2, widths=c(.85, .15))
print(out)

ggplot2: Assign color to 2 different geoms and get 2 different legends

If you use a filled plotting symbol, you can map one factor to fill and the other to colour, which then separates them into two scales and, therefore, legends.

ggplot(dat, aes(y = y, x = x)) +
  geom_point(aes(fill = a, size = z), pch = 21) + 
  geom_boxplot(fill = NA, size=.75, aes(color=b)) +
  scale_color_manual(values = c("orange", "purple")) +
  scale_fill_manual(values = c("#F8766D", "#00BFC4"))


How to use facets with a dual y-axis ggplot

EDIT: UPDATED TO GGPLOT 2.2.0
But ggplot2 now supports secondary y axes, so there is no need for grob manipulation. See @Axeman's solution.

facet_grid and facet_wrap plots generate different sets of names for plot panels and left axes. You can check the names using g1$layout where g1 <- ggplotGrob(p1), and p1 is drawn first with facet_grid(), then second with facet_wrap(). In particular, with facet_grid() the plot panels are all named "panel", whereas with facet_wrap() they have different names: "panel-1", "panel-2", and so forth. So commands like these:

pp <- c(subset(g1$layout, name == "panel", se = t:r))
g <- gtable_add_grob(g1, g2$grobs[which(g2$layout$name == "panel")], pp$t,
                     pp$l, pp$b, pp$l)

will fail with plots generated using facet_wrap. I would use regular expressions to select all names beginning with "panel". There are similar problems with "axis-l".

Also, your axis-tweaking commands worked for older versions of ggplot, but from version 2.1.0, the tick marks don't quite meet the right edge of the plot, and the tick marks and the tick mark labels are too close together.

Here is what I would do (drawing on code from here, which in turn draws on code from here and from the cowplot package).

# Packages
library(ggplot2)
library(gtable)
library(grid)
library(data.table)
library(scales)

# Data 
dt.diamonds <- as.data.table(diamonds)
d1 <- dt.diamonds[,list(revenue = sum(price),
                        stones = length(price)),
                  by=c("clarity", "cut")]
setkey(d1, clarity, cut)

# The facet_wrap plots
p1 <- ggplot(d1, aes(x = clarity, y = revenue, fill = cut)) +
  geom_bar(stat = "identity") +
  labs(x = "clarity", y = "revenue") +
  facet_wrap( ~ cut, nrow = 1) +
  scale_y_continuous(labels = dollar, expand = c(0, 0)) + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1),
        axis.text.y = element_text(colour = "#4B92DB"), 
        legend.position = "bottom")

p2 <- ggplot(d1, aes(x = clarity, y = stones, colour = "red")) +
  geom_point(size = 4) + 
  labs(x = "", y = "number of stones") + expand_limits(y = 0) +
  scale_y_continuous(labels = comma, expand = c(0, 0)) +
  scale_colour_manual(name = '', values = c("red", "green"), labels = c("Number of Stones"))+
  facet_wrap( ~ cut, nrow = 1) +
  theme(axis.text.y = element_text(colour = "red")) +
  theme(panel.background = element_rect(fill = NA),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_rect(fill = NA, colour = "grey50"),
        legend.position = "bottom")


# Get the ggplot grobs
g1 <- ggplotGrob(p1)
g2 <- ggplotGrob(p2)

# Get the locations of the plot panels in g1.
pp <- c(subset(g1$layout, grepl("panel", g1$layout$name), se = t:r))

# Overlap panels for second plot on those of the first plot
g <- gtable_add_grob(g1, g2$grobs[grepl("panel", g1$layout$name)], 
      pp$t, pp$l, pp$b, pp$l)


# ggplot contains many labels that are themselves complex grob; 
# usually a text grob surrounded by margins.
# When moving the grobs from, say, the left to the right of a plot,
# Make sure the margins and the justifications are swapped around.
# The function below does the swapping.
# Taken from the cowplot package:
# https://github.com/wilkelab/cowplot/blob/master/R/switch_axis.R 

hinvert_title_grob <- function(grob){

  # Swap the widths
  widths <- grob$widths
  grob$widths[1] <- widths[3]
  grob$widths[3] <- widths[1]
  grob$vp[[1]]$layout$widths[1] <- widths[3]
  grob$vp[[1]]$layout$widths[3] <- widths[1]

  # Fix the justification
  grob$children[[1]]$hjust <- 1 - grob$children[[1]]$hjust 
  grob$children[[1]]$vjust <- 1 - grob$children[[1]]$vjust 
  grob$children[[1]]$x <- unit(1, "npc") - grob$children[[1]]$x
  grob
}

# Get the y axis title from g2
index <- which(g2$layout$name == "ylab-l") # Which grob contains the y axis title?   EDIT HERE
ylab <- g2$grobs[[index]]                # Extract that grob
ylab <- hinvert_title_grob(ylab)         # Swap margins and fix justifications

# Put the transformed label on the right side of g1
g <- gtable_add_cols(g, g2$widths[g2$layout[index, ]$l], max(pp$r))
g <- gtable_add_grob(g, ylab, max(pp$t), max(pp$r) + 1, max(pp$b), max(pp$r) + 1, clip = "off", name = "ylab-r")

# Get the y axis from g2 (axis line, tick marks, and tick mark labels)
index <- which(g2$layout$name == "axis-l-1-1")  # Which grob.    EDIT HERE
yaxis <- g2$grobs[[index]]                    # Extract the grob

# yaxis is a complex of grobs containing the axis line, the tick marks, and the tick mark labels.
# The relevant grobs are contained in axis$children:
#   axis$children[[1]] contains the axis line;
#   axis$children[[2]] contains the tick marks and tick mark labels.

# First, move the axis line to the left
# But not needed here
# yaxis$children[[1]]$x <- unit.c(unit(0, "npc"), unit(0, "npc"))

# Second, swap tick marks and tick mark labels
ticks <- yaxis$children[[2]]
ticks$widths <- rev(ticks$widths)
ticks$grobs <- rev(ticks$grobs)

# Third, move the tick marks
# Tick mark lengths can change. 
# A function to get the original tick mark length
# Taken from the cowplot package:
# https://github.com/wilkelab/cowplot/blob/master/R/switch_axis.R 
plot_theme <- function(p) {
  plyr::defaults(p$theme, theme_get())
}

tml <- plot_theme(p1)$axis.ticks.length   # Tick mark length
ticks$grobs[[1]]$x <- ticks$grobs[[1]]$x - unit(1, "npc") + tml

# Fourth, swap margins and fix justifications for the tick mark labels
ticks$grobs[[2]] <- hinvert_title_grob(ticks$grobs[[2]])

# Fifth, put ticks back into yaxis
yaxis$children[[2]] <- ticks

# Put the transformed yaxis on the right side of g1
g <- gtable_add_cols(g, g2$widths[g2$layout[index, ]$l], max(pp$r))
g <- gtable_add_grob(g, yaxis, max(pp$t), max(pp$r) + 1, max(pp$b), max(pp$r) + 1, 
   clip = "off", name = "axis-r")

# Get the legends
leg1 <- g1$grobs[[which(g1$layout$name == "guide-box")]]
leg2 <- g2$grobs[[which(g2$layout$name == "guide-box")]]

# Combine the legends
g$grobs[[which(g$layout$name == "guide-box")]] <-
    gtable:::cbind_gtable(leg1, leg2, "first")

# Draw it
grid.newpage()
grid.draw(g)


Similar to the technique you use above you can extract the legends, bind them and then overwrite the plot legend with them.

So starting from # draw it in your code

# extract legend
leg1 <- g1$grobs[[which(g1$layout$name == "guide-box")]]
leg2 <- g2$grobs[[which(g2$layout$name == "guide-box")]]

g$grobs[[which(g$layout$name == "guide-box")]] <- 
                                  gtable:::cbind_gtable(leg1, leg2, "first")
grid.draw(g)


Now that ggplot2 has secondary axis support this has become much much easier in many (but not all) cases. No grob manipulation needed.

Even though it is supposed to only allow for simple linear transformations of the same data, such as different measurement scales, we can manually rescale one of the variables first to at least get a lot more out of that property.

library(tidyverse)

max_stones <- max(d1$stones)
max_revenue <- max(d1$revenue)

d2 <- gather(d1, 'var', 'val', stones:revenue) %>% 
  mutate(val = if_else(var == 'revenue', as.double(val), val / (max_stones / max_revenue)))

ggplot(mapping = aes(clarity, val)) +
  geom_bar(aes(fill = cut), filter(d2, var == 'revenue'), stat = 'identity') +
  geom_point(data = filter(d2, var == 'stones'), col = 'red') +
  facet_grid(~cut) +
  scale_y_continuous(sec.axis = sec_axis(trans = ~ . * (max_stones / max_revenue),
                                         name = 'number of stones'),
                     labels = dollar) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1),
        axis.text.y = element_text(color = "#4B92DB"),
        axis.text.y.right = element_text(color = "red"),
        legend.position="bottom") +
  ylab('revenue')

It also works nicely with facet_wrap:

Other complications, such as scales = 'free' and space = 'free' are also done easily. The only restriction is that the relationship between the two axes is equal for all facets.


Starting with ggplot2 2.2.0 you can add a secondary axis like this (taken from the ggplot2 2.2.0 announcement):

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() + 
  scale_y_continuous(
    "mpg (US)", 
    sec.axis = sec_axis(~ . * 1.20, name = "mpg (UK)")
  )




gtable