厚缊

诹图——条形图(四)

厚缊 / 2019-04-02


绘制图形时,有时需要根据某个因子(factor)进行分面(facet),即根据因子水平拆分原数据,再分别绘制图形。ggplot2包可以通过facet_wrap()facet_grid()函数简单解决分面问题,但高级的绘图函数往往以损失灵活性为代价,自动分面难以控制每个分面图形的绘图细节。例如,ggplot2通过labeller()控制每个分面标签,当需要扩展时也会显得比较麻烦,更为严重的是,当向每个分面添加不同文本时,需要借助grid包从底层上来寻找解决方案。因此,绘制高度定制化的分面图时形基础绘图系统并不显得劣势。

由于原作者没有提供此图的调查数据,这里利用iris数据集手动生成了虚拟的ggplot2测试数据,说明利用高级绘图系统生成高度定制化的分面图形并不一定更简洁。

library(ggplot2)
library(ggthemes)
set.seed('20190402')
df <- dplyr::sample_n(iris, size = 500, replace = TRUE)
df[['group']] <- sample(letters[1:11], 500, replace = TRUE)
df[['cat']] <- sample(LETTERS[1:4], 500, replace = TRUE)
cols <- c('A' = "cornsilk4", 'B' = "cornsilk1", 
          'C' = "lightpink", 'D' = "palevioletred4")
ggplot(df, aes(Species, fill = cat) ) + 
  geom_bar( position = position_fill(), 
            color = 'black', size = 0.2) + 
  scale_fill_manual( values = cols) +
  facet_grid(group~.) + 
  coord_flip(ylim = c(0, 1), expand = FALSE) +
  labs(title = 'Reading attitude',
       subtitle = 'How much do you disagree or agree with these statements about reading?') +
  theme( plot.title = element_text(family = 'Arial Black', 
                              size = 16, lineheight = 10),
         plot.subtitle = element_text(size = 12, 
                                      lineheight = 8, 
                                      colour = 'grey40'),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        plot.background = element_rect(fill = 'grey95'),
        legend.position = 'bottom',
        legend.direction = 'horizontal',
        legend.title = element_blank(),
        legend.background = element_rect(fill = 'grey95'),
        legend.spacing.x = grid::unit(0.8, units = 'cm')
        )

分面条形图

老规矩,先上效果图。图中是一份针对阅读态度的调查,共11个问题,每个问题是一个分面,其中每个分面中包含美国、墨西哥和加拿大三个国家的调查结果。每个分面的左上角标注调查问题,左侧标注国家名称、持Strongly disagree和Disagree态度的人员所占比例,右侧标注持Agree和Strongly agree态度的人员所占比例。

绘图代码

整幅图的实现思路比较简单,大概可以分为以下五个步骤:

  1. 设置par()参数mfcol = c(12, 1),将绘图区等分为12行、1列,行数比主图11个分面多一个,主要是预留一个绘图空间添加坐标轴和图例。

  2. 通过for循环函数分别绘制11个分面主图,其中每次循环提取data_source中的三行数据,然后调用barplot()函数绘制条形图,然后用paste0()函数生成主图的左(llab)右(rlab)标签,并通过text()手动添加标签,注意,添加标签时会溢出绘图区,需要设置xpd = TRUE。最后通过mtext()函数添加每个分面的调查问题名称。

  3. 在第十二个预留绘图区内添加坐标轴和坐标轴标签,这里使用了最原始的方法,先用barplot()函数绘制了一个空图,注意X轴范围须设置的与主图相同,然后用segments()text()函数分别添加坐标轴和坐标轴标签(当然,可以直接调用axis()函数添加坐标轴及标签)。

  4. 图例仍然使用points()函数手动添加,主要是方便控制摆放位置。图例摆放位置的坐标根据全图的位置大致判断,出图后位置有偏移或溢出,可以进行适当修正,完全是凭经验判断。

  5. 最后是使用mtext()函数添加标题、副标题、数据来源等信息,当这些辅助信息绘制在外边距(outer margin)上时,要设置outer = TRUE

barchart09 <- 'barchart09.pdf'
Cairo::CairoPDF(barchart09, bg = 'grey95', width = 12, height = 19)
data_source <- matrix(
  c(18, 32, 34, 16, 22, 37, 33, 8, 25, 36, 24, 15,
    32, 38, 21, 9, 17, 35, 37, 11, 26, 36, 26, 12,
    25, 34, 33, 8, 20, 33, 38, 9, 26, 31, 33, 10,
    26, 43, 23, 8, 22, 39, 31, 8, 29, 43, 19, 9,
    27, 35, 31, 7, 14, 29, 43, 14, 24, 26, 37, 13,
    35, 39, 14, 12, 45, 43, 7, 5, 39, 37, 13, 11,
    19, 28, 35, 18, 17, 37, 38, 8, 21, 27, 33, 19,
    17, 36, 34, 13, 12, 32, 40, 16, 22, 39, 27, 12,
    32, 39, 16, 13, 31, 46, 18, 5, 39, 37, 13, 11,
    21, 28, 40, 11, 10, 27, 46, 17, 21, 26, 41, 12,
    27, 39, 25, 9, 21, 31, 35, 13, 28, 32, 29, 11),
  ncol = 4, byrow = TRUE
)
question <- c("I read only if I have to.",
              "Reading is one of my favorite hobbies.",
              "I like talking about books with other people.",
              "I find it hard to finish books.",
              "I feel happy if I receive a book as a present.",
              "For me, reading is a waste of time.",
              "I enjoy going to a bookstore or a library.",
              "I read only to get information that I need.",
              "I cannot sit still and read for more than a few minutes.",
              "I like to express my opinions about books I have read.",
              "I like to exchange books with my friends.") 
opar <- par(no.readonly = TRUE)
par(
  omi = c(1, 0.5, 1.8, 0.5),
  mai = c(0.1, 1.45, 0.35, 0.6),
  las = 1,
  family = 'Arial',
  mfcol = c(12, 1)
)
col_name <- c("palevioletred4", "lightpink", "cornsilk1", "cornsilk4")
for (i in seq_len(11)) {
    idx <- 1:3 + 3 * (i - 1)
    data <- t(data_source[idx, ])
    llab <- paste0(c('USA', 'Mexico', 'Canada'), ' - ', 
                   paste0(data[1, ] + data[2, ], '%'))
    rlab <- paste0(data[3, ] + data[4, ], '%')
    y <- barplot(data, horiz = TRUE, cex.names = 2, xlim = c(0, 100),
                 col = col_name, axes = FALSE)
    text(rep(-2, 3), y, llab, adj = 1, cex = 1.6, 
         col = 'grey40', xpd = TRUE)
    text(rep(102, 3), y, rlab, adj = 0, cex = 1.6, 
         col = 'grey40', xpd = TRUE)
    mtext(question[i], side = 3, adj = 0.01, line = 0.05, 
          cex = 1.6, col = 'grey40')
}
    par(mai = c(0.1, 1.45, 0.1, 0.6))
    y <- barplot(data, horiz = TRUE, border = NA, cex.names = 2,
                 col = '#00000000', xlim = c(0, 100), axes = FALSE)
    segments(0, 3.6, 100, 3.6, col = 'grey40', lty = 1.2, xpd = TRUE)
    segments(seq(0, 100, length.out = 5), rep(3.6, 5),
             seq(0, 100, length.out = 5), rep(3.45, 5), 
             col = 'grey40', lty = 1.2, xpd = TRUE)
    text(seq(0, 100, length.out = 5), rep(3.2, 5), 
         labels = seq(0, 100, length.out = 5),
         cex = 1.6, col = 'grey40', xpd = TRUE)
    points(seq(0, 75, length.out = 4) + 1, rep(1.6, 4), pch = 15,
           col = col_name, cex = 4)
    legend_lab <- c("Strongly disagree","Disagree","Agree","Strongly agree")
    text(seq(3, 79, length.out = 4), rep(1.6, 4), labels = legend_lab, 
         adj = c(0, 0.5), cex = 1.6, col = 'grey40')


mtext('Reading attitude', side = 3, line = 7, adj = 0, cex = 3, 
      family = 'Arial Black', outer = TRUE)
mtext('How much do you disagree or agree with these statements about reading?',
      side = 3, line = 3.5, adj = 0, cex = 1.8, 
      col = 'grey40', outer = TRUE)
mtext('Source: PISA 2009 Assessment Framework – Key Competencies in Reading, Mathematics, and Science',
      side = 1, line = 1, adj = 1, cex = 1.1, col = 'grey40', 
      outer = TRUE)
mtext('© OECD 2009, Data: bryer.org', side = 1, line = 3.5, adj = 1,
      cex = 1.1, col = 'grey40', outer = TRUE)
dev.off()
par(opar)