Correlation heatmaps can be made easily using the
ggcorrhm() function. ggcorrhm() is a wrapper
for (and shares most of its arguments with) gghm() and
calculates the correlation matrix of the input to make a heatmap with
gghm(). If one matrix is provided the column-column
correlations are plotted. If two matrices are provided, the correlations
between the columns in the two matrices are plotted. By using
cor_method and cor_use, different
method and use arguments can be passed to
stats::cor().
ggcorrhm(mtcars)

Changing the colours
ggcorrhm() sets the fill scale to use
ggplot2::scale_fill_gradient2() with limits set to [-1, 1].
Unique to ggcorrhm() are the arguments high,
mid and low to control the legend colours.
Like with gghm(), the bins and
limits arguments create a binned scale and change the
limits, respectively.
ggcorrhm(mtcars, high = "green", low = "magenta", mid = "yellow", bins = 4L)
If complete control is needed, the col_scale argument
can be used to overwrite the default scale. This can be a string
specifying a Brewer or Viridis scale, or a ggplot2 scale
object.
plt1 <- ggcorrhm(mtcars, col_scale = "RdYlGn")
# Arguments like bins and limits still work with brewer and viridis scales
plt2 <- ggcorrhm(mtcars, col_scale = "PuOr", bins = 5L)
plt1 + plt2
Changing the layout
For square heatmaps with identical dimnames (from here on just
referred to as square heatmaps), the layout argument can be
used to get triangular layouts. The possible layouts are full matrix,
top left, top right, bottom left, and bottom right. Note that while the
top right and bottom left layouts are just the top right and bottom left
triangles of the square matrix, top left and bottom right layouts are
actually the bottom left and top right layouts with a flipped y axis to
retain the diagonal. This means that some layouts may have the y axis
order reversed.
plt1 <- ggcorrhm(mtcars, layout = "topleft")
# The diagonal is displayed by default but can be hidden using the `include_diag` argument
# Layouts can be specified by the first letters e.g. br, tl, f (for full), w (whole, same as full)
plt2 <- ggcorrhm(mtcars, layout = "br", include_diag = FALSE)
plt1 + plt2
Label customisation
For square matrices, the axis names are drawn on the diagonal. The
show_names_diag, show_names_rows and
show_names_cols arguments control where names should be
drawn.
The looks of the names displayed on the diagonal of a square matrix
can be adjusted with the names_diag_params argument, which
takes a named list of parameters to pass to
ggplot2::geom_text(). For names displayed on the x and y
axes, use ggplot2::theme() on the generated plot object to
change the appearance.
plt1 <- ggcorrhm(mtcars, layout = "bl", legend_order = NA, include_diag = FALSE,
names_diag_params = list(
angle = 45, vjust = 0.5, hjust = 0.5, colour = "magenta"
))
# Also take vectorised input
set.seed(123)
plt2 <- ggcorrhm(mtcars, layout = "br", include_diag = FALSE,
names_diag_params = list(
angle = c(0, rep(-45, 9), -90),
hjust = 0.7,
colour = sample(c("red", "green", "blue"), 11, TRUE)
))
plt1 + plt2
Legend position
The legend can be moved around by adding to the theme of the output
plot using ggplot2::theme(). Use the
legend.position.inside argument to move the legend to the
plotting area to make use of the empty space of triangular layouts.
plt1 <- ggcorrhm(mtcars) +
theme(legend.position = "top",
legend.title = element_text(vjust = 0.8))
plt2 <- ggcorrhm(mtcars, layout = "br") +
theme(legend.position = "inside",
legend.position.inside = c(0.3, 0.75))
plt1 + plt2
Clustering and annotation
Just like with gghm(), the heatmap can be clustered and
annotated. Triangular layouts limit the positions of annotations and
dendrograms to the non-empty sides of the heatmap.
set.seed(123)
# Make a correlation heatmap with a triangular layout, annotations and clustering
row_annot <- data.frame(.names = colnames(mtcars),
annot1 = sample(letters[1:3], ncol(mtcars), TRUE),
annot2 = rnorm(ncol(mtcars)))
ggcorrhm(mtcars, layout = "tl",
cluster_rows = TRUE, cluster_cols = TRUE,
show_dend_rows = FALSE,
annot_rows_df = row_annot,
annot_rows_names_side = "top")
Cell text
The cell_labels argument labels cells with their values
or some user-supplied values, explained more in the heatmap article. If
cell_labels is a matrix/data frame it should be in the
shape of the matrix being plotted (using the same row and column names),
which for ggcorrhm is the correlation matrix. To keep in
mind for triangular layouts is that the top left or bottom right
triangles will display the bottom left or top right triangles with the y
axis order flipped, and this is reflected in the cell labels too.
lab_mat <- matrix(1:121, nrow = 11, ncol = 11, byrow = TRUE,
dimnames = list(colnames(mtcars), colnames(mtcars)))
plt_list <- lapply(c("tl", "tr", "bl", "br"), function(lt) {
# Play a bit with the legend for the last plot
legend_order <- if (lt == "br") 1 else NA
ggcorrhm(mtcars, layout = lt, legend_order = legend_order,
cell_labels = lab_mat, show_names_diag = FALSE) +
theme(legend.position = "inside",
legend.position.inside = c(0.025, 1.075))
})
wrap_plots(plt_list)
Cell shape
Correlation heatmaps are sometimes plotted with circles. Passing a
value from 1 to 25 to the mode argument allows for
different cell shapes. Shapes 21-25 support filling and use fill scales
while 1-20 are not filled and use colour scales (relevant if a scale
object is passed). The shape size is set to vary between the values in
the size_range argument (not found in gghm()),
scaling with the absolute value of the correlation.
size_scale can be provided to overwrite this behaviour. The
size legend is hidden by default when using ggcorrhm() as
it would only show the absolute values of the correlations (but the
legend can be shown with the legend_order argument if
necessary).
Setting mode to ‘text’ writes the values in empty cells,
with the text colour scaling with the correlation values.
# If mode is a number, the R pch symbols are used, meaning 21 produces filled circles.
plt1 <- ggcorrhm(mtcars, mode = 21) + labs(title = "mode = 21")
# It is also possible to use shapes other than circles. Only 21-25 support filling.
# Change the size range using the size_range argument
plt2 <- ggcorrhm(mtcars, mode = 23, size_range = c(2, 7)) +
# Can add a grid in the background with ggplot2::theme()
theme(panel.grid.major = element_line(colour = "grey90")) +
labs(title = "mode = 23")
plt1 + plt2
It’s also possible to draw mixed layouts by providing two values to
layout and mode. See the mixed layouts article for more details.

P-values
ggcorrhm() adds support for correlation p-values which
can be calculated by setting the p_values argument to
TRUE. The values can be adjusted for multiple testing by
passing a string specifying the adjustment method to the
p_adjust argument. If the correlation matrix is square, the
diagonal is excluded and only half of the off-diagonal values are used
for the adjustment. The calculated p-values will be included in the data
if return_data is TRUE. P-values and adjusted
p-values for the correlations of the diagonal in a square matrix are set
to 0 as the correlations are always 1.
By default, cells are marked with asterisks if p_values
is TRUE. The p_thresholds argument controls
the p-value thresholds. p_thresholds must be NULL (for no
thresholds or symbols) or a named numeric vector where the values
specify the thresholds (in ascending order) and the names are the
symbols to use when an adjusted p-value is below each threshold.
stats::symnum() is used to convert p-values and the last
value of p_thresholds must be 1 (or any higher number) to
set the upper bound. By default, p_thresholds is set to be
c("***" = 0.001, "**" = 0.01, "*" = 0.05", 1). As can be
seen, 1 is left unnamed so that any p-values between 0.05 and 1 are
unmarked.
# P-value symbols added to the plot
plt1 <- ggcorrhm(mtcars, p_values = TRUE, p_adjust = "bonferroni")
# Changing the symbols
plt2 <- ggcorrhm(mtcars, p_values = TRUE, p_adjust = "bonferroni",
p_thresholds = c("+" = 0.01, "*" = 0.05, "ns" = 1))
plt1 + plt2
In a mixed layout it is possible to display p-values in only half of
the plot by passing a vector or list of length two to the
p_values argument. If cell_labels is
TRUE (so that correlation values are written in the cells),
the symbols from p_thresholds are added at the end of the
correlation values.
# P-values only in the bottom right, mode 'none' for labels with empty background
ggcorrhm(mtcars, layout = c("tl", "br"), mode = c("hm", "none"), p_adjust = "bonferroni",
p_values = c(FALSE, TRUE), cell_labels = c(FALSE, TRUE))
It is also possible to write out the p-values instead of correlation
values by setting both cell_labels and
cell_label_p to TRUE.
# P-values and labels only in one half, correlation labels in the other
ggcorrhm(mtcars, layout = c("tl", "br"), mode = c("hm", "none"), p_adjust = "bonferroni",
p_values = c(FALSE, TRUE), cell_labels = TRUE,
cell_label_p = c(FALSE, TRUE))
This is also carried over to ‘text’ mode.
# Same as previous but text mode
ggcorrhm(mtcars, layout = c("tl", "br"), mode = c("hm", "text"), p_adjust = "bonferroni",
p_values = c(FALSE, TRUE),
# In text mode the labels are displayed regardless so cell_labels can be FALSE
cell_labels = c(TRUE, FALSE),
cell_label_p = c(FALSE, TRUE))
By returning the data and using ggplot2 functions it is
possible to be more creative with the p-value marking than
ggcorrhm() allows out of the box.
p <- ggcorrhm(mtcars, p_values = TRUE, p_thresholds = NULL,
p_adjust = "bonferroni", return_data = TRUE,
layout = "br", include_diag = FALSE)
p$plot_data <- mutate(p$plot_data,
p_lty = case_when(p_adj < 0.05 ~ 1, TRUE ~ 3),
p_lwd = case_when(p_adj < 0.05 ~ 1, TRUE ~ 0.5),
p_size = -log10(p_adj))
ggcorrhm(mtcars, border_lty = p$plot_data$p_lty, border_lwd = p$plot_data$p_lwd,
layout = "br", include_diag = FALSE) +
geom_point(aes(x = col, y = row, size = p_size), data = p$plot_data %>%
filter(p_adj < 0.05, as.character(row) != as.character(col)),
shape = "*") +
scale_size(range = c(4, 8)) +
labs(size = "-log10(p_adj)")
p <- ggcorrhm(mtcars, p_values = TRUE, p_thresholds = NULL,
p_adjust = "bonferroni", return_data = TRUE,
layout = "tl", include_diag = FALSE)
ggcorrhm(mtcars, layout = "tl", include_diag = FALSE) +
geom_point(data = p$plot_data %>%
# Mark cells above a threshold
filter(p_adj >= 0.05, as.character(row) != as.character(col)),
mapping = aes(x = col, y = row),
shape = 4, size = 8)
Some extra plots
Triangular layouts require square matrices, but not symmetric ones (even if correlation heatmaps typically are symmetric). This means that the different triangles could show different information, such as the results of different correlation methods.
# Use different correlation methods
pear_cor <- cor(mtcars, method = "pearson")
kend_cor <- cor(mtcars, method = "kendall")
# Combine
cor_df <- pear_cor
cor_df[lower.tri(cor_df)] <- kend_cor[lower.tri(kend_cor)]
plt1 <- ggcorrhm(cor_df, cor_in = TRUE, cell_labels = TRUE,
col_name = "Correlation") +
# Add axis titles
labs(x = "Kendall tau", y = "Pearson r") +
theme(axis.title = element_text())
# With different colour scales, although that makes it hard to compare
plt2 <- gghm(cor_df, layout = c("bl", "tr"), mode = c("hm", "hm"),
col_scale = c("RdBu_rev", "PRGn_rev"),
col_name = c("Pearson r", "Kendall tau"),
split_diag = TRUE, limits = c(-1, 1))
plt1 + plt2
Other packages
A benefit of making the plot with ggplot2 is the
possibility of modifying and arranging the plot using
ggplot2 and its many extensions. ggcorrheatmap
is convenient for making simple correlation heatmaps, with options for
extensive customisation of the appearance and layout, and built-in
support for clustering (with dendrograms) and annotation. Another
package for making correlation heatmaps with ggplot2 is
ggcorrplot which is faster and has more ways of visualising
p-values. If having the plot made with ggplot2 is not
necessary, the corrplot package can make flexible
correlation plots with many alternative visualisations.