There are so many ways to define colour scales within ggplot2. After just loading ggplot2 I count 22 functions beginging with scale_color_* (or scale_colour_*) and same number beginging with scale_fill_*. Is it possible to briefly name the purpose of the functions below? Particularly I struggle with the differences of some of the functions and when to use them.
- scale_*_binned()
- scale_*_brewer()
- scale_*_continuous()
- scale_*_date()
- scale_*_datetime()
- scale_*_discrete()
- scale_*_distiller()
- scale_*_fermenter()
- scale_*_gradient()
- scale_*_gradient2()
- scale_*_gradientn()
- scale_*_grey()
- scale_*_hue()
- scale_*_identity()
- scale_*_manual()
- scale_*_ordinal()
- scale_*_steps()
- scale_*_steps2()
- scale_*_stepsn()
- scale_*_viridis_b()
- scale_*_viridis_c()
- scale_*_viridis_d()
What I tried
I've tried to make some research on the web but the more I read the more I get onfused. To drop some random example: 
The binned example (color steps legend):
cont scale_color_binned()
The following are continuous functions.
| Name of Function | Colorspace | Legend | What it does |
|---|---|---|---|
| scale_color_continuous() | default | Colorbar | basic scale (as if you did nothing) |
| scale_color_gradient() | user-defined | Colorbar | define low and high values |
| scale_color_gradient2() | user-defined | Colorbar | define low mid and high values |
| scale_color_gradientn() | user_defined | Colorbar | define any number of incremental val |
| scale_color_binned() | default | Colorsteps | basic scale, but binned |
| scale_color_steps() | user-defined | Colorsteps | define low and high values |
| scale_color_steps2() | user-defined | Colorsteps | define low, mid, and high vals |
| scale_color_stepsn() | user-defined | Colorsteps | define any number of incremental vals |
| scale_color_viridis_c() | Viridis | Colorbar | viridis color scale. Change palette via option=. |
| scale_color_viridis_b() | Viridis | Colorsteps | Viridis color scale, binned. Change palette via option=. |
| scale_color_distiller() | Brewer | Colorbar | Brewer color scales. Change palette via palette=. |
| scale_color_fermenter() | Brewer | Colorsteps | Brewer color scale, binned. Change palette via palette=. |
Discrete Scales
These discrete scales apply only when the data mapped is discrete (see above). Since the nature and colors of discrete scales are more disjointed by definition, these tend to be more manually-defined. We can use the same mtcars example and "force" a discrete scale applied to the color by mapping to cyl defined as.factor():
discrete <- ggplot(mtcars, aes(mpg, disp, color=as.factor(cyl))) geom_point(size=4)
discrete
The following are discrete scale functions:
| Name of Function | What it does |
|---|---|
| scale_color_discrete() | The basic default. Evenly-spaced hues |
| scale_color_hue() | Same as scale_color_discrete(), but you can define the range of hues and colors used |
| scale_color_grey() | Uses a greyscale. Can define the range. |
| scale_color_manual() | Must define specifically every color used. You can apply to your mapping by supplying a named vector for values=. |
| scale_color_identity() | A special-case function where your data is made up of names of colors - not names of factor levels |
| scale_color_brewer() | The discrete version of the Brewer colorspaces. Change palette via palette=. |
| scale_color_viridis_d() | The discrete version of the viridis colorspaces. Can change palette via option=. |
Viridis and Brewer Scales
A final note, you'll see above defined the functions for Brewer and Viridis palette options. Each one of these contain a few color palettes chosen to better represent ordered and non-ordered data based on some color theory. It's useful to do a little research in color theory applied to data visualization on your own. There are discrete, continuous, and binned versions of each of the two function classes, and each one has a slightly different method to change the specific palette. You'll have to Google around a bit for some representations of each scale to get a feel for them, but useful usage notes include:
| Colorspace | Discrete version | Continuous version | Binned version |
|---|---|---|---|
| Brewer | scale_color_brewer() | scale_color_distiller() | scale_color_fermenter() |
| Viridis | scale_color_viridis_d() | scale_color_viridis_c() | scale_color_viridis_b() |
One final note here: scale_color_ordinal() is really the same as scale_color_viridis_d()... I honestly don't really see the difference, so perhaps one is just a wrapper for the other?
Date Scales
The final two more esoteric functions are the ones related to date and datetime. These functions are scale_color_date() and scale_color_datetime(), respectively. They are basically the same as the scale_color_continuous() function, but with some convenience wrappers for labeling dates. This is the same relationship that scale_x_date() has with scale_x_continuous().
ggplot(economics, aes(x=date, y=unemploy, fill=date)) geom_col() scale_fill_date()
You see the graphic result is the same as scale_color_continuous(), but note the formatting benefit for representing dates correction using scale_color_date():
ggplot(economics, aes(x=date, y=unemploy, fill=date)) geom_col() scale_fill_continuous()
It all makes sense...
Given all of this above... now the following error messages you have probably seen before become quite apparent:
> discrete scale_color_continuous()
Error: Discrete value supplied to continuous scale
> cont scale_color_discrete()
Error: Continuous value supplied to discrete scale
CodePudding user response:
@chemdork123 provided an outstanding answer. I want to add on it since my question was focussing on the differences between the numerous color scale functions or, to rephrase it: Why do we need them all? Can't we embrace them in less functions?
For the moment I made a simple general function that includes scale_*_distiller, scale_*_viridis_c and scale_*_continuous in one function named scale_col_fill_cont. This is how the function works:
(1) The provided palette name is compared to the predefined palette names of the three functions. For example, one of the RColerBrewer palettes is called "Spectral", thus if the user provides this palette name the function calls
scale_color_distiller(palette= "Spectral", ...). This approach is possible because there is no overlap in color palette names of the three functions. With this first step we save choosing betweenscale_*_distiller,scale_*_viridis_candscale_*_continuous.(2) I define the
aestheticsargument asc("colour", "fill"). This way we do not need to choose whether the aesthetic is acolorfillcolor, i.e. whether we have to choosescale_color_viridis_corscale_fill_viridis_c,scale_color_continuous()orscale_fill_continuous()andscale_color_distiller()orscale_color_distiller().
At the moment the function just includes 6 of the 2*22 functions. But if we apply the the logic of step (2) as described above we can half the 44 function to 22. I will work on a function that includes more color scale functions and update this answer if it works out.
Examples
Here is the function in action. First we generate some data and two ggplot2 objects, one which is using col and another that is using fill:
n <- 10
df <- data.frame(a= 1:n, x= rnorm(n), y= rnorm(n), z= rnorm(n))
# ggplot objects
gg_col <- ggplot(data= df, aes(x, y, col= z))
geom_point()
gg_fill <- ggplot(data= df, aes(a, group= a, fill= z))
geom_bar()
Function
Here is the simple function:
scale_col_fill_cont <- function(palette){
rcb <- c("Blues", "BuGn", "BuPu", "GnBu", "Greens", "Greys", "Oranges", "OrRd", "PuBu",
"PuBuGn", "PuRd", "Purples", "RdPu", "Reds", "YlGn", "YlGnBu", "YlOrBr", "YlOrRd",
"Accent", "Dark2", "Paired", "Pastel1", "Pastel2", "Set1", "Set2", "Set3",
"BrBG", "PiYG", "PRGn", "PuOr", "RdBu", "RdGy", "RdYlBu", "RdYlGn", "Spectral")
rcb_num <- 1:18
v <- c("magma", "inferno", "plasma", "viridis", "cividis", "rocket", "mako", "turbo",
"A", "B", "C", "D", "E", "F", "G", "H")
if(any(palette == rcb) | any(palette == rcb_num)){
scale_color_distiller(palette= palette, aesthetics= c("colour", "fill"))
} else if(any(palette == v)){
scale_color_viridis_c(option= palette, aesthetics= c("colour", "fill"))
} else{
scale_color_continuous(type= palette, aesthetics= c("colour", "fill"))
}
}
Vidris
The following code shows that my function scale_col_fill_cont results in the same output as if we would use scale_color_viridis_c().
# Applying the funtions to a col aesthetic
gg_col
scale_col_fill_cont("rocket")
gg_col
scale_color_viridis_c(option= "rocket")
# Applying the funtions to a fill aesthetic
gg_fill
scale_col_fill_cont("plasma")
gg_fill
scale_fill_viridis_c(option= "plasma")
RColerBrewer
The following code shows that my function scale_col_fill_cont results in the same output as if we would use scale_color_distiller().
# Applying the funtions to a col aesthetic
gg_col
scale_col_fill_cont("Spectral")
gg_col
scale_color_distiller(palette= "Spectral")
# Applying the funtions to a fill aesthetic
gg_fill
scale_col_fill_cont(3)
gg_fill
scale_fill_distiller(palette= 3)
continuous_*scale()
The following code shows that my function scale_col_fill_cont results in the same output as if we would use scale_color_distiller().
# Applying the funtions to a col aesthetic
gg_col
scale_col_fill_cont("gradient")
gg_col
scale_color_continuous(type= "gradient")
# Applying the funtions to a fill aesthetic
gg_fill
scale_col_fill_cont("gradient")
gg_fill
scale_fill_continuous(type= "gradient")




