What exactly is ggplot doing with the ` ` operator?-CodePudding

I started using R recently, and have been confused with ggplot which my class is using. I'm used to the operator just adding two outputs, but I see that in ggplot you can things such as:

ggplot(data = bechdel, aes(x = domgross_2013))  
  geom_histogram(bins = 10, color="purple", fill="white")  
  labs(title = "Domestic Growth of Movies", x = " Domestic Growth")

Here we are adding two function calls together. What exactly is happening here? Is ggplot "overriding" the operator (maybe like how you can override the == operator in dart?) in order to do something different? Or is it that the ' ' operator means something different in R than I am used to with other programming languages?

A second unrelated question would be how come we are able to pass the column we would like to map to the x value (domgross_2013)? It seems to be passed like a variable rather then a string.

CodePudding user response：

I'll answer the first question. You should ask the second question in a separate posting.

R lets you override most operators. The easiest way to do it is using the "S3" object system. This is a very simple system where you attach an attribute named "class" to the object, and that affects how R processes some functions. (The ones this applies to are called "generic functions". There are other functions that don't pay any attention to the class.)

Each ggplot2 function returns an object with a class. You can use the class() function to get the class. For example, class(ggplot(data = "mtcars")) is a character vector containing c("gg", "ggplot"), and class(geom_histogram(bins = 10, color="purple", fill="white")) is the vector c("LayerInstance","Layer","ggproto","gg").

If you ask for methods(" ") you'll see all the classes with methods defined for addition, and that includes "gg", so R will call that method to process the addition in the expression you used.

CodePudding user response：

The operator is part of the philosophy of ggplot2. It's inspired by The Grammar of Graphics, which is worth reading. Essentially, you keep creating new and new layers.

Try taking this one step at a time in your code and it should make sense!

one <- ggplot2::ggplot(data = mtcars)   
  labs(title = "Mtcars", subtitle = "Blank Canvas")

two <- ggplot2::ggplot(data = mtcars, aes(x = mpg))   
  labs(title = "Mtcars", subtitle = "  Aesthetic Mapping")

three <- ggplot2::ggplot(data = mtcars, aes(x = mpg, y = after_stat(count)))   
  geom_histogram() 

library(patchwork)
one   two   three

CodePudding user response：

For the second question, the reason you can pass x like a variable rather than a string is due to non-standard evaluation. Effectively, the function arguments are captured rather than being immediately evaluated, and then evaluated within the scope that they exist. For example, with the quote() function, we can capture the input as-is, rather than looking for the value inside var. Then, we can evaluate it inside another environment like the mtcars data frame.

var <- quote(mpg)
> var
mpg

eval(var, mtcars)
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4

We can make a similar use of NSE within functions:

f <- function(x) {
  input <- substitute(x)
  print(input)
  eval(input, mtcars)
}

Here, we capture whatever was passed to the argument, and then execute it in the scope of the mtcars data frame.

f(cyl)
cyl
 [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

You can read more about this at the above link and here.