I am trying to learn about Andrews plots by doing. I know R has the andrews package that uses the base plotting system, but I wanted to use ggplot2. I followed the 
group=Row produces one line in the plot for each row of the data frame. colour=Species colours each line according to the corresponding value of Species.
[There's probably a neater way than uncount() %>% add_column() of replicating each row of iris for each value of t using expand(), but I couldn't get it to work.]
CodePudding user response:
Seems to me that the oversight is that matrices in R are column-major and in python are row-major. In addition, you had hard-coded the iris dataset in your df_out specification whereas it was not a formal argument to the function.
library(ggplot2)
andrews <- function(df, class_column, samples=200) {
t <- seq(-pi, pi, length.out = samples)
vals <- t(
data.matrix(
df[, -which(names(df) %in% class_column)]
)
)
curves <- outer(vals[1, ], rep(1, length(t)))
for (i in 2:nrow(vals)) {
ft = (i %/% 2) * t
if (i %% 2 == 0) {
curves <- curves outer(vals[i, ], sin(ft))
} else {
curves <- curves outer(vals[i, ], cos(ft))
}
}
row <- as.vector(row(curves))
col <- as.vector(col(curves))
df_out <- data.frame(
t = col,
sample = row,
values = as.vector(curves),
class_column = df[[class_column]][row]
)
df_out
}
df <- andrews(iris, "Species")
ggplot(df, aes(x = t, y = values, color = class_column,
group = sample))
geom_line()

Created on 2022-01-26 by the reprex package (v2.0.1)
