I have a data that contains 10 lines (samples) that are either Class1 or Class2 and columns.
I want to calculate the mean of rows Class1 and make the result in line 11 and calculate the mean of rows Class2 and make the result in a line 12.
The data:
Name ClassType Col1 Col2 Col3
A Class1 10 50 12
B Class2 7 20 12
C Class1 8 12 8
D Class1 9 14 17
E Class2 3 15 14
F Class2 10 15 16
G Class2 12 22 15
H Class1 10 28 10
The result I want:
Name ClassType Col1 Col2 Col3
A Class1 10 50 12
B Class2 7 20 12
C Class1 8 12 8
D Class1 9 14 17
E Class2 3 15 14
F Class2 10 15 16
G Class2 12 22 15
H Class1 10 28 10
Mean class1 NA 9.25 26 11.75
Mean class2 NA 8 18 14.25
CodePudding user response:
Try this
library(dplyr)
calc_cols <- your_dataframe %>% group_by(ClassType) %>%
summarise( Col1 = mean(Col1),
Col2 = mean(Col2),
Col3 = mean(Col3)) %>%
mutate( ClassType = NA,
Name = c("Mean class1", "Mean class2"))
your_new_dataframe <- rbind(your_dataframe, calc_cols)
CodePudding user response:
You can summarize across columns. To build on @geometricfreedom's answer, here's how (I created a mini-version of your data set, using just the first four rows):
your_data <- data.frame(
Name = c("A", "B", "C", "D"),
ClassType = c("Class1", "Class2", "Class1", "Class1"),
Col1 = c(10, 7, 8, 9),
Col2 = c(50, 20, 12, 14),
Col3 = c(12, 12, 8, 17)
)
calc_cols <- your_data %>%
group_by(ClassType) %>%
summarize(
across(Col1:Col3, mean)
) %>%
mutate(ClassType = NA, Name = c("Mean Class 1", "Mean Class 2"))
your_data <- rbind(your_data, calc_cols)
your_data
CodePudding user response:
Here is another tidyverse option, but doing everything in one pipe. We can first group by ClassType, then summarize over any columns that start with Col. Also, in the summarize statement, we can create the additional names that you want for the bottom rows (i.e., Mean Class 1 and Mean Class 2) and just return NA for ClassType. Then, we can use bind_rows to bind the original input dataframe to the new one (denoted by the .) from the previous step.
df %>%
group_by(ClassType) %>%
summarize(Name = paste0("Mean ", unique(ClassType)),
across(starts_with("Col"), mean),
ClassType = NA) %>%
bind_rows(df, .)
Output
Name ClassType Col1 Col2 Col3
1 A Class1 10.00 50 12.00
2 B Class2 7.00 20 12.00
3 C Class1 8.00 12 8.00
4 D Class1 9.00 14 17.00
5 E Class2 3.00 15 14.00
6 F Class2 10.00 15 16.00
7 G Class2 12.00 22 15.00
8 H Class1 10.00 28 10.00
9 Mean Class1 <NA> 9.25 26 11.75
10 Mean Class2 <NA> 8.00 18 14.25
Data
df <- structure(
list(
Name = c("A", "B", "C", "D", "E", "F", "G", "H"),
ClassType = c(
"Class1",
"Class2",
"Class1",
"Class1",
"Class2",
"Class2",
"Class2",
"Class1"
),
Col1 = c(10L, 7L, 8L, 9L, 3L,
10L, 12L, 10L),
Col2 = c(50L, 20L, 12L, 14L, 15L, 15L, 22L, 28L),
Col3 = c(12L, 12L, 8L, 17L, 14L, 16L, 15L, 10L)
),
class = "data.frame",
row.names = c(NA,-8L)
)
