Below, I'm trying to randomly select the rows of a group value in each study in my data, how?
Well, we first group_by(study), then decide to pick one of the group's rows in each study based on:
group_row <- sapply(1:length(unique(data$study)),
function(i)sample(0:2, 1, replace = TRUE))
For each study in group_by(study):
if group_row was 1, select group == 1 rows of that study.
if group_row was 2, select group == 2 rows of that study.
if group_row was 0, select ALL rows of that study.
I have tried the following without success?
library(tidyverse)
(data <- expand_grid(study=1:3,group=1:2,outcome=c("A","B"), time=0:1) %>%
as.data.frame())
lapply(1:2, function(i){
data %>% dplyr::group_by(group) %>%
filter(group == if(group_row[i] ==0) unique(data$group) else group_row[i]) %>%
dplyr::ungroup() %>% arrange(study,group,outcome,time)
})
CodePudding user response:
You can write a function to select a row for each study and apply the function by group.
library(dplyr)
return_rows <- function(x) {
n <- sample(0:2, 1)
#If n = 0 select all rows else
#select row for corresponding group
if(n == 0) TRUE else x == n
}
data %>%
group_by(study) %>%
filter(return_rows(group)) %>%
ungroup()
