I have 2 data frames from an experiment. The 1st df reads a (roughly) continuous signal over 40 mins. There are 5 columns, 1:3 are binary - saying whether a button was pushed. The 4th column is a binary of if either from column 2 or 3 was pushed. The 5th column is an approximate time in seconds. Example from df below:
| initiate | left | right | l or r | time |
|---|---|---|---|---|
| 0 | 0 | 1 | 1 | 2.8225 |
| 0 | 0 | 1 | 1 | 2.82375 |
| 0 | 0 | 1 | 1 | 2.82500 |
| 0 | 0 | 1 | 1 | 2.82625 |
| 1 | 0 | 0 | 0 | 16.82000 |
| 1 | 0 | 0 | 0 | 16.82125 |
etc.
The 2nd data frame is session info where each row is a trial, usually 100-150 rows depending on the day. I have a column that marks trial start time and another column that marks trial end time in seconds. Example from df below (I omitted several irrelevant columns):
| trial | success | t start | t end |
|---|---|---|---|
| 1 | 0 | 16.64709 | 35.49431 |
| 2 | 1 | 41.81843 | 57.74304 |
| 3 | 0 | 65.54510 | 71.16612 |
| 4 | 0 | 82.65743 | 87.30914 |
etc.
For the 1st data frame, I want to create a column that indicates whether or not the button was pushed within a trial. This is based on those start and end times in the 2nd df. I would like it to look something like this (iti = inter-trial, wt = within trial):
| initiate | left | right | l or r | time | trial |
|---|---|---|---|---|---|
| 0 | 0 | 1 | 1 | 2.8225 | iti |
| 0 | 0 | 1 | 1 | 2.82375 | iti |
| 0 | 0 | 1 | 1 | 2.82500 | iti |
| 0 | 0 | 1 | 1 | 2.82625 | iti |
| 1 | 0 | 0 | 0 | 16.82000 | wt |
| 1 | 0 | 0 | 0 | 16.82125 | wt |
etc.
I had the idea to do something like this, but I don't have a grouping variable between the 2 data frames so it doesn't work:
df2 %>%
full_join(df1, by = "trial") %>%
mutate(in_iti = case_when(time < tstart & time > tend ~ "iti",
time > tstart & time < tend ~ "within_trial"))
Any ideas on how to label the rows in df1 based on the time condition from the df2?
Thank you!
CodePudding user response:
Maybe try the following, if you data is relatively small, with dplyr. Assuming names of data.frames of df and df2. Using mutate to create your new column, and ifelse comparing each time in the first data.frame with t_start and t_end in your second data.frame.
library(dplyr)
df %>%
rowwise() %>%
mutate(trial = ifelse(any(time > df2$t_start & time < df2$t_end), "wt", "iti"))
Output
initiate left right l_or_r time trial
<int> <int> <int> <int> <dbl> <chr>
1 0 0 1 1 2.82 iti
2 0 0 1 1 2.82 iti
3 0 0 1 1 2.82 iti
4 0 0 1 1 2.83 iti
5 1 0 0 0 16.8 wt
6 1 0 0 0 16.8 wt
