I am currently developing my pipeline in R for data processing/analysis.
My data is in a long format (sample rate = 1000Hz). Throughout the dataframe I have added a trialNum variable for each trial, but I am having issues reshaping my data to wide.
What I am trying to do, and I think should be possible with a for loop or two... Is to get the average value of x at index 1:100, based on the trialNum.
Here is a simple version...
| Pupil Size | TrialNum |
|---|---|
| 500 | 1 |
| 502 | 1 |
| 504 | 1 |
| 506 | 1 |
| 508 | 1 |
| 507 | 2 |
| 508 | 2 |
| 510 | 2 |
| 511 | 2 |
| 512 | 2 |
| 513 | 3 |
| 515 | 3 |
| 514 | 3 |
| 512 | 3 |
| 515 | 3 |
So stated simply... I would get the first index of Pupil size for each TrialNum, and average together, and add to a new variable (average_pupil_size).
In this example, each trial has 5 inputs, so I would end up with a variable output of length = 5...
average_size <- c(507, 508, 509, 510, 512)
I could then plot this signal for all my trials... I hope I have explained myself clearly... Apologies for the chaos that is my mind.
Does anyone know how to do this? It is a bit beyond me.
Thanks in advance!
CodePudding user response:
We could add an index within each TrialNum using row_number(), and then group-summarize within those.
library(dplyr)
df %>%
group_by(TrialNum) %>%
mutate(index = row_number()) %>%
group_by(index) %>%
summarize(avg = mean(Pupil.Size))
Result
# A tibble: 5 × 2
index avg
<int> <dbl>
1 1 507.
2 2 508.
3 3 509.
4 4 510.
5 5 512.
CodePudding user response:
in base R, if the data has same length for each trial, eg in this case 5, we can do:
rowMeans(unstack(df))
[1] 506.6667 508.3333 509.3333 509.6667 511.6667
