I essentially have two columns (vectors) with speed and accel in a data.frame as such:
speed acceleration
1 3.2694444 2.6539535522
2 3.3388889 2.5096979141
3 3.3888889 2.2722134590
4 3.4388889 1.9815256596
5 3.5000000 1.6777544022
6 3.5555556 1.3933215141
7 3.6055556 1.1439051628
8 3.6527778 0.9334115982
9 3.6722222 0.7561602592
I need to find for each value speed on the x axis (speed), what is the top 10% max values from the y axis (acceleration). This also needs to be in a specific interval. For example speed 3.2-3.4, 3.4-3.6, and so on. Can you please show me how a for loop would look like in this situation?
CodePudding user response:
As @alistaire already pointed out, you have provided a very limited amount of data. So we first have to simulate I a bit more data based on which we can test our code.
Let's first simulate some extra data.
set.seed(1)
# your data
speed <- c(3.2694444, 3.3388889, 3.3388889, 3.4388889, 3.5,
3.5555556, 3.6055556, 3.6527778, 3.6722222)
acceleration <- c(2.6539535522, 2.5096979141, 2.2722134590,
1.9815256596, 1.6777544022, 1.3933215141,
1.1439051628, 0.9334115982, 0.7561602592)
df <- data.frame(speed, acceleration)
# expand data.frame and add a little bit of noise to all values
# to make them 'unique'
df <- as.data.frame(do.call(
rbind,
replicate(15L, apply(df, 2, \(x) (x runif(length(x), -1e-1, 1e-1) )),
simplify = FALSE)
))
Now, the following code which does the 'heavy lifting' and stores the desired result in out.
# function to cut speed into equal intervals
my_groups <- \(n_groups) {
step <- with(df, c(max(speed) - min(speed))/n_groups)
intervals <- array(0L, dim = n_groups)
for(i in seq_len(n_groups)) {
intervals[i] <- min(df$speed) i * step
}
return(intervals)
}
# three intervals of equal width
my_intervals <- my_groups(n_groups = 3)
# Compute values of speed when acceleration is greater then
# or equal to its 90th percentile in each interval
out <- lapply(1:(length(my_intervals)-1L), \(i) {
x <- subset(df, speed >= my_intervals[i] & speed <= my_intervals[i 1L])
x[x$acceleration >= quantile(x$acceleration, 0.9), ]
})
# function to round values to two decimal places
r <- \(x) round(x, 2)
# assign names to each element of out
for(i in seq_along(out)) {
names(out)[i] <- paste0(r(my_intervals[i]), '-', r(my_intervals[i 1L]))
}
Output
> out
$`3.38-3.57`
speed acceleration
11 3.394378 2.583636
21 3.383631 2.267659
57 3.434123 2.300234
83 3.394886 2.580924
101 3.395459 2.460971
$`3.57-3.76`
speed acceleration
6 3.635234 1.447290
41 3.572868 1.618293
51 3.615017 1.420020
95 3.575412 1.763215
CodePudding user response:
there was a previous question 1 yr 9 mon ago which may help: dplyr select top ten values for each category can probably google it , I did
