I have a dataset as below:
stockCode date Closeprice
A 2022-01-24 100
A 2022-01-25 101
A 2022-01-26 103
A 2022-01-27 104
A 2022-01-28 103
B 2022-01-24 200
B 2022-01-25 180
B 2022-01-26 177
B 2022-01-27 192
B 2022-01-28 202
C 2022-01-24 304
C 2022-01-25 333
C 2022-01-26 324
C 2022-01-27 360
C 2022-01-28 335
and then, I wish to add some return columns as below:
I tried to make a new column, and calculating the return, but always shows errors.
> data$newclose <- data$Closeprice[2:length(data$Closeprice)-2]
Error in `$<-.data.frame`(`*tmp*`, newclose, value = c(8900, 9090, 9200, :
replacement has 126626 rows, data has 126628
CodePudding user response:
The assignment should have the same length on the lhs and rhs. Perhaps we need to get the lead
library(dplyr)
data1 <- data %>%
mutate(newcolose = lead(Closeprice, n = 1))
CodePudding user response:
I first create new columns with the values from 1 to 4 days using lead. Then, I calculate the percentage change for each day for each group.
library(tidyverse)
df %>%
group_by(stockCode) %>%
mutate(day1 = lead(Closeprice, n = 1),
day2 = lead(Closeprice, n = 2),
day3 = lead(Closeprice, n = 3),
day4 = lead(Closeprice, n = 4)) %>%
mutate(across(starts_with("day"), ~((. - Closeprice)/Closeprice)*100))
Output
# A tibble: 15 × 5
# Groups: stockCode [3]
stockCode day1 day2 day3 day4
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 1 3 4 3
2 A 1.98 2.97 1.98 NA
3 A 0.971 0 NA NA
4 A -0.962 NA NA NA
5 A NA NA NA NA
6 B -10 -11.5 -4 1
7 B -1.67 6.67 12.2 NA
8 B 8.47 14.1 NA NA
9 B 5.21 NA NA NA
10 B NA NA NA NA
11 C 9.54 6.58 18.4 10.2
12 C -2.70 8.11 0.601 NA
13 C 11.1 3.40 NA NA
14 C -6.94 NA NA NA
15 C NA NA NA NA
