There are some mistakes in the Yahoo finance historical price for the Brazilian stock "PFRM3.SA". I was trying to conditionally change a value using dplyr. I managed to do it using case_when() but I couldn't make it work using replace(), which I think should be a more efficient solution.
library(tidyverse)
library(tidyquant)
price <- tq_get("PFRM3.SA", from = "2020-01-01", get = "stock.prices") #Data
#visual inspection
price %>%
ggplot(aes(x = date, y = close))
geom_line()
#the decimal digit was misplaced a few times
#this works fine
price %>%
mutate(close = case_when(close > 100 ~ close/100, TRUE ~ close))
#However I don't understand why this do not work
price %>%
mutate(close = replace(close, close > 100, close/100))
The last line of code generates this error:
Warning messages:
1: Problem with `mutate()` input `close`.
ℹ number of items to replace is not a multiple of replacement length
ℹ Input `close` is `replace(close, close > 100, close/100)`.
2: In x[list] <- values :
number of items to replace is not a multiple of replacement length
I don't understand why replace() doesn't return the original close value when the condition isn't met. Also, I don't know how to make it work. Thanks
CodePudding user response:
- The
replace()function in R syntax is very simple and easy to implement. - It includes the vector, index vector, and the replacement values as well as shown below.
replace(x, list, values)
x = vector having some values -> close
list = this can be an index vector -> close > 100
Values = the replacement values -> close[close>100]/100
- Explanation:
You are conditionally indexing the close column, therefore in the Values argument of replace (= the replacement values) you have to conditionally index the replacement value; here close/100 with close[close>100]/100:
price %>%
mutate(close = replace(close, close > 100, close[close>100]/100)) %>%
data.frame()
symbol date open high low close volume adjusted
1 PFRM3.SA 2020-01-02 6.59 6.67 6.49 6.60 549100 6.263065
2 PFRM3.SA 2020-01-03 6.50 6.99 6.40 6.96 1202000 6.604687
3 PFRM3.SA 2020-01-06 7.07 7.71 7.07 7.69 2692600 7.297420
4 PFRM3.SA 2020-01-07 7.42 7.50 7.05 7.05 2325900 6.690092
5 PFRM3.SA 2020-01-08 7.01 7.28 6.80 7.13 1588900 6.766008
6 PFRM3.SA 2020-01-09 7.21 7.22 6.98 7.12 638800 6.756518
7 PFRM3.SA 2020-01-10 7.12 7.23 6.96 6.96 658000 6.604687
8 PFRM3.SA 2020-01-13 7.01 7.17 6.96 7.10 586700 6.737539
9 PFRM3.SA 2020-01-14 7.12 7.15 6.95 7.00 663700 6.642644
10 PFRM3.SA 2020-01-15 7.02 7.39 7.02 7.27 1173000 6.898861
11 PFRM3.SA 2020-01-16 7.35 7.45 7.09 7.16 816300 6.794476
12 PFRM3.SA 2020-01-17 7.24 7.29 7.07 7.17 669900 6.803966
13 PFRM3.SA 2020-01-20 7.20 7.29 7.10 7.23 548300 6.860903
14 PFRM3.SA 2020-01-21 7.23 7.40 7.13 7.17 722500 6.803966
15 PFRM3.SA 2020-01-22 7.23 7.29 6.88 7.09 697400 6.728050
16 PFRM3.SA 2020-01-23 7.09 7.15 6.95 7.04 670500 6.680603
17 PFRM3.SA 2020-01-24 7.07 7.29 7.02 7.15 552000 6.784987
18 PFRM3.SA 2020-01-27 7.08 7.08 6.70 6.95 681300 6.595197
19 PFRM3.SA 2020-01-28 6.94 7.06 6.61 6.90 703200 6.547750
20 PFRM3.SA 2020-01-29 6.92 7.03 6.72 6.72 406900 6.376938
21 PFRM3.SA 2020-01-30 6.60 6.65 6.40 6.64 786600 6.301023
22 PFRM3.SA 2020-01-31 6.51 6.83 6.49 6.49 379200 6.158680
23 PFRM3.SA 2020-02-03 6.45 6.69 6.37 6.66 491100 6.320002
24 PFRM3.SA 2020-02-04 6.71 6.90 6.68 6.90 388800 6.547750
25 PFRM3.SA 2020-02-05 6.85 6.94 6.64 6.81 499100 6.462344
26 PFRM3.SA 2020-02-06 6.89 6.89 6.72 6.83 302400 6.481323
27 PFRM3.SA 2020-02-07 6.66 6.82 6.57 6.60 280100 6.263065
28 PFRM3.SA 2020-02-10 6.60 6.60 6.23 6.30 537700 5.978380
29 PFRM3.SA 2020-02-11 6.35 6.52 6.26 6.50 455100 6.168170
30 PFRM3.SA 2020-02-12 6.51 6.57 6.37 6.45 361200 6.120722
31 PFRM3.SA 2020-02-13 6.35 6.38 6.18 6.30 338800 5.978380
32 PFRM3.SA 2020-02-14 6.29 6.35 6.12 6.26 366100 5.940423
33 PFRM3.SA 2020-02-17 6.32 6.44 6.20 6.20 582900 5.883485
34 PFRM3.SA 2020-02-18 6.20 6.50 6.15 6.48 341200 6.149191
35 PFRM3.SA 2020-02-19 6.50 6.83 6.40 6.67 722800 6.329491
36 PFRM3.SA 2020-02-20 6.70 6.78 6.62 6.66 206200 6.320002
37 PFRM3.SA 2020-02-21 6.61 6.62 6.41 6.52 261500 6.187149
38 PFRM3.SA 2020-02-26 6.24 6.24 5.89 5.99 514500 5.684206
39 PFRM3.SA 2020-02-27 5.85 6.04 5.61 5.75 654500 5.456458
40 PFRM3.SA 2020-02-28 5.70 5.76 5.51 5.76 538900 5.465948
41 PFRM3.SA 2020-03-02 5.82 6.10 5.71 6.05 294000 5.741143
42 PFRM3.SA 2020-03-03 6.16 6.25 5.92 6.05 415000 5.741143
43 PFRM3.SA 2020-03-04 6.09 6.16 5.90 6.05 465100 5.741143
44 PFRM3.SA 2020-03-05 6.05 6.06 5.60 5.75 415300 5.456458
45 PFRM3.SA 2020-03-06 5.46 5.60 5.29 5.60 609400 5.314116
46 PFRM3.SA 2020-03-09 4.89 5.16 4.63 4.75 920400 4.507509
47 PFRM3.SA 2020-03-10 5.12 5.32 4.77 5.32 1060000 5.048410
48 PFRM3.SA 2020-03-11 5.28 5.41 4.69 5.21 870500 4.944026
49 PFRM3.SA 2020-03-12 4.50 4.52 4.02 4.02 515900 3.814776
50 PFRM3.SA 2020-03-13 4.50 4.68 4.10 4.49 706000 4.260782
51 PFRM3.SA 2020-03-16 4.01 4.65 3.90 4.35 564900 4.127929
52 PFRM3.SA 2020-03-17 4.50 4.52 3.96 4.24 672800 4.023544
53 PFRM3.SA 2020-03-18 3.80 3.95 3.29 3.50 870200 3.321322
54 PFRM3.SA 2020-03-19 3.20 3.58 2.89 3.58 1378800 3.397238
55 PFRM3.SA 2020-03-20 3.67 3.96 3.22 3.22 1539200 3.055617
56 PFRM3.SA 2020-03-23 3.23 3.34 2.87 3.10 1183900 2.941742
57 PFRM3.SA 2020-03-24 3.22 3.42 3.13 3.13 1064000 2.970211
58 PFRM3.SA 2020-03-25 3.21 3.66 3.15 3.52 890600 3.340301
59 PFRM3.SA 2020-03-26 3.59 4.30 3.50 4.18 1004100 3.966608
60 PFRM3.SA 2020-03-27 3.90 4.33 3.90 4.25 572100 4.033034
61 PFRM3.SA 2020-03-30 4.26 4.68 4.17 4.27 771200 4.052013
62 PFRM3.SA 2020-03-31 4.20 4.38 4.07 4.30 371300 4.080482
63 PFRM3.SA 2020-04-01 4.11 4.25 3.98 4.11 411100 3.900181
64 PFRM3.SA 2020-04-02 415.00 425.00 402.00 4.18 282600 396.660767
65 PFRM3.SA 2020-04-03 410.00 420.00 383.00 4.00 351100 379.579681
66 PFRM3.SA 2020-04-06 421.00 423.00 395.00 3.99 639500 378.630737
67 PFRM3.SA 2020-04-07 418.00 434.00 397.00 4.05 455800 384.324432
68 PFRM3.SA 2020-04-08 410.00 430.00 400.00 4.28 443600 406.150269
69 PFRM3.SA 2020-04-09 431.00 447.00 424.00 4.29 417900 407.099213
70 PFRM3.SA 2020-04-13 429.00 445.00 423.00 4.39 287500 416.588715
71 PFRM3.SA 2020-04-14 450.00 455.00 442.00 4.43 273200 420.384491
72 PFRM3.SA 2020-04-15 432.00 452.00 421.00 4.44 278800 421.333466
73 PFRM3.SA 2020-04-16 4.50 4.53 4.26 4.34 530800 4.118440
74 PFRM3.SA 2020-04-17 4.54 4.78 4.44 4.64 908400 4.403124
75 PFRM3.SA 2020-04-20 4.46 4.76 4.43 4.60 447200 4.365166
76 PFRM3.SA 2020-04-22 4.60 5.04 4.57 4.89 788600 4.640362
77 PFRM3.SA 2020-04-23 4.97 5.17 4.63 4.77 576500 4.526488
78 PFRM3.SA 2020-04-24 4.62 4.72 4.22 4.49 703200 4.260782
79 PFRM3.SA 2020-04-27 4.61 4.65 4.35 4.63 296400 4.393635
80 PFRM3.SA 2020-04-28 4.65 4.82 4.41 4.58 583300 4.346187
81 PFRM3.SA 2020-04-29 4.62 4.75 4.50 4.74 391800 4.498019
82 PFRM3.SA 2020-04-30 4.75 4.79 4.47 4.59 288300 4.355677
83 PFRM3.SA 2020-05-04 4.41 4.63 4.33 4.63 288400 4.393635
84 PFRM3.SA 2020-05-05 4.63 4.76 4.54 4.70 389700 4.460061
85 PFRM3.SA 2020-05-06 4.69 5.26 4.57 5.22 1389800 4.953515
86 PFRM3.SA 2020-05-07 5.35 5.37 4.61 4.82 1877800 4.573936
87 PFRM3.SA 2020-05-08 4.93 4.98 4.50 4.72 1040300 4.479040
88 PFRM3.SA 2020-05-11 4.70 4.74 4.40 4.49 532600 4.260782
89 PFRM3.SA 2020-05-12 4.50 4.58 4.22 4.36 585100 4.137419
90 PFRM3.SA 2020-05-13 4.40 4.40 4.13 4.19 456100 3.976097
91 PFRM3.SA 2020-05-14 4.13 4.27 3.97 4.27 504600 4.052013
92 PFRM3.SA 2020-05-15 4.24 4.36 4.13 4.19 255500 3.976097
93 PFRM3.SA 2020-05-18 4.30 4.32 4.12 4.19 351000 3.976097
94 PFRM3.SA 2020-05-19 4.19 4.30 4.14 4.18 335200 3.966608
95 PFRM3.SA 2020-05-20 4.21 4.25 4.10 4.20 347500 3.985587
96 PFRM3.SA 2020-05-21 4.16 4.23 4.10 4.20 234300 3.985587
97 PFRM3.SA 2020-05-22 4.20 4.39 4.11 4.39 368500 4.165887
98 PFRM3.SA 2020-05-25 4.42 4.66 4.42 4.64 680200 4.403124
99 PFRM3.SA 2020-05-26 4.74 4.84 4.49 4.60 631300 4.365166
100 PFRM3.SA 2020-05-27 4.60 4.74 4.55 4.66 422400 4.422103
101 PFRM3.SA 2020-05-28 4.66 4.78 4.57 4.60 451200 4.365166
102 PFRM3.SA 2020-05-29 4.65 4.65 4.44 4.56 387400 4.327209
103 PFRM3.SA 2020-06-01 4.50 4.93 4.46 4.74 962300 4.498019
104 PFRM3.SA 2020-06-02 4.77 5.05 4.77 5.00 1063000 4.744746
105 PFRM3.SA 2020-06-03 5.05 5.31 4.86 4.86 924300 4.611893
106 PFRM3.SA 2020-06-04 4.85 5.17 4.71 4.86 768600 4.611893
107 PFRM3.SA 2020-06-05 4.94 5.12 4.85 4.85 661700 4.602404
108 PFRM3.SA 2020-06-08 4.95 5.86 4.95 5.79 2498400 5.494416
109 PFRM3.SA 2020-06-09 5.64 5.93 5.50 5.66 1288500 5.371052
110 PFRM3.SA 2020-06-10 5.76 5.81 5.35 5.35 1140600 5.076878
111 PFRM3.SA 2020-06-12 5.25 5.36 4.92 5.27 894600 5.000962
112 PFRM3.SA 2020-06-15 5.06 5.30 5.00 5.22 618700 4.953515
113 PFRM3.SA 2020-06-16 5.31 5.55 5.25 5.41 656000 5.133815
114 PFRM3.SA 2020-06-17 5.41 5.61 5.37 5.59 597600 5.304626
115 PFRM3.SA 2020-06-18 5.61 5.74 5.48 5.65 440600 5.361563
116 PFRM3.SA 2020-06-19 5.71 5.74 5.47 5.50 558500 5.219221
117 PFRM3.SA 2020-06-22 5.52 5.75 5.50 5.72 554600 5.427989
118 PFRM3.SA 2020-06-23 5.75 6.25 5.74 6.00 1652300 5.693695
119 PFRM3.SA 2020-06-24 6.00 6.15 5.77 5.99 830800 5.684206
120 PFRM3.SA 2020-06-25 5.99 6.04 5.79 5.94 827100 5.636758
121 PFRM3.SA 2020-06-26 5.99 6.03 5.59 5.59 767000 5.304626
122 PFRM3.SA 2020-06-29 5.69 5.74 5.33 5.58 739600 5.295136
123 PFRM3.SA 2020-06-30 5.60 5.70 5.45 5.65 457900 5.361563
124 PFRM3.SA 2020-07-01 5.70 5.75 5.49 5.53 634200 5.247689
125 PFRM3.SA 2020-07-02 5.61 5.64 5.41 5.42 448700 5.143305
[ reached 'max' / getOption("max.print") -- omitted 369 rows ]
benchmark:
CodePudding user response:
You just need to add in rowwise before replace. Essentially, in your replace statement, close/100 is returning a new calculated value for every row. When in reality, you only want to replace the values greater than 100. So, with using rowwise, then it will only calculate per row, rather than for the whole cost column for the values parameter.
library(tidyverse)
price %>%
rowwise %>%
mutate(close = replace(close, close > 100, close/100))
Output
# A tibble: 493 × 8
# Rowwise:
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 PFRM3.SA 2020-01-02 6.59 6.67 6.49 6.6 549100 6.26
2 PFRM3.SA 2020-01-03 6.5 6.99 6.4 6.96 1202000 6.60
3 PFRM3.SA 2020-01-06 7.07 7.71 7.07 7.69 2692600 7.30
4 PFRM3.SA 2020-01-07 7.42 7.5 7.05 7.05 2325900 6.69
5 PFRM3.SA 2020-01-08 7.01 7.28 6.8 7.13 1588900 6.77
6 PFRM3.SA 2020-01-09 7.21 7.22 6.98 7.12 638800 6.76
7 PFRM3.SA 2020-01-10 7.12 7.23 6.96 6.96 658000 6.60
8 PFRM3.SA 2020-01-13 7.01 7.17 6.96 7.1 586700 6.74
9 PFRM3.SA 2020-01-14 7.12 7.15 6.95 7 663700 6.64
10 PFRM3.SA 2020-01-15 7.02 7.39 7.02 7.27 1173000 6.90
# … with 483 more rows
Another option (provided by @DonaldSeinen) is to use transform, which I've added to the Benchmark below.
transform(price, close = ifelse(close > 100, close / 100, close))
Benchmark
So, it looks like transform (provided by @DonaldSeinen) is by far the fastest. @TarJae provides a nice visual of this!
library(microbenchmark)
microbenchmark(
replace = price %>%
rowwise %>%
mutate(close = replace(close, close > 100, close / 100)),
case_when = price %>%
mutate(close = case_when(close > 100 ~ close / 100, TRUE ~ close)),
times = 1000L,
control = list(order = 'block')
)
# Unit: microseconds
# expr min lq mean median uq max neval
# replace 4149.300 4997.8325 5785.2692 5469.816 5993.926 28077.15 1000
# case_when 1410.959 1514.8980 1906.9804 1662.030 2138.092 11863.49 1000
# DonaldSeinen 247.421 271.7735 410.3955 315.499 472.813 11546.28 1000
# TarJae 1536.112 1755.8475 2254.4749 2086.729 2560.107 12802.07 1000

