In this SO post the accepted answer shows how to remove a prefix from a subset of column names. I will reproduce the toy data and solution and get to my issue. Note that I have altered the toy data by adding a suffix (_end) to two of the variables.
df <- data.frame(ATH_V1 = rnorm(10), ATH_V2_end = rnorm(10), ATH_V3_end = rnorm(10), ATH_V4 = rnorm(10), ATH_V5 = rnorm(10), ATH_V6 = rnorm(10), ATH_V7 = rnorm(10))
df
# ATH_V1 ATH_V2_end ATH_V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
# 1 -1.5520380 1.16782520 -0.3628090 1.5238728 -1.1660806 -1.01416226 -0.95163564
# 2 0.6270134 1.63810443 0.2199733 -0.6175186 -1.8909463 -0.23913125 -0.70650296
# 3 -0.7462879 0.08504734 0.6506818 -0.5436457 1.3369322 1.69883194 -1.07623124
# 4 0.3196569 0.95782069 -0.3454795 -1.7485607 2.3896003 1.24958489 -0.73316675
# 5 -0.8820414 -2.01739089 -0.5881156 1.2725712 1.4251221 0.56213069 -0.47188011
# 6 -0.5534390 1.48974625 -0.2532402 -1.2333677 1.6690452 -0.48178503 0.30727117
# 7 -0.4637729 -1.13762829 1.3072153 1.0082090 -1.7958189 -1.37604307 -0.08900913
# 8 -0.3878013 -1.09693619 -0.9022672 0.1809460 -1.0303186 0.54576930 -0.64634653
# 9 -0.9553941 0.91495814 -0.2993733 -0.5860527 -0.5623538 -0.24521585 0.21297231
# 10 2.2891475 0.05568124 -0.1718192 0.4249103 2.6009601 0.06357305 0.47794076
I would like to remove the ATH_ prefix ONLY from the columns that end with _end.
Now the solution in the original post proposed the following code, where we specify the column names we want to operate on in a vector within rename_at and then remove the ATH_ prefix via the str_remove function, like so
df %>% rename_at(c("ATH_V2_end", "ATH_V3_end"), ~ .x %>% str_remove("^ATH_"))
# ATH_V1 V2_end V3_end ATH_V4 ATH_V5 ATH_V6 ATH_V7
# 1 1.14822123 -0.6285561 0.52458507 -0.63906454 1.1401342 -1.6559726 0.41732258
# 2 0.07519307 2.0090135 0.13440368 1.24337727 -0.2906335 -0.1349698 1.45647898
# 3 -0.87465492 -1.8766134 -0.17119197 -1.22701678 -0.7603659 0.1015543 -1.06211069
# 4 1.01402581 -0.4744169 0.78326842 -0.02910686 0.1548202 1.0042147 -0.23739832
# 5 1.00613252 -1.5701097 1.64415870 0.86733910 0.1558727 0.3011537 0.05700506
# 6 -1.01416351 -1.7687648 -0.13999833 -1.01482747 -0.5732621 -0.2504362 2.20762232
# 7 1.00861721 0.7494679 0.08853307 1.46402775 -0.1153655 0.8427913 -1.16114455
# 8 0.28117809 -0.6669487 -0.50816389 -0.12875270 0.7798111 -0.3937148 -1.30894602
# 9 -0.23092640 2.8516271 -1.36959691 -0.39303227 1.9862182 1.2378769 -1.66039502
# 10 0.65034202 0.9009923 0.58264859 0.50931251 1.7284268 1.8420746 -0.71894637
However the help for the new dplyr suite of packages states that rename_at has been superseded by rename_with and that you can use some of the powerful functionality of the select functions to choose a subsets of columns.
So I would like to remove the ATH_ prefix ONLY from the columns that end with _end using the ends_with() function within rename_with() using tidyverse grammar.
I tried
df %>%
select(ends_with("_end")) %>%
rename_with(str_remove(string = ~.x,
pattern = "^ATH_"))
and
df %>%
rename_with(cols = ends_with("_end"),
.fn = str_remove(string = ~.x,
pattern = "^ATH_"))
And got the same error
Error in `rename_with()`:
! Can't convert `.fn`, a character vector, to a function.
Any help much appreciated
CodePudding user response:
If you use select to filter the columns, those columns will no longer be a part of the data frame. You're on the right track, though.
If you don't use the tilde with .x to represent the dynamic field name, you have to use function, literally.
For example, you can use the tilde, like this:
rename_with(df, .cols = ends_with("_end"),
~ gsub("^ATH_", "", .x))
Or you can designate a variable name of your choice, instead of .x, and use function(), like this:
rename_with(df, .cols = ends_with("_end"),
.fn = function(frenchFries) {
gsub("^ATH_", "", frenchFries)
})
You can use names() to test your work before you change the object. The names() function wasn't really intended for piping, but with a bit of finesse, it does the job.
rename_with(df, .cols = ends_with("_end"),
.fn = function(frenchFries) {
gsub("^ATH_", "", frenchFries)
}) %>% {names(.)}
# [1] "ATH_V1" "V2_end" "V3_end" "ATH_V4" "ATH_V5" "ATH_V6" "ATH_V7"
In R, very few libraries present objects as mutable or modified in place, so you have to assign this to an object to actually change it.
df <- rename_with(df, .cols = ends_with("_end"),
~ gsub("^ATH_", "", .x))
CodePudding user response:
You put the ~ symbol to a wrong place... It should be
df %>%
rename_with(cols = ends_with("_end"),
.fn = ~ str_remove(string = .x, pattern = "^ATH_"))
# V1 V2_end V3_end V4 V5 V6 V7
# 1 -0.7211939 -0.8369699 0.8317321 -0.05233632 0.05711023 -1.1028795 -0.44261881
# 2 -1.2497923 -0.9062427 1.6472891 -0.77403163 -0.37941031 -0.8270005 1.14721669
# 3 -0.1343481 -1.2049003 0.5347915 0.16202132 -0.38939422 -1.6720070 -1.55429956
# 4 0.1664160 1.9248057 -0.1133589 -0.48717961 0.89363994 1.0983927 0.82700398
# 5 -1.0916865 -0.8093323 -1.3128583 -0.68529918 -0.22614257 0.3307024 -2.45071083
# 6 0.4191887 1.6177852 1.7017075 1.40316160 -1.30115133 -0.6129785 1.28648456
# 7 0.8725919 -0.2706190 1.3131828 -2.99366849 1.28976332 -0.2348865 1.09045642
# 8 -0.5935664 -0.2918142 0.7699294 -1.30566644 -1.53736071 -0.2689142 0.10605338
# 9 1.4284704 -0.3578967 -0.8106887 1.04486145 -0.32881870 0.2486389 0.08226489
# 10 1.2323733 -0.2241655 0.2167915 -0.31868072 -0.74497243 -1.7778882 -0.70894820
More concise expression is
df %>%
rename_with(~ str_remove(.x, "^ATH_"), ends_with("_end"))
and even
df %>%
rename_with(str_remove, ends_with("_end"), "^ATH_")
