I'd like to be able to select out data from my data.frame simply and elegantly, but I'm new to R.
This worked:
SchIndxRead %>% select(,.DormList) %>% filter(SchIndxRead$.College.Lookup=="MIAD")
I tried using this:
SchIndxRead[SchIndxRead$.College.Lookup=='MIAD',".DormList"]
And expected just "Two50Two"
but got this result:
> [1] "Two50Two" NA NA NA NA
> [6] NA NA NA NA NA
> [11] NA NA NA NA NA
> [16] NA NA NA NA NA
> [21] NA NA NA NA NA
CodePudding user response:
Your column .College.Lookup probably has NA values, such that the expression SchIndxRead$.College.Lookup=="MIAD" returns TRUE's and FALSE's, but also NA's.
When you try to subset a variable with a vector that contains NA's, the result will also have NA's:
set.seed(10)
df = tibble(a = 1:10, b = sample(c(0, 1, NA), 10, TRUE))
> df
# A tibble: 10 × 2
a b
<int> <dbl>
1 1 NA
2 2 0
3 3 1
4 4 NA
5 5 1
6 6 NA
7 7 NA
8 8 NA
9 9 NA
10 10 NA
> df$b == 1
[1] NA FALSE TRUE NA TRUE NA NA NA NA NA
> df[df$b == 1, "a"]
# A tibble: 9 × 1
a
<int>
1 NA
2 3
3 NA
4 5
5 NA
6 NA
7 NA
8 NA
9 NA
That's why there were NA's in your second attempt.
But dplyr::filter "ignores" NA's, that is, it filters out rows where the condition returns FALSE or NA. That's why there weren't NA's in your first attempt.
Two hints to improve your code:
- It would've been better to change the order of
selectandfilter:
SchIndxRead %>% filter(.College.Lookup == "MIAD") %>% select(.DormList)
This way you don't have to add the SchIndxRead$ later.
- You might prefer using
pull():
SchIndxRead %>% filter(.College.Lookup == "MIAD") %>% pull(.DormList)
