How to get the name of variable using it's index in R-CodePudding

I have data that looks like this

Name      A    B    C    D    E
r1        1    5    12  21    15
r2        2    4     7  10     9
r3        5   15     6   9     6
r4        7    8     0   7    18

My question is how can i get the name of variable using it's index

for example if i want the name of index number 1 the name that will return is "A"

thank you

CodePudding user response：

Use the colnames() function then index the vector that produces.

colnames(mtcars)[1]

That would return the name of the first variable in mtcars. Just change the name of the data.frame to match yours and the number to the variable of interest. E.g the third variable in iris is

 colnames(iris)[3]

CodePudding user response：

I think you want to know which column (name) includes the value 1.

tmp <- colSums(dat == 1) > 0
names(tmp[tmp])
# [1] "A"

Walk-through:

The == returns a matrix with per-position matches:

dat == 1
#       Name     A     B     C     D     E
# [1,] FALSE  TRUE FALSE FALSE FALSE FALSE
# [2,] FALSE FALSE FALSE FALSE FALSE FALSE
# [3,] FALSE FALSE FALSE FALSE FALSE FALSE
# [4,] FALSE FALSE FALSE FALSE FALSE FALSE

colSums(.) > 0 tells us which column has at least one TRUE:

colSums(dat == 1) > 0
#  Name     A     B     C     D     E 
# FALSE  TRUE FALSE FALSE FALSE FALSE

... and then we take the name of the names found. If none are found, it will return an empty vector:
```
names(tmp[tmp])
# character(0)
```

The only gotcha I can think of here is if you are doing high-precision floating-point comparison, in which case IEEE-754 comes into play (see Why are these numbers not equal?, Is floating point math broken?, and https://en.wikipedia.org/wiki/IEEE_754). For that, consider a test of inequality with tolerance instead of a strict test of equality.

This requires that we only look at numeric columns.

isnum <- sapply(dat, is.numeric)
isnum
#  Name     A     B     C     D     E 
# FALSE  TRUE  TRUE  TRUE  TRUE  TRUE 

tmp <- colSums(abs(dat[,isnum] - 1) < 1e-5) > 0
#                      ,^^^^^  ^^^^^^^^^^^.
#   subset the data --'                   
# ... and a test of inequality within tolerance
names(tmp[tmp])
# [1] "A"