Home > Mobile >  Check column type if it exists
Check column type if it exists

Time:01-19

I'm trying to create a validation stage for dataframes. I'm trying to use the validate library and have added elements as described in the documentation here.

I can't however see the right way to check a property of a column only when that column exists.

Following the cars example from the tutorial:

library(validate)
data(cars)
rules <- validator(speed >= 0, dist >= 0)
confront(cars, rules)

So that works fine. What I would like to do is add a rule so that if there is a name column to the car's data it will be of type character. However, when adding the rule as in the following, it raises an error because the names column doesn't exist for the rule can be run.

library(validate)
data(cars)
rules <- validator(speed >= 0, dist >= 0, is.character(name))
confront(cars, rules)

**update I don't know if the following attempt better represents what I'm aiming for. This fails on syntax.

 rules <- validator(speed >= 0, dist >= 0, speed/dist <= 1.5, cor(speed, dist)>=0.2, ifelse(exists("name"), is.character(name),T))

CodePudding user response:

Comparisons All basic comparisons, including >,>=,==,!=,<=,<, %in% are validating statements. When executing a validating statement, the %in% operator is replaced with %vin%.

Logical operations Unary logical operators ‘!’, all() and any define validating statements. Binary logical operations including &,&&,|,||, are validating when P and Q in e.g. P & Q are validating. (note that the shortcircuits && and & onnly return the first logical value, in cases where for P && Q, P and/or Q are vectors. Binary logical implication P ⇒ Q (P implies Q) is implemented as if ( P ) Q. The latter is interpreted as !(P) | Q.

Also note that the dot in names(.) refers to data.frame you confront

https://cran.r-project.org/web/packages/validate/validate.pdf

rules <- validator(
  speed >= 0,
  dist >= 0,
  if("names" %in% names(.)) is.character(names),
  if("speed" %in% names(.)) is.character(speed),
  if("speed" %in% names(.)) is.numeric(speed)
)

results <- confront(cars, rules)

summary(results)

  name items passes fails nNA error warning                                        expression
1   V1    50     50     0   0 FALSE   FALSE                        (speed - 0) >= -0.00000001
2   V2    50     50     0   0 FALSE   FALSE                         (dist - 0) >= -0.00000001
3   V3     1      1     0   0 FALSE   FALSE !("names" %vin% names(.)) | (is.character(names))
4   V4     1      0     1   0 FALSE   FALSE !("speed" %vin% names(.)) | (is.character(speed))
5   V5     1      1     0   0 FALSE   FALSE   !("speed" %vin% names(.)) | (is.numeric(speed))
  •  Tags:  
  • Related