Home > Net >  How to add new value to existing dataset so that only the range changes but mean remains the same in
How to add new value to existing dataset so that only the range changes but mean remains the same in

Time:02-05

Hi I'm a student studying statistic, as my textbook does not include much of the R coding but more of the basic calculation. Hence, would like to ask if it is there a way in R, for adding additional number to the existing generated set with specific mean and range?

1(a) Apply R to simulate a set of 100 numbers, with mean value of 20 and standard deviation of 2. List out the set of numbers.

> x <- rnorm(100,20,2) 
> print(x)
  [1] 20.59256 20.66069 12.68841 21.13575 24.09587 21.69535 20.18661 21.71236 20.92864 19.63182 22.12583 19.06238
 [13] 18.73813 22.59813 17.30012 16.98957 20.74050 21.28319 19.75426 20.62065 20.20814 18.16406 22.24261 22.05673
 [25] 21.27086 18.78538 21.86479 18.03242 21.00538 20.27731 22.59440 23.24389 20.20846 19.73281 19.50040 20.51712
 [37] 20.16493 23.56715 21.25884 18.37542 19.84470 19.81911 16.94701 19.06637 17.74580 18.03151 19.57144 16.45314
 [49] 20.89975 21.86249 17.42996 23.52514 21.17759 20.20160 18.11839 21.69716 16.93685 20.62335 20.37935 22.46131
 [61] 17.78489 19.90424 17.67674 20.20571 21.60567 20.41897 20.25134 22.44366 19.06513 20.62692 24.04101 24.03634
 [73] 20.15566 20.33157 20.22881 20.54014 19.49401 17.34388 19.94099 18.71450 19.24386 19.91813 18.71863 20.94027
 [85] 17.55676 17.18079 24.96868 24.09565 19.87488 20.06114 19.21374 18.39874 21.01435 18.38329 20.91788 21.45158
 [97] 20.43168 21.80438 20.50405 23.07149

(b) Add another 2 numbers to the set simulated in Question 1(a), such that the new set now has (same) mean of 20, but range becomes 200. List out the set of numbers.

CodePudding user response:

As you need a range of 200, then each aggregation should be current_range- desired_range/2

Solution in code:

> x <- rnorm(100,20,2) 
> 
> x
  [1] 17.84671 19.02797 23.83426 21.28975 20.35738 19.35365 22.57753 15.09991 18.18989 21.61537 20.97786 20.74412 20.95964
 [14] 20.00677 13.79552 16.65435 23.48840 19.50842 25.10979 21.10134 19.15891 22.58312 23.65634 17.89358 17.98529 22.33547
 [27] 20.84291 21.28044 22.37447 16.89740 19.95510 17.67625 19.64634 18.07762 21.50655 18.62182 18.59671 15.53542 12.85074
 [40] 19.06638 19.90743 18.64610 20.71322 22.78706 22.33449 22.30899 17.09384 21.57055 19.88208 18.85795 18.52198 23.70028
 [53] 22.91794 20.24993 20.63627 19.01672 19.34706 17.42375 21.88536 20.91214 21.16099 23.54738 21.40821 21.06485 23.95725
 [66] 21.09893 16.15641 21.28983 19.27113 17.89774 23.24801 23.23136 22.67976 23.21619 20.17257 21.09512 16.83565 22.17975
 [79] 20.50282 23.86079 14.97483 16.91109 18.66540 21.79649 21.01789 18.81188 19.77038 25.04698 17.69211 20.04085 17.29910
 [92] 18.98335 16.37297 19.78979 18.83341 16.60093 19.41327 17.85721 22.55003 16.67850
> 
> mean(x)
[1] 19.99774
> 
> sd(x)
[1] 2.494173
> 
> range <- range(x)[2]-range(x)[1]
> 
> range
[1] 12.25905
> 
> x <- c(x,range 100,range-100)
> 
> mean(x)
[1] 19.846
> 
> sd(x)
[1] 14.3276
> 
> range <- range(x)[2]-range(x)[1]
> 
> range
[1] 200
> 

CodePudding user response:

First create reproducible data:

set.seed(42)
x <- rnorm(100,20,2)
mean(x)
# [1] 20.06503
range(x)
# [1] 14.01382 24.57329
(x2 <- mean(x)   c(-100, 100))
# [1] -79.93497 120.06503

To keep the mean the same we need to add points 100 above the mean and 100 below the mean. Fortunately these points lie beyond the original range.

mean(c(x, x2))
# [1] 20.06503
diff(range(c(x, x2)))
# [1] 200

The mean is the same and the range is now 200.

  •  Tags:  
  • Related