I want a script to put column values in groups with a maximum range of 10. So that all values in the group are <10 different from each other. I'm trying to get a script that defines the groups, counts how many values are in each group, and find the mean.
So if I had this df:
cat1 = c(85, 60, 60, 55, 55, 15, 0, 35, 35 )
cat2 = c("a","a","a","a","a","a","a","a","a")
df <- data.frame(cat1, cat2)
cat1 cat2
1 85 a
2 60 a
3 60 a
4 55 a
5 55 a
6 15 a
7 0 a
8 35 a
9 35 a
the output would be:
numValues cat1avg
1 85
4 57.5
1 15
1 0
2 35
I followed the top-rated answer 
We could see how much range is in each cluster produced in each case, and find the widest range cluster for each k. We see that dividing into four groups would have at least 15 range (see the 4 facet in the chart above), but 5 groups would be adequate to keep the within-cluster range under 5.
assignments %>%
group_by(k, .cluster) %>%
summarize(range = max(cat1) - min(cat1)) %>%
summarize(max_range = max(range))
# A tibble: 6 × 2
k max_range
<int> <dbl>
1 1 85
2 2 35
3 3 30
4 4 15
5 5 5
6 6 0
And finally:
assignments %>%
filter(k == 5) %>%
group_by(.cluster) %>%
summarize(numValues = n(),
cat1avg = mean(cat1))
.cluster numValues cat1avg
<fct> <int> <dbl>
1 1 1 85
2 2 1 15
3 3 2 35
4 4 4 57.5
5 5 1 0
