I am using the package "naturalsort" found here: https://github.com/kos59125/naturalsort Natural sorting is not something that is implemented elsewhere in a good manner in R as far as I know, so I was happy to find this package.
I use the function naturalsort to sort file names just like windows explorer, which works great locally.
But when I use it in my production environment deployed with Docker on Google Cloud Run, the sorting changes. I don't know if this is due to changes in locale(I am fra Denmark) or it is due to OS differences between my windows PC and the Docker/Google Cloud Run deployment.
I have created a example ready to be run in R:
######## Code start ###########
require(plumber)
require(naturalsort) #for name sorting
#* Retrieve sorted string list
#* @get /sortstrings
#* @param nothing
function(nothing) {
print(nothing)
test <- c("0.jpg", "file (4_5_1).jpeg", "1 tall thin image.jpeg",
"8.jpeg", "8.jpg", "file (2.1.2).jpeg", "file (0).jpeg", "3.jpeg",
"file (1).jpeg", "file (2.1.1).jpeg", "file (0) (3).jpeg", "file (2).jpeg",
"file (2.1).jpeg", "file (4_5).jpeg", "file (4).jpeg", "file (39).jpeg")
print("Direct sort")
print(naturalsort(text = test))
sorted_strings <- naturalsort(text = test)
return(sorted_strings)
}
######## Code end ###########
I would expect it to sort the file names like below, which it does locally both when run directly in the script and also when doing it through plumber RUN API:
c("0.jpg",
"1 tall thin image.jpeg",
"3.jpeg",
"8.jpeg",
"8.jpg",
"file (0) (3).jpeg",
"file (0).jpeg",
"file (1).jpeg",
"file (2).jpeg",
"file (2.1).jpeg",
"file (2.1.1).jpeg",
"file (2.1.2).jpeg",
"file (4).jpeg",
"file (4_5).jpeg",
"file (4_5_1).jpeg",
"file (39).jpeg"
)
But instead it sorts it like this:
c("0.jpg",
"1 tall thin image.jpeg",
"3.jpeg",
"8.jpeg",
"8.jpg",
"file (0) (3).jpeg",
"file (0).jpeg",
"file (1).jpeg",
"file (2.1.1).jpeg",
"file (2.1.2).jpeg",
"file (2.1).jpeg",
"file (2).jpeg",
"file (4_5_1).jpeg",
"file (4_5).jpeg",
"file (4).jpeg",
"file (39).jpeg")
Which is not like windows explorer.
CodePudding user response:
Try fixing the collating sequence prior to the naturalsort call. It varies by locale and can affect how strings are compared (and therefore sorted).
## Get initial value
lcc <- Sys.getlocale("LC_COLLATE")
## Use fixed value
Sys.setlocale("LC_COLLATE", "C")
sorted_strings <- naturalsort(text = test)
## Restore initial value
Sys.setlocale("LC_COLLATE", lcc)
You can find some details in ?sort, ?Comparison, and ?locales and more here.
