Home > Software design >  Sort output of find based on regex
Sort output of find based on regex

Time:02-03

I have a find command, which lists all the directories containing a matching substring. The directories matching the find command are all formatted as such:

a_bunch_of_stuff_20210110_101945_more_stuff/
a_bunch_of_stuff_20201225_101934_more_stuff/
a_bunch_of_stuff_20210106_101933_more_stuff/

As you can see, each directory name has a bunch of text following and leading a datetime in the form of YYYYMMDD_HHMMSS.

I want to sort the output of the find command based on these datetime strings, from oldest to newest. I can't simply do a | sort, because the "a_bunch_of_stuff_" following the datetime can be anything.

Is there a way to sort based on a regex, similar to how I can do so in the find command? Note: performance here is not of concern.

CodePudding user response:

If you can use perl and data can fit into memory :

find ... | perl -e 'use Sort::Key qw(keysort); map { print; } keysort { /(\d{8}_\d{6})/; $1 } (<>)'

You might need to tune the regex.

CodePudding user response:

A possible solution (if your filenames don't contain newlines):

find . -type d -name 'somefilter' |
sed -nE 's|.*\/.*_([[0-9]{8}_[0-9]{6})_.*|\1 &|p' |
sort -k1,1 |
sed -E 's/[^ ]* //'
  •  Tags:  
  • Related