This question pertains to the situation where
- An image was uploaded, say
mypicture.jpg - Wordpress created multiple copies of it with different resolutions like
mypicture-300x500.jpgandmypicture-600x1000.jpg - You delete the original image only
In this scenario, the remaining photos on the filesystem are mypicture-300x500.jpg and mypicture-600x1000.jpg.
How can you script this to find these "dangling" images with the missing original and delete the "dangling" images.
CodePudding user response:
You could use find to find all lower resolution pictures with the -regex test:
find . -type f -regex '.*-[0-9] x[0-9] \.jpg'
And this would be much better than trying to parse the ls output which is for humans only, not for automation. A safer (and simpler) bash script could thus be:
#!/usr/bin/env bash
while IFS= read -r -d '' f; do
[[ "$f" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ] &&
echo rm -f "$f"
done < <(find . -type f -regex '.*-[0-9] x[0-9] \.jpg' -print0)
(delete the echo once you will be convinced that it works as expected).
Note: we use the
-print0action and the emptyreaddelimiter (-d '') to separate the file names with theNULcharacter instead of the newline character. This is preferable because it works as expected even if you have unusual file names (e.g., with spaces).
Note: as we test the file name inside the loop we could simply search for files (
find . -type f -print0). But I suspect that if you have a large number of files the performance would be negatively impacted. So keeping the-regextest is probably better.
Bash loops are OK but they tend to become really slow when the number of iteration increases. So, let's incorporate our simple bash script in a single find command with the -exec action:
find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
Note:
bash -ctakes a script to execute as first argument, then the positional parameters to pass to the script, starting with$0. This is why we pass_(my favourite for don't care), followed by{}(the current file path).
Note:
findaction but here it is needed because-execis one of thefindactions that inhibit the default behaviour.
This will print a list of files. Check that it is correct and, once you will be satisfied, add the -delete action:
find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -delete -print
See man find and man bash for more explanations.
Demo:
$ touch mypicture.jpg mypicture-300x500.jpg mypicture-600x1000.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
$ rm -f mypicture.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
./mypicture-300x500.jpg
./mypicture-600x1000.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -delete -print
./mypicture-300x500.jpg
./mypicture-600x1000.jpg
$ ls *.jpg
ls: cannot access '*.jpg': No such file or directory
One last note: if, by accident, one of your full resolution picture matches the regular expression for lower resolution pictures (e.g., if you have a
balloon-1x1.jpgfull resolution picture) it will be deleted. This is unfortunate but according your specifications there is no easy way to distinguish it from an orphan lower resolution picture. Be careful...
CodePudding user response:
I've written a Bash script that will attempt to find the original filename (i.e. mypicture.jpg) based on scraping away the WordPress resolution (i.e. mypicture-300x500.jpg), and if it's not found, delete the "dangling image" (i.e. rm -f mypicture-300x500.jpg)
#!/bin/bash
for directory in $(find . -type d)
do
for image in $(ls $directory)
do
echo "The current filename is $image"
resolution=$(echo $image | rev | cut -f 1 -d "-" | rev | xargs)
echo "The resolution is $resolution"
extension=$(echo $resolution | rev| cut -f 1 -d "." | rev | xargs)
echo "The extension is $extension"
resolutiononly=$(echo $resolution | sed "s@.$extension@@g")
echo "The resolution only is $resolutiononly"
pattern="[0-9] x[0-9] "
if [[ $resolutiononly =~ $pattern ]]; then
echo "The pattern matches"
originalfilename=$(echo $image | sed "s@-$resolution@.$extension@g")
echo "The current filename is $image"
echo "The original filename is $originalfilename"
if [[ -f "$originalfilename" ]]; then
echo "The file exists $originalfilename"
else
rm -f $directory/$image
fi
else
break
fi
done
done
