Home > Mobile >  Pairwise comparison using diff on multiple files. If there are same - display only them. If none of
Pairwise comparison using diff on multiple files. If there are same - display only them. If none of

Time:02-03

I want to compare small multiple files (over 70) pairwise using diff in bash.

If same files are found - only they should be displayed:

>Files t01 and t03 are the same
>Files t10 and t15 are the same

If all files are unique - a message should be displayed:

>All files are unique

The following code snippet runs a loop for all matching files, then checks the return value of diff, and if the value is 0, it prints the message which files are similar:

FILES=./dir/t*
for data1 in $FILES; do
    for data2 in $FILES; do
        if [[ "$data1" != "$data2" ]]
        then

            diff $data1 $data2 > /dev/null
            if [[ $? -eq 0 ]]
             then
                echo "The files $(basename ${data1}) and $(basename ${data2}) are same."
            
             fi
        fi

    done
done    

If I delete similar files from folder, nothing will be displayed.

If I add else to the if [[ $? -eq 0 ]] statement, the output will look like this if similar files exist:

    >Files t01 and t03 are the same
    >All files are unique
    >All files are unique
    >Files t10 and t15 are the same
    >All files are unique
    >All files are unique
    ...

Unfortunately, I don't know how to continue the code to make it work properly. I would be very thankful if someone could help.

CodePudding user response:

First expand the glob in an array:

files=( ./dir/t* )

then use nested for loops to generate the pairs and do the comparison inside; add a variable for memorising if there was a hit:

found=false

for ((i = 0; i < ${#files[@]} ; i  ))
do
    for ((j = i 1; j < ${#files[@]}; j  ))
    do
        if diff -q "${files[i]}" "${files[j]}" > /dev/null
        then
            found=true
            echo "Files ${files[i]##*/} and ${files[j]##*/} are identical"
        fi
    done
done

! "$found" && echo "Files are unique"

  •  Tags:  
  • Related