I have found the code for read all the text file in a directory. But I don't know how to find the commonality between them. Please help me with the code, or share on what area I need to explore more. There is too much to learn but I have a time constraint.
use strict;
use warnings;
use English;
my $dir = 'C:\Perl_Example\Data';
foreach my $fp (glob("$dir/*.txt"))
{
printf "%s\n", $fp;
#the file header
open my $fh, "<", $fp or die "can't read open '$fp': $OS_ERROR";
#open file to read which is each file in dir
while (<$fh>)
{
printf " %s", $_;
#print the file content
}
close $fh or die "can't read close '$fp': $OS_ERROR";
}
CodePudding user response:
Here is one way to "find the common lines". Of course, there is more than one way to do that in Perl :)
#!/usr/bin/perl -w
my %h;
for my $file (@ARGV) {
open (my $fh, $file) or die "$file: $!\n";
while(<$fh>) {
chomp;
push @{$h{$_}}, $file;
}
}
for (sort keys %h) {
if(@{$h{$_}} > 1) {
print "line <$_>\n";
print " occurs in ", join(", ", @{$h{$_}}), "\n";
}
}
exit 0;
Now the test files, named {1,2,3}:
% cat 1
PING YA.RU (87.250.250.242): 56 data bytes
64 bytes from 87.250.250.242: icmp_seq=0 ttl=249 time=14.615 ms
64 bytes from 87.250.250.242: icmp_seq=1 ttl=249 time=14.943 ms
64 bytes from 87.250.250.242: icmp_seq=2 ttl=249 time=14.381 ms
64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms
64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms
% cat 2
PING YA.RU (87.250.250.242): 56 data bytes
64 bytes from 87.250.250.242: icmp_seq=0 ttl=249 time=14.615 ms
64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms
64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms
% cat 3
64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms
64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms
And test run of the script:
% ./try.pl 1 2 3
line <64 bytes from 87.250.250.242: icmp_seq=0 ttl=249 time=14.615 ms>
occurs in 1, 2
line <64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms>
occurs in 1, 2, 3
line <64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms>
occurs in 1, 2, 3
line <PING YA.RU (87.250.250.242): 56 data bytes>
occurs in 1, 2
