Home > Software engineering >  How to get Perl to match \r in files with windows EOL characters
How to get Perl to match \r in files with windows EOL characters

Time:01-24

I'm trying to write a perl script to identify files with Windows EOL characters, but \r matching doesn't seem to work.

Here's my test script:

#!/usr/bin/perl
use File::Slurp;

$winfile = read_file('windows-newlines.txt');
if($winfile =~ m/\r/) {
  print "winfile has windows newlines!\n"; # I expect to get here
}
else {
  print "winfile has unix newlines!\n"; # But I actually get here
}

$unixfile = read_file('unix-newlines.txt');
if($unixfile =~ m/\r/) {
  print "unixfile has windows newlines!\n";
}
else {
  print "unixfile has unix newlines!\n";
}

Here's what it outputs:

winfile has unix newlines!
unixfile has unix newlines!

I'm running this on Windows, and I can confirm in Notepad that the files definitely have the correct EOL characters:

Screenshot of windows-newlines.txt in Notepad  Screenshot of unix-newlines.txt in Notepad

CodePudding user response:

Unless binmode is true (which is not in your code) read_file will change \r\n to \n on Windows. From the code:

# line endings if we're on Windows
${$buf_ref} =~ s/\015\012/\012/g if ${$buf_ref} && $is_win32 && !$opts->{binmode};

In order to keep the original encoding set binmode, like shown in the documentation:

my $bin = read_file('/path/file', { binmode => ':raw' });
  •  Tags:  
  • Related