I try to count characters that where submitted to git by the author "JohnJohnson" using this command:
wc -m $(git log --use-mailmap --no-merges --author="JohnJohnson" --name-only --pretty=format:"" | sort -u)
The problem is that on Linux and Windows(git-bash) it produces different results, at least because on Windows the new line consists of two chars '\r\n'. Is there a way to make wc -m to ignore '\r' so I get consistent results on both OSs with the same command?
CodePudding user response:
NOTE: While running dos2unix on each file before running wc -m should suffice, I'm assuming a) dos2unix is not available and/or b) OP may find there are other characters (besides \r) that need to be removed.
Assuming the objective is to generate the same exact output as wc -m, one idea using a user-defined function:
my_wc () {
local charcount=0 totcount=0
for fname in $@
do
charcount=$(tr -d '\r' < $fname | wc -m)
echo "$charcount $fname"
((totcount =charcount))
done
echo "$totcount total"
}
Applying to OP's example:
my_wc $(git log --use-mailmap --no-merges --author="JohnJohnson" --name-only --pretty=format:"" | sort -u)
If OP finds additional characters (besides \r) to skip then add them to the tr -d '\r' call).
Another function idea but this one uses awk:
my_wc() {
awk 'BEGIN { RS="^$" } # whole file becomes one single, long record
{ gsub("\r","")
n=length($0)
tot =n
print n,FILENAME
}
END { print tot,"total"}' $@
}
Demonstrating these functions on a few sample files:
$ head f?
==> f1 <==
a 13
a 5
b 7
a 20
a 3
==> f2 <==
a 13
a 5
b 7
a 20
a 3
==> f3 <==
a 13
a 5
b 7
a 20
a 3
$ dos2unix f?
$ wc -m f?
22 f1
22 f2
22 f3
66 total
$ unix2dos f?
$ wc -m f?
27 f1
27 f2
27 f3
81 total
$ my_wc f?
22 f1
22 f2
22 f3
66 total
CodePudding user response:
Operate in a repo configured to do no newline translation in your work tree, i.e. with eol processing turned off. You can do this anywhere, say git config core.eol false. Easiest way to avoid interference is probably to do this in a scratch clone,
git clone -ns . `mktemp -d`; cd $_
git config core.eol false
git checkout
and now you've got a pristine checkout with no eol munging applied.
