Home > OS >  How to check if a STDIO stream is gzipped and consequently gunzip it in the same stream
How to check if a STDIO stream is gzipped and consequently gunzip it in the same stream

Time:01-19

I have the following (simplified) situation with bash:

A stream of data comes in that is gzipped. Sometimes however, the data has been double gzipped by accident so what I would have to do then is to ...incoming stream... | gunzip | gunzip | ...continue process... . But only when I am sure that the output of the first gunzip is yet another gzipped datastream.

My initial thought was:

echo "My plain text" | gzip | gzip | gunzip | if [ $(mimetype -b --stdin ) == 'application/gzip' ]; then zcat; else cat; fi

but this leads to an empty stdin for the second gunzip (gzip: stdin: unexpected end of file). My suspicion is that the stream can only be read once and that mimetype has already finished it therefor leading to an empty input to zcat.

The working of the script can be checked with:

echo "My plain text" | gzip | gzip | gunzip | if [ 'application/gzip' == 'text/plain' ]; then cat; else zcat; fi

which runs fine.

What I'd like to find out:

  1. Is the assumption correct that you can only process the stream once?
  2. What could be a way forward without saving the data to hard-disk in between?

CodePudding user response:

Use zless instead of zcat. If its input isn't compressed, it will pass it through unchanged (this allows you to use a single run of zless with a mixture of compressed and uncompressed files). It's normally an interactive utility, but if the output is not a terminal it's just a filter.

echo "My plain text" | gzip | gzip | zless | zless
  •  Tags:  
  • Related