Home > OS >  Python Decoding binary data back to file
Python Decoding binary data back to file

Time:01-20

i have a database in MSSQL with compressed and converted files, looks like this:

screenshot of values(every of them is 40k symbols long

i need to decode these files to pdf, docx and png files.

i've tried to do this via base64, but it didn't build correct files.

Do you have any ideas how could i decode all of them and build to correct file?

CodePudding user response:

The data is hex-encoded, try:

from base64 import b16decode

# Data 
encoded = '0x48656C6C6F'
decoded = b16decode(encoded[2:])
print(decoded)

Outputs b'Hello'

CodePudding user response:

As your learning the hard way stuffing blobs into a text database is probably the worst sin a data manager could commit as a novice, bloated unwieldy and slow it is best if the source files are left in their fast natural native compressed state and simply referenced in the DB by a related unique ID and file storage name. Rant over.

The fact that they are fixed size blocks of 40K suggests they are chunked in pieces thus several odd chunks needed to create one whole BLOB.

The blob you presented appears to be just part of a PNG image that should be, if I am interpreting correctly =

2164 pixels wide by 835 pixels high

HOWEVER the output is only 5 pixels high within that oddly suspect size canvas, which might be correct if its just the first part of a much longer truncated stream.

Your 40K chunk translates to 22K binary with the characteristics of a PNG BUT a PNG STARTS WITH 89 so you are having a problem since that is prefixed with 0x 00 22 40 DD BF

We can discard the 0x as the signature for a Hex stream and use the remainder as I did above, but what is the significance of the ODD 00 22 40 DD BF (most likely contains in part an indicator of the final full length size or pointer to the next chunk)

What you need to do is extract that image by your normal method and compare the total expected file size, since translated into 22Kb binary it may only equate to a small percent of the total to be expected. In that case you need to determine how & where the rest of the image is stored in order to concatenate all the parts into one homogeneous blob i.e. a single image.

You need to have sight of the method where chunks are extracted slowly converted slowly and stitched together slowly, but using some measure of expected file size.

  •  Tags:  
  • Related