Home > Software design >  Python 3 garbage data into the struct.unpack
Python 3 garbage data into the struct.unpack

Time:01-15

I am trying to send and receive messages via socket using Python 3.

BUFFER_SIZE = 1024
# Create message
MLI = struct.pack("!I", len(MESSAGE))
MLI_MESSAGE = MLI   str.encode(MESSAGE)

When the message receive:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(MLI_MESSAGE)
print ("Sent data: ‘", MESSAGE, "’")
# Receive MLI from response (you might want to add some timeout handling as well
resp = s.recv(BUFFER_SIZE)
resp = struct.unpack("!I", resp)[0]
print(resp)

resp:

b'\x00\t\xeb\x07\xdf\x01\x00\xdf\x02\x010'

I am getting that error:

struct.error: unpack requires a buffer of 4 bytes

I think it is related with \t char into the resp but I am not sure. How can I remove that \t char and how to solve that issue?

CodePudding user response:

You are basically trying to do the following (sockets removed):

1 import struct
2 
3 msg = "foobar"
4 mli = struct.pack("!I", len(msg))
5 mli_msg = mli   str.encode(msg)
6 
7 len = struct.unpack("!I", mli_msg)[0]
8 print(len)

The extraction of the length in line 7 will fail since you put the whole mli_msg as argument to unpack, not only the expected 4 bytes for the len. Instead you should do:

7 len = struct.unpack("!I", mli_msg[:4])[0] 

Apart from that it is wrong to first take the length of the message and then convert the message to bytes. The first takes the number of characters while the latter takes the number of bytes, which will differ when non-ASCII characters are involved - check len("ü") vs. len(str.encode("ü")). You need to first convert the message to bytes thus and then take the length to provide the correct byte length for what you send.

4 encoded_msg = str.encode(msg)
5 mli_msg = struct.pack("!I", len(encoded_msg))    encoded_msg

CodePudding user response:

Explanation of !I:

  • ! indicates big endian alignment
  • I indicates unsigned integer type, occupying 4 bytes.

The variable resp value is b'\x00\t\xeb\x07\xdf\x01\x00\xdf\x02\x010', and the length exceeds 4.

You can intercept 4 bytes for parsing, like below.

import struct

resp = b'\x00\t\xeb\x07\xdf\x01\x00\xdf\x02\x010'
print(struct.unpack("!I", resp[:4])[0])

# 649991
  •  Tags:  
  • Related