2.3. Byte Sequences#

When working with binary data, sometimes you need to deal with raw byte sequences. Caterpillar provides several structs to handle these byte sequences efficiently, whether they are stored in memory, byte arrays, or prefixed with length information.

2.3.1. Memory#

The Memory struct is ideal when you need to handle data that can be wrapped by a memoryview. It allows you to define fields with a specified size (static or dynamic) and is especially useful for printing out unpacked objects in a readable way.

Python

>>> m = F(Memory(5)) # static size; dynamic size is allowed too
>>> pack(bytes([i for i in range(5)], m))
b'\x00\x01\x02\x03\x04'
>>> unpack(m, _)
<memory at 0x00000204FDFA4411>

2.3.2. Bytes#

If you need direct access to byte sequences, the Bytes struct is the solution. This struct converts a memoryview to bytes for easy manipulation. You can define fields with static, dynamic, or greedy sizes based on your needs.

Python

>>> bytes_obj = Bytes(5) # static, dynamic and greedy size allowed

Caterpillar C

>>> b = octetstring(5) # static, dynamic size allowed

Let’s implement a struct for the fDAT chunk of the PNG format, which stores frame data. In this case, we use the Memory struct to handle the frame data.

Python

Implementation for the frame data chunk#

@struct(order=BigEndian)                    # <-- endianess as usual
class FDATChunk:
    sequence_number: uint32
    # We rather use a memory instance here instead of Bytes()
    frame_data: Memory(parent.length - 4)

Caterpillar C

Implementation for the frame data chunk#

parent = ContextPath("parent.obj")

@struct(endian=BIG_ENDIAN)
class FDATChunk:
    sequence_number: u32
    frame_data: octetstring(parent.length - 4)

Challenge

If you feel ready for a more advanced structure, try implementing the zTXt chunk for compressed textual data.

2.3. Byte Sequences#

2.3.1. Memory#

2.3.2. Bytes#

This Page