2. Basic Concepts#

In this section, we’ll explore some common techniques used in binary file formats, setting the stage for more advanced topics in the next chapter.

Note

Some examples using the interpreter prompts make use of a shortcut to define Field objects:

>>> from caterpillar.shortcuts import F
>>> field = F(uint8)

2.2. Context#

Caterpillar uses a special Context to keep track of the current packing or unpacking process. A context contains special variables, which are discussed in the Context reference in detail.

The current object that is being packed or parsed can be referenced with a shortcut this. Additionally, the parent object (if any) can be referenced by using parent.

Python

Understanding the context#

@struct
class Format:
    length: uint8
    foo: CString(this.length)   # <-- just reference the length field

Caterpillar C

Understanding the context#

this = ContextPath("obj")

@struct
class Format:
    length: u8
    foo: cstring(this.length)

Note

You can apply any operation on context paths. However, be aware that conditional branches must be encapsulated by lambda expressions.

2.2.1. Runtime length of objects#

In cases where you want to retrieve the runtime length of a variable that is within the current accessible bounds, there is a special class designed for that use-case: lenof.

You might have seen this special class before when calculating the length of some strings. It simply applies the len(...) function of the retrieved variable.

Tip

To access elements of a sequence within the context, you can just use this.foobar[...].

2.3. Standard Structs#

We still have some important struct types to discuss to start defining complex structs.

2.3.1. Constants#

Proprietary file formats or binary formats often store magic bytes usually at the start of the data stream. Constant values will be validated against the parsed data and will be applied to the class automatically, eliminating the need to write them into the constructor every time.

2.3.1.1. ConstBytes#

These constants can be defined implicitly by annotating a field in a struct class with bytes. For example, in the case of starting the main PNG struct:

Starting the main PNG struct#

@struct(order=BigEndian) # <-- will be relevant later on
class PNG:
    magic: b"\x89PNG\x0D\x0A\x1A\x0A"
    # other fields will be defined at the end of this tutorial.

2.3.1.2. Const#

Raw constant values require a struct to be defined to parse or build the value. For example:

>>> field = F(Const(0xbeef, uint32))

2.3.2. Compression#

This library also supports default compression formats like zlib, lzma, bz2 and, if installed via pip, lzo (using lzallright).

>>> field = ZLibCompressed(100) # length or struct here applicable

2.3.3. Specials#

All of the following structs may be used in special situations where all other previously discussed structs can’t be used.

2.3.3.1. Computed#

A runtime computed variable that does not pack any data. It is rarely recommended to use this struct, because you can simply define a @property or method for what this structs represents, unless you need the value later on while packing or unpacking.

>>> struct = Computed(this.foobar) # context lambda or constant value

Challenge

Implement the gAMA chunk for our PNG format and use a Computed struct to calculate the real gamma value.

2.3.3.2. Pass#

In case nothing should be done, just use Pass. This struct won’t affect the stream in any way.

Important

Congratulations! You have successfully mastered the basics of caterpillar! Are you ready for the next level? Brace yourself for some breathtaking action!

2. Basic Concepts#

2.1. Standard Types#

2.1.1. Numbers#

2.1.1.1. Custom-sized integer#

2.1.1.2. Variable-sized integer#

2.1.2. Enumerations#

2.1.3. Arrays/Lists#

2.1.4. String Types#

2.1.4.1. CString#

2.1.4.2. String#

2.1.4.3. Prefixed#

2.1.5. Byte Sequences#

2.1.5.1. Memory#

2.1.5.2. Bytes#

2.1.6. Padding#