Decompiler Components#

pairipcore.decompiler._decompiler.decompiler_main(vm: VM, decomp: Decompiler, callback=None) Code[source]#

Main function to decompile the VM bytecode, executing the appropriate handlers for each opcode.

Parameters:
  • vm (VM) – The virtual machine instance.

  • decomp (Decompiler) – The decompiler instance with opcode handlers.

  • callback (Optional[Callable[[VM, int], None]]) – An optional callback function for handling unknown opcodes.

Returns:

The generated code as a Code object.

Return type:

Code

pairipcore.decompiler._decompiler.from_opcode_def(opcode_def: dict[int | str, str | dict[str, str]]) Decompiler[source]#

Create a Decompiler instance from opcode definitions.

>>> dec = pairipcore.decompiler.from_opcode_def(...)
>>> code = pairipcore.interpret(vm, dec)
>>> for line in code:
...     print(line)
Parameters:

opcode_def (Dict[Union[int, str], Union[str, dict]]) – A dictionary where keys are opcodes and values are format IDs or dictionaries containing format IDs.

Returns:

The initialized Decompiler instance.

Return type:

Decompiler

CFG#

class pairipcore.cfg.LocationDB[source]#

A database for managing address ranges and labels within a virtual machine.

__ranges_#

List of address ranges.

Type:

List[range]

__labels_#

Dictionary of address labels.

Type:

Dict[addr_t, str]

__init__() None[source]#
Return type:

None

add_insn(insn: Insn) None[source]#

Add an instruction’s address range to the database.

Parameters:

insn (Insn) – The instruction to add.

Return type:

None

add_memory(start: addr_t, length: int) None[source]#

Add a memory range to the database.

Parameters:
  • start (addr_t) – Starting address of the memory range.

  • length (int) – Length of the memory range.

Return type:

None

add(start: addr_t, end: addr_t) None[source]#

Add a custom address range to the database.

Parameters:
  • start (addr_t) – Starting address of the range.

  • end (addr_t) – Ending address of the range.

Return type:

None

get_label(addr: addr_t, is_mem: bool = False) str[source]#

Retrieve or generate a label for an address.

Parameters:
  • addr (addr_t) – The address to label.

  • is_mem (bool) – Whether the address is for memory or not.

Returns:

The label for the address.

Return type:

str

has_label(addr: addr_t) bool[source]#

Check if an address has an assigned label.

Parameters:

addr (addr_t) – The address to check.

Returns:

True if the address has a label, False otherwise.

Return type:

bool

class pairipcore.cfg.GraphDelegate[source]#

A delegate class for managing graph nodes and edges related to VM instructions and memory.

graph#

The graph object for visualization.

Type:

graphviz.Digraph

opcode_ids#

Mapping of opcodes to human-readable names.

Type:

Dict[int, str]

Parameters:
  • graph (graphviz.Digraph) – The graph object for visualization.

  • opcode_ids (Dict[int, str]) – Mapping of opcodes to human-readable names.

__init__(graph, opcode_ids: dict) None[source]#
Parameters:

opcode_ids (dict)

Return type:

None

memory_node(mem_label: str, mem_addr: addr_t) None[source]#

Create a graph node for a memory location.

Parameters:
  • mem_label (str) – The label for the memory node.

  • mem_addr (addr_t) – The memory address.

Return type:

None

leaf_insn_node(label: str, opcode: int) None[source]#

Create a graph node for a leaf instruction.

Parameters:
  • label (str) – The label for the instruction node.

  • opcode (int) – The opcode of the instruction.

Return type:

None

insn_node(label: str, opcode: int, insn: Insn, loc_db: LocationDB) None[source]#

Create a graph node for an instruction, including its memory accesses.

Parameters:
  • label (str) – The label for the instruction node.

  • opcode (int) – The opcode of the instruction.

  • insn (Insn) – The instruction object.

  • loc_db (LocationDB) – The location database for memory and label management.

Return type:

None

pairipcore.cfg.instruction_handler(vm: VM, depth: int) None[source]#

Recursively handle and visualize instructions in the VM’s memory, updating the location database and graph delegate with instruction details.

Parameters:
  • vm (VM) – The virtual machine instance.

  • depth (int) – The current depth of recursion.

Return type:

None

pairipcore.cfg.new_cfg(vm: VM, opcode_def: dict, loc_db=None, depth=None, path=None, layout=None) Digraph[source]#

Generates a control flow graph (CFG) for a VM’s instructions.

>>> vm = VM(...)
>>> cfg = pairipcore.cfg.new_cfg(vm, opcode_def={...}, depth=5)
>>> open("cfg.dot").write(str(cfg))
Parameters:
  • vm (VM) – The virtual machine instance.

  • opcode_def (dict) – Definitions of opcodes, including format IDs.

  • loc_db (LocationDB, optional) – Pre-existing location database. If None, a new one will be created.

  • depth (int, optional) – Maximum depth for recursion in CFG generation.

  • path (list[int], optional) – Specific opcode path to follow in the CFG.

  • layout (str, optional) – Graphviz layout direction (e.g., ‘TB’ for top-to-bottom).

Returns:

The generated control flow graph.

Return type:

graphviz.Digraph

Disassembler#

class pairipcore.disassembler.VMOp_DisasmHandler[source]#

Handler class for disassembling VM operations with detailed debugging output.

debug(vm: VM) None[source]#

Output debugging information about the opcode execution.

Parameters:

vm (VM) – The virtual machine instance.

Return type:

None

pairipcore.disassembler.as_disasm(dec: Decompiler) Decompiler[source]#

Configure the decompiler to use disassembly handlers for debugging.

>>> dec = pairipcore.decompiler.from_opcode_def(...)
>>> dis = pairipcore.disassembler.as_disasm(dec)
>>> code = pairipcore.interpret(vm, dis)
>>> for line in code:
...     print(line)
Parameters:
  • decompiler (Decompiler) – The decompiler instance to configure.

  • dec (Decompiler)

Returns:

The configured decompiler instance with disassembly handlers.

Return type:

Decompiler

Code Writer#

class pairipcore.decompiler.code.LazyCodeStatement[source]#

A base class representing a lazy code statement that will be converted to a string representation.

c_str() str[source]#

Convert the lazy code statement to a C-style string representation. This method should be implemented by subclasses.

Return type:

str

class pairipcore.decompiler.code.AssignmentExpr[source]#

A class representing an assignment expression in code.

c_str() str[source]#

Convert the assignment expression to a C-style string representation.

Return type:

str

__init__(left: VMVariable | str | LazyCodeStatement, right: VMVariable | str | LazyCodeStatement) None#
Parameters:
Return type:

None

class pairipcore.decompiler.code.Line[source]#

A class representing a single line of code.

c_str() str[source]#

Convert the lazy code statement to a C-style string representation. This method should be implemented by subclasses.

Return type:

str

__init__(text: str) None#
Parameters:

text (str)

Return type:

None

class pairipcore.decompiler.code.Block[source]#

A class representing a block of code, which consists of multiple statements.

__init__(*statements) None[source]#
Return type:

None

c_str() str[source]#

Convert the lazy code statement to a C-style string representation. This method should be implemented by subclasses.

Return type:

str

class pairipcore.decompiler.code.Comment[source]#

A class representing a comment in code.

c_str() str[source]#

Convert the lazy code statement to a C-style string representation. This method should be implemented by subclasses.

Return type:

str

__init__(text: str, other: LazyCodeStatement | None = None) None#
Parameters:
Return type:

None

class pairipcore.decompiler.code.CallExpr[source]#

A class representing a function or method call expression.

c_str() str[source]#

Return the function call expression in C-style syntax.

Return type:

str

__init__(obj: str | VMVariable | None, func: str, args: list[str | VMVariable] | None = None, is_ptr: bool = False) None#
Parameters:
  • obj (str | VMVariable | None)

  • func (str)

  • args (list[str | VMVariable] | None)

  • is_ptr (bool)

Return type:

None

class pairipcore.decompiler.code.Code[source]#

A class representing a collection of code statements and variable declarations.

Parameters:

vm (VM) – The virtual machine containing variable information.

__init__(vm: VM) None[source]#
Parameters:

vm (VM)

Return type:

None

lines()[source]#

Generate lines of code including variable declarations and statements.

Yields:

str – The lines of code, including variable declarations and statements.

Utilities#

pairipcore.decompiler.utils.VMJump(vm: VM, insn: Insn, verify=True) None[source]#

Set the program counter (PC) to the next instruction address based on the instruction’s hash verification.

Parameters:
  • vm (VM) – The virtual machine instance.

  • insn (Insn) – The instruction containing the jump information.

  • verify (bool) – Whether to perform hash verification before jumping.

Return type:

None

pairipcore.decompiler.utils.VMDeref(vm: VM, addr: addr_t)[source]#

Dereference a memory address to retrieve the stored value.

Parameters:
  • vm (VM) – The virtual machine instance.

  • addr (addr_t) – The address to dereference.

Returns:

The value stored at the given address.

Return type:

Any

pairipcore.decompiler.utils.VMNewGlobalVar(vm: VM, addr: addr_t, type: str, value=None) VMVariable[source]#

Create a new global variable and add it to the VM’s memory.

Parameters:
  • vm (VM) – The virtual machine instance.

  • addr (addr_t) – The address for the new global variable.

  • type (str) – The type of the variable.

  • value (Optional[Any]) – The initial value of the variable, if any.

Returns:

The newly created global variable.

Return type:

VMVariable

pairipcore.decompiler.utils.VMGetGlobalVar(vm: VM, addr: addr_t) VMVariable[source]#

Retrieve an existing global variable by its address.

Parameters:
  • vm (VM) – The virtual machine instance.

  • addr (addr_t) – The address of the global variable.

Returns:

The global variable at the given address.

Return type:

VMVariable

pairipcore.decompiler.utils.VMGetOrCreateGlobalVar(vm: VM, addr: addr_t, type: str, value=None) VMVariable[source]#

Get an existing global variable or create a new one if it doesn’t exist.

Parameters:
  • vm (VM) – The virtual machine instance.

  • addr (addr_t) – The address of the global variable.

  • type (str) – The type of the variable.

  • value (Optional[Any]) – The value to set for the variable if created.

Returns:

The global variable at the given address, either newly created or existing.

Return type:

VMVariable