2. Smali Reader API¶
This module contains an implementation of a line-based Smali source code parser. It can be used to parse Smali files as an output of a decompiling routine.
The parsing model is rather simple and will be described in the following chapter.
Note
Please note that no optimization is done by default, so methods or fields
that won’t be visited by a SmaliWriter
won’t be covered and won’t be
visible in the final file.
To copy non-visited structures, just add the reader variable to the
SmaliWriter
when creating a new one:
1reader = SmaliReader(...)
2writer = SmaliWriter(reader)
3
4reader.visit(source_code, writer)
Hint
You can add your own copy_handler to the reader instance if you want to use your own callback to copy raw lines.
2.1. Parsing model¶
Parsing is done by inspecting each line as a possible input. Although, some statements consume more than one line, only one line per statement is used.
Speaking of statements, they can be devided into groups:
Token:
Statements begin with a leading
.
and can open a new statement block. They are specified in theToken
class described in Smali tokenInvocation blocks:
Block statements used within method declarations start with a
:
and just specify the block’s id.Annotation values:
Annotation values don’t have a leading identifier and will only be parsed within
.annotation
or.subannotation
statements.Method instructions:
Same goes with method instructions - they will be handled only if a method context is present.
- class smali.reader.SupportsCopy¶
Interface for classes that can react as a copy handler for a SmaliReader.
Note that the context is used to distinguish the current visitor.
- copy(line: str, context: type = <class 'smali.visitor.ClassVisitor'>) None ¶
Copies the given line.
- Parameters:
line (str) – the line to copy
- class smali.reader.SmaliReader(validate: bool = True, comments: bool = False, snippet: bool = False, errors: str = 'strict')¶
Basic implementation of a line-base Smali-SourceCode parser.
- Parameters:
validate (bool, optional) – Indicates the reader should validate the input code, defaults to True
comments (bool, optional) – With this option enabled, the parser will also notify about comments in the source file, defaults to False
snippet (bool, optional) – With this option enabled, the initial class definition will be skipped, defaults to False
errors (str, optional) – Indicates whether this reader should throw errors (values:
strict
,ignore
), defaults to ‘strict’
- _class_def(next_line=True, inner_class=False)¶
Parses (and verifies) the class definition.
- Parameters:
visitor (ClassVisitor) – the visitor instance
next_line (bool, optional) – whether the next line should be used, defaults to True
inner_class (bool, optional) – whether the class is an inner class, defaults to False
- Raises:
SyntaxError – if EOF is reached
SyntaxError – if EOL is reached
- Returns:
an inner class ClassVisitor instance if inner_class it True
- Return type:
ClassVisitor | None
- _collect_values(strip_chars=None) list ¶
Collects all values stored in the rest of the current line.
Note that values will be splitted if ‘,’ is in a value, for instance: >>> line = “const/16 b,0xB” >>> _collect_values(‘,’) [‘const/16’, ‘b’, ‘0xB’]
- Parameters:
strip_chars (str, optional) – the chars to strip first, defaults to None
- Returns:
the collected values
- Return type:
list
- _do_visit() None ¶
Performs the source code visit.
- Parameters:
source (io.IOBase) – the source to read from
visitor (ClassVisitor) – the visitor to notify
- _handle_end() None ¶
Removes the active visitor from the stack.
- _handle_source() None ¶
Handles .source definitions and their comments.
- Parameters:
visitor (ClassVisitor) – the visitor to notify
- _next_line()¶
Reads until the next code statement.
Comments will be returned to the visitory immediately.
- Parameters:
source (io.IOBase) – the source to read from
visitor (ClassVisitor) – the visitor to notify
- Raises:
EOFError – if the end of file has beeen reached
- _read_access_flags() list ¶
Tries to resolve all access flags of the current line
- Returns:
the list of access flags
- Return type:
list
- _validate_descriptor(name: str) None ¶
Validates the given name if validation is enabled.
- Parameters:
name (str) – the type descriptor, e.g. ‘Lcom/example/ABC;’
- Raises:
SyntaxError – if the provided string is not a valid descriptor
- _validate_token(token: str, expected: Token) None ¶
Validates the given token if validation is enabled.
- Parameters:
token (str) – the token to verify
expected (str) – the expected token value
- Raises:
SyntaxError – if validation failed
- property _visitor: VisitorBase¶
Returns the active visitor instance.
- Returns:
the active visitor.
- Return type:
- comments: bool = True¶
With this option enabled, the parser will also notify about comments in the source file.
- errors: str = 'strict'¶
Indicates whether this reader should throw errors (values: ‘strict’, ‘ignore’)
- snippet: bool = False¶
With this option enabled, the initial class definition will be skipped.
- source: IOBase¶
The source to read from.
- stack: list = []¶
Stores the current visitors (index 0 stores the initial visitor)
A null value indicates that no visitors are registered for the current parsing context.
- validate: bool = False¶
Indicates the reader should validate the input code.
- visit(source: IOBase, visitor: ClassVisitor) None ¶
Parses the given input which can be any readable source.
- Parameters:
source (io.IOBase | str | bytes) – the Smali source code
visitor (ClassVisitor, optional) – the visitor to use, defaults to None
- Raises:
ValueError – If the provided values are null
TypeError – if the source type is not accepted
ValueError – if the source is not readable