1. Smali API¶
Implementation of a line-based Smali source code parser using a visitor API.
1.1. Overview¶
The overall structur of a decompiled Smali class is rather simple. In fact a decompiled source class contains:
A section describing the class’ modifiers (such as
public
orprivate
), its name, super class and the implemented interfaces.One section per annotation at class level. Each section describes the name, modifiers (such as
runtime
orsystem
), and the type. Optionally, there are sub-annotations included as values in each section.One section per field declared in this class. Each section describes the name, modifiers (such as
public
orstatic
), and the type. Optionally, there are annotations placed within the section.One section per method, constructor (
<cinit>
) and static initializer block (clinit
) describing their name, parameter types and return type. Each section may store instruction information.Sections for line-based comments (lines that start with a
#
).
As Smali source code files does not contain any package
or import
statements
like they are present Java classes, all names must be fully qualified. The structure
of these names are described in the Java ASM documentationin section 2.1.2 [1].
1.1.1. Type descriptors¶
Type descriptors used in Smali source code are similar to those used in compiled Java classes. The following list was taken from ASM API [1]:
Smali type |
Type descriptor |
Example value |
---|---|---|
|
V |
– |
|
Z |
|
|
C |
|
|
B |
|
|
S |
|
|
I |
|
|
F |
|
|
J |
|
|
D |
|
|
Ljava/lang/Object; |
– |
|
[Z |
– |
The descriptors of primitive types can be represented with single characters. Class
type descriptors always start with a L
and end with a semicolon. In addition to
that, array type descriptors will start with opened square brackets according to the
number of dimensions. For instance, a two dimensional array would get two opened square
brackets in its type descriptor.
This API contains a class called SVMType
that can be used to retrieve type descriptors
as well as class names:
1from smali import SVMType, Signature
2
3# simple type instance
4t = SVMType("Lcom/example/Class;")
5t.simple_name # Class
6t.pretty_name # com.example.Class
7t.dvm_name # com/example/Class
8t.full_name # Lcom/example/Class;
9t.svm_type # SVMType.TYPES.CLASS
10
11# create new type instance for method signature
12m = SVMType("getName([BLjava/lang/String;)Ljava/lang/String;")
13m.svm_type # SVMType.TYPES.METHOD
14# retrieve signature instance
15s = m.signature or Signature(m)
16s.return_type # SVMType("Ljava/lang/String;")
17s.parameter_types # [SVMType("[B"), SVMType("Ljava/lang/String;")]
18s.name # getName
19s.declaring_class # would return the class before '->' (only if defined)
20
21# array types
22array = SVMType("[B")
23array.svm_type # SVMType.TYPES.ARRAY
24array.array_type # SVMType("B")
25array.dim # 1 (one dimension)
As an input can be used anything that represents the class as type descriptor, original class name or internal name (array types are supported as well).
1.1.2. Method descriptors¶
Unlike method descriptors in compiled Java classes, Smali’s method descriptors contain the
method’s name. The general structure, described in detail in the ASM API [1] documentation,
is the same. To get contents of a method descriptor the SVMType
class, introduced before,
can be used again:
1from smali import SVMType
2
3method = SVMType("getName([BLjava/lang/String;)Ljava/lang/String;")
4# get the method's signature
5signature = method.signature
6# get parameter type descriptors
7params: list[SVMType] = signature.parameter_types
8# get return type descriptor
9return_type = signature.return_type
10
11# the class type can be retrieved if defined
12cls: SVMType = signature.declaring_class
Caution
The initial method descriptor must be valid as it can cause undefined behaviour if custom strings are used.
1.2. Interfaces and components¶
The Smali Visitor-API for generating and transforming Smali-Source files
(no bytecode data) is based on the ClassVisitor
class, similar to the
ASM API [1] in Java. Each method in
this class is called whenever the corresponding code structure has been
parsed. There are two ways how to visit a code structure:
- Simple visit:
All necessary information are given within the method parameters
- Extendend visit:
To deep further into the source code, another visitor instance is needed (for fields, methods, sub-annotations or annotations and even inner classes)
The same rules are applied to all other visitor classes. The base class of
all visitors must be VisitorBase
as it contains common methods all sub
classes need:
class VisitorBase:
def __init__(self, delegate) -> None: ...
def visit_comment(self, text: str) -> None: ...
def visit_eol_comment(self, text: str) -> None: ...
def visit_end(self) -> None: ...
All visitor classes come with a delegate that can be used together with the
initial visitor. For instance, we can use our own visitor class together with
the provided SmaliWriter
that automatically writes the source code.
Note
The delegate must be an instance of the same class, so FieldVisitor
objects can’t be applied to MethodVisitor
objects as a delegate.
The provided Smali API provides three core components:
The
SmaliReader
class is an implementation of a line-based parser that can handle .smali files. It can use both utf-8 strings or bytes as an input. It calls the corresponding visitXXX methods on theClassVisitor
.The
SmaliWriter
is a subclass ofClassVisitor
that tries to build a Smali file based on the visited statements. It comes together with anAnnotationWriter
,FieldWriter
andMethodWriter
. It produces an output utf-8 string that can be encoded into bytes.The
XXXVisitor
classes delegate method calls to internal delegate candicates that must be set with initialisation.
The next sections provide basic usage examples on how to generate or transform Smali class files with these components.
1.2.1. Parsing classes¶
The only required component to parse an existing Smali source file is the SmaliReader
component. To illustrate an example usage, assume we want to print out the parsed
class name, super class and implementing interfaces:
1from smali import ClassVisitor
2
3class SmaliClassPrinter(ClassVisitor):
4 def visit_class(self, name: str, access_flags: int) -> None:
5 # The provided name is the type descriptor - if we want the
6 # Java class name, use a SVMType() call:
7 # cls_name = SVMType(name).simple_name
8 print(f'.class {name}')
9
10 def visit_super(self, super_class: str) -> None:
11 print(f".super {super_class}")
12
13 def visit_implements(self, interface: str) -> None:
14 print(f".implements {interface}")
The second step is to use our previous defined visitor class with a SmaliReader
component:
1# Let's assume the source code is stored here
2source = ...
3
4printer = SmaliClassPrinter()
5reader = SmaliReader(comments=False)
6reader.visit(source, printer)
The fifth line creates a SmaliReader
that ignores all comments in the source
file to parse. The visit method is called at the end to parse the source code
file.
1.2.2. Generating classes¶
The only required component to generate a new Smali source file is the SmaliWriter
component. For instance, consider the following class:
1.class public abstract Lcom/example/Car;
2.super Ljava/lang/Object;
3
4.implements Ljava/lang/Runnable;
5
6.field private id:I
7
8.method public abstract run()I
9.end method
It can be generated within seven method calls to a SmaliWriter
:
1from smali import SmaliWriter, AccessType
2
3writer = SmaliWriter()
4# Create the .class statement
5writer.visit_class("Lcom/example/Car;", AccessType.PUBLIC + AccessType.ABSTRACT)
6# Declare the super class
7writer.visit_super("Ljava/lang/Object;")
8# Visit the interface implementation
9writer.visit_implements("Ljava/lang/Runnable")
10
11# Create the field id
12writer.visit_field("id", AccessType.PRIVATE, "I")
13
14# Create the method
15m_writer = writer.visit_method("run", AccessType.PUBLIC + AccessType.ABSTRACT, [], "V")
16m_writer.visit_end()
17
18# finish class creation
19writer.visit_end()
20source_code = writer.code
At line 3 a SmaliWriter
is created that will actually build the source code
string.
The call to visit_class
defines the class header (see line 1 of smali source
code). The first argument represents the class’ type descriptor and the second its
modifiers. To specify additional modifiers, use the AccessType
class. It provides
two ways how to retrieve the actual modifiers:
Either by referencing the enum (like
AccessType.PUBLIC
)or by providing a list of keywords that should be translated into modifier flags:
modifiers = AccessType.get_flags(["public", "final"])
The calls to visit_super
defines the super class of our previously defined
class and to visit_implements
specifies which interfaces are implemented by our
class. All arguments must be type descriptors to generate accurate Smali code (see
section Type descriptors for more information on the type class)