In this blog post, I describe how to use Soot to read an Android APK (without the source code), change its methods and classes(even add a new class), and write the new code into a working APK. A few notes:
One way to analyze Android apps is by running them on a device (or an emulator) and observe the logs to capture some desired information. If you have the app’s source code, you can log whatever you want, e.g., record the execution of a method at its beginning. However, in the cases that the source code is not available, such as in security analysis where you are dealing with possible malicious apps, you need to instrument the APK, which is compiled in Dalvik byte code. And as you may have guessed, Soot is going to save the day.
The following figure shows an overview of how Soot reads/modifies/writes an APK. First, Soot uses Dexplerto convert the Dalvik byte code to Jimple bodies. Then it runs Whole Packs that can transform, optimize, and annotate the whole program (for example, call graph generation). Next, the Jimple Transformation Packs will run on each Jimple body (this is the part that we modify the code). Finally, Soot converts all Jimple bodies to Baf (a low-level intermediate representation in Soot), and using Dexpler the whole code will be compiled into an APK. The instrumented APK can be installed on an Android device (you just need to sign it first). Now, let’s instrument some apps!
An overview of instrumenting an Android APK by Soot (the circles are Soot packs, more info here)
We are going to add a simple statement(System.out.println("Beginning of method: " + METHOD_NAME)
) at the beginning of each APK method using a BodyTransfomer. Before reading further, please clone the SootTutorial repository and have AndroidLogger.java in front of you.
In order to analyze an Android APK with Soot, you need to install the Android SDK in your machine. You can either use the SootTutorial docker image (docker run -it noidsirius/soot_tutorial:lates
) or follow this link.
Soot needs a special configuration for analyzing Android apps. The following code shows the options that I used for the instrumentation. Each option accompanies with a comment that describes it.
public static void setupSoot(String androidJar, String apkPath, String outputPath) {
// Reset the Soot settings (it's necessary if you are analyzing several APKs)
G.reset();
// Generic options
Options.v().set_allow_phantom_refs(true);
Options.v().set_whole_program(true);
Options.v().set_prepend_classpath(true);
// Read (APK Dex-to-Jimple) Options
Options.v().set_android_jars(androidJar); // The path to Android Platforms
Options.v().set_src_prec(Options.src_prec_apk); // Determine the input is an APK
Options.v().set_process_dir(Collections.singletonList(apkPath)); // Provide paths to the APK
Options.v().set_process_multiple_dex(true); // Inform Dexpler that the APK may have more than one .dex files
Options.v().set_include_all(true);
// Write (APK Generation) Options
Options.v().set_output_format(Options.output_format_dex);
Options.v().set_output_dir(outputPath);
Options.v().set_validate(true); // Validate Jimple bodies in each transofrmation pack
// Resolve required classes
Scene.v().addBasicClass("java.io.PrintStream",SootClass.SIGNATURES);
Scene.v().addBasicClass("java.lang.System",SootClass.SIGNATURES);
Scene.v().loadNecessaryClasses();
}
Soot Initialization
The last part of the setup code is not setting an option but resolving the required classes for the instrumentation. Recall that we want to add a new statement at the beginning of each APK method, which its Jimple representation is:
$r1 = <java.lang.System: java.io.PrintStream out>
virtualinvoke $r1.<java.io.PrintStream: void println(java.lang.String)>("<SOOT_TUTORIAL> Beginning of method METHOD_NAME")
Since these statements require classes java.lang.System
and java.io.PrintStream
, we should resolve them in Soot, which is done in lines 20–21 in setupSoot
.
#android #soot #program-analysis #code-instrumentation