Ninja: Towards Transparent Tracing and Debugging on ARM

Ninja makes use of Arm hardware feature like TrustZone, PMU, and ETM to build a transparent tracing and debugging tool.

The paper was first published at USENIX in 2017.

Ninja is a work done in my own group COMPASS back in 2017, when my advisor Fengwei were still at Wayne State University. He joined SUSTech at 2019 and I joined his group at the same time. I began to attend the lab meeting in 2020, as a junior student trying to learn something about system security.

My first work in the lab is to deploy Ninja (as a training), which helps me get familiar with Arm features. Our group is continuing working on Arm and trying to make full use of its feature to find more possibilities.

OKay, you may feel like, why this guy is still not going to introduce the paper? So much of the story, let’s dive.


Transparency is an increasingly important topic when it comes to malware analysis. Malware may find ways to detect the existence of ayalyzer, and hide its malicious behavior. That makes dynamic analysis frameworks hardly to work.

In the paper, the author argues that analyzers running in the Operating System or Hypervisor are likely to be detected, due to the easy-found footprint they left. The analyzer running in the hardware, however, benifits from a good isolated environment. Arm provides such an secure world for analyzers called TrustZone.

TrustZone divides hardware into two parts, called the secure domain and the non-secure domain. Arm claims that applications running in the secure world, a.k.a. the trusted applications, are able to access everything in the non-secure world. On the other hand, the applications running in normal non-secure world have no idea about what is going on in the secure world. This feature may help keep Ninja transparent to malware, if it decides to live in the secure world.

Reliable Domain switch

Debugging needs to stop, in other word, interrupt the program. Since Ninja wants to be as transparent as possible, it needs some non-invasive methods to interrupt the running program, switch to secure domain and gain control. Here is the most exciting part in the paper, instead of using the smc instunction which is supposed to do the domain switch, Ninja uses PMI (Performance Monitor Unit Interrupt) to automatically interrupt the program at precise time.

Performance Monitor Unit is a counter that counts when an event happens. There are multiple events pre-defined in the system, like instruction retired, branch load misses, memory accessed, etc. Ninja preset the counter to maximum value 0xFFFFFFFF. Thus, every time the event happens, there will be a PMI. Ninja configures PMI as an secure interrupt, and handles it in the secure world.

Embedded Tracing Macrocell

Embedded Tracing Macrocell (ETM) is an Arm version of Interl PT. It tracks the instructions executed in the processor. Ninja enables this feature to do the tracing.

I found ETM is a great tool to produce many other works.

Semantic Gap

Using hardware components as analyzer has its benifits like transparency. However, there are still some drawbacks. For example, it is hard for hardware to understand what is going on in the upper layer. In hardware’s view, Operating System schedules all tasks in a unknown order. How can it tells where the instruction is from, program a or program b?

That’s why Ninja has to fill the semantic gap, which is the gap between upper layer like OS and lower layer like hardware. Ninja finds a way to fetch task_struct, which mantains the information of current thread in Linux.

Ninja also does a great work to analyse the Android RunTime framework, managed to capture the running Android API, which makes it able to tracing Android applications running in Java Virtual Machine.