Brahmastra: Driving Apps to Test the Security of Third-Party

This paper is included in the Proceedings of the

23rd USENIX Security Symposium.

August 20–22, 2014 • San Diego, CA

ISBN 978-1-931971-15-7

Open access to the Proceedings of

the 23rd USENIX Security Symposium

is sponsored by USENIX

Brahmastra: Driving Apps to Test the Security of

Third-Party Components

Ravi Bhoraskar, Microsoft Research and University of Washington; Seungyeop Han,

University of Washington; Jinseong Jeon, University of Maryland, College Park;

Tanzirul Azim, University of California, Riverside; Shuo Chen, Jaeyeon Jung, Suman Nath,

and Rui Wang, Microsoft Research; David Wetherall, University of Washington

https://www.usenix.org/conference/usenixsecurity14/technic al-sessions/presentation/bhoraskar

USENIX Association 23rd USENIX Security Symposium 1021

Brahmastra: Driving Apps to Test the Security of Third-Party Components

Ravi Bhoraskar

1,2

, Seungyeop Han

, Jinseong Jeon

, Tanzirul Azim

Shuo Chen

, Jaeyeon Jung

, Suman Nath

, Rui Wang

, David Wetherall

Microsoft Research

University of Washington

University of Maryland, College Park

University of California, Riverside

Abstract

We present an app automation tool called Brahmastra for

helping app stores and security researchers to test third-

party components in mobile apps at runtime. The main

challenge is that call sites that invoke third-party code

may be deeply embedded in the app, beyond the reach

of traditional GUI testing tools. Our approach uses static

analysis to construct a page transition graph and discover

execution paths to invoke third-party code. We then per-

form binary rewriting to “jump start” the third-party code

by following the execution path, efﬁciently pruning out

undesired executions. Compared with the state-of-the-

art GUI testing tools, Brahmastra is able to successfully

analyse third-party code in 2.7× more apps and decrease

test duration by a factor of 7. We use Brahmastra to un-

cover interesting results for two use cases: 175 out of

220 children’s apps we tested display ads that point to

web pages that attempt to collect personal information,

which is a potential violation of the Children’s Online

Privacy Protection Act (COPPA); and 13 of the 200 apps

with the Facebook SDK that we tested are vulnerable to

a known access token attack.

1 Introduction

Third-party libraries provide a convenient way for mo-

bile application developers to integrate external services

in the application code base. Advertising that is widely

featured in “free” applications is one example: 95% of

114,000 popular Android applications contain at least

one known advertisement library according to a recent

study [22]. Social media add-ons that streamline or en-

rich the user experience are another popular family of

third-party components. For example,

Facebook Login

lets applications authenticate users with their existing

Facebook credentials, and post content to their feed.

Despite this beneﬁt, the use of third-party components

is not without risk: if there are bugs in the library or the

way it is used then the host application as a whole be-

comes vulnerable. This vulnerability occurs because the

library and application run with the same privileges and

without isolation under existing mobile application mod-

els. This behavior is especially problematic because a

number of third-party libraries are widely used by many

applications; any vulnerability in these libraries can im-

pact a large number of applications. Indeed, our inter-

est in this topic grew after learning that popular SDKs

provided by Facebook and Microsoft for authentication

were prone to misuse by applications [30], and that ap-

plications often make improper use of Android cryptog-

raphy libraries [20].

In this paper, we present our solution to the problem

of third-party component integration testing at scale, in

which one party wishes to test a large number of appli-

cations using the same third-party component for a po-

tential vulnerability. To be useful in the context of mo-

bile app stores, we require that a successful solution test

many applications without human involvement. Observe

that it is not sufﬁcient to simply test the third-party li-

brary for bugs in isolation. This is because vulnerabil-

ities often manifest themselves due to the interaction of

the application and the third-party component. Thus our

focus is to develop tools that enable testers to observe in

situ interactions between the third-party component and

remote services in the context of a speciﬁc application at

runtime.

We began our research by exploring automated run-

time analysis tools that drive mobile UIs (e.g., [5, 23,

26]) to exercise the third-party component, but quickly

found this approach to be insufﬁcient. Although these

tools are effective at executing many different code paths,

they are often unable to reach speciﬁc interactions deep

within the applications for a number of reasons that we

explore within this paper. Instead, our approach lever-

ages the structure of the app to improve test hit rate and

execution speed. To do this, we characterize an app by

statically building a graph of its pages and transitions

between them. We then use path information from the

graph to guide the runtime execution towards the third-

1022 23rd USENIX Security Symposium USENIX Association

party component under test. Rather than relying on GUI

manipulation (which requires page layout analysis) we

rewrite the application under test to directly invoke the

callback functions that trigger the desired page transi-

tions.

We built Brahmastra to implement our approach for

Android apps. Our tool statically determines short exe-

cution paths, and dynamically tests them to ﬁnd one that

correctly invokes a target method in the third-party li-

brary. At this stage, behavior that is speciﬁc to the li-

brary is checked. Because our techniques do not require

human involvement, Brahmastra scales to analyze a large

number of applications. To show the beneﬁts of our ap-

proach, we use our tool for two new studies that con-

tribute results to the literature: 1) checking whether chil-

dren’s apps that source advertisements from a third-party

comply with COPPA privacy regulations; and 2) check-

ing that apps which integrate the Facebook SDK do not

have a known security vulnerability [30].

From our analysis of advertisements displayed in 220

kids apps that use two popular ad providers, we ﬁnd that

36% apps have displayed ads whose content is deemed

inappropriate for kids—such as offering free prizes, or

displaying sexual imagery. We also discover that 80%

apps have displayed ads with landing pages that attempt

to collect personal information from the users, such as

name, address, and online contact information—which

can be a violation of the Children’s Online Privacy Pro-

tection Act [6]. Apart from creating an unsafe environ-

ment for kids, this also leaves the app developers vulner-

able to prosecution, since they are considered liable for

all content displayed by their app.

For our analysis of a vulnerability in third party login

libraries, we run a test case proposed by Wang et al. [30]

against 200 Android apps that bundle Facebook SDK.

We ﬁnd that 13 of the examined apps are vulnerable.

Contributions: We make two main contributions. The

ﬁrst is Brahmastra, which embodies our hybrid approach

of static and dynamic analysis to solve the third-party

component integration testing problem for Android apps.

We discuss our approach and key techniques in §4 and

their implementation in §5. We show in §6 that our

techniques work for a large fraction of apps while ex-

isting tools such as randomized testing (Monkey) often

fail. We have made the static analysis part of Brah-

mastra available at https://github.com/plum-

umd/redexer.

Our second contribution is an empirical study of two

security and privacy issues for popular third-party com-

ponents. We ﬁnd potential violations of child-safety laws

by ads displayed in kids apps as discussed in §7; several

apps used in the wild display content in potential viola-

tion of COPPA due to the behavior of embedded compo-

nents. We ﬁnd that several popular Android apps are vul-

nerable to the Facebook access token attack as discussed

in §8; A Facebook security team responded immediately

to our ﬁndings on 2/27/2014 and had contacted the af-

fected developers with the instructions to ﬁx.

2 Background

As our system is developed in the context of Android,

we begin by describing the structure of Android apps and

support for runtime testing.

Android app structure: An Android app is organized as

a set of pages (e.g., Figure 1) that users can interact with

and navigate between. In Android, each page is repre-

sented by an

activity object. Each activity class repre-

sents one kind of page and may be initialized with differ-

ent data, resulting in different

activity instances. We

use the terms page and activity instance interchangeably.

Each page contains various GUI elements (e.g., buttons,

lists, and images), known as

views. A view can be as-

sociated with a callback function that is invoked when a

user interacts with the view. The callback function can

instantiate a new activity by using a late binding mecha-

nism called

intent. An intent encapsulates the descrip-

tion of a desired action (e.g., start a target activity) and

associated parameters. The main

activity (or the ﬁrst

page) of an app, deﬁned in its manifest ﬁle, is started by

the application launcher by passing a

START intent to it.

For example, in Figure 1, clicking the “Done” button

on activity

A1 invokes its event handler, which calls a

callback function deﬁned by the app developer. The call-

back constructs an intent to start activity

A2 with nec-

essary parameter

P12. The Activity Manager then con-

structs an instance of

A2, and starts it with P12 as parame-

ters. We refer to the documentation of Android internals

for more details [2].

Automated dynamic analysis: Recent works have used

a class of automation tools, commonly called a Mon-

key, that, given a mobile app binary, can automatically

execute it and navigate to various parts (i.e., states) of

the app. Examples include PUMA [23], DECAF [25],

AppsPlayground [26], A3E [14], and VanarSena [27]. A

Monkey launches the app in a phone or an emulator, in-

teracts with it by emulating user interactions (e.g., click-

ing a button or swiping a page) to recursively visit vari-

ous pages, and performs speciﬁc tasks (e.g., checking ad

frauds in the page or injecting faults) on each page.

In Figure 1, a Monkey may be able to visit the se-

quence of states

A1 → A2 → A3→ A4 if it knows the

right UI actions (e.g., type in mother’s name and select

“Due Date” in

A1) to trigger each transition. However, if

Monkey clicks a button in

A3 other than “Account”, the

app would navigate to a different activity. If the goal of

testing is to invoke speciﬁc methods (e.g., Facebook lo-

gin as shown in the example), then without knowing the

structure of the app, a Monkey is likely to wander around

USENIX Association 23rd USENIX Security Symposium 1023

Figure 1: Activity sequences of com.alt12.babybumpfree that invoke Facebook single sign-on window in the forth

activity (

A4): Clicking “I Agree” (E1) then clicking “Done” (E2) opens up A2 with the parameter, fromLoader : tru

(P12). Clicking “Settings” (E3) in A2 opens up the settings activity, A3 and then clicking “Account” (E4) opens up the

A4 with the parameter, WHICHCLASS : com.alt12.babybumpcore.activity.settings.Settings. Finally,

clicking “Login with Facebook” (

E5) opens up the sign-on window within the same activity, A4.

many activities until it reaches

A4, if it ever does.

3 Problem and Insights

Our goal is to develop the ability to automatically and

systematically test a large set of mobile apps that embed

a speciﬁc third-party component for a potential vulnera-

bility associated with the use of that component. This

ability would let app store operators rapidly vet apps

to contain security vulnerabilities caused by popular li-

braries. It would let component developers check how

apps use or misuse their interfaces. It would also let se-

curity researchers such as ourselves empirically assess

vulnerabilities related to third-party libraries.

A straightforward approach is to use existing Mon-

keys. Unfortunately, this approach does not work well:

it often fails to exercise the target third-party compo-

nent of the app under test. Although recent works pro-

pose techniques to improve various types of coverages,

computed as the fraction of app activities or methods in-

voked by the Monkey, coverage still remains far from

perfect [26, 14, 13, 25, 27]. Moreover, in contrast to tra-

ditional coverage metrics, our success metric is binary

for a given app indicating whether the target third-party

component (or a target method in it) is invoked (i.e., hit)

or not (i.e., miss). Our experiments show that even a

Monkey with a good coverage can have a poor hit rate

for a target third-party component that may be embed-

ded deep inside the app. We used an existing Monkey,

PUMA that reports a > 90% activity coverage compared

to humans [23], but in our experiments it was able to in-

voke a target third-party component only in 13% of the

apps we tested (see §6 for more details). On a close ex-

amination, we discovered several reasons for this poor

hit rate of existing Monkeys:

R1. Timeout: A Monkey can exhaust its time budget be-

fore reaching the target pages due to its trial-and-

error search of the application, especially for apps

with many pages that “blow up” quickly.

R2. Human inputs: A Monkey is unable to visit pages

that are reached after entering human inputs such

as login/password, or gestures beyond simple clicks

that the automated tester cannot produce.

R3. Unidentiﬁed elements: A Monkey fails to explore

clickable UI elements that are not visible in the cur-

rent screen (e.g., hidden in an unselected tab) or are

not activated yet (e.g., a “Like” button that is acti-

vated only after the user registers to the app) or are

not identiﬁed by underlying UI automation frame-

work (e.g., nonstandard custom control).

R4. Crashes: By stressing the UI, a Monkey exacerbates

app crashes (due to bugs and external dependencies

such as the network) that limit exploration.

Note that, unlike existing Monkeys, our goal is not to

exhaustively execute all the possible code paths but to

execute particular code paths to invoke methods of in-

terest in the third-party library. Therefore, our insight is

to improve coverage by leveraging ways how third party

components are integrated with application code base.

These components are incorporated into an app at the

activity level. Even if the same activity is instantiated

multiple times with different contents, third-party com-

ponents typically behave in the same way in all those

instantiations. This allows us to restrict our analysis at

the level of activity rather than activity instances. Fur-

ther, even if an app contains a large number of activities,

only a small number of them may actually contain the

third-party component of interest. Invoking that compo-

1024 23rd USENIX Security Symposium USENIX Association

nent requires successfully executing any and only one of

those activities.

Using this insight, our testing system, Brahmastra,

uses three techniques described below to signiﬁcantly

boost test hit rate and speed compared to a Monkey that

tries to reach all pages of the app.

Static path pruning: Brahmastra considers only the

“useful” paths that eventually invoke the target third-

party methods and ignores all other “useless” paths. In

Figure 1, Brahmastra considers the execution path

A1 →

A2 → A3→ A4 for exploration and ignores many other

paths that do not lead to a target activity,

A4.

Such useful paths need to be identiﬁed statically be-

fore dynamic analysis is performed. The key challenges

in identifying such paths by static analysis arise due to

highly asynchronous nature of Android apps. We discuss

the challenges and our solution in §4.

Dynamic node pruning: Brahmastra opportunistically

tries to start from an activity in the middle of the path.

If such “jump start” is successful, Brahmastra can ignore

all preceding activities of the path. For example, in Fig-

ure 1, Brahmastra can directly start activity

A3, which can

lead to the target activity

A4.

Dynamic node pruning poses several challenges —

ﬁrst, we need to enable jump-starting an arbitrary activity

directly. Second, jump starting to the target activity may

fail due to incorrect parameters in the intent, in which

case we need to ﬁnd a different activity that is close to

the target, for which jump start succeeds. We discuss

these in detail in next section.

Self-execution of app: Brahmastra rewrites the app bi-

nary to automatically call methods that cause activity

transitions. The appropriate methods are found by static

analysis. In Figure 1, instead of clicking on the button

with label “Done” in

A1, Brahmastra would invoke the

onClick() method that would make the transition from

A1 to A2. The advantage over GUI-driven automation is

that it can discover activity-transitioning callbacks even

if they are invisible in the current screen.

In summary, our optimizations can make dynamic

analysis fast by visiting only a small number of activ-

ities of an app. More importantly, they also improve

the test hit rate of such analysis. Faster analysis helps

to avoid any timeouts (R1). Dynamic node pruning can

bypass activities that require human inputs (R2). In Fig-

ure 1, Brahmastra can jump to

A3 and bypass A1 that re-

quires selecting a future due date. Intent-driven naviga-

tion helps Brahmastra to make transitions where a Mon-

key fails due to unidentiﬁed GUI elements (R3). Finally,

visiting fewer activities reduces the likelihood of crashes

(R4). We quantitatively support these claims in §6.



#

"























 



 



 

!















%%%





















Figure 2: A simpliﬁed call graph of

ch.smalltech.battery.free that shows multiple transi-

tion paths composed of multiple activities. Boxes and

ovals represent classes and methods. Solid edges corre-

spond to synchronous calls; (red) dotted edges indicate

activity transitions; and (blue) dashed edges represent

implicit calls due to user interactions. Three different

paths starting from

Home.onOptionItemSelected() reach

AboutBox.onCreate() and share the remaining part.

4 Design

Brahmastra requires as input: a test application binary;

the names of target methods to be invoked within the

context of the application; and the plug-in of a spe-

ciﬁc security analysis to run once the target method is

reached. Our system is composed of three parts:

1. Execution Planner statically analyzes the test app

binary and discovers an execution path to invoke the

target third-party method.

2. Execution Engine receives execution paths from the

Planner and launches the test app in one or multi-

ple emulators and automatically navigates through

various pages according to the execution path.

3. Runtime Analyzer is triggered when the test app in-

vokes the target method. It captures the test app’s

runtime state (e.g., page content, sensors accessed,

network trace) and runs the analysis plug-in.

4.1 Execution Planner

The job of the Execution Planner is to determine: (1) the

activities that invoke the target third-party method; and

(2) the method-level execution paths that lead to the tar-

get activities. To accomplish these tasks, we statically

analyze the app binary to construct a call graph that en-

compasses its activities and interactions that cause activ-

ity transitions.

Constructing call graph: A call graph is a graph where

vertices are methods and edges are causal relationship

between method invocation. More precisely, there exists

USENIX Association 23rd USENIX Security Symposium 1025

1 ImageButton b = (ImageButton)

2 findViewById(R.id.b1);

3 b.setOnClickListener(new OnClickListener() {

4 public void onClick(View v) {

5 ...

6 }});

Figure 3: Example of a programmatic handler registra-

tion.

onClick() is bound to setOnClickListener()

an edge from method m

to m

if m

invokes m

. Based

on how m

is invoked by m

, there are three types of

edges: (1) synchronous edges, if m

directly calls m

, (2)

asynchronous edges, if m

invokes m

asynchronously,

and (3) activity transition edges, if m

starts an activity

that automatically calls m

. Figure 2 depicts a call graph

of one real app.

While synchronous edges can be identiﬁed easily by

scanning the app binary code, discovering other edges

can be difﬁcult. To ﬁnd activity transition edges, we

rely on the fact that one activity can start another ac-

tivity by generating an intent and passing it to the

startActivity() method. We perform constant propa-

gation analysis [12] so as to track such intent creations

and detect activity transitions. We also conduct class hi-

erarchy analysis [19] to conservatively determine possi-

ble receiver types for dynamic dispatch, where the target

call sites depend on the runtime types of the receivers.

To discover asynchronous edges, we need to consider

all the different ways asynchronous methods can be in-

voked by a mobile app:

1. Programmatic handler registrations: These are call-

backs explicitly bound to methods (e.g., event han-

dler of GUI elements) within the code. Figure 3

shows an example.

2. XML-based handler registrations: These are call-

backs speciﬁed in the layout or resource XML ﬁles.

Figure 4 shows an example.

3. Lifetime methods: These are methods provided by

the underlying framework that automatically make

transitions to other methods on speciﬁc events. Ex-

amples are splash screens and message boxes that

transition to next activities after a timeout or after

user acknowledgment, respectively.

To discover the ﬁrst and third types, we use constant

propagation analysis to trace callbacks attached to var-

ious event handlers. To handle the second case, we parse

layout XML ﬁles corresponding to each activity to ﬁgure

out the binding between UI elements and callback meth-

ods.

Efﬁcient call graph computation: A call graph can be

extremely large, thus computing the entire call graph can

1 // layout/about_box_share.xml

2 <Button android:id="@id/mShareFacebook"

3 style="@style/ABB_Black_ShareButton" ... />

4 <Button android:id="@id/mShareTwitter"

5 style="@style/ABB_Black_ShareButton" ... />

6 // values/styles.xml

7 <style name="ABB_Black_ShareButton ... >

8 <item name="android:onClick">onShareClick</item>

9 </style>

10 // ch.smalltech.common.feedback.ShareActivity

11 public void onShareClick(View v){

12 // different behavior depending on argument v

13 }

Figure 4: Example of a XML-based handler registration

observed from

ch.smalltech.battery.free. Two buttons

share the

onShareClick callback. The binding between

onShareClick and setOnClickListener of each button

can be determined through layout and styles XML ﬁles.

be very expensive. For example, the app shown in Fig-

ure 1 declares 74 activities in the manifest; we ﬁnd at

least 281 callbacks over 452 registering points; and its

call graph is composed of 1,732 nodes and 17,723 edges.

To address this, we use two optimizations to compute a

partial call graph that includes target methods and the

start activity methods. First, we exclude system’s static

libraries and other third-party libraries that are not re-

lated to the target methods. Second, we search transition

paths backwards on call graph. We pinpoint call sites

of target methods while walking through bytecodes. We

then construct a partial call graph, taking into accounts

component transitions via intent and bindings between

views and listeners. Finally, starting from the call sites,

we build backward transition paths, until public compo-

nents including the main activity are reached. If failed,

partial paths collected at the last phase will be returned.

Determining target activity(s): Given the call graph

and a target method, we determine the activities that in-

voke the method as follows. From the call graph, we

can identify the activity boundaries such that all meth-

ods within the same boundary are invoked by the same

activity. Since an activity can be started only through

an activity transition edge in the call graph, any maxi-

mal connected component whose edges are either syn-

chronous or asynchronous deﬁne the boundary of an ac-

tivity. In Figure 2, bigger rectangles denote the activity

boundaries. Given the boundaries, we identify the activ-

ities that contain the target method.

Finding activity transition paths: Reaching a target ac-

tivity from the start activity may require several transi-

tions between multiple activities. For example, in Fig-

ure 2, navigating from the start activity (

HomeFree) to

a target activity (

ShareActivity) requires three transi-

1026 23rd USENIX Security Symposium USENIX Association

tions. This implies that Brahmastra requires techniques

for automatic activity transitions, which we describe in

the next subsection. Second, a target activity may be

reachable via multiple transition paths. While the short-

est path is more attractive for fast exploration, the path

may contain blocking activities and hence not executable

by Brahmastra. Therefore, Brahmastra considers all tran-

sition paths (in increasing order of their length); if exe-

cution of a short path fails, it tries a longer one.

Given the call graph G, the Planner computes a small

set P of acyclic transition paths that the Execution En-

gine need to consider. P includes a path if and only if

it terminates at a target activity without a cycle and is

not a sufﬁx of any other path in P. This ensures that P is

useful, complete (i.e., Execution Engine does not need to

consider any path not in P), and compact. For instance

Figure 5 shows one out of three paths contained in P.

HomeFree;.onCreate

---> Home;.onCreate

-#-> Home;.onOptionsItemSelected

---> Home;.showAbout

---> AboutBox;.onCreate

-#-> AboutBox;.onLikeClicked

---> ShareActivity;.onCreate

-#-> ShareActivity;.onShareClick

---> ShareActivity;.share

---> ShareActivity;.onFacebookShare

Figure 5: An example path information for

ch.smalltech.battery.free Dashed arrows stand

for explicit calls or activity transition, whereas ar-

rows with a hash tag represent implicit invocations,

which are either callbacks due to user interactions or

framework-driven callbacks, such as lifecycle methods.

P can be computed by breadth-ﬁrst traversals in G,

starting from each target activity and traversing along the

reverse direction of the edges.

4.2 Execution Engine

The useful paths P produced by the Execution Planner

already give an opportunity to prune exploration: Brah-

mastra considers only paths in P (and ignore others), and

for each path, it can simply navigate through its activi-

ties from the beginning of the path (by using techniques

described later). Exploration can stop as soon as a target

method is invoked.

Rewriting apps for self-execution: One might use a

Monkey to make activity transitions along useful paths.

Since a Monkey makes such transitions by interacting

with GUI elements, this requires identifying mapping be-

tween GUI elements and transitioning activities and in-

teract with only the GUI elements that make desired tran-

sitions.

We address this limitation with a technique we develop

called self execution. At a high level, we rewrite app bi-

naries to insert code that automatically invokes the call-

backs that trigger desired activity transitions, even if their

corresponding GUI elements are not visible. Such code

is inserted into all the activities in a useful path such that

the rewritten app, after being launched in a phone or an

emulator, would automatically make a series of activity

transitions to the target activity, without any external in-

teraction with its GUI elements.

Jump start: Brahmastra goes beyond the above opti-

mization with a node pruning technique called “jump

start”. Consider a path p =(a

,...,a

), where a

is a

target activity. Since we are interested only in the target

activity, success of Brahmastra is not affected by what

activity a

in p the execution starts from, as long as the

last activity a

is successfully executed. In other words,

one can execute any sufﬁx of p without affecting the hit

rate. The jump start technique tries to execute a sufﬁx —

instead of the whole — useful path. This can improve

Brahmastra’s speed since it can skip navigating through

few activities (in the preﬁx) of a useful path. Interest-

ingly, this can also improve the hit rate of Brahmastra.

For example, if the ﬁrst activity a

requires human in-

puts such as user credentials that an automation system

cannot provide, any effort to go beyond state a

will fail.

Note that directly executing an activity a

,i > 0, with-

out navigating to it from the start activity a

, may fail.

This is because some activities are required to be invoked

with speciﬁc intent parameters. In such cases, Brahmas-

tra tries to jump start to the previous activity a

i−1

in the

path. In other words, Brahmastra progressively tries to

execute sufﬁxes of useful paths, in increasing order of

lengths, until the jump start succeeds and the target activ-

ity a

is successfully reached or all the sufﬁxes are tried.

Algorithm 1 shows the pseudocode of how execution

with jump start works. Given the set of paths, the al-

gorithm ﬁrst generates sufﬁxes of all the paths. Then it

tries to execute the sufﬁxes in increasing order of their

length. The algorithm returns true on successful execu-

tion of any sufﬁx. Note that Algorithm 1 needs to know

if a path sufﬁx has been successfully executed (line 9).

We inject lightweight logging into the app binary to de-

termine when and whether target methods are invoked at

runtime.

4.3 Runtime Analyzer

Runtime Analyzer collects various runtime states of the

test app and makes it available to custom analysis plug-

ins for scenario-speciﬁc analysis. Runtime states include

UI structure and content (in form of a DOM tree) of the

current app page, list of system calls and sensors invoked

by the current page, and network trace due to the current

page. We describe two plug-ins and analysis results in

USENIX Association 23rd USENIX Security Symposium 1027

Algorithm 1 Directed Execution

1: INPUT: Set of useful paths P from the Planner

2: OUTPUT: Return true if execution is successful

3: S ← set of sufﬁxes of all paths in P

4: for i from 0 to ∞ do

5: S

← set of paths of length i in S

6: if S

is empty then

7: return false

8: for each path sufﬁx p in S

9: if Execute(p) = true then

10: return true

11:

12:

return false

later sections.

5 Implementation of Brahmastra

We implement Brahmastra for analyzing Android apps,

and use the tool to perform two security analyses which

we will describe in §7 and §8. This section describes

several highlights of the tool, along with practical chal-

lenges that we faced in the implementation process and

how we resolved them.

5.1 Execution Planner

Static analyses for constructing a call graph and ﬁnd-

ing transition paths to target methods are performed us-

ing Redexer [24], a general purpose bytecode rewriting

framework for Android apps. Redexer takes as input

an Android app binary (APK ﬁle) and constructs an in-

memory data structure representing

DEX ﬁle for various

analyses. Redexer offers several utility functions to ma-

nipulate such

DEX ﬁle and provides a generic engine to

perform data-ﬂow analysis, along with call graph and

control-ﬂow graph.

For special APIs that trigger activity transitions, e.g.,

Context.startActivity(), we perform constant propa-

gation analysis (see Appendix A for details) and identify

a target activity stored inside the intent. Figure 6 de-

picts example bytecode snippets that create and initial-

ize an intent (lines 2 and 6), along with the target activ-

ity (line 4), and starts that activity via

startActivity()

(line 8). Mappings between each bytecode show how

we accumulate data-ﬂow information, from empty intent

through class name to intent with the speciﬁc target ac-

tivity. We apply the same analysis to bindings between

views and listeners.

5.2 App Rewriting

We use the Soot framework [29] to perform the byte-

code rewriting that enables self execution. Dexpler [7]

converts an Android app binary into Soot’s intermedi-

ate representation, called Jimple, which is designed to

1 // { v3 → this }

2 new-instance v0, Landroid/content/Intent;

3 // { v0 → Intent(), ... }

4 const-class v1, ...AboutBox;

5 // { v1 → Clazz(AboutBox), ... }

6 invoke-direct {v0, v3, v1}, ...Intent;.<init>

7 // { v0 → Intent(AboutBox), ... }

8 invoke-virtual {v3, v0}, ...;.startActivity

9 // { ... }

Figure 6: An example bytecode of activity transition ex-

cerpted from

ch.smalltech.battery.free. Mappings be-

tween bytecode represent data-ﬂow information, which

shows what values registers must have. Only modiﬁed or

newly added values are shown.

ease analysis and manipulation. The re-writing tool is

composed of Soot’s class visitor methods and an An-

droid XML parser. Given an app binary and an execution

path, the rewriter generates a rewritten binary which ar-

tiﬁcially invokes a callback method upon the completion

of the exercising the current activity, triggering the next

activity to be launched. The inserted code depends on the

type of the edge found by the Planner (Recall three kinds

of asynchronous edges described in §4.1). For program-

matic and XML-based registrations, the rewriter ﬁnds the

view attached to it — by parsing the activity code, and

the manifest respectively — and invokes the appropri-

ate UI interaction on it after it has completed loading.

Lifetime methods are invoked by the Android framework

directly, and the rewriter skips code insertion for these

cases. In other cases, the rewriter inserts a timed call to

the transition method directly, to allow the activity and

any dependencies of the method to load completely.

5.3 Jump Start

Jump start requires starting an activity even if it is not de-

ﬁned as the

Launcher activity in the app. To achieve that,

we manipulate the manifest ﬁle of the Android app. The

Intent.ACTION MAIN entry in the manifest ﬁle declares

activities that Android activity manager can start directly.

To enable jump start, we insert an

ACTION MAIN entry for

each activity along the path speciﬁed, so that it can be

started by the Execution Engine. Manifest ﬁle also de-

clares an intent ﬁlter, which determines the sources from

which an activity may be started, which we modify to

allow the Execution Engine to launch the activity. The

Engine then invokes desired activity by passing an intent

to it. We use the Android Debug Bridge (ADB) [4] for

performing jump start. ADB allows us to create an intent

with the desired parameters and target, and then passes

it to the Android Activity Manager. The activity man-

ager in turn loads the appropriate app data and invokes

the speciﬁed activity. Starting the (jump started) activ-

1028 23rd USENIX Security Symposium USENIX Association

ity immediately activates self execution from that activ-

ity onwards.

6 Evaluation of Brahmastra

We evaluate Brahmastra in terms of two key metrics: (1)

hit rate, i.e., the fraction of apps for which Brahmastra

can invoke any of target methods, and (2) speed, i.e.,

time (or number of activity transitions) Brahmastra takes

to invoke a target method in an app for which Brahmas-

tra has a hit. Since we are not aware of any existing tool

that can achieve the same goal, we compare Brahmas-

tra against a general Android app exploration tool called,

PUMA [23]. This prototype is the best-performing Mon-

key we were able to ﬁnd that is amenable to experimen-

tation. In terms of speed and coverage, PUMA is far

better than a basic “random” Monkey. PUMA incorpo-

rates many key optimizations in existing Monkeys such

as AppsPlayground [26], A3E [14], and VanarSena [27]

and we expect it to perform at least on a par with them.

6.1 Experiment Methodology

Target method: For the experiments in this section, we

conﬁgure Brahmastra to invoke authentication methods

in the Facebook SDK for Android.

We choose Face-

book SDK because this is a popular SDK and its meth-

ods are often invoked only deep inside the apps. Using

the public documentation for the Facebook SDK for An-

droid, we determined that it has two target methods for

testing. Note that apps in our dataset use the Facebook

SDK version 3.0.2b or earlier

Lcom/facebook/android/Facebook;->authorize

Lcom/facebook/android/Facebook;->dialog

Figure 7: Target methods for evaluation

Apps: We crawled 12,571 unique apps from the Google

Play store from late December 2012 till early January

2013. These apps were listed as 500 most popular free

apps in each category provided by the store at the time.

Among them, we ﬁnd that 1,784 apps include the Face-

book SDK for Android. We consider only apps that in-

voke the authentication method—Over 50 apps appear to

have no call sites to Facebook APIs, and over 400 apps

use the API but do not invoke any calls related to autho-

rization. We also discard apps that do not work with our

tool chain, e.g., crash on the emulator or have issues with

apktool [1] since our analysis depends on the disassem-

bled code of an apk ﬁle. This leaves us with 1,010 apps.

https://developers.facebook.com/docs/

android/login-with-facebook

The later version of Facebook SDK was released in the middle

of data collection and appears to use different methods to display a

adapted the new version yet.















   



#

 

Figure 8: Failure causes of Brahmastra and PUMA.

App execution: In order to determine if Brahmastra or

PUMA is able to reach a program point that invokes the

target method, we instrument apps. The instrumentation

detects when any of the target methods are invoked dur-

ing runtime, by comparing signatures of executing meth-

ods with signatures of target methods. For Brahmastra,

we consider only 5 of all paths generated by the Execu-

tion Planner. For PUMA, we explore each app for up to

250 steps; higher timeouts signiﬁcantly increase overall

testing time for little gain in hit rate.

6.2 Hit Rate

In our experiments, PUMA was able to successfully in-

voke a target method in 127 apps (13%). Note that

PUMA’s hit rate is signiﬁcantly lower than its activity

coverage (> 90% compared to humans) reported in [23],

highlighting the difﬁculty in invoking speciﬁc program

points deep inside an app. In contrast, Brahmastra was

successfully able to invoke a target method in 344 (34%)

apps, a 2.7× improvement over PUMA. A closer exam-

ination of our results, as shown in Table 1, reveals that

Brahmastra’s technique can help circumventing all the

root causes for PUMA’s poor hit rate as mentioned in §3.

We now investigate why PUMA and Brahmastra

sometimes fail to invoke the target method. For PUMA,

this is due to the aforementioned four cases. Figure 8

shows the distribution of apps for which PUMA fails due

to speciﬁc causes. As shown, all the causes happen in

practice. The most dominant cause is the failure to ﬁnd

UI controls to interact with, which is mostly due to com-

plex UI layouts of the popular apps we tested. Figure 8

also shows the root causes for Brahmastra’s failure. The

key reasons are as follows:

Blocking page: Even if jump start succeeds, successive

activity transition may fail on a blocking page. Brah-

mastra fails for 20% of the apps due to this cause. We

would like to emphasize that Brahmastra experiences

more blocking pages than PUMA only because Brah-

mastra explores many paths that PUMA does not (e.g.,

because those paths are behind a custom control that

USENIX Association 23rd USENIX Security Symposium 1029

Case Apps

R1: Timeout in PUMA, success in Brahmastra 62%

R2: Blocking page in PUMA, success in Brahmastra 48%

R3: Unknown control in PUMA, success in Brahmastra 43%

R4: Crash in PUMA, success in Brahmastra 30%

Table 1: % of apps for which Brahmastra succeeds but

PUMA fails due to various reasons mentioned in §3.

PUMA cannot interact with, but Brahmastra can ﬁnd and

invoke the associated callback) and many of these paths

contain blocking pages. If PUMA tried to explore those

paths, it would have failed as well due to these blocking

pages.

Crash: Jump start can crash if the starting activity ex-

pects speciﬁc parameters in the intent and Brahmastra

fails to provide that. Brahmastra fails for 7% of the apps

due to this cause.

Custom components: Execution Planner may fail to

ﬁnd useful paths if the app uses custom components

which can be used to override standard event handlers,

thus breaking our model of standard Android apps. With-

out useful paths, Brahmastra can fail to invoke the target

methods. In our experiments, this happens with 16% of

the apps. We leave as future work a task to extend Execu-

tion Planner to handle custom components. We ﬁnd that

PUMA also failed 91% on these apps, proving the dif-

ﬁculty of navigating apps with custom components. In

fact, PUMA suffers much more than Brahmastra due to

custom components.

Improving the hit rate: There are several ways we can

further improve the hit rate of Brahmastra. First, 16%

failures of Brahmastra come because the static analysis

fails to identify useful paths. A better static analysis that

can discover more useful paths can improve Brahmas-

tra’s hit rate. Second, in our experiments, Brahmastra

tried only up to 5 randomly selected useful paths to in-

voke the target method and gave up if they all failed. In

many apps, our static analysis found many tens of use-

ful paths, and our results indicate that the more paths we

tried, the better was the hit rate. More speciﬁcally, Brah-

mastra succeeded for 207 apps after considering only

one path, and for 344 apps after considering up to ﬁve

paths. This suggests that considering more paths is likely

to improve the hit rate. Additionally, we should select

the paths to avoid any nodes or edges for which explo-

ration failed in previously considered paths instead of

choosing them randomly. In 72 apps, Brahmastra was

unable to ﬁnd the binding between a callback method

and the UI element associated with it, causing it to fall

back on a direct invocation of the callback method. A

better static analysis can help in this case as well. In 22

http://developer.android.com/guide/topics/

ui/custom-components.html



!

"

#

$



 ! " # $ 

'







Figure 9: Test speed comparison of Brahmastra and

PUMA

apps, Brahmastra deemed a page blocked due to UI el-

ements in the Android SDK (e.g., list elements) whose

behavior was not encoded in the instrumentation engine.

An engineering effort in special-case handling of these

and more views would increase hit rate. We plan to ex-

plore such optimizations in future. Finally, PUMA (and

other Monkeys) and Brahmastra use fundamentally dif-

ferent techniques to navigate between app pages and it

might be possible to combine them in a single system

where PUMA is used if Brahmastra fails (or vice versa).

In our experiments, such a hybrid approach would give

an overall hit rate of 39% (total 397 apps).

6.3 Speed

We use the number of activity transitions required to

reach the target activity as a proxy for speed, since the

actual time will vary depending on a variety of com-

putational factors (e.g., network speed, device speciﬁ-

cations). In Figure 9, we plot the CDF of the number

of transitions required to reach the target activity for the

apps which are successfully tested by both Brahmastra

and PUMA. Since Brahmastra prunes away many un-

necessary paths using static analysis, it runs faster than

PUMA that suffers from uninformed activity transitions

and large fanout in the activity transition graphs. On av-

erage, PUMA requires 18.7 transitions per app, while

Brahmastra requires 2.5 transitions per app, resulting in

7 fold speedup.

7 Analysis of Ads in Kids Apps

Our ﬁrst scenario is to use Brahmastra to study whether

ad libraries for Android apps meet guidelines for protect-

ing the online privacy of children. We give results for two

popular ad components embedded in 220 kids apps. Our

analysis shows that 80% of the apps displayed ads with a

link to landing pages that have forms for collecting per-

sonal information, and 36% apps displayed ads deemed

inappropriate to children.

1030 23rd USENIX Security Symposium USENIX Association

7.1 Motivation and Goals

The Children’s Online Privacy Protection Act (COPPA)

lays down a variety of stipulations that mobile app de-

velopers must follow if their apps are directed at children

under 13 years old [6]. In particular, COPPA disallows

the collection of personal information by these apps un-

less the apps have ﬁrst obtained parental consent.

COPPA holds the app developer responsible for the

personal information collected by the embedded third

party components as well as by the app’s code [6]. Since

it is common to see ad components included in free apps,

we aim to measure the extent of potentially non-COPPA-

compliant ad components in kids apps. Speciﬁcally, our

ﬁrst goal is to determine whether in-app ads or landing

pages pointed by these ads present forms that collect per-

sonal information. Although displaying collection forms

itself is not a violation, children might type in requested

personal information, especially if these websites claim

to offer free prizes or sweepstakes. In such cases, if

these ads or landing pages do collect personal informa-

tion without explicit parental consent, this act could be

considered as a violation according to COPPA. Since it is

difﬁcult to model these legal terms into technical speciﬁ-

cations, we only report potential concerns in this section.

Our second goal is to test whether content displayed in

in-app ads or landing pages is appropriate for children.

Since this kind of judgement is fundamentally subjective,

we show the breakdown of content categories as labeled

by human testers.

Note that runtime observation is critical for this test-

ing, since ads displayed within apps change dynamically

depending on the inventory of ads at the time of request.

7.2 Testing Procedure

The testing has two steps. For a given app, we ﬁrst col-

lect ads displayed within apps and landing pages that are

pointed by the ads. Second, for each ad and landing page,

we determine: (1) whether they present forms to collect

personal information such as ﬁrst and last name, home

address, and online contact as deﬁned in COPPA; and

(2) whether their content appears inappropriate to chil-

dren and if so why.

Driving apps to display ads: We use Brahmastra to au-

tomatically drive apps to display ads. In this study, we

focus on two popular ad libraries, AdMob and Millen-

nial Media, because they account for over 40% of free

Android apps with ads [11]. To get execution paths that

produce ads, we use the following target methods as in-

put to Brahmastra:

Lcom/google/ads/AdView;-><init>

Lcom/millennialmedia/android/MMAdView;-><init>

Collecting ads & landing pages: We redirect all the net-

work trafﬁc from executing test apps through a

Fiddler

proxy [8]. We install the Fiddler SSL certiﬁcate on the

phone emulator as a trusted certiﬁcate to allow it to ex-

amine SSL trafﬁc as well. We then identify the requests

made by the ad libraries to their server component us-

ing domain names. Once these requests are collected,

we replay these traces (several times, over several days),

to fetch ad data from the ad servers as if these requests

were made from these apps. This ad data is generally in

the form of a JSON or XML object that contains details

about the kind of ad served (image or text), the content

to display on the screen (either text or the URL of an im-

age), and the URL to redirect to if the ad is clicked upon.

We record all of above for analysis.

Analyzing ads & landing pages: We use two methods

to characterize ads and landing pages. First, for each

landing page URL collected, we probe the Web of Trust

(WoT) database [9] to get the “child safety” score. Sec-

ond, to better understand the reasons why landing pages

or ads may not be appropriate for children, we use crowd-

sourcing (via Amazon Mechanical Turk [3]) to label each

ad and landing page and to collect detailed information

such as the type of personal information that landing

pages collect. As data collected from crowds may in-

clude inconsistent labeling, we use majority voting to ﬁl-

ter out noise.

7.3 Results

Dataset: We collected our dataset in January 2014. To

ﬁnd apps intended for children, we use a list of apps

categorized as “Kids” in Amazon’s Android app store

Since apps offered from the Amazon store are protected

with DRM and resist bytecode rewriting, we crawled the

Play store for apps with the same package name.

Starting from slightly over 4,000 apps in the Kids cat-

egory, we found 699 free apps with a matching package

name in the Play store. Among these, we ﬁnd 242 apps

that contain the AdMob or Millennial Media ad libraries.

Using Brahmastra, we were successfully able to retrieve

at least one ad request for 220 of these apps (also in Jan-

uary 2014), for which we report results in this section.

For the remaining 22 apps, either Brahmastra could not

navigate to the correct page, or the app did not serve any

ad despite reaching the target page.

Results: We collected ads from each of the 220 apps

over 5 days, giving us a total collection of 566 unique

ads, and 3,633 unique landing pages. Using WoT, we de-

termine that 183 out of the 3,633 unique landing pages

have the child-safety score below 60, which fall in the

“Unsatisfactory”, “Poor” or “Very Poor” categories. 189

out of the 220 apps (86%) pointed to at least one of these

pages during the monitoring period. Note that WoT did

not contain child-safety ratings for 1,806 pages, so these

Google Play store does not have a separate kids category.

USENIX Association 23rd USENIX Security Symposium 1031

Info Type Landing Pages Apps

Home address 47 58

First and last name 231 174

Online contact 100 94

Phone number 17 15

Total 235 175

Table 2: Personal information collected by landing pages

numbers represent a lower bound. We then used Amazon

Mechanical Turk to characterize all 566 ads, and 2,111

randomly selected landing pages out of the 3,633. For

each ad and landing page, we asked Amazon Mechan-

ical Turk to check whether they collect personal infor-

mation (of each type) and whether they contain inappro-

priate content for children (see Appendix B for the task

details). We offered 7 cents (US) per each task (which

involves answering various questions for each website or

banner ad) and collected three responses per data point.

As discussed above, we only counted responses that were

consistent across at least two out of three respondents, to

ﬁlter out noise.

Table 2 summarizes the types of personal informa-

tion that landing pages ask users to provide as labeled

by Amazon Mechanical Turk. We ﬁnd that at least 80%

of the apps in the dataset had displayed ads that point to

landing pages with forms to collect personal information.

On a manual examination of a subset of these pages, we

found no labeling errors. We also found that none of the

sites we manually checked attempt to acquire parental

consent when collecting personal information. See Ap-

pendix B for examples.

Table 3 breaks down child-inappropriate content of the

ads displayed in apps as labeled by Amazon Mechani-

cal Turk. Although COPPA does not regulate the con-

tent of online services, we still ﬁnd it concerning that

36% (80 out of 220) of the apps display ads with con-

tent deemed inappropriate for children. In particular 26%

(58 apps) displayed ads that offer free prizes (e.g., Fig-

ure 13), which is considered a red ﬂag of deceptive adver-

tising, especially in ads targeting children as discussed in

guidelines published by the Children’s Advertising Re-

view [10]. We also analysed the content displayed on

the landing pages, and found a similar number of content

violations as the ad images.

8 Analysis of Social Media Add-ons

Our second use case is to test apps against a recently

discovered vulnerability associated with the Facebook

SDK [30]. Our testing with Brahmastra shows that 13

out of 200 Android apps are vulnerable to the attack.

Fixing it requires app developers to update the authen-

tication logic in their servers as recommended by [30].

Content Type Image Ads Apps

Child exploitation 2 8

Gambling, contests, lotteries or

sweepstakes

3 2

Misleading users about the

product being advertised

7 16

Violence, weapons or gore 4 5

Alcohol, tobacco or drugs 3 3

Profanity and vulgarity 0 0

Free prize 39 58

Sexual or sexually suggestive

content

12 29

Total 62 80

Table 3: Breakdown of child-inappropriate content in ads

8.1 Testing Goal

The Facebook access token vulnerability discussed

in [30] can be exploited by attackers to steal the vic-

tim’s sensitive information stored in vulnerable apps. For

instance, if a malicious-yet-seemingly benign news app

can trick the victim once to use the app to post a fa-

vorite news story on the victim’s Facebook wall (which

is a common feature found in many news apps), then the

malicious app can use the access token obtained from the

Facebook identity service to access sensitive information

stored by any vulnerable apps that the victim had inter-

acted with and have been associated with the victim’s

Facebook account. This attack can take place ofﬂine—

once the malicious app obtains an access token, then it

can send the token to a remote attacker who can imper-

sonate as the victim to vulnerable apps.

Figure 10 gives the steps that allow a malicious appli-

cation to steal

Victim’s information from VulApp

. The

fact that the online service (

VulApp

) is able to retrieve

the user’s information from Facebook only means that

the client (

MalApp

) possesses the privilege to the Face-

book service, but is not a proof of the client app’s iden-

tity (

MalApp

= VulApp

). The shaded boxes in Figure 10

highlight the vulnerability. See [30] for more detail.

Wang et al. [30] manually tested 27 Windows 8 apps

and showed that 78% of them are vulnerable to the ac-

cess token attack. Our goal is to scale up the testing to a

large number of Android applications. Note that testing

for this vulnerability requires runtime analysis because

the security violation assumptions are based on the inter-

actions among the application, the application service,

and Facebook service.

8.2 Testing Procedure

The testing has three steps. For a given app, we ﬁrst need

to drive apps to load a Facebook login screen. Second,

we need to supply valid Facebook login credentials to ob-

serve interactions between the test application and Face-

1032 23rd USENIX Security Symposium USENIX Association

7. Get user info with access_token

8. Provide Victim’s info

1. Click Login with Facebook

Victim MalApp

VulApp

Facebook ID service

5. Return access_token

2. Initiate login with Facebook

3. Prompt Facebook login screen

4. Provide Victim’s Facebook credentials

9. Authenticate this session as Victim

6. Authenticate with

access_token

Figure 10: Facebook’s access token, intended for autho-

rizing access to

Victim’s info, is used by VulApp

to au-

thenticate the session as

Victim. From step 9, MalApp

can steal Victim’s sensitive information in VulApp

book ID service. Third, we need to determine whether

the test application misuses a Facebook access token for

authenticating a client (steps 7-9) by monitoring network

trafﬁc and application behavior after providing a fraudu-

lent access token.

Driving apps to display Facebook login: We use Brah-

mastra to automatically drive apps to invoke the Face-

book SDK’s authentication methods shown in Figure 7.

Once the authentication methods open the sign-in win-

dow, we supply valid Facebook credentials.

Manipulating trafﬁc with MITM proxy: As before,

we direct all network trafﬁc through a

Fiddler proxy.

Since Facebook sign-in trafﬁc is encrypted over SSL, we

also install a

Fiddler SSL certiﬁcate on the phone emu-

lator to decrypt all SSL trafﬁc.

To manipulate the login, we record an

access token

from a successful login session associated with another

application (and therefore simulating an attacker as illus-

trated in the steps 1-5 of Fig. 10) and use the script shown

in Fig. 11. It runs over HTTP responses, and overwrites

an incoming

access token with a recorded one.

8.3 Experiments

Dataset: We randomly draw 200 apps from the dataset

used in §6 for this testing.

Results: We ﬁnd that 18 out of 200 apps use a Face-

book access token for authentication, and among them 13

apps are vulnerable to a fraudulent access token (72%). 5

apps appear not vulnerable, and show some sort of error

message when given a fraudulent access token. The re-

maining 182 apps use the Facebook SDK merely to post

content to the user’s wall, and not as an authentication

mechanism. We determined this by looking at the net-

1 if(oSession.url.Contains("m.facebook.com")) {

2 var toReplace = "access_token=CAAHOi...";

3 ...

4 if(oSession.oResponse.headers.

5 ExistsAndContains("Location", "access_token"))

6 {

7 oSession.oResponse.headers["Location"] =

8 oSession.oResponse.headers["Location"].

9 replace(oRegEx, toReplace);

10 oSession["ui-customcolumn"] = "changed-header";

11 }}

Figure 11: A script used to manipulate access token:

We only show the ﬁrst 6 bytes of the access token used

in the attack.

work trafﬁc at the login event, and observing that all of it

is sent only to Facebook servers.

To understand how widespread the vulnerability is, we

look at the statistics for the number of downloads on the

Google Play store. Each of the 13 vulnerable apps has

been downloaded more than 10,000 times, the median

number of app downloads is over 500,000, and the most

popular ones have been downloaded more than 10 mil-

lion times. Further, these 13 apps have been built by

12 distinct publishers. This shows that the problem is

not restricted to a few na

ıve developers. We shared the

list of vulnerable apps with a Facebook security team on

2/27/2014 and got a response immediately that night that

they had contacted the affected developers with the in-

structions to ﬁx. The privacy implications of the pos-

sessing the vulnerability are also serious. To look at what

user data can potentially be exﬁltrated, we manually in-

vestigated the 13 vulnerable apps. Users of these apps

may share a friends list, pictures, and messages (three

dating apps); photos and videos (two apps); exercise logs

and stats (one app); homework info (one app) or favorite

news articles, books or music preferences (remaining six

apps). By exploiting the vulnerability, a malicious app

could exﬁltrate this data.

9 Related Work

Automated Android app testing: A number of recent

efforts proposed improvements over Android Monkey:

AndroidRipper [13] uses a technique known as GUI rip-

ping to create a GUI model of the application, and ex-

plores its state space. To improve code coverage, An-

droidRipper relies on human testers to type in user cre-

dentials to get through blocking pages. However, de-

spite this manual effort, the tool shows less than 40%

code coverage after exploring an app for 4.5 hours. App-

sPlayground [26] employs a number of heuristics—by

guessing the right forms of input (e.g., email address, zip

code) and by tracking widgets and windows in order to

USENIX Association 23rd USENIX Security Symposium 1033

reduce duplicate exploration. It shows that these heuris-

tics are helpful although the coverage is still around 33%.

SmartDroid [31] uses a combination of static and dy-

namic analysis to ﬁnd the UI elements linked to sensitive

APIs. However, unlike Brahmastra, SmartDroid explores

every UI element at runtime to ﬁnd the right view to

click. A3E [14] also uses static analysis to ﬁnd an activ-

ity transition graph and uses the graph to efﬁciently ex-

plore apps. We leveraged the proposed technique when

building an execution path. However, similarly to the

tools listed above, A3E again uses runtime GUI explo-

ration to navigate through activities. In contrast to these

works, Brahmastra determines an execution path using

static analysis and rewrites an app to trigger a planned

navigation, bypassing known difﬁculties related to GUI

exploration.

Security analysis of in-app ads: Probably because only

recently COPPA [6] had been updated to include mo-

bile apps

, we are not aware of any prior work looking

into the issues around COPPA compliance of advertise-

ments (and the corresponding landing pages) displayed

within apps directed at children. However, several past

works investigated security and privacy issues with re-

spect to Android advertising libraries. AdRisk [22] is a

static analysis tool to examine advertising libraries inte-

grated with Android apps. They report that many ad li-

braries excessively collect privacy-sensitive information

and expose some of the collected information to adver-

tisers. Stevens et al. examine thirteen popular Android

ad libraries and show the prevalent use of tracking iden-

tiﬁers and the collection of private user information [28].

Worse, through a longitudinal study, Book et al. show

that the use of permissions by Android ad libraries has

increased over the past years [18].

Analyzing logic ﬂaws in web services and SDKs: The

authentication vulnerability discussed in §8 falls into the

category of logic ﬂaws in web programming. Recent pa-

pers have proposed several technologies for testing var-

ious types of logic ﬂaws [16, 17, 21]. However, these

techniques mainly target logic ﬂaws in two-party web

programs, i.e., programs consisting of a client and a

server. Logic ﬂaws become more complicated and in-

triguing in multi-party web programs, in which a client

communicating with multiple servers to accomplish a

task, such as the Facebook-based authentication that we

focus in this paper. AuthScan is a recently developed

technique to automatically extract protocol speciﬁcations

from concrete website implementations, and thus dis-

cover new vulnerabilities in the websites [15]. In con-

trast, our goal is not to discover any new vulnerability on

a website, but to scale up the testing of a known vulnera-

bility to a large number of apps.

The revision was published on July 2013.

10 Discussion

Limitations: Although Brahmastra improves test hit

rates over Monkey-like tools, we discover several id-

iosyncratic behaviors of mobile apps that challenge run-

time testing. Some apps check servers upon launching

and force upgrading if newer versions exist. Some apps

constantly load content from remote servers, showing

transient behaviors (e.g., extremely slow at times). We

also have yet to implement adding callbacks related to

sensor inputs. Another challenge is to isolate dependent

components in the code. We assume that each activity is

more or less independent (except that they pass parame-

ters along with intent) and use our jump start technique

to bypass blocking pages and to speed up testing. How-

ever, we leave as future work a task to statically deter-

mine dependent activities to ﬁnd activities to jump-start

to without affecting the program behavior.

Other runtime security testing of mobile apps: As

mobile apps are highly driven by user interaction with

visual components in the program, it is important to an-

alyze the code behavior in conjunction with runtime UI

states. For instance, malicious third-party components

can trick users into authorizing the components to access

content (e.g., photos) that the users intended to share with

the application. Brahmastra can be used to capture visual

elements when certain APIs are invoked to check against

such click jacking attempts. Brahmastra can also auto-

mate the testing to check whether privacy-sensitive APIs

are only invoked with explicit user interactions.

11 Conclusion

We have presented a mobile app automation tool, Brah-

mastra, that app store operators and security researchers

can use to test third-party components at runtime as they

are used by real applications. To overcome the known

shortcomings of GUI exploration techniques, we analyze

application structure to discover desired execution paths.

Then we re-write test apps to follow a short path that in-

vokes the target third-party component. We ﬁnd that we

can more than double the test hit rate while speeding up

testing by a factor of seven compared to a state-of-the-art

Monkey tool.

We use Brahmastra for two case studies, each of which

contributes new results: checking if third-party ad com-

ponents in kids apps are compliant with child-safety reg-

ulations; and checking whether apps that use Facebook

the kids apps, we discover 36% of 220 kids apps dis-

play ads deemed inappropriate for children, and 80% of

the apps display ads that point to landing pages which

attempt to collect personal information without parental

consent. Among the apps that use Facebook Login, we

ﬁnd that 13 applications are still vulnerable to the Face-

1034 23rd USENIX Security Symposium USENIX Association

book access token attack even though the attack has been

known for almost a year. Brahmastra let us quickly check

the behavior of hundreds of apps for these studies, and

it can easily be used for other studies in the future—

checking whether privacy-sensitive APIs can be invoked

without explicit user interaction, discovering visible UI

elements implicated in click jacking attempts, and more.

Acknowledgments

This material is based on research sponsored in part by

DARPA under agreement number FA8750-12-2-0107,

NSF CCF-1139021, and University of Maryland Part-

nership with the Laboratory of Telecommunications Sci-

ences, Contract Number H9823013D00560002. The

U.S. Government is authorized to reproduce and dis-

tribute reprints for Governmental purposes notwithstand-

ing any copyright notation thereon.

References

[1] A tool for reverse engineering Android apk ﬁles.

http://code.google.com/p/android-

apktool/.

[2] Activity — Android Developers. http:

//developer.android.com/reference/

android/app/Activity.html.

[3] Amazon Mechanical Turk. https://www.

mturk.com.

[4] Android Debug Bridge. http://developer.

android.com/tools/help/adb.html.

[5] Android Developers, The Developer’s

Guide. UI/Application Exerciser Monkey.

http://developer.android.com/

tools/help/monkey.html.

[6] Complying with COPPA: Frequently Asked

Questions. http://business.ftc.gov/

documents/Complying-with-COPPA-

Frequently-Asked-Questions.

[7] Dexpler: A Dalvik to Soot Jimple Translator.

http://www.abartel.net/dexpler/.

[8] Fiddler. http://www.telerik.com/

fiddler.

[9] Web of Trust. https://www.mywot.com/.

[10] Self-Regulatory Program for Childrens Ad-

vertising, 2009. http://www.caru.org/

guidelines/guidelines.pdf.

[11] AppBrain, Feb. 2014. http://www.

appbrain.com/stats/libraries/ad.

[12] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers:

Principles, Techniques and Tools. Addison-Wesley,

1986.

[13] D. Amalﬁtano, A. R. Fasolino, S. D. Carmine,

A. Memon, and P. Tramontana. Using GUI Ripping

for Automated Testing of Android Applications. In

Proceedings of the IEEE Conference on Automated

Software Engineering (ASE), 2012.

[14] T. Azim and I. Neamtiu. Targeted and depth-ﬁrst

exploration for systematic testing of android apps.

In OOPSLA, 2013.

[15] G. Bai, J. Lei, G. Meng, S. S. V. P. Saxena, J. Sun,

Y. Liu, and J. S. Dong. Authscan: Automatic ex-

traction of web authentication protocols from im-

plementations. In NDSS, 2013.

[16] P. Bisht, T. Hinrichs, N. Skrupsky, R. Bobrowicz,

and V. N. Venkatakrishnan. Notamper: Automati-

cally detecting parameter tampering vulnerabilities

in web applications. In CCS, 2010.

[17] P. Bisht, T. Hinrichs, N. Skrupsky, and V. N.

Venkatakrishnan. Waptec: Whitebox analysis of

web applications for parameter tampering exploit

construction. In CCS, 2011.

[18] T. Book, A. Pridgen, and D. S. Wallach. Longitu-

dinal analysis of android ad library permissions. In

IEEE Mobile Security Technologies (MoST), 2013.

[19] J. Dean, D. Grove, and C. Chambers. Optimization

of Object-Oriented Programs Using Static Class

Hierarchy Analysis. In Proceedings of the 9th Eu-

ropean Conference on Object-Oriented Program-

ming, pages 77–101, 1995.

[20] M. Egele, D. Brumley, Y. Fratantonio, and

C. Kruegel. An Empirical Study of Cryptographic

Misuse in Android Applications. In CCS, 2013.

[21] V. Felmetsger, L. Cavedon, C. Kruegel, and G. Vi-

gna. Toward automated detection of logic vulner-

abilities in web applications. In USENIX Security,

2010.

[22] M. Grace, W. Zhou, X. Jiang, and A.-R. Sadeghi.

Unsafe Exposure Analysis of Mobile In-App Ad-

vertisements. In WiSec, 2012.

[23] S. Hao, B. Liu, S. Nath, W. G. Halfond,

and R. Govindan. PUMA: Programmable UI-

Automation for Large Scale Dynamic Analysis of

Mobile Apps. In Mobisys, 2014.

USENIX Association 23rd USENIX Security Symposium 1035

[24] J. Jeon, K. K. Micinski, J. A. Vaughan, A. Fogel,

N. Reddy, J. S. Foster, and T. Millstein. Dr. Android

and Mr. Hide: Fine-grained Permissions in Android

Applications. In ACM CCS Workshop on Security

and Privacy in Smartphones and Mobile Devices,

2012.

[25] B. Liu, S. Nath, R. Govindan, and J. Liu. DECAF:

Detecting and Characterizing Ad Fraud in Mobile

Apps. In USENIX NSDI, 2014.

[26] V. Rastogi, Y. Chen, and W. Enck. Appsplay-

ground: Automatic security analysis of smartphone

applications. In Proceedings of the ACM Confer-

ence on Data and Application Security and Pri-

vacy, 2013.

[27] L. Ravindranath, S. Nath, J. Padhye, and H. Balakr-

ishnan. Automatic and Scalable Fault Detection for

Mobile Applications. In Mobisys, 2014.

[28] R. Stevens, C. Gibler, J. Crussell, J. Erickson, and

H. Chen. Investigating user privacy in android ad

libraries. In IEEE Mobile Security Technologies

(MoST), 2012.

[29] R. Valle-Rai, P. Co, E. Gagnon, L. J. Hendren,

P. Lam, and V. Sundaresan. Soot - a Java byte-

code optimization framework. In IBM Centre for

Advanced Studies Conference, 1999.

[30] R. Wang, Y. Zhou, S. Chen, S. Qadeer, D. Evans,

and Y. Gurevich. Explicating SDKs: Uncover-

ing Assumptions Underlying Secure Authentica-

tion and Authorization. In USENIX Security, 2013.

[31] C. Zheng, S. Zhu, S. Dai, G. Gu, X. Gong, X. Han,

and W. Zou. Smartdroid: an automatic system for

revealing ui-based trigger conditions in android ap-

plications. In ACM CCS Workshop on Security and

Privacy in Smartphones and Mobile Devices, 2012.

A Constant Propagation Analysis

We extend the existing constant propagation analysis so

as to trace

intents, UI elements, and listeners. In ad-

dition to traditional value types, such as numerical or

string constant, we add meta-class, object, and intent

sorts, which track class ids, object references, and in-

tent instances, respectively. For instructions that create

objects; load class ids; or invoke special APIs such as

Intent.setClass(), we add their semantics into the data-

ﬂow transfer function.

Figure 12 illustrates how we extend data-ﬂow lattice;

how we conform to meet operation property; and how we

deﬁne semantics of relevant instructions.

1 type lattice = ...

2 | Clazz of string (∗ const−class ∗)

3 | Object of string (∗ instance ∗)

4 | Intent of string (∗ Intent for a speciﬁc component ∗)

5 | ...

6 let meet l1 l2 = match l1 , l2 with ...

7 | Clazz c1, Clazz c2 when 0 = compare c1 c2 → l1

8 | Object o1, Object o2 when 0 = compare o1 o2 → l1

9 | Intent i1 , Intent i2 when 0 = compare i1 i2 → l1

10 | ...

11 let transfer (inn: lattice Map.t) (op, opr) = ...

12 else if OP NEW = op then (∗ NEW ∗)

13 (

14 let dst :: id ::[] = opr in

15 let cname = Dex.get ty name id in

16 if 0 = compare cname ‘‘ Intent ’’

17 then Map.add dst ( Intent ‘‘’’) inn

18 else Map.add dst (Object cname) inn

19 ) ...

Figure 12: Abbreviated source code of extended constant

propagation analysis. Meta-class, object, and intent sorts

maintain information as string, and they can be merged

only if internal values are identical, hence must-analysis.

As an example, this shows how to handle opcode

NEW.

1036 23rd USENIX Security Symposium USENIX Association

B Examples of Ads in Kids Apps

(a)

(b)

(c)

(d)

(e) clicking

the banner ad

Figure 13: a) and (b) offer a free prize and (c) and (d) are sexually suggestive. (e) shows an example where clicking a

banner ad displayed in a kids app opens up a landing page that presents forms to collect personal information.

Figure 14: A screenshot of the Amazon Mechanical Turk task that we created to characterize landing pages pointed

by ads displayed in kids apps