2011: The year of mobile malware? Nope.

One of the discussion topics at this week’s Mobile Security Barcamp in Sophia Antipolis was mobile malware, with some people claiming that 2011 will be the year of mobile malware. I agree with them that, as mobile takes more and more power, and as platforms like iOS and Android become more and more common, they become a prime target for malware developers. So yes, more mobile malware will be produced.

However, naming 2011 “The Year of Mobile Malware” also entails that this malware will have a real effect on users, and that ‘s where I am much less sure. The main reason for this doubt comes from the use of application stores as the dominant way to deploy applications. The fact that most mobile applications go through the same few filters (Apple’s App Store, Google’s Android Market, and a few more) is of great help against malware, for several reasons:

Application vetting. On an application store, applications are vetted against a policy. The process depends greatly on the store, with Apple’s process being the most well-known, mostly because the policy it enforces is not compleltely clear, and the process is fully opaque. In all cases, the existence of a vetting process provides a good place to attempt the identification and rejection of malware.
Application banning. As soon as an application is recognized as malware, it can be banned from an application store, immediately stopping its distribution. This also allows application store managers to analyze the application and enhance their vetting process to better catch similar malware.
Kill switches. That’s the extra mile after banning, which is (confirmed?) implemented in both Apple’s and Google’s application stores. Once an application is banned from the store, it can be removed or deactivated on all devicess, as soon as they connect to the store (usually to check for updates). This also will lead to a reduction of the impact of malware.
Other enforcement means. Some people argue that automated kill switches are overkill. Alternatives are possible, which may show up if the amount of malware rises significantly. For instance, once an application is identified as malware, a popup could be used to warn the user and propose the deactivation/removal of the application.

My background is on the vetting process, and more specifically, on the security analysis of mobile applications. At Trusted Labs, I have practiced several types of analysis of applications, including:

In-depth white-box evaluations. That’s the best kind of analysis, in which the evaluator has access to the application’s source code and related documentation, in addition to the running code. Hiding malware is possible, of course, but it requires significant effort, and it is very hard to use the same technique twice (at least with the same evaluator/lab). However, such evaluations are very expensive, they require the developer to provide source code, and they are basically used only for really sensitive applications, like payment applications.
In-depth black-box evaluations. Here, the evaluator only gets a copy of the application (i.e., binary code only), together with the user documentation. Usually, such an evaluation is performed in a fixed, limited time, and it relies heavily on the intuition of the evaluator. The process usually starts with some automated analysis of the application, at least a simple one. For instance, knowing which permissions are requested and in which APIs they are used already provides an interesting point of view on the application. For instance, it may help identify an application that requests access to Internet and read access to contact data, and combines them to dump its user’s contacts to a spam vendor. The costs are usually lower than for white-box evaluations, and all necessary information is available in application stores. However, such processes remain expensive, and should be reserved for some applications (for instance, suspicious applications, applications that require sensitive permissions, or high-profile applications.
Shallow black-box evaluations. In that case, cost is the priority, and the process is likely to be mostly automated. Here, static code analysis is at the heart of the process. Here, the objective is to design an automated analysis with an extremely low rate of false negatives (applications that violate the policy and pass the evaluation), and the lowest possible rate of false positives (applications that follow the policy and are rejected by the evaluation). Of course, the policy must also be flexible enough to allow the most common behaviors to pass the evaluation. Here, the cost of the static analysis is very low, and the actual cost depends on the proportion of applications that need to undergo an in-depth evaluation.

When we discussed malware at the Barcamp, I claimed (live and on Twitter) that static analysis could be effective, and it seems that not everybody agreed with me, in particular Craig Heath, although we reach similar conclusions about mobile malware in 2011.

The basic reason for our disagreement may be the definition of static analysis, and what the target is. Here are a few things about static analysis:

Static analysis is about analyzing binary code. We want to analyze the code that will actually run on the devices, so let’s not bother with source code analysis: source code analysis is a tool for developers, basically useless in application stores.
Static analysis is about proving properties on this binary code. The notion of proof is important, as we want to make sure that the applications that pass static analysis actually satisfy all policy rules.
Static analysis works much better on structured code, i.e., on code on which we have some guarantees, like Java bytecode. We can do really interesting things on bytecode-based frameworks, because the technology is now ready. On native code, the problem is more difficult, because it is much more difficult to prove properties. Nevertheless, technology keeps evolving, and native code analysis is progressing.
Static analysis should not be fully automated. Instead, it should be complemented by other ways to verify what it wasn’t able to prove. Of course, the a good tool will reject few applications and require llittle human involvement, but it is always necessary to monitor the developer practices and manually approve or reject some applications.
Static analysis should evolve, or at least the policies it checks should evolve. Policies usually evolve by replacing a very restrictive policy by a less restrictive and more precise one. On the first evaluations I did, a typical rule was “The application should not connect to Internet”; it beecame “The application should only connect to its issuer’s domain”, and then to “The application should only connect to statically defined domains”, to basically no rule (although in practice the last one should be satisfied by most applications, which only connect to a few fixed sites). Such changes follows the current practice, as well as the evolution of trust between users/operators/distributors and developers.

So, static analysis is a way to exploit the adapted Business Intelligence techniques that Craig Heath rightly favors. What I have noticed over several years of using static analysis tools in evaluations is that they greatly simplify the work of a human evaluator, but they have a hard time dealing with two things: very complex applications, and applications that are “borderline” with respect to the policy. Because of that, a human being needs to validate its results (at least, the rejections). As we know, borderline applications often are the disruptive ones, and it is not acceptable to reject them.

In 2011, we will see what kind of malware comes up, but at least on some platforms, static analysis will play a role in the application vetting process, making it much more efficient on the simple applications, and allowing evaluators to spend more time in the vetting of sensitive/complex applications. And in the end, the infrastructure is quite likely to win the fight with very limited impact on the end users. So, like in the past 10 years, we are not yet on the year of mobile malware.

3 Comments

Craig Heath wrote:

January 20, 2011 at 19:42

I think we are actually in agreement here, it just goes to show the limitations of Twitter’s 140 characters! Bytecode is much easier to do effective static analysis on, because of the type safety and the inability of the code to manipulate references to arbitrary objects. The vast majority of malware on the Symbian platform is compiled native ARM code though, which is rather harder to make any reliable judgements about. Nevertheless, static analysis has its part to play, and we agree that there has to both an element of automation and an element of human discretion, so the trick is just how to balance those two most efficiently.
Eric VÃ©tillard wrote:

January 20, 2011 at 22:21

Hi Craig,

There is still one point on which I kind of disagree. Strong typing makes the analysis easier, but I don’t think that it is the most difficult part. I rather beileve that the most difficult part comes from the fact that native code execution is arbitrary, and that it can be used to manipulate code pointers. Thismakes the analysis much more complex, because it greatly increases the possible number of execution patterns. There is already some work about such native code analysis, which works under the condition that the code satisfies some properties. This is quite likely to come in the near future.

If we come tu rumors, I would even say that static analysis is one good reason for Apple to force the use of their tools. If the code generated by their tools satisfies a few interesting properties, it makes advanced static analysis possible, which can significantly reduce the cost of vetting applications. Now that can be considered as a competitive advantage, or as a very nice way to balance automation and human discretion very efficiently.
Tweets that mention 2011: The year of mobile malware? Nope. â€“ On the road to Bandol -- Topsy.com wrote:

January 22, 2011 at 11:08

[…] This post was mentioned on Twitter by David Rogers, Eric Vetillard. Eric Vetillard said: Will 2011 be the year of mobile malware? I don't think so, and static analysis will help http://bit.ly/dYnb1F (#MoCaSa followup) […]

On the road to Bandol