JC101-12C: Defending against attacks

UPDATED (05/06/08): Fixed problem with loops that zapped examples.

UPDATED (06/06/08): Fixed some bugs.

In the previous entry, we have looked at a few common attacks on smart cards. In this one, we will look at possible defenses against such attacks. Instead of applying them to our example, we will look at one simple example, of which we will write several versions, more or less following its evolution over time. This simple function will be the standard PIN comparison method. The naive version can be written as follows:

  boolean verify(byte[] buffer, short ofs, byte len)
  {
    // No comparison if PIN is blocked
    if (triesLeft < 0)
      return false ;

    // Main comparison
    for(short i=0; i < len; i++)
      if (buffer[ofs+i] != pin[i])
      {
        triesLeft-- ;
        authenticated[0] = false ;
        return false ;
      }

    // Comparison is successful
    triesLeft = maxTries ;
    authenticated[0] = true ;
    return true ;
  }

Well, I believe that this version is at least correct. It will compare a proposed value to the refernce PIN, it will count the number of attempts, and refuse to perform further comparisons after a fixed number of attempts. Of course, presenting the right PIN will reset all that.

However, in terms of security (as defending agaisnt attacks), this thing is quite poor. There are many improvement to be made, in order to defend agaisnt many attacks. If you want to do an exercise, try to think of improvements before you continue with the rest of the entry.

As I mentioned earlier, the idea is here to give some kind of a historical approach. One of the early attacks was based on basic physics. On early smart cards, there were two power supplies: VCC for standard use, and VPP for operations like EEPROM programming that requires more intensity (or voltage, physics has never been my thing). Still, the attack simply consisted in coating the VPP contact with a layer of insulation material (nail varnish seems to do the trick). Without hits supply of power, no more EEPROM programming, and counters never decrease.

This attack occurred on early phone cards (the ones that were used on public phones, not SIM cards). It provided their users with an unlimited supply of phone minutes. This lasted until some checks were performed after the counter change in order to ensure that the change actually occurred.

Such attacks don't apply to Java Card, since the Java Card platform needs to guarantee the atomicity of the persistent updates. Of course, VPP is not used on modern cards, which is another reasons for this attack not to work. I will not describe other historical attacks, but this example shows that defenses are built slowly. The first smart cards did not resist any modern attacks, and the defenses have been built as sophisticated attacks were designed and impemented.

Timing attacks

Still, let's start with a rather old attack, the timing attack. This attack requires some equipment, typically an oscilloscope. The attack consists in monitoring the card's power consumption during a sensitive computation. Some operations that consume a lot of energy are easy to spot, such as memory updates and cryptographic operations. These operations can be used as a reference mark for timing other operations.

In our example, we can time the PIN comparison itself. In all cases, the comparison ends with a persistent update, which should be easy to notice. The weakness is here very simple: the duration of the comparison depends on the number of correct bytes in the PIN code. Let's apply it to a 4-byte PIN code:

If the first byte is incorrect, then the first comparison fails.
If the second byte is incorrect, the comparison loops once, and the second comparison fails.
If the third byte is incorrect, the comparison loops twice, and the third comparison fails.
If the fourth byte is incorrect, the comparison loops three times, and the third comparison fails.
If all bytes are correct, the comparison does four complete loops before a write attempt.

It is quite easy to make the difference between the four first cases, but it is more difficult to make the difference between the two last cases, because the difference in terms of timing is quite subtle (just a loop index increment and comparison). This still represents several instructions, and a few cycles, so timing remains largely possible.

Now, we have a good vulnerability: we are able to spy on the comparison and to determine whether or not it succeeded. The next question is: how can I exploit this vulnerability? This is a very important question. We find vulnerabilities of various kinds in most evaluations; nevertheless, most of these vulnerabilities don't lead to a rejection of the evaluated product, because they can't be exploited by an attack.

In this particular case, exploitation is really easy. The idea is to cut the power right after the beginning of the persistent write operation at the end of the comparison. If an attacker does that, the write operation will fail. And guess what? the counter will not be decreased. With this attack, we get an unlimited number of tries, since the try counter is not decreased. If we combine it with the fact that we are able to time the operation precisely and determine whether or not the comparison succeeded, we have a good way to guess PIN codes.

Now, we have a full attack, not just a vulnerability. We can start assessing it. Here, we will look at the time it takes to perform the attack. If we assume a 4-digit PIN, there are 10,000 possible values. Guessing a value out of 10,000 will take on average 5,000 tries. If we assume that each attack (resetting the card, selected the appropriate application, sending the command to be attacked, cutting power, analyzing the result) takes about 30 seconds, we can compute that it will take 2,500 minutes on average (i.e., 41 hours and 40 minutes) to break a PIN code. Two days to break a PIN on a card is quite long; if the card was stolen, its owner is quite likely to have cancelled it by then.

The next step is to optimize our attack. If we look at the vulnerability, we know that we can measure not only the fact that a PIN is good or bad, but also the number of correct bytes. If we assume an encoding with a digit per byte, there are 10 possible values per byte, and it will take on average 5 tries to guess the first digit. It will then take 5 more tries to guess the second one, 5 more for the third one, and 5 more for the last one. If we sum this up, we can guess a PIN code on an average of 20 tries. With the same performance assumptions as above, it will take about 10 minutes to break a PIN. Finally, we hava a complete and well-optimized attack.

Let's now switch role and build a defense against this attack. The first countermeasure is quite obvious: we must ensure that the timing vulnerability cannot be exploited to get an infinite number of tries. The countermeasure is rather simple: we can simply decrement the try counter before to make the comparison, like shown below:

  boolean verify(byte[] buffer, short ofs, byte len)
  {
    // No comparison if PIN is blocked
    if (triesLeft < 0)
      return false ;

    // First decrements the number of remaining tries
    triesLeft-- ;

    // Main comparison
    for(short i=0; i < len; i++)
      if (buffer[ofs+i] != pin[i])
      {
        authenticated[0] = false ;
        return false ;
      }

    // Comparison is successful
    triesLeft = maxTries ;
    authenticated[0] = true ;
    return true ;
  }

With this piece of code, the timing attack remains possible, but the number of tries will be respected: 3 only. If we consider that there are 10,000 possible values, the probability to find the correct one is very low: about 1 in 3000. This means that we need over 3000 stolen cards to get one card with its PIN. Not an efficient attack. However, if we think of the optimized version, which guesses PIN codes one digit at a time, the probability to find the correct PIN in 3 guesses is much better, ad issuers may still consider it a risk.

The solution the is to do a constant-time comparison, in which the comparison is always performed on all bytes. It is fairly easy to write:

  boolean verify(byte[] buffer, short ofs, byte len)
  {
    // No comparison if PIN is blocked
    if (triesLeft < 0)
      return false ;

    // First decrements the number of remaining tries
    triesLeft-- ;

    // Main comparison
    boolean equal = true ;
    for(short i=0; i < len; i++)
      equal = equal && (buffer[ofs+i] != pin[i]) ;

    if (!equal)
    {
      // Comparison failed
      authenticated[0] = false ;
      return false ;
    } else {
      // Comparison is successful
      triesLeft = maxTries ;
      authenticated[0] = true ;
      return true ;
    }
  }

We now have a constant time comparison, or close to it. I am not claiming that there are no differences at all in a power trace, because, for instance, the data are different. But the attacker's task is now difficult, where it was rather easy before to add the countermeasure.

Observation attacks

This is going to be easy. Apart from the good old timing attack described above, there are no really interesting observation attacks on a PIN code, for at least two reasons. First, there is not much to observe. But more importantly, the operation is protected by a counter. Counters are a very efficient weapon against most attacks of all kinds, for two reasons:

The setup of an attack is is a fairly long trial-and-err process, in which the attack fails many times before to succeed. A counter can severely hinder such attacks.
Most attacks don't have a 100% success rate. Typical success rates are around 10% or 20%. If you have to combine two such attacks, your success rate is already around 1% to 4%. A counter with a small threshold can then severely reduce the likelihood of a successful attack on the field.

If we thing of the PIN comparison, there are actually things to observe. A good SPA analysis could provide information about the Hamming weight of the value compared, or it could even figure out the difference between the Boolean values true and false. Without a counter, these attacks could be worth trying.

Just to update the example, we will here clear the Hamming weight issue, by using two constants that are different from true and false, and happen to have the same Hamming weight. Let's first define our constants:

  public final static short BOOL_TRUE  = (short)0x5a5a ;
  public final static short BOOL_FALSE = (short)0xa5a5 ;

Now, let's update the code:

  boolean verify(byte[] buffer, short ofs, byte len)
  {
    // No comparison if PIN is blocked
    if (triesLeft < 0)
      return false ;

    // First decrements the number of remaining tries
    triesLeft-- ;

    // Main comparison
    short equal = BOOL_TRUE ;
    for(short i=0; i < len; i++)
      equal = equal && (buffer[ofs+i] != pin[i]) ;

    if (equal == BOOL_TRUE)
    {
      // Comparison is successful
      triesLeft = maxTries ;
      authenticated[0] = true ;
      return true ;
    } else {
      // Comparison failed
      authenticated[0] = false ;
      return false ;
    }
  }

Two small details here:

We are using the short type, because Java Card Classic uses a 16-bit virtual machine. Computations on boolean, byte and short are performed on 16-bit values. So, we may as well use all the bits.
I have reversed the test, because of an inherent flaw of the C language, inherited by Java. The value false is 0, and the value true is anything else. If anything goes wrong in a computation (intentionally or not), a false turns into a true. With an explicit test on an explicit true value, this flaw disappears.

Finally, what I have done here is not actually very efficient against observation attacks. We will see that it will be useful for other things. And also, this is an public introductory tutorial, and the really efficient countermeasures are neither obvious nor public.

Perturbation attacks

Perturbation attacks are always an issue, as soon as you are doing something with your chip, like running code or using data. No process is inherently safe from these attacks. In addition, few countermeasures are possible, and suicide (of the application, or of the card) often happens to be the only thing that an application can do. We will see that even this requires a significant effort. But before that, the application needs to be able to detect that it is actually being attacked; without detection, nothing can happen. Two levels of mechanisms are available for detection:

Hardware detectors. All modern chips include detectors for the most commonly used attacks (glitch, light, etc.). These detectors are monitoring the attacks themselves, or their direct effect on the silicon.The good news for Java developers is that they don't have to worry about this part, which is taken care of by the underlying system. The bad news is that these detectors cannot be perfected, so attackers can circumvent them.
Integrity checks and redundancy. That part is under the direct responsibility of the Java Card developer. The idea is here to detect the effect of a perturbation. The good news is that integrity checks do work if they are well written. The bad news is that they introduce redundancy in the application, which can be very/too costly for a Java Card application.

When thinking about adding integrity protections to a Java Card application, one has to think about the possible perturbation attacks, and to decide against which potential attacks to defend. Here is a non-exhaustive list of potential effects of good perturbation attacks:

Making the chip believe that a value read from EEPROM is 0.
Replacing a value read from EEPROM with a random value.
Reading from a random memory address instead of the expected address.
Writing a fixed/random value instead of the expected value in EEPROM
Forcing a test to go in one direction or another, independently of the result.
Skipping a method invocation.
Modify a key bit during a RSA computation, in order to get a false and interesting result.

Of course, it is hard to defend against all these attacks at all points of the program. This means that, in the development process, one needs to decide which attacks to defend against at which point of the program. And let's face it, this requires a bit of experience. Basically, the idea is to decide what an attacker can do. First, we will consider an attack on memory read operations.

The countermeasure against such attacks is simple: redundancy. Instead of trusting immediately a value that comes from EEPROM, we will perform additional checks. The most common response is to include some kind of control value in EEPROM, which will also be read and compared to the original value.

Let's apply this on the first comparison on triesLeft. In the simplest instance, we can duplicate the value of triesLeft in a variable named triesLeftBackup. Here is a typical first cut at it:

    // First checks the integrity of the variable
    if (triesLeft != triesLeftBackup)
      takeCountermeasure() ;

    // No comparison if PIN is blocked
    if (triesLeft < 0)
      return false ;

Is this correct? No. It provides adequate protection against modification of EEPROM between the last write and the current read, but it does not protect against attacks on the read operation itself. An attacker will most likely not even notice the countermeasure, since he will attack the second read operation from EEPROM (in the test). In order to adequately protect against perturbation attacks, the protected value must only be read once, and then stored in a local variable:

    // First checks the integrity of the variable
    byte tl = triesLeft ;
    if (tl != triesLeftBackup)
      takeCountermeasure() ;

    // No comparison if PIN is blocked
    if (tl < 0)
      return false ;

This means that we definitely make a difference between RAM and EEPROM, by assuming that reading from RAM is safe, and that attackers can't successfully perturb it. A less optimistic view would be to say that, if attackers were successful in perturbing RAM read operations, software countermeasures would not be of any effect anyway, so we don't need to bother about it.

Let's now extend the protection to the entire method. The code becomes as follows:

  boolean verify(byte[] buffer, short ofs, byte len)
  {
    // First checks the integrity of the variable
    byte tl = triesLeft ;
    if (tl != (short)(~triesLeftBackup))
      takeCountermeasure() ;

    // No comparison if PIN is blocked
    if (tl < 0)
      return ;

    // First decrements the number of remaining tries
    JCSystem.beginTransaction() ;
      triesLeft = --tl ;
      triesLeftBackup++ ;
    JCSystem.commitTransaction() ;

    // Verifies the new value
    if (triesLeft != (short)(~triesLeftBackup))
      takeCountermeasure() ;

    // Main comparison
    short equal = BOOL_TRUE ;
    for(short i=0; i < len; i++)
      equal = equal && (buffer[ofs+i] != pin[i]) ;

    if (equal == BOOL_TRUE)
    {
      // Comparison is successful
      // Reset the remaining tries to the max
      JCSystem.beginTransaction() ;
        triesLeft = maxTries ;
        triesLeftBackup = (byte)(~maxTries) ;
      JCSystem.commitTransaction() ;

      // Verifies the new value
      if ( (triesLeft != (short)(~triesLeftBackup)) ||
           (triesLeft != triesLeftBackup) )
        takeCountermeasure() ;

      authenticated[0] = true ;
      return true ;
    } else {
      // Comparison failed
      authenticated[0] = false ;
      return false ;
    }
  }

There are at least three things to notice on that code:

We have used a bitwise negation on the backup value in order to make it stronger. If an attacker is only capable of getting some value, he will also need to be able to get its negation, which could be a problem.
A transaction is used around the update, because the two values that are updated need to be update together; otherwise, we fall into the countermeasure.
We have not protected the integrity of the maxTries variable as we should have. This is a voluntary decision, because we have assumed that perturbation attacks on this variable would be more difficult to exploit (in particular since this value is only used after presenting a correct PIN.

Of course, the method starts to be complicated. Before stopping, we can add a last countermeasure, which is used to verify that the execution of code is not perturbed. The idea is very simple: store a counter in a local variable, and increment it at or around every important operation. The idea is that, if an attacker perturbs an operation (by attempting to skip it), it will also skip the increment, and the attack will end up being detected. Here is our method after that little treatment:

  boolean verify(byte[] buffer, short ofs, byte len)
  {
    // Initializes the step counter
    short stepCounter = INITIAL_COUNTER ;
    // First checks the integrity of the variable
    byte tl = triesLeft ;
    stepCounter++ ;
    if (tl != (short)(~triesLeftBackup))
      takeCountermeasure() ;
    stepCounter++ ;

    // No comparison if PIN is blocked
    if (tl < 0)
      return false ;
    stepCounter++ ;

    // First decrements the number of remaining tries
    JCSystem.beginTransaction() ;
      triesLeft = --tl ;
      stepCounter++ ;
      triesLeftBackup++ ;
    JCSystem.commitTransaction() ;
    stepCounter++ ;

    // Verifies the new value
    if (triesLeft != (short)(~triesLeftBackup))
      takeCountermeasure() ;
    stepCounter++ ;

    // Main comparison
    short equal = BOOL_TRUE ;
    stepCounter++ ;
    for(short i=0; i < len; i++)
      equal = equal && (buffer[ofs+i] != pin[i]) ;
    stepCounter++ ;

    if (equal == BOOL_TRUE)
    {
      // Comparison is successful
      // Reset the remaining tries to the max
      stepCounter++ ;
      JCSystem.beginTransaction() ;
        triesLeft = maxTries ;
        triesLeftBackup = (byte)(~maxTries) ;
      JCSystem.commitTransaction() ;
      stepCounter++ ;

      // Verifies the new value
      if ( (triesLeft != (short)(~triesLeftBackup)) ||
           (triesLeft != triesLeftBackup) )
        takeCountermeasure() ;

      stepCounter++ ;
      authenticated[0] = true ;
      if (stepCounter == (short)(INITIAL_VALUE+11) )
        return true ;
    } else {
      // Comparison failed
      stepCounter++ ;
      authenticated[0] = false ;
      if (stepCounter == (short)(INITIAL_VALUE+9) )
        return false ;
    }

    // Should have returned at this point
    takeCountermeasure() ;
  }

Now, if we remember that this method simply compares implements a PIN comparison, we can see a part of the problem: adding security features to a Java Card program is a difficult task. We have more than doubled the size of the program, and we have most likely made it a much slower as well.

Of course, there are other attacks, there are other countermeasures (some of them more sophisticated and more subtle than the ones described here). There are even some platforms that offer specific countermeasures through proprietary libraries. And, one always needs to remember that there is no silver bullet. The only thing that we can teach in a tutorial like here is that the essential thing is to learn about the attacks; then, the countermeasures are often quite simple to design and implement. For the details, nothing can replace practical experimentation.

General countermeasures

To conclude this tutorial post, we need to say a few words about the general countermeasures, i.e., the part of security that is not specific to a given access. We will here discuss two small points:

How to react to the detection of an attack. We have mentioned a takeCountermeasure() method, but we haven't instantiated it. However, the kind of reaction i crucial for the application's security.
How to expect the unexpected. The effect of attacks often is to trigger an unexpected behavior. The application should not be taken by surprise in such cases.

React to attacks

There are many possible reactions to the detection of an attack. Among these reactions, we have:

Do nothing. The command fails, but the attacker can immediately issue another command. This is not recommended, because it does not sufficiently hinder attacks.
Reset or mute the card. Doing this has two objectives: first, it resets all transient data, and makes it impossible to spy on. Then, it is time-consuming. When a card is muted, an attacker is likely to waste at least 30 seconds before to be able to send another command. This may sound short, but it is efficient against attacks like DPA attacks, which need to run a large number of commands. Therefore, this countermeasure may be efficient to reduce the practical applicability of an attack, but it is also unlikely to deter laboratory attackers, who design attacks.
Block or terminate the card. This is the most typical kind of countermeasure, in which the card is destroyed. It is a very good deterrent, but one must be careful not to trigger some countermeasures without a good reason: having a million cards committing suicide because of a software glitch is not something to be proud of.
Block or terminate the application. This is an interesting variant of the "terminate card" countermeasure, mostly useful on applications that are not a card's default application. In such cases, it makes sense to terminate an application witout terminating the card itself.

One also needs to be careful when implementing these countermeasures, as attackers are also able to detect them and avoid them (in particular the more drastic ones). In particular, the termination of a card must be both very fast and very concealed. Some processors propose very fast card termination countermeasures, which destroy the persistent memory in a very short time. However, such countermeasures cannot be concealed, as they consume a significant amount of memory.

Naturally, there are more choices than the ones described above. For instance, it is possible to combine counters to these reactions, or even to combine the reactions. Here are a few examples of policies:

Mute the card and increment a counter. When the counter reaches a threshold value, terminate the card.
Mute the card and increment an application-level counter and a platform-level counter. If the application counter reaches a threshold, terminate the application. If the platform counter reaches a threshold, terminate the card.

Such policies are to be discussed with issuers. On some platforms, proprietary APIs allow developers to combine their attack detection mechanisms with the platform's own mechanisms, providing even more flexibility.

Expect the unexpected

Attacks are usually not taken into account in an application's functional specification. This means that the effect of attacks is not taken in consideration by the application. We will here consider this at two different levels.

First, let's start by something practical. If an application is targeted by a perturbation attack, unexpected things may happen. We have already taken some steps against successful attacks, but all attacks will not be successful. If the behavior of the wrong instruction is modified, one of the likely outcomes is an exception, as the virtual machine will detect a problem. This exception can be caught and processed, or it can be ignored, which then means that it will be caught by the JCRE, which will return a 6F00 status word. Since this status word (unknown error) is not expected in most cases, it may be interesting to catch these exceptions, especially around sensitive operations.

Then, we can continue by a more indirect effect of attacks. If your application manages a secret, it may be targeted by attackers, who will try to disclose your secret through side-channel attacks. In terms of design, it is essential to think about the behavior of your application (at a global level) once this secret has been disclosed.

If a single breach makes your security collapse, then your system is too brittle, and it needs to be reinforced in order to sustain more attacks. For instance, think about payment cards. A while ago, I experienced my financial institution's backup countermeasures while traveling in four countries in 2 days (thanks to bad plane connections), and making purchases in every country, plus an online purchase. My card got canceled by risk management. It was not convenient for me, and there were most likely more intelligent ways to manage the issues. Nevertheless, I did something uncommon (buying in four countries within a few hours), which fits much better the pattern of "stolen card" than the pattern of "normal use". If this had been a real attack, they would have detected it, without knowing anything about the card itself.

This is the kind of things that all applications should do, at the appropriate scale. Security measures should be pervasive enough to be hard to avoid in the case of a fraud.

3 Comments

a reader wrote:

May 16, 2008 at 13:03

Code samples with a for loop statement are incomplete. May be problem with ‘>’ character not converted to > ?

Thanks.
Emilio Wuerges wrote:

June 5, 2008 at 20:09

Even if your card is not vulnerable to cross site scripting, your blog looks like it is.
Like “a reader” posted, try parsing the input of the coments.

But hey, your blog is really unique. It is the best source about javacard outside the specifications I saw.
Thanks for the hard work!
Eric VÃ©tillard wrote:

June 5, 2008 at 23:20

First, thanks for the compliment.

Now, about the ” < ” issue, I believe that it is now fixed (these HTML-like interfaces are sometimes a bit strange). There are also quite a few other bugs in this code, and my original intention was to fix everything at the same time. However, I hadn’t measured the depth of the comment. So now, there is a good opportunity to find errors in this code (no I haven’t even compiled it yet). Apparently, there are a few functional glitches in the application’s PIN code, but these will get fixed in a future post about testing. Thanks for your patience!

On the road to Bandol