Smart-cards vs USB tokens: esoteric form factors (part III)

[continued from part II]

A smart-card is a smart-card is a…

Once we take away the literal interpretation of “card” out of smart-card, we can see the same underlying IC appearing in a host of other form factors. Here are some examples, roughly in historical order of appearance.

  • Trusted Platform Module or TPM. Defined by the Trusted Computing Group in the early 2000s, TPM provides separate hardware intended to act as root of trust and provide security services such as key management to the primary operating system. Their first major application was Bitlocker full disk-encryption in the ill-fated Windows Vista operating system. In an entertaining example of concepts coming full-circle, Windows 7 introduced the notion of virtual smart-cards which leverage the TPM to simulate smart-cards for scenarios such as remote authentication.
  • Electronic passports, or Machine Readable Travel Documents (MRTD) as they are officially designated by the standardizing body ICAO. This is an unusual scenario where NFC is the only interface to the chip; there is no contact plate to interface with vanilla card readers.
  • Embedded secure elements on mobile devices. While it is possible to view these as “TPM for phones” the integration model tends to be very different. In particular TPMs are tightly integrated into the boot process for PC to provide measured boot functionality, while eSE historically has been limited to a handful of scenarios such as payments. Nokia and other manufacturers experimented with SEs in the late 2000s, before Google Wallet first introduced  it to the US market at scale on Android devices. Apple Pay followed suit a few years later. SE is effectively a dual-interface card. The “contact” interface is permanently connected to the mobile operating system (eg Android or iOS) while the contactless interface is hooked to the NFC antenna, with one caveat: there is an NFC controller in the middle actively involved in the communication. That controller can alter communications, for example directing traffic either to the embedded SE or the host operating system depending on which “card” application is being accessed. By contrast a vanilla card usually has no controller standing between antenna and the chip.

One vulnerability, multiple shapes

An entertaining example of shared heritage of hardware involves the ROCA (Return Of Coppersmith Attack) vulnerability from 2017. Researchers discovered a vulnerability in the RSA key generation logic in Infineon chips. This was not a classic case of randomness failure: the hardware did not lack for entropy for a change. Instead it had a faulty logic that over-constrained the large prime numbers chosen to create the RSA modulus. It turns out that moduli generated as the product of such primes were vulnerable to ancient attack that allowed efficient factoring, breaking the security of such keys. This one bug affected a wide range of products all based on the same underlying platform:

One bug, one secure hardware platform, multiple manifestations.

Beyond form-factor: trusted user interface

There are some USB tokens with additional functionality that at least theoretically improve security compared to using vanilla cards. Most of these involve the introduction of a trusted user interface, augmenting the token with input/output devices so it can communicate information with the user independently of the host. Consider a scenario where digitally signing a message authorized the in transfer of money to a specified bank account. In addition to securing the secret key material that generates those signatures, one would have to be very careful about the messages being signed. An attacker does not need to have access to the key bits to wreak havoc, although that is certainly sufficient. They can trick the authorized owner of the key to sign a message with altered contents, diverting funds into a different account without ever gaining direct access to the private key.

This problem is difficult to solve with standard smart-cards, since the message being signed is delivered by the host system the card is attached to. On-card applications assume the message is correct. With dual-interface cards, there are some kludgy solutions involving the use of two different hosts. For examples the message can be streamed over contact interface initially, but must be confirmed from again over contactless interface. This allows using a second machine to verify that the first one did not submit a bogus message. Note that dual-interface is necessary for this, since the card can not otherwise detect the difference between initial delivery and secondary approval.

A much better solution is to equip the “card” with UI that can display pertinent details about the message being signed along with a mechanism for users to accept or cancel the operation. Feitian LCD PKI token is an example of a USB token with exactly that capability. It features a simple LCD display and buttons. On-board logic parses and extracts crucial fields from messages submitted for signing, such as amount, currency and recipient information in the case of a money transfer. Instead of immediately signing the message, the token displays a summary on the display and waits for affirmative approval from the user via one of the buttons. (Those buttons are physically part of the token itself. Malicious code running on the host can not fake a button press.) Similar ideas have been adopted for cryptocurrency hardware wallets, with one important difference: most of those wallets are not built around a previously validated smart-card platform. The difference is apparent in the sheer number of trivial & embarrassing vulnerabilities in such products that have long been eliminated in more mature market segments such as EMV.

The challenge for all of these designs is that they are necessarily application specific, because the token must make sense of the message and display summarize it in a succinct way for the user to make an informed decision. For all the time standards groups have spent putting angle brackets (XML) or curly braces (JSON) around data, there is no universal standard format for encoding information, much less its semantics. This requires at least some parsing and presentation logic hard-coded in the token itself, in a way that can not be influenced by the host. Otherwise if a malicious host can influence presentation on display, it could also alter the appearance of an unauthorized transaction to appear benign, defeating the whole point of getting having a human in the loop to sanity check what gets signed. Doing this in a general and secure way outside narrow applications such as cryptocurrency is an open problem.

CP