Using the secure element on Android devices (1/3)

As earlier posts noted many Android devices in use have an NFC controller and embedded secure element (SE). That SE contains the same hardware internals as a traditional smart card, capable of performing  similar security-sensitive functionality such as managing cryptographic keys. While Google Wallet is the canonical application leveraging the SE, for  contactless payments, in principle other use cases such as authentication or data encryption can be implemented with appropriate code running on the SE. This is owing to the convenient property that in card emulation, the phone looks like a regular smart card to a standard PC, ready to transparently substitute for any scenario where traditional plastic cards were used. This property is easiest to demonstrate on Windows, but owing to the close fidelity of PC/SC API ports to OS X and Linux, it holds true for other  popular operating systems as well. All of this brings up the question of what it would take to leverage this embedded SE  hardware for additional scenarios such as logging into a Windows machine or encrypting a portable volume using Bitlocker-To-Go.

First a disclaimer: using the phone as a smart card does not require secure element involvement at all There is a notion of host-terminated card emulation: APDUs sent from another device over NFC are delivered to the Android OS for handling at the application processor, as opposed to routed to the SE and bypassing the main OS. This mode is not exposed out-of-the-box in Android even though the PN544 NFC controller used in most Android phones  is perfectly capable of it. Compliments of an open source environment, a Cyanogen patch exists for enabling the functionality. Android Explorations blog has a neat demonstration of using that tweak to emulate a smart card running a simple PKI “applet” except this applet is implemented as a vanilla Android user-mode application that process APDUs originating from external peer.

The problem with this model is that sensitive data used by the emulated smart card application, such as cryptographic keys, are by definition accessible to Android OS. That makes these assets vulnerable to software vulnerabilities in a commodity operating system, as well as more subtle hardware risks such as side-channel leaks. While Android has a robust defense-in-depth model, the secure element has a smaller attack surface by virtue of its simpler functionality and built-in hardware tamper resistance.

With that caveat out-of-the-way, there are two notions of “using the secure element” on Android:

  • Exchanging APDUs with SE, preferably from an Android application running on the same phone.
  • Managing the contents of the SE. In particular provisioning code that implements new functionality such as authentication with public-key cryptography required for smart card logon.

Let’s start with the easy piece first. The embedded SE sports two interfaces. It can be accessed via contact or host interface from Android applications, as well as a contactless interface over NFC, used by external devices such as point-of-sale terminals or smart card readers.  In principle applets running on the SE can detect which interface  they are accessed from and discriminate based on that. Fortunately most management tasks including code installation can be done over the NFC interface using card emulation mode, without involving Android at all. That said it is often more convenient and natural to perform some actions (such as PIN entry) on the phone itself. That calls for an ordinary Android application to access the embedded SE over its contact interface.

Consistent with the Android security model, such access is strictly controlled by permissions granted to applications. Unlike other capabilities such as network access or making phone calls, this is not a discretionary permission that can be requested at install time subject to user approval. In Gingerbread the access model was based  on signature of the calling application; more precisely only applications signed with the same key as the NFC stack. Starting with ICS a new model was introduced based on white-listing code signing  certificates. There is an XML file on the system partition containing list of certificates. Any APK signed with one of these certificates is granted access to the “NFC execution environment,” a fancy term for referring to the embedded SE, and can send arbitrary APDUs to the any applet present on the SE. That includes the special Global Platform card manager, which is responsible for managing card contents and installing new code on the card.

[continued]

CP

Smart card logon with EIDAuthenticate — under the hood

The architecture of Windows logon and its extensibility model is described in a highly informative piece by Dan Griffin focusing on custom credential providers. (While that article dates back to 2007 and refers to Vista, the same principles apply to Windows 7 and 8. ) The MSDN article even provides code sample for a credential provider implementing local smart card logon– exactly the functionality of interest discussed in the previous post. A closer look at the implementation turns up one of the unexpected design properties: they leverage built-in authentication schemes which are in turn built on passwords. Regardless of what the user is doing on the outside such as presenting a smart card with PKI capabilities, at the end of the day the operating system is still receiving a static password for verification. EIDAuthenticate follows the same model. The tell-tale sign is a prompt for existing local account password during the association sequence described earlier. FAQ on the implementation says as much:

A workaround is to store the password, encrypted by the public key and decrypted when the logon is done. Password change is handled by a password package which intercepts the new password and encrypts it using the public key stored in the LSA.

In plain terms, the password is encrypted using the public key located in the certificate from the card. The resulting ciphertext is stored on the local drive. As the smart card contains the corresponding private key, it can decrypt that ciphertext to reveal the original password, to be presented to the operating system just as if the user typed it into a text prompt. (The second sentence about intercepting password changes and re-encrypting the new password using the public key of the card is a critical part of the scheme. Otherwise smart card logon would break after a password change because the decrypted password is no longer valid.)

This is decidedly not the same situation as enterprise use of smart cards. Domain logon built into Windows does not use smart cards to recover a glorified password. Instead it uses an extension to Kerberos called PKinit. Standardized by RFC4556, pkinit bootstraps initial authentication to the domain controller using a private key held by the card. Unlike the local equivalents, there is no “password equivalent” that can be used to complete that step in the protocol. While smart cards may coexist with passwords in an enterprise (eg depending on security policy, some “low security” scenarios permit passwords while sensitive operations require smart card logon) these two modes of authentication do not converge to an identical path from the perspective of the domain controller. For example the company can implement a policy that certain users with highly privileged accounts such as domain administrators, must log in with smart cards. It would not be possible to work around such a policy by somehow emulating the protocol with passwords.

It is tempting to label EIDAuthenticate and solutions in the same vein as not being “true” smart card logon because they degenerate into passwords downstream in the process. While that criticism is accurate in a strict sense, the more relevant question is how these solutions stack up compared to using plain passwords typed into the logon screen each time. It’s difficult to render a verdict here, because the risks/benefits depend on the threat model. In particular, for stand alone PCs the security concerns about console logon, eg while sitting in front of the machine, are closely linked to security of the data stored on the machine. The next post in the series will attempt to answer this question.

CP

Smart card logon without Active Directory

Ever since the prescient Wired declared that passwords are passé, a natural question comes up around exploring alternative authentication schemes. While Windows has historically boasted smart card support since the days of Win2K,  the catch is that capability gets classified under “enterprise feature.” This is short hand for medium or large company, with managed computing environment and dedicated IT staff. Translated into technical  terms, the “managed” requirement implies an Active Directory installation, with centralized servers responsible for administering resources remotely and individual user PCs joined to a domain under the oversight of these servers.

At first blush that rules out consumer scenarios. Most home users do not even have the right edition of Windows to join a domain if one existed. For example, anyone running Windows 7 Home Basic or Home Premium editions is out of luck. For users on a more advanced version of  the OS that meets the prerequisites, a more fundamental problem looms: it requires a Windows server class machine to create a domain. At best home PCs are likely to be members of a home group, introduced in Windows 7. Home groups can be created without a dedicated server and function as rudimentary AD domains, with support for cross-machine authentication and file sharing. But they still lack the more advanced capabilities of an enterprise domain including support for strong authentication.

Shifting our attention to third-party solutions, the picture becomes more complicated:

  • Several custom schemes exist for smart card logon to stand alone computers
  • Caveat emptor: it turns out the benefits for doing that are marginal for the typical home user scenario, when the machine only permits local access.

Control panel screenshot showing new itemIn this post we tackle the first point. EIDAuthenticate is a popular example of freely available third-party solution that permits local logon using a wide range of card types including the European eID cards and US PIV standard. EIDAuthenticate is based on the idea of associating a smart card with an existing local account that has been already setup with a password. After completing installation, a new control panel option appears for configuring smart-card logon, as shown in the screenshot on the right.

Selecting this new option brings up a window with three options, one of which is initially grayed out first time around (“Disable smart card logon”) since the functionality is already disabled. Assuming that we already have a compatible smart card such as PIV, we can choose the first option and follow the setup sequence:

  • Insert/tap a compatible smart card
  • Choose one of the X509 certificates located on the card
  • Fix any certificate validation errors.  Since there is no prior trust relationship with Active Directory is assumed, the certificate on the card could have been issued by a certificate authority that is not recognized by the machine.
  • Enter current Windows password for the user account
  • [Optional]  Dry run, by simulating a login with the selected card to verify that everything is working as intended. The experience here depends on the card profile. For example in the case of PIV cards, a dialog will be displayed to collect the PIN.
  • On successful completion of the dry run, the control panel displays a confirmation page.

Configure smart-card eID_CheckCertificate SmartCardCheckComplete

After this association is created between a particular card (more precisely, a certificate on that card since there can be more than one usable) that card can be used to login to Windows by selecting one of the “Other credentials” or “Insert smartcard” buttons on the logon screen, or simply inserting/tapping a card to implicitly select the smart card path. Case in point: the screenshots from November post on using Android devices as smart cards were captured on a machine with EIDAuthenticate installed.

[continued]

CP

Inspecting communications from a smart card (1/2)

A previous post featured communication logged in transit to/from the secure element of an Android phone tapped against an NFC reader attached to a Windows 7 machine. This article provides a brief description of how such traces can be obtained, providing a glimpse into the internals of the smart card stack.

To recap the Windows architecture detailed earlier, here is a simplified depiction of  code paths invoked when an application is using a smart card:

Communication flow to a smart card

There are a lot of moving pieces here. Starting from the top:

  1. An application interested in performing a cryptographic operation. Typically this is user land application such as IE or Outlook, although it could be part of the operating system as in the case of Bitlocker disk encryption or winlogon process for login to the OS.
  2. The application calls into platform cryptographic API. To complicate matters, there are two of these in recent versions of Windows: the “legacy” but-not-quite-deprecated CAPI which has existed since NT4 days and the shiny new CNG introduced in Vista.
  3. Depending on which API the application targets, the code paths diverge. CAPI and CNG have different extensibility mechanisms for abstracting cryptographic hardware. The former defines cryptographic service providers (CSP) while the latter has key storage providers (KSP).
  4. But the paths converge again downstream as the smart card module associated with CAPI and CNG itself defines an extensibility scheme based on the same notion of smart card mini drivers.
  5. These drivers communicate with the card using PC/SC  (Personal Computer Smart Card) API. This is a standard initially developed by MSFT and later ported to OS X and Linux with an open-source clone called pcsclite.
  6. Transceive operation in PC/SC in turn results in data being funneled to a device driver for the card reader. Typically this is the CCID class driver in Windows, although some manufacturers also provide drivers unique to their particular hardware.
  7. This kernel mode driver communicates to the card reader hardware using a channel such as USB.
  8. Finally the reader communicates with the card via a physical layer, in this case NFC.

When the objective is observing the way Windows interacts with cards in a given scenario, there is a sweet spot in this stack: between the card driver and PC/SC.  Above that layer, there are no APDUs in the picture, only higher level semantics such as “verify PIN.” Below PC/SC layer, APDUs are encapsulated in lower level protocols which typically fragment them and envelop the content in additional metadata, which will eventually get stripped away by the reader when final APDU contents are being delivered to the card. While one could attempt to reconstruct the final contents closer to the source, this would require special hardware such as the Proxmark for intercepting NFC transmissions or the Smart Card Detective for contact cards. By contrast watching APDUs at the PC/SC layer can be done entirely in software, by intercepting a couple of entry points in the DLL that implement PC/SC.

Side note: certain cross-platform applications such as Firefox do not use the Windows cryptography API, and instead require alternative mechanism such as PKCS11 modules for smart card support. These applications bypass the first couple of layers in the above picture, but still rely on PC/SC for card communication. As such techniques that rely on intercepting PC/SC calls continue to work, in contrast to those operating at the higher layer.

[continued]

CP

NFC tags and authenticating wine

From the random innovations department comes an unusual application of NFC tags: authenticating rare wine bottles. According to the NFC Times story, ClikGenie demonstrated NFC-based solution to the problem of counterfeit wine. The idea is based on embedding tags in the label affixed to the bottle, along with a mobile application running on Android for scanning these tags. The description of target audience is somewhat contradictory: one section suggests that any consumer could verify the certificate of authenticity by scanning the label, but a subsequent paragraph hints at a restricted audience: “The CLIKSecure app itself is available only to designated employees, such as product inspectors.” (Or counterfeiters posing as one in order to reverse engineer the system, one might add.)

This is an interesting use case enabled by the combination of decreasing prices for NFC tags on the one hand, and increased popularity of Android devices with embedded NFC readers. It can be extended to other luxury goods subject to high incidence of counterfeiting. But the design and implementation are fraught with problems.

While technical details are sparse, the article suggests the scheme is relying on the unique ID of the tag. There in lies the first problem. NXP Semiconductors has been clear on pointing out that UID is intended for anti-collision Anti-collision refers to distinguishing between multiple tags present in the range of an NFC reader. When the reader turns on the RF field, all tags will be activated and start responding back. UID permits separating these responses. It was not intended as a way to authenticate a tag with any degree of assurance. While genuine Mifare tags do not permit overwriting the UID assigned at factory, counterfeit tags allows setting UID arbitrarily. This effectively “clones” the tag from the perspective of any application relying purely on UID.  Given that tags with cryptographic protection such as  Ultralight-C or Mifare classic are only marginally more expensive, not to mention the scanning device likely has Internet connectivity, the standard Mifare authentication could have been proxied. (Granted the original Mifare cryptography is broken, but the cost of an attack is higher than programming a bogus tag with same UID.)

The second issue is that tags can not authenticate wine per se, because it is relatively easy for the precious contents and their container to part ways. In fact it does not even authenticate the bottle, because the tag is part of a label affixed to the bottle. The article already points out that moderately sophisticated counterfeiting operations “… might remove an authentic label and place it on a bottle with a similar shape.” So the risk of replenishing an authentic bottle with cheaper wine already exists, and could impact the resale market. Suppose a bottle is purchased by consumer Alice and a few years later after it has appreciated in value, she wants to sell it back. Is it the original wine or was it replaced? Meanwhile the bottle and its label are still intact, and the tag will continue to authenticate.

At best the scheme only prevents Alice from scaling this fraud and creating many copies from one bottle. Such cloning can be detected if scanned UIDs are transmitted to an online service for reconciliation, where new entries are compared against existing ones in the database. (Although we have to account for the legitimate resale scenario where Alice scans the tag, then makes an honest sale to Bob who also insists on scanning the tag to verify the product.) Indeed the article implies the presence of such checks, involving a unique ID assigned to the phone and its GPS location. But even this reasoning is dubious, as it assumes everyone is using the application and diligently scanning every bottle. The final irony is that NFC is not required to prevent cloning when there is an online service keeping track of all observed labels. QR codes or plain serial numbers printed in digits would be equally effective. Sure they are easier to clone than NFC tags. But the cloud will notice any duplicate entries just as quickly, with no fancy hardware or mobile apps required. If anything  removing the need for specialized hardware could increase the chance of locating clones because more people are able to report on their inventory.

The choice quote:

“[…] CLIKSecure’s partners are working with customers in the luxury apparel and wine industries, though he declined to name any clients, he said, for fear of alerting counterfeiters.”

Because criminals can’t buy their own Android phones to check for tags before counterfeiting some object?

CP

Observing Windows smart card discovery in action

The previous post described how Windows picks drivers for unfamiliar smart cards. The low-level operation of that process can be observed by looking a trace of the communications occurring when the system encounteres an “unfamiliar” type of card, such as an NFC-enabled Android phone  in card-emulation mode.

First, some background on the host to card communication. The logical format for exchanging data with a smart card is the Application Protocol Data Units, or APDU for short. These can be viewed as the analog to packets in a networking protocol. These building blocks come in two variants, command and response. Usually cards operate in passive mode: the host initiates communication by sending request APDUs, cards reply with response APDUs. (There are a few exceptions to this model such as SIM application toolkit, when the card is calling the shots.) Basic structure of APDUs were defined in ISO7816 part #4. For a request APDU, there are a few mandatory bytes at the beginning with special significance defined by the standard, such as the instruction type and parameters. That header is followed by a variable length payload upto 256 bytes. The logical structure of the payload itself is completely up to the card application to define. In practice compact binary encodings such as ASN1 with BER are de rigeur. Response APDUs have even less structure. Two bytes at the end are reserved for a mandatory status word. These are preceded by an arbitrary response payload, also capped at 256 bytes. ISO7816 defines categories of status words corresponding to succesful operations as well as a litany of error codes.

For completeness, it’s worth mentioning that later updates also introduced extended length APDUs. These allow sending/receiving upto 64K of data at once. The catch is that support for extended length is not universal in card operating systems. Windows follows a least-common denominator approach and the built-in PIV/GIDS drivers do not use this feature. Neither does the discovery process, but it would have made little difference given that only small amounts of data are exchanged, as the trace reveals. In the case of the secure element in Android, support depends on the interface. It is possible to send extended length APDUs over NFC but not over wired interface from the Android side.

With that background, we can look at an APDU trace. This contains raw APDU dumps from three identical rounds of discovery on the Android secure element. A couple of differences from the driver specification jump out in the trace. The first request/response pair goes according to expectation, an attempt to select the  MSFT-defined discovery applet by its AID. Status word 6A82 means file not found in ISO7816 error codes. So far so good. But the second request/response attempts to obtain the device ID record from the non-existent applet. This is strange to say the least. If an application does not exist, it is hardly reasonable to ask that application to perform additional tasks. Instead the command was routed to the currently selected applet, which happens to be the Global Platform card manager. Luckily the card manager had no information associated with the proprietary Windows tag and returns an error corresponding to “referenced data not found” instead of returning some unrelated object that could have confused the discovery process.

Windows next proceeds with trying to select the PIV applet by its AID. Here is another divergence from the discovery spec: attempt to select the master file is skipped, in favor of directly looking for PIV. After getting another 6A82 it tries the same with GIDS by using that AID.  These well-known AIDs are documented in appendix D of the driver specification.) Because no GIDS applet exists on this stock Android device, Windows gives up at this point and falls back to using the ATR historical bytes to construct a device ID. All of this happens in less than a tenth of a second, barely perceptible compared to the time required for users to move the phone into the RF field.) While this is not reflected in the APDU trace, the OS would have checked for a driver at Windows Update using that identifier. Since there is none published for the Android embedded secure element, the discovery process concludes with the driver installation error observed in the experiment.

Next post will briefly discuss how this APDU trace was obtained.

CP

Quick primer on the Windows smart-card stack (3/3)

In the final post of this three-part series, we will discuss the discovery mechanism. The definitive reference on this topic is the smart card mini driver specification at MSDN. Here only a few points will be highlighted.

There are two slightly different notions of “discovery” involved in smart cards:

  • Mapping a card to an existing driver when the card is presented to the system. In this situation drivers are already installed but the
  • Installing an appropriate driver when a new type of card is encountered for the first time. This is the analog to “plug-and-play” technology for smart cards. PnP allows connecting a peripheral such as printer to the PC and having that device automatically get recognized as the appropriate model and configured for use. Gone are the days of manually installing software from CDs shipped by the manufacturer. (Not that hardware vendors seem to have gotten that memo, since packages usually include a CD with proprietary software that installs the drivers.)

Appendix D in the driver specification describes the steps involved in each of these processes. For the case of plug-and-play, the overarching goal is to obtain a device ID which can be used to query Windows Update for the appropriate driver corresponding to that hardware.

  1. The system first attempts to locate a proprietary smart-card plug-and-play applet on the card, with functionality defined by MSFT. If that applet exists, it can be queried for a particular object containing the device ID.
  2. If PnP applet does not exist, Windows tries selecting the master file (MF) on the card and later the EF.ATR object to derive the ID from these sources.
  3. If that also fails, the smart card stack goes into guessing mode, trying to locate PIV and GIDS applets in that order.
  4. If the card does not support either of these profiles, a final attempt is made to construct device ID from the historical bytes portion of the ATR.

For subsequent driver selection, there are three paths:

  1. By card ATR. A mapping in the registry allows associating a particular driver with all cards returning a specific. (ATR matching can be done partially, by specifying an associated bitmask. This allows covering all cards with a particular prefix for example.)
  2. If no explicit registry mapping is defined, Windows next consults a cache of ATRs for cards previously identified as having PIV and GIDS applets. (This mapping is also maintained in the registry.) In this case ATR matching is exact.
  3. In case the cache does not help, the system goes into guessing mode, trying GIDS and PIV AID in that order– note this is the opposite order used during plug-and-play installation.

One important difference in Windows 8 is that when no driver is found, the card is associated with the null driver. That is a “steady state” and requires manual override to associate the card with a different driver. By contrast Windows 7 would repeatedly attempt to perform PnP discovery on the same card.

CP

Quick primer on the Windows smart-card stack (2/3)

While the original CSP model made it possible to support smart cards, it was neither efficient or scalable. It put a significant burden on both upstream consumers and authors of each CSP. Applications had to know which CSP to load in advance– which is a tall order for smart cards, since they can be introduced at any point. In reality most applications were written on the assumption that a single CSP could service all their requirements, because the model did not permit easily mixing and matching. That meant even dedicated CSPs  had to effectively duplicate all of the functionality in the default CSP, including features that have nothing to do with smart cards. These limitations motivated MSFT to introduce a different model with the ill-fated Vista OS, eventually reaching its full potential with Windows 7.

The new model takes advantage of the fact that there is a significant amount of logic shared across smart-card CSPs– such as prompting for a PIN– can be refactored into a platform component. The new architecture could be depicted in simplified form as follows:

Simplified model of smart-cards and CAPI for Vista and above

Simplified model of smart-cards and CAPI for Vista and above

Some observations about this architecture:

  • CSP layer is still intact, but third-party CSPs are largely deprecated. While existing ones will continue to work, MSFT can at some point discourage publication of new ones due to another legacy of the crypto wars: every CSP must be signed by MSFT itself. In the logic of export restrictions, this would have prevented the bad guys from writing stronger cryptography implementations than they were entitled to according to the framework. Again the original motivation is long gone, but the restriction survived in the architecture. (Tangent: it is trivial for users to override this, with administrative privileges. First approach is to configure the machine for kernel debugging, which permits loading unsigned CSP by design. Another approach, adopted by this blogger for a past project, involves modifying the function in advapi32 responsible for loading new providers. Reverse engineering reveals the location where return value from signature verification is used in a conditional branch. Hot patching that code in memory to with NOOPs allows ignoring failed signature checks. This is best automated with a DLL injected into all processes, using the app-init DLL mechanism for example, such that existing apps can transparently load the unsigned CSP without modification.)
  • There is a single CSP responsible for all smart cards.  One more layer of indirection has been added under that CSP, branching out to individual mini-drivers responsible for a particular card edge. Vendors introducing a proprietary card edge– as opposed to simply shipping new hardware that conforms to an existing standard– are now tasked with writing smart card mini-drivers hooking into the generic smart card CSP instead of writing wholesale CSPs specific to one model of cards.
  • To complicate the picture– Vista also introduced a new cryptographic API called Crypto Next Generation (CNG) with its own abstraction called Key Storage Providers, or KSP for short. Smart cards can also be accessed via this path using the unified smart-card KSP. One convenient property is that drivers can operate as part of either stack, depending on whether the calling application uses CAPI or CNG.
  • Drivers are assumed to communicate via the PC/SC interface to the card itself, and they are invoked only after a connection to the card is established at PC/SC layer. A corollary is that cryptographic hardware that does not appear as a CCID device, such as networked HSMs or USB tokens communicating via home-brew HID interface, either have to stick with writing legacy CSP or create the appearance of a fake smart-card, complete with an associated fake smart-card reader to appease PC/SC, as in the case of virtual smart cards built on top of the TPM.
  • The adjective “mini” is relative. Counting by sheer number of APIs, card drivers have a significantly more complex interface than the corresponding CSP they replaced.
  • Two card edges are built into the OS out of the box: PIV and GIDS. Logically these appear as different card types but in reality they are implemented by a single driver, msclmd.dll. For coping with the terra incognito of card models beyond that, there exists a fairly elaborate discovery mechanisms to map a given smart card to a driver, as well as install new drivers via plug-and-play mechanism when confronted with an unfamiliar “card” such as an Android phone with embedded secure element. This will be the subject of the last post in the series.

[continued]

CP

Quick primer on the Windows smart-card stack (1/3)

Earlier posts considered the idea of using a smartphone as a smartcard replacement for security scenarios such as logical access and encryption. The upshot is that presence of an embedded secure element and support for NFC with card-emulation mode gets surprisingly close without installing any software on either side. A commodity OS such as Windows already recognizes the phone as a smart card when it is tapped against an NFC reader, and attempts to use it for traditional smart card scenarios such as login. This article takes a closer look at exactly what goes on inside Windows when interacting with the phone over NFC, a story best related through a historical perspective on cryptography in Windows.

The first systematic usage of cryptography in Windows dates back to the 1990s and Clinton administration, when crypto wars were still raging. On one side the US government lobbied for strong export restrictions, arguing that encryption made it easier for criminals to evade surveillance by law enforcement agencies. On the other side an alliance of industry, civil liberatarians, self-described cypherpunks and academics pushed for widespread commercial availability of strong cryptography. Meanwhile the sudden growth of the web and ecommerce gave rise to the first large scale use of crypto for consumers: the Secure Sockets Layer protocol or SSL was the ultimate kitchen sink, mixing all of the basic building blocks: public-key encryption, symmetric ciphers, hash functions, digital certificates. The dependence of ecommerce on SSL translated into a product requirement to have all of those moving pieces implemented.

Eventually the US would back down from its position and the restrictions largely lifted in 2000, but not before they took their toll on the design of Windows cryptography API or CAPI for short. CAPI tried to walk this fine lien of enabling the “permited” use cases such as online banking for good guys while making it difficult for bad guys to implement the “frowned upon” scenarios. Many of the inexplicable design choices reflect these constraints. Case in point: for the longest time it was not possible for users to generate their own keys and provide that to the API for use in an encryption operation. Because users could override export restrictions, and for example use 128-bit keys when they were supposed to be limited to weak ciphers only. (Even when the regulatory constraints went away, the API remained hobbled with the restrictions, with the bizarre exponent-of-one trick to import keys with a dummy RSA operation.)

Still CAPI tried to provide an extensibility mechanism, using the notion of cryptographic service providers or CSPs. Applications did not simply ask to encrypt a message or compute a hash. Instead they firsted pick a CSP, which provided the implementation for set of algorithm and invoked the functionality through a consistent interface. (A similar extensibility mechanism was adopted later by openssl in the notion of engines.) While this made life slightly more cumbersome for consumers of cryptography– even simple tasks such as “hash this message with SHA1” required dozens of lines of code– on paper the model had significant advantages:

  • Support for new algorithms can be added by end users. In other words a third-party could author a CSP and install thiswithout waiting for MSFT to update the OS.
  • Default implementation of existing algorithms could be replaced. If a customer was concerned about side-channel attacks in the built-in RSA implementation, they could in principle write an improved version.
  • System-wide policies could be set on choice of algorithms. One could mandate that all applications that use hashing would pick SHA256 for example.

In reality it did not quite pan-out that way, for reasons beyond the scope of this post. But for our purposes the main feature is how smart cards were incorporated into the framework. Originally support for smart cards and custom cryptographic hardware such as HSMs were viewed as a special case of the second scenario. In this model each smartcard vendor writes a CSP tailored to the precise capabilities of their hardware. Customers would install that CSP on the machines where they planned to use that card.  Magically CAPI ensures that requests from existing applications– web browser, email client, VPN app etc.– are be transparently routed to the new CSP without changing the application. This is what the initial architecture looked like:

Diagram of Windows CSP

Original Windows CAPI architecture for smart-cards

[continued]

CP

Login with Facebook as .NET Passport V2, essentially

(Full disclosure: this blogger worked on Passport and its later incarnation Windows Live ID.)

Facebook is close to accomplishing what Microsoft set out to do with .NET Passport in the late 90s: become an identity provider trusted by the majority of popular websites.While recent usage/adoption statistics are difficult to come by, the increasing number of “login with Facebook” buttons popping up on sites ranging from Remember The Milk  to KickStarter suggests that companies representing all types of business segments are on board.

Facebook login may also the lone success story for identity federation– or to be more precise, federation in the consumer space. The enterprise scenario has received far more attention and enjoys an abundance of commercial solutions designed to solve a well-defined problem: employee Alice working at medium/large enterprise, typically a Windows shop running Active Directory, wants to use some cloud provider such as Salesforce. Main requirement is for Alice to login to that external resources using her existing Windows domain credentials, without managing a different username/password. SAML and its more Redmond-centric counterpart WS-Trust have been used with varying degrees of success to bridge that gap. What has never worked reliably is the consumer equivalent of that scenario: Alice using her Yahoo account to login to Twitter for example. There are isolated cases of interoperability, such as Remember The Milk also accepting login with Google identities, via combination of OpenID and oauth. But these are the happy exceptions. For the most part, each cloud provider stands in its own island of identity, with occasional one-off agreements and forays into federation experiments.

Except for Facebook. Starting with Facebook Connect, the service has seen increasing adoption of its identity system, with little under the guise of adding social features to websites. This is a far cry from the response to Passport when MSFT started pitching it around. Conceived from the beginning as an identity provider for the entire Internet– as opposed to merely all Microsoft properties, which would have been ambitious enough–  the service had little adoption. Expedia, originally spun out of MSFT, and eBay were among the few large sites accepting Passport logon along side their own identities. (Expedia would later phase out the service.) It was not for lack of trying either, at least on the Windows front.  For example, the IIS 6.0 web server in Windows Server 2003 had built-in support for Passport authentication at the HTTP level.

What was the difference? Several theories come to mind:

  • Better value proposition for relying parties. Passport provided identity and very little else. Granted there was an associated “profile” of user provided personal information. But there few forcing functions for that profile to be accurate (as opposed to “John Smith” living at zipcode 90210) comparable to the stringent real names requirement of Google Plus or the self-imposed convention followed by most Facebook users to represent their true name. Hailstorm aka .NET MyServices was the one ill-fated attempt by MSFT to attach large amounts of data and broker these to third-parties. That effort soon went up in flames. By contrast Facebook login brings an associated wealth of user-generated content, and even ability to post updates to the user timeline.
  • Fear of outsourcing in general. In this day and age of EC2 instances spun up on demand and sites scrapped together by mashing up third-party data feeds, it’s difficult to imagine a time when everyone insisted on running their own data-center and having full control of every feature in their service. That attitude, combined with an over-confidence that everything could be done better in house, predisposed developers against relying on some other authentication service. (The sheer number of password mishaps would prove them very wrong. It turns out the majority of those sites can better serve their users’ security if they delegate authentication to more competent entities.)
  • Fear of MSFT in particular. With diversified business interests in operating systems, productivity software, servers, gaming and entertainment, it is easy for any given website to consider MSFT a competitor, and shy away from entrusting a critical business function to the one company they are most concerned about.
  • Privacy concerns. Plenty of FUD, culminating with an EPIC complaint to FTC and other privacy advocacy groups did not help the matter. Facebook has arguably displaced MSFT as the great privacy boogeyman of the decade, but this was not true when Facebook Connect debuted in 2008, before the company got embroiled in multiple rounds of controversy around privacy of it own making.
  • Lack of standardization. Passport started out with a proprietary protocol, partly because there were few good options available for a web-based protocol that did not require changes to the web browser. Later Kerberos support was added but the corresponding functionality in browsers lagged behind. In principle Passport service could server as a Kerberos Domain Controller (KDC) compatible with Windows, and users could use Kerberos via the negotiate package supported by IE But the user experience for that would have been very restricted– it is a native dialog from the OS– compared to the full control that an HTML page gives the website for customizing the login experience. In any case WS-Trust and SAML followed soon with browser-aware profiles, and a few years later came the backlash against angle-brackets in the form of OpenID and Oauth. Facebook login is built on Oauth2.
  • Finally, one can’t rule out the passage of time helping out. Federated login was a foreign concept in 2000. It confused users: people had become so accustomed to having different identities at every site that they no longer expected their Hotmail credentials could also get them into their instant messaging client. Web site designers meanwhile had very little incentive to fix that, so they continued to put their own identity system front and center while offering federated login half-heartedly as a hidden option that required jumping through hoops. (Reflecting that lack of confidence, they hedged their bets and often required even federated users to also register a “native” account just in case the external identity system disappears overnight.)

CP