About those strange P3P compact policies (2/2)

With the background on P3P compact policies covered in the first part of this series, time to answer the vexing question: why do nonsensical P3P policies appear to meet the Internet Explorer privacy settings?

This is partially a consequence of the way IE privacy settings are specified. As described in MSDN, compact policies are evaluated using a rules-based system, triggered by the presence or absence of specific policy tokens. For example the token CUS stands for “customization” and is part of the P3P vocabulary for data collection purposes. Similary FIN is a token indicating the category of data collected, in this case financial information. IE privacy engine is a series of rules where the condition is that presence/absence of some combination of tokens and the action defines what to do with that cookie. For example it is possible to state that if financial data (FIN) is being shared with third-parties (OTR) and the user has no recourse (no presence of LEG token) then reject the cookie.

In principle this mechanism is expressive enough to implement either blacklist or whitelist approach. In the first case, one accepts all policies except those containing certain combination of tokens, which are subject to additional restrictions. In the second case, the browser is more strict and by default rejects/downgrades cookies except when the policy meets a particular criteria. Looking at the medium privacy settings which are the default for Internet zone, IE takes the former approach– the default action attribute is accept.

The catch is that if Internet Explorer runs into unrecognized tokens such as “HONK” it will simply ignore these. The original motivation for this is forward compatibility: IE6 was finalized before P3P standard itself was completed, creating the possibility that the vocabulary could be expanded. In fact even if P3P standard had been finalized as W3C recomendation, that would be version 1.0– future revisions could introduce new tokens, with the result that users running earlier versions of IE would be faced with unrecognized tokens. That mindset is hard to imagine today when software is updated periodically, and often automatically. In 2001 the picture was different, with no monthly patch-Tuesday or near instant Chrome updates.

There is also a correctness problem in ignoring unknown tokens, in conjunction with the blacklisting approach used for settings. Any new token introduced in the spec could have signalled some pernicious data practice worse than those that existing rules were trying to block. Ignoring the new token in that case results in a decision resulting in less privacy and more cookies accepted than intended. This highlights a cultural preference common to MSFT at the time, in favor of failing open, favoring compatibility at all costs over privacy/security. (Trustworthy Computing has been successful in shifting that attitude.)

In reality of course P3P never went anywhere, with the W3C group eventually disbanding in frustration, citing “… insufficient support from current Browser implementers for the implementation of P3P 1.1.” That was 2006. With the vocabulary stabilized, a more strict parser could have been implemented. Even admitting for the possibiltiy of new tokens, sanity checks could have been added: since compact policies are supposed to be derived from a full XML policy, the well-formedness requirement for the XML rules out certain situations such as empty policy without any valid, recognized tokens.

With the perfect hindsight of 10+ years, that is one feature one of the designers regrets not implementing.

About those strange P3P compact policies (1/2)

There are times when past mistakes come back to haunt the designers and developers of a system in unexpected ways. The implementation of the privacy standard P3P in Internet Explorer is proving to be that example for this blogger.

First some background: P3P stands for Platform for Privacy Preferences Project. P3P was forged over a decade ago, amidst the great privacy scares of 2000, in what can be seen as a more innocent/idylic time before September 11 when the greatest threat to online users were evil marketers trying to track users with third-party cookies. Under the charter of the World Wide Web consortium’s Technology and Society group, P3P was an ambitious effort to introduce greater transparency and user control over the collection of information online. In many ways it was also ahead of its time. In the vein of similar initiatives that attempt to prescribe technological fixes to what are fundamentally economic incentive problems, only a tiny fraction of the ideas found their way into widespread implementation. (It would be another 10 years before W3C would dabble on the policy front, with Do-Not-Track, instantly getting mired in as much controversy as P3P in its heyday. To think– DNT introduces just one modest HTTP header representing a yes/no decision. P3P is enormously complex by comparison.

In the original vision, websites express their privacy policies– often couched in legalese and not written with the purpose of informing users– in machine readable XML format. The web browser could then retrieve and compare these policies against the user’s preferences as they navigated to different websites. P3P even proposed a machine-readable standard for expressing user preferences called APPEL, also in XML naturally, which went nowhere. It’s difficult to argue against greater transparency– although several advertising networks managed to do precisely that, out of concern that shining a light into data collection practices could paint an unflattering picture.

Earlier iterations of the protocol also had serious disconnects with the way web browsers operate and their focus on performance. Blocking, synchronous evaluation of privacy policies for every resource associated with a web page, as originally envisioned in the draft spec, would have been an enormous speed penalty. With some reality checks to focus on improved efficiency, attention eventually focused on the perceived privacy boogeyman du jour: HTTP cookies. In order to avoid out-of-band retrieval of privacy statements, compact policies were introduced, as a summary of the full XML policy that could be expressed succinctly in HTTP response headers accompanying cookies. Compact policies are derived from the full XML version via a deterministic transformation. This process is lossy and produces a worst-case picture: while the full XML format allows specifiying that a particular type of data (say email address) is collected for a specific purpose, retention and third-party usage, the compact policy simply lists all categories, all purposes, retention times etc. as one dimensional list, collapsing such subtle distinctions. Still compact policies could be specified in HTTP headers or even in the body of the HTML document, allowing fast decisions about cookies.

So what was implemented in practice? Internet Explorer ended up being the only web browser supporting P3P and a very specific subset at that: (Full disclosure: this blogger was involved in the standards effort and implementation in IE.)

IE uses compact policies for cookie management.
IE does not evaluate full XML policies or otherwise act differently based on the presence/absence of that document. It does not even make an attempt to retrieve the XML or verify its consistency against the compact policy. There is an option under the privacy report to retrieve the policy and render it in natural language, if the user went out of their way to ask for it. (Not surprisingly many sites only deployed compact policies, never bothering to publish the XML.)
No APPEL or other automatic policy evaluation triggers, for example before submitting a form or logging in to a new service when it would be a useful data point for the user.

Even with this subset, P3P had significant effect on web sites because of its default settings. Belying the assertion that default settings are just that and easily modified by users who disagree with them, the default choice of “medium” privacy became the de facto standard for websites that depended on cookies. First-party cookies were given a wide berth– not requiring a compact policy and permitting existing usage to continue functioning without any changes– third-party cookies without an associated satisfactory were summarily rejected. That means not only advertising networks must implement P3P, they must have a policy that meets the default settings for IE Otherwise all of those banner-ads and iframes with punch-the-monkey animated Flash ads get stripped of their cookies, losing their capability to accurately track distinct users.

This is a great example of regulation by code as Lawrence Lessig described it brilliantly in “Code and other laws of cyberspace.” By choosing a particular default configuration in the most popular web browser, MSFT had established a minimum privacy bar for a segment of the online industry. (The irony is inescapable: at the same time that MSFT was trying to discredit Lessig in the antitrust trial, the engineers were busy providing a textbook example of his central thesis around regulation via West Coast Code.)

[continued]

Secure elements and mobile devices

After the previous post covering NFC modes in Android, time to turn our attention to a closely related subject: the embedded secure element.

In principle a hardware secure element can be viewed as completely independent entity, completely orthogonal to whether there is NFC capability on the same device. Sure enough such a “secure element” already exists in a good chunk of the phones: the lowly SIM card, or UICC as it goes by its formal name, is a type of secure element capable of executing security critical functions. Its raison d’etre is the storage of authentication keys for connecting to GSM networks, a scenario near-and-dear to the mobile carriers. But as is often the case, market demand has influences hardware requirements: the driving force for including an SE (or even a second SE, counting the SIM for GSM devices) is tightly coupled to the primary NFC use case: contactless payments.

The secure element is a system-on-a-chip or SoC– which is to say that it has its own processor, RAM and persistent storage. It can be viewed as a tiny computer inside the main “computer” that is the smart phone. That in itself is not very remarkable, as the average phone contains plenty of such chips: everything from the Bluetooth adapter to the flash controller could arguably meet that definition. What differentiates the secure element?

Locked-down operating sytsem which can not be directly controlled by the host device. In other words, Android OS even with root privileges can not reflash the contents of the SE, read/write out its memory or install new code. (Managing SE requires privileged access authenticated by cryptographic keys for such operations.) For most other chips such restrictions are undesirable. For example, it is important that the Bluetooth controller can have its firmware updated locally as the OEM releases updates or bug fixes.
Hardware tamper-resistance measures designed to guard against attacks that involve direct physical access to the chip. This includes intrusive attacks such as peeling open the chip to try to read its EEPROM directly, or attemptign to cause glitches in the execution by subjecting it to environment stress, heat, over/under power, zap with laser beams etc.
Built-in root of trust, with unique identity. It is possible– for the parties armed with the right cryptographic keys– to authenticate an SE remotely and set up a secure channel where communications to/from that SE are not visible to even the host operating system.

Secure elements appear in any number of different physical form factors, ranging from the very familiar “smartcard” in ID-1 format (typical dimensions of credit-card) to USB tokens employed for authentication in enterprise settings. While these objects seem “large” in relation to the size of a mobile device, it should be noted that the bulk is not taken up by the electronics. (In particular, the brass-colored metal area on a smart card is not the size of the IC– those are the contact points for interfacing with a card reader, for which the dimensions are fixed by international standards.) The chip itself is tiny and continues to shrink over time as fabrication techniques improve. By contrast overall physical dimensions are subject to interop constraints, such as being wide enough to cover a USB slot.

In the spirit of experimentation, different form factors have been tried for incorporating a secure element into a mobile device:

SIM card and its smaller brethren found in iPhone (Becaues the Apple design has to be different and incompatible)
MicroSD cards, which include a secure element such as Giesecke&Devrient Mobile Security Card and Tyfone SideSafe designs. These combine both mass storage suitable for the SD slot on a phone, as well as a secure element accessed over the same interface. (Tyfone even boasts a version with integrated NFC.)
Embedded SE coupled to NFC controller– this is the Android architecture, where the secure element is part of the phone.

The list does not even include ways that an external SE can be used in conjunction with the phone. For example there have been mobile payment designs based on stickers, where a sticker containing an SE and integrated NFC antenna is applied to the back of the phone. (These end up being relatively thick, because a layer of ferrite is necessary to separate the antenna from metal on the back of the phone.) Likewise the US government adoption of smartcards with CAC and PIV programs has inspired highly awkward looking sleeves and Bluetooth card-readers designed to allow reading such cards from a mobile device.

Android and NFC modes

Quick note about the different modes for NFC usage supported in Android:

Reader/writer mode. This is probably the most common scenario. The host device functions as the active participant, while on the other side is a passive tag that powered by the induction field generated by the phone. Examples include scanning a URL from a tag (such as Pay-By-Phone stickers on parking meters in SF, or reading information from a US passport) The Android NFC stack provides extensive support for this mode, in callback model: applications can register to receive notifications on discovery of tags, either by NDEF or tag type– such as Mifare classic tags or all ISO 14443 smartcards.
Peer-to-peer, the basis for Android Beam. It is not possible to directly use this mode via Android API either on the sender or recipient side. Instead applications can declare an NDEF record to be transmitted if a beam-transfer is initiated. The stars have to align for this, with another device in RF range of the phone and transfer is confirmed by the user by clicking on the screen at the right instance. (There is also an optimization to register a callback to create that NDEF record on demand, without committing to it in advance.) On the recipient side, Beam is handled directly by the NFC service by invoking the right application.
Card-emulation. In this mode the phone emulates an NFC tag. Specifically for Android card-emulation involves means routing communication from an external NFC reader directly to the embedded secure element, which can appear either as 14443 contactless smart-card or Mifare classic tag. The host operating system is completely out of the picture: the traffic goes direct from the NFC antenna to/from the SE, without traversing the Android path at all. It follows that applications on the host OS have no control over the data exchanged in this model, except indirectly by influencing the behavior of applets present on SE. Card emulation is used for Google Wallet, to execute contactless payments with secure element as well as offer redemption. By default card emulation state is tied to the state of the screen: CE is on only when display is on. (As an aside: the screen does not have to be unlocked. This is in contrast from reader/writer mode where the polling loop will not operate when screen is locked. For this reason it is not possible to scan tags without first getting past the Android screen lock, while tap-payments can be initiated by simply turning on the display and holding the reader against an NFC reader.)

Card emulation mode is particularly interesting because it allows the phone to function as a smart-card and substitute for single-purpose dedicated cards that were traditionally used in scenarios such as transit, physical access control and identity/authentication. In other words, subsuming the capabilities of an EMV contactless credit-card is the proverbial tip of the iceberg.

GoDaddy outage and lessons for certificate revocation (2/2)

Windows includes a helpful utility called certutil that serves as a Swiss-army knife for trouble-shooting PKI problems on that platform. One of the options can be used to look at URL cache entries, where previously obtained OCSP and CRLs are stored using the -urlcache option. By running this query and looking for objects associated with GoDaddy one can determine the extent revocation information that would have been available to the client locally, if further network requests were ruled out.

Running this experiment on a couple of actively used Windows 7 machines shows a decidedly mixed record:

On one machine there were no GoDaddy entries at all. In this case all revocation checks for GoDaddy sites would have fail.
On another laptop, there were two dozen OCSP response as well as CRLs for root and intermediate issuer.

Actively used is the operative keyword here, because paradoxically the effectiveness of revocation checking as implemented on Windows is directly correlated to its frequency of use. The chain-building engine contains sophisticated optimizations on when to prefer CRL over OCSP (if multiple certificates are checked for a given issuer, it become more efficient to download the CRL) and also which issuers are most frequently observed, to allow prefetching those OCSP/CRLs ahead of time before the current ones expires.

(As an aside, this makes revocation checking something of a cooperative enterprise between multiple applications on the machine. Everyone wants to avoid doing a costly CRL/OCSP check over the network, hoping that there is a cached response already in the cache. But to the extent that applications skip revocation checking or instruct CAPI2 to use offline checks based on cached information only, the chances of that happy condition occurring goes down. This is why applications such as Chrome which “defect” from revocation checking are doing a disservice to other applications using the feature.)

The sensitivity of caching to navigation patterns is helpful. Any website the user visits often, will likely have an OCSP response cached, helping tide over any temporary outages of the certificate issuer when visiting those sites again. In fact if the user happened to visit may sites with GoDaddy issued certificates, it may even exceed the threshold where CRL download is triggered, covering all sites– including those not yet visited– affiliated with that issuer. While navigation history is highly clustered around particular sites and this makes the first case realistic, there is no reason to expect any correlation that multiple sites users visit are more likely to have certificates issued by the same CA.

There is one more ray of hope: OCSP stapling. This an SSL extension that permits the server to return a recent OCSP response to the client, saving the client from having to do the lookup on its own. In principle this would also increase resilience against outages of the OCSP responder, as long as the server has a fresh response obtained prior to the outage. (This still has edge-cases around a brand new server being deployed from scratch or perhaps rebooted during the outage. Typically it would need to reach an OCSP responder as part of initialization.) In reality the less-than-stellar uptake of this optimization outside of Windows platform means it would have been of limited use in the GoDaddy debacle. This may change in the near future. For example nginx recently announced support for OCSP stapling.

GoDaddy outage and lessons for certificate revocation (1/2)

One of the unintended side-effects of recent GoDaddy outage was providing a data point on the debate around whether certificate revocation checks can be made to fail-hard. In addition to being the registrar and DNS provider for several million websites, GoDaddy also operates a certificate authority. The outage inadvertently created an Internet-wide test of revocation checks to operate in offline mode.

Quick recap: when establishing a secure connection to a website using SSL, web browsers will verify the identity of that site, in a format known as X509 digital certificates. Most of these checks can be done locally and are very efficient in nature. For example, verifying that the certificate has been issued by a trusted authority, that it is not expired and the validated name specified in the certificate is consistent with the name of the website the user expected to visit– eg what appears on the address bar. But there is an additional check that may require looking up additional information from the web: verifying that the certificate has not been revoked by since its time of issuance.

It is fair to say that web browser developers hate revocation checking because of its performance implications. The web is all about speed, with each browser vendor cherry picking their own set of performance benchmarks. Over time an extensive bag of tricks has been developed to squeeze every last ounce of bandwidth available from the network and fetch those webpages an imperceptible fraction of a second faster than the competing browser. Revocation checking throws a wrench into that by stopping everything in its tracks, until information about the validity of the certificate has been retrieved. (In fact it is more than one connection that is stalled: one of the standard speed improvements involves creating multiple connections to request resources in parallel, such as the style-sheet and images concurrently. All of these are blocked on vetting the certificate status.)

Almost always, the revocation checks pass uneventfully. In rare cases the client may discover that a certificate was indeed revoked, saving the user from a man-in-the-middle attack, although there have been no recorded cases of that happening in recent memory. (For the most epic CA failures such as DigiNotar incident, the certificates in question were blacklisted by out-of-band channels and the attacks stopped as soon as they were publicized, long before revocation checks could save the day.) Then there is the third possibility of a gray area: revocation check is inconclusive because the network request to ascertain the certificate status has failed. A conservative design favors failing-safe and assuming the worst. In reality, due to historically low confidence in services providing revocation status (imagined or real) most implementations fail open, and assume the certificate is not revoked. For example Internet Explorer does not even warn about failed checks in the default configuration. This is the basis for rants to the effect that revocation checking is useless— after all, in any scenario where an adversary has enough control over the network to orchestrate a man-in-the-middle attack, they are also capable of blocking any traffic from that user to the revocation provider.

Luckily the situation is not quite as bleak as described above. There are several optimizations in place to avoid having costly, unreliable network look ups for each certificate validated. First there is aggregation within each CA: a certificate revocation list or CRL contains list of all revoked certificates for that issuer. This spreads the network cost of a revocation across multiple certificates. This list in turn can be broken up into incremental updates called delta CRLs, to avoid downloading an ever-expanding list each time. Finally each update has a well-defined lifetime, including an expected point when the next update will be published. Combined with the fact that CRLs are signed, they can be cached both locally and by intermediate proxies at the edge of the network for scaling.

The story gets more complicated when considering there is another way to perform revocation checking: online certificate status protocol or OCSP. This is a more targeted approach to querying trust status– instead of downloading a list of all the bad certificates, one queries a server about a particular certificate identified by serial number. OCSP does not amortize cost over multiple queries, since finding the status for one website does not help answer the same question about a different one. On the bright side, OCSP responses also have well-defined lifetimes times much like CRLs, obviating the need for additional queries during that period. Also in the spirit of CRLs, they are signed objects, permitting caching by intermediate proxies to decentralize distribution.

All of this would suggest that perhaps clients could cope with a temporary outage of revocation servers, even if they opted for hard-fail approach. (Recall that hard-fail means connections will not proceed without positive proof that certificate was not revoked.) In principle all that caching could permit users to still visit websites when the revocation infrastructure becomes unreachable, when both the CRL distribution point (CDP) and OCSP responder are down– exactly what happened to GoDaddy.

The question is, how well would that have worked out for GoDaddy during its outage?
Not very well, it turns out.

[continued]

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Random Oracle

Building and breaking systems

Month: October 2012