Intro to trouble: LinkedIn and trusting the cloud (part II)

[continued from part I]

Expanding attack surface

In terms of risk, Intro amounts to expanding the attack surface, the universe of ways a system can be targeted by adversaries. It’s not that email was absolutely safe before Intro and somehow became intolerably dangerous afterwards. Instead users incur additional risks– their messages can be compromised in transit to or during processing at LinkedIn datacenters.

LinkedIn response outlines mitigations in place to manage that risk. But discussing defenses is getting ahead of ourselves. The critical question is not whether Intro design is taking necessary steps at the technology level to manage the delta. Before going down the path of evaluating countermeasures, there is a more basic question: does the value proposition make sense? Is the service provided by LinkedIn valuable enough to justify the risk? That question can not be answered in isolation without looking at both benefits and risks side of the equation. Much like deciding whether an investment is appropriate, we need to compare its expected returns to the incremental addition of attack surface.

Weighing risks and benefits

In this case the expected reward from installing Intro is that email messages are annotated with profile information about the sender, drawn from their LinkedIn profile. The potential risks are also clear: email flowing through LinkedIn systems is susceptible to attacks both in transit to/from LinkedIn as well during the brief time it is being processed by LinkedIn systems. (This is a best-case generous interpretation; we taking the designers at their word that messages are not stored. That statement can not be verified without access to LinkedIn operational environment.) What could possibly go wrong? Here is a sampling of potential risks:

State-sponsored attackers can break into LinkedIn systems to capture email as it is routed through this system.
Interception of messages in transit by breaking SSL, via using fraudulent digital certificates from incompetent/dishonest CA on behalf of LinkedIn.
LinkedIn insiders can modify the system to divert certain messages
Law-enforcement and surveillance requests can compel LinkedIn to start storing messages, against the stated design intent.

Again these are all incremental risks. It’s not that SSL was absolutely safe when used only for connecting to the original email provider or that provider was somehow immune from getting 0wned by China. The point is that all of those risks are increased by having one more participant attackers can target. How much depends on the relative security of LinkedIn compared to the email provider already entrusted by the user with access to their messages. If a Gmail user started routing their traffic via Intro, chances are the risks have drastically increased: given its past experience of responding to APT attacks and investments in SSL such as certificate pinning, Google is likely a much harder target than LinkedIn.

Reasonable people may disagree

Is it worth it? The answer may well vary between individuals or in managed IT environments, between different enterprise philosophies. At least for this blogger, there is no conceivable universe where scribbling profile information in email messages– information that can be obtained in other ways, if a little less conveniently by visiting the LinkedIn website to run a manual search– is worth the risk of exposing raw email messages to a third-party. Simply put LinkedIn is not an appropriate “trusted third-party” for access to user email. This is not a reflection on LinkedIn or the quality of its internal security practices. The same concept implemented by Facebook or Twitter would be equally inappropriate and dubious in value proposition.

Also worth pointing out: this is not an automatic rejection of relying on cloud services or affording special treatment to email. Enterprises often contract with third-party for security services to screen all incoming email for that company. This is accomplished by routing the messages to servers run by that third-party to be scanned for malware and spam. A decade ago commentators were asking whether it is appropriate to outsource such services. Two key differences from Intro make it easier to answer that question:

Clear security benefits to counter-balance risks. Blocking malware and phishing attacks arriving via email is a security feature. On the one hand, routing messages to third-party systems increases attack surface in ways similar to Intro. On the other hand, the enterprise expects reduced malware prevalence and corresponding improvement in host security.
Alternatives are significantly more costly or less effective. While email screening can be done on-premises as installed software, such designs face the problem of keeping up-to-date with new attack mitigations. By contrast outsourced systems benefit from having visibility into attacks across multiple customers and can respond to new threats faster by aggregating this information.

This is why it is not completely gratuitous for an outsourced security provider to have access to email traffic. Screening email is the raison d’etre for these services; they could not provide any value otherwise. There is no similar urgency or necessity for a social network such as LinkedIn to access user email. As the existence of Facebook and any number of other successful specialized social networking sites demonstrates, access to user emails is not in anyway a prerequisite to operating a viable business in that space.

Intro to trouble: LinkedIn and trusting the cloud (part I)

It has been a tough start for LinkedIn’s Intro feature, designed to add contact information from the social networking site to email messages. The project was announced on the company engineering blog with much fanfare, chronicling the challenges faced in implementing the concept on iOS. Whatever the technical complexity and virtuosity involved in pulling this together, the main reaction was one of skepticism and outright hostility from the security research community. In particular Bishop Fox eviscerated the concept in a detailed point-by-point critique and LinkedIn responded with another blog addressing technicalities while dancing around the fundamental question of trust.

Non-issues

Before discussing the problem with Intro, let’s dispense with one non-reason that appears to be dredged up in every article covering the feature. LinkedIn suffered a massive password breach in 2012, netting the company a Pwnie award nomination for Most Epic Fail. Incidents or lack thereof is not a good metric for evaluating the security of a service. While a data breach usually implies the existence of weaknesses and defects in the defenses, whether or not someone gets around to exploiting an existing weakness is influenced as much by sheer luck. Granted there were disturbing signs in this episode indicating that suboptimal design played a significant role in amplifying the damage: the way LinkedIn stored passwords violated industry best practices. There was no salt applied to diversify passwords before hashing. Just 1 iteration of the hash function was used instead of iterating thousands of times to slow down guessing.

That was not an isolated instance when it comes to questionable decisions on the security front. As noted earlier the service continues to use the password anti-pattern, phishing users for their passwords on other sites instead of adopting the industry standard Oauth protocol for constrained access to user data at those sites.

Still, there is a statute of limitations for incidents. It is not rational risk-management to reject every new offering from a company on the basis of one incident or for that matter, failure to follow optimal security design in one feature to color judgments about every other one. This post will give LinkedIn a free pass for such transgressions and evaluate Intro on its own terms.

Fundamental problem with Intro

Key observation about Intro is that the functionality is not implemented locally. In order for this email rewriting to take place, the message is sent out to LinkedIn servers, modified there in the cloud and then returned to the user. This means that LinkedIn servers get access to every single message sent to that particular email account. The difference is best explained by contrasting it with two other common systems that operate on email messages.

Gmail keyword advertising

Since 2004, GMail has been controversial for offering targeted advertising based on keywords in email. Strictly speaking GMail does not tamper with messages unlike LInkedIn Intro. Sponsored advertising appears off to the side, in a clearly demarcated area. Still the experience of the user– something Microsoft repeatedly capitalized on in the Scroogled series of TV commercials– is that their messages are being “read.” Why is Gmail keyword scanning not a security risk? (Even though it may well be construed a significant privacy infringement.) Because Google servers already have access to the email message. There is no new user data being made available to Google in order for their servers to decide which advertisements will be displayed alongside the message. This stands in sharp contrast with LinkedIn situation: before using Intro, LinkedIn did not have access to emails sent/received. It is the act of installing Intro that causes otherwise private messages to start flowing through LinkedIn servers.

PGP and S/MIME

Another example of software which does in fact modify email messages are PGP and S/MIME extensions for email. Both are standards for adding encryption and digital signatures to messages. Sometimes the functionality is built into an email client: MSFT Outlook has S/MIME. In other cases it is a third-party extension that integrates with an existing email application. For example GPGtools hooks into the standard Apple mail client on OS X.

So what is the difference between installing Intro on iOS versus installing a GPG client for OS X integrates with the built-in mail application? GPG clients operate locally. No data is ever shipped to a third-party in the cloud. (Incidentally the reason LinkedIn implemented Intro as a remote service is that iOS mail application lacks the necessary extensibility mechanism for other local applications to hook into the email processing pipeline.)

Local vs cloud

Having a local application does not completely eliminate the trust question. Users still have to trust the author of the software. After all that code could secretly leak a copy of every message to a server in China or rootkit the machine. But such properties can be verified locally. A complete copy of the implementation is available for direct observation. It can be debugged, audited, reverse-engineered if necessary– many versions are open-source so they can be audited directly. It can be tweaked to run with reduced privileges in a sandboxed environment. More importantly for the purpose of future-proofing trust decision, there are strong assurances in place that these properties will not change magically. Users retain visibility and control over changes to the application going forward. If the software publisher decides to go rogue or is compelled by law enforcement to start installing spyware on user machines, they will be going through a public process of pushing out malicious updates. This is conceivable but much harder to hide compared to making equivalent changes behind closed doors inside a datacenter.

Leaps of faith

By definition, critical parts of the Intro implementation belong in the cloud inside LinkedIn data centers. Regardless of how much LinkedIn swears up and down that this environment has necessary safeguard (the blog cites an iSEC audit but it is telling that iSEC Partners itself has not come forward to defend the design) that aspect remains a blackbox for anyone who is not directly affiliated with the company. A significant leap of faith is required to accept that all is well inside that blackbox not just in the present moment, but indefinitely into the future.

Granted such leaps of faith are made all the time when adopting cloud computing. Gmail users have made a decision (perhaps implicitly and without spelling out the full consequences) that it is acceptable level of risk for Google to have access to their written communications. Ascertaining whether the same risk can be justified for Intro calls for stepping back to examine the broader question of how trust decisions are made.

[continued]

NFC payments on personal machines: PCI versus innovation (part II)

[continued from part I]

In this follow-up we look at the missing pieces for enabling ecommerce with tap-and-pay from end-user devices.

Integration with web-browsers

The first problem is that neither the NFC reader hardware or higher-level APIs leveraging that (such as PC/SC for smart cards) are directly exposed to web sites. This is good for security– it is dangerous to give websites access to smart cards that happen to be attached to the system, considering those cards may contain valuable credentials. “Directly exposed” in this case refers to being accessible from the standard web platform of HTML5 and JavaScript. It is certainly possible to execute arbitrary code and go direct to operating system APIs via various extension mechanisms such as ActiveX and NPAPI. But these are not portable, strongly discouraged for security reasons, not to mention impractical for each website to rewrite from scratch.

Not to worry: from their humble beginnings, web browsers have been getting increasingly bloated with random APIs over time. W3C recently standardized a basic cryptography API. (There is even a proposal for secure element API in the same vein being promoted by Gemalto. But that proposal operates at too low a level, exposing raw APDU communications to cards instead of encapsulating a payment transaction.) It is not too much of a stretch to imagine some payments API can be introduced either by industry consensus or by one of the players such as Google or MSFT deciding to field their proprietary design. Such an API could abstract away the logistics of stepping through the payment protocol and communicating with the card. It would function identically for contact-based chip & PIN as well as contactless cards with NFC readers. Implementation of the API would also handle critical user-interface elements such as collecting PIN from the user, which can not be entrusted to websites.

Integration with payment processors

After the protocol is executed, the online merchant ends up with some “proof” of authorization generated by the card and conveyed by the web browser. The contents of this proof are partially dependent on inputs chosen by the website itself. For example in certain EMV protocols, the amount being charged is itself an input authenticated by the card. The next challenge is for the website to use that proof and somehow get paid.

This is an interoperability concern. Typically websites are working with a payment processor which expects to receive specific inputs to pass upstream: credit card number, expiration date, cardholder name and optionally CVC2. Depending on the exact protocol variant used, a contactless transaction will not produce all of these. For example the card-holder name is redacted from the emulated “magnetic stripe” for Paypass transactions. That problem is easily solved by asking the user– chances are the merchant is interested in customer name for other reasons already. But others are more difficult to work around. Contactless payments use a dynamic CVC, also known as CVC3. (By the way: that applies only for the simplest case of EMV protocols run in backwards compatible-mode with swipe-transactions. This is the “mag-stripe profile” intentionally designed to produce output that looks like track data coming from a plain plastic card. Pure EMV has far more elaborate schemes for authorizing transactions with nothing resembling a CVC in sight.) That value is not static; it is computed as a function of challenge from the reader and incrementing sequence counter maintained by the card. On the surface CVC3 has the same format as CVC2: typically three digits, although this is a configurable parameter. However they are not interchangeable; CVC3 values can not be used to make a card-not-present transaction.

At this point one can fallback to the strategy of simply asking the user for other fields. The problem is that defeats the point of using a contactless payment protocol such as NFC: having a stronger assurance that the card is indeed present, and resisting some class of attacks that involve copying cardholder data. Receiving a fresh CVC3 response based on a challenge chosen by the website provides guarantees that are not present in using a static CVC2, which could have been captured from the user in a different context. Put another way, retrieving only the card number and other static data from the card is only “tap-and-pay” in spirit; it amounts to using NFC as glorified numeric keypad, without leveraging the higher security level possible for contactless payments.

Better solution is to allow the upstream payment processor to accept output format generated by EMV protocols. Luckily the mag-stripe profiles provide an easy transition path for this, since they were designed for backwards compatibility with “legacy” swipe-transactions using CVC1. While POS transactions are card-present (as opposed to card-not-present mode used in online purchases) processors typically have capability for processing these if they are also facilitating in-store transactions. In other words, one approach to interop is treating tap-and-pay transaction from end-user as a card-present transaction conducted at POS terminal, returning track data with CVC3. This emulated track-data will be relayed to the merchant and passed through to the payment processor, who in turn forwards it to the network.

[continued]

NFC payments on personal machines: PCI versus innovation (part I)

Nearly six-months ago Engadget reported that NFC payments may be coming to Lenovo laptops equipped with built-in NFC readers. Digging deeper, the headline proves somewhat misleading– Lenovo only announced that these models have an integrated NFC reader. The author appears to have optimistically jumped to the conclusion that they could be used to enable payments. (HP already beat them to the punch with the Spectre Envy over a year ago but the story neglects to mention that.) Humoring this line of speculation, it is a fair question what exactly stops users from completing an ecommerce transaction by tapping their contactless credit card to their laptop. This is an interesting case where technology is running well ahead of self-imposed certification rules.

Accepting payments vs. making payments

First one clarification: the Engadget story refers to using off-the-shelf devices as payment terminals (point-of-sale or POS in industry parlance.) This is not to be confused– which the article does– with using the device as a payment card, as in the case of making in-store payments with Google Wallet running on an Android device.

At a superficial level, both scenarios use the same technology and follow the same EMV suite of payment protocols. But the roles are reversed. In the first case the device is in card-emulation mode, behaving like an ISO-14443 compatible smart card. It is acting as the source of payment instruments. By contrast when a device is accepting payments, it operates in reader-mode and behaves like a point-of-sale terminal. The device communicates with the card over NFC to drive the transaction, ultimately obtaining necessary information to authorize a charge against the card over some payment network. (Somewhat confusingly both can be combined: tap Android phone running Google Wallet against NFC-enabled laptop. Earlier we described a proof-of-concept along those lines but it was used for authentication rather than payments.)

The scenario Lenova is supposedly piloting would fall into the second case: visit an ecommerce site and during checkout tap NFC-enabled credit card to pay.

Easy enough

Playing the role of an NFC-enabled credit card requires having cryptographic keys associated with that card. The information is not printed or encoded on the plastic card itself. It is not possible to collect this from the user and somehow convert an ordinary magnetic-stripe card into one that can successfully interoperate with compliant POS terminals. It is up to the issuer to decide what type of hardware they are willing to entrust with these keys. Due to the high security concerns around safety of payment information, that usually means special tamper-resistant hardware. In the case of mobile devices, this is either an embedded secure element that is part of the phone hardware (used by Google Wallet) or hardened SIM cards with additional capabilities, as Google’s ill-fated competitor ISIS plans to leverage.

By contrast the reader side is not authenticated at all. Anyone can write reader software that will walk any EMV-compliant card through the transaction protocol. There is no “permission” required at the technology level for this. Of course there is a very stringent certification process for anyone one planning to manufacture and sell credit card readers for use in commercial settings. But such criteria exist above the application protocol layer. The wire protocol has no provisions for recognizing when a particular reader is “EMV-approved” and declining to proceed with payment. From a strict compatibility perspective, there is no reason why any laptop, tablet or smartphone equipped with an NFC reader could not function as point-of-sale terminal for accepting payments.

Proof-of-concept, no assembly required

MasterCard provides a Paypass emulator— Paypass being the MasterCard variant of EMV contactless payment protocols. This emulator runs on a Windows, executing a “payment” complete with a trace of messages exchanged with the card. Running the emulator typically requires an external smart-card reader attached to the system, compatible with the PC/SC standard used by Windows for reader hardware. But as we pointed out, many laptops are already shipping with integrated NFC readers. Given appropriate PC/SC device drivers, these can function as smart card readers. In fact we demonstrated this earlier by using an NXP provided PC/SC driver for the PN533 in HP Envy laptops for a proof-of-concept.

The only part of this simulation requiring quotes around “payment” is that no money changes hands. The emulator is not connected to any payment processor or issuer that will act on the information obtained by the card. But the card itself has no idea whether funds were actually moved. All of the bits exchanged over NFC are identical in both cases.

(Caution: the Paypass protocol includes an “application transaction counter” or ATC. Each completed tap– even when the reader is not connected to an actual cash register and no money is exchanging hands– will increment the ATC. If the ATC on the phone gets too far ahead of the ATC stored by the card issuer, future payments may be declined. Luckily in the case of Google Wallet, performing a full wallet-reset from Settings menu will provision a brand-new card with ATC reset to zero.)

Missing pieces

The proof-of-concept covers the first hop– from the card to the NFC-equipped laptop. But that is not quite the full picture, since transaction data must eventually make its way to the remote website or service and from there into the payment rails associated with a network such as MasterCard or Visa. So what are the missing pieces?

Browser integration
Differences in payment authorization between what websites use today and data generated by an NFC transaction
Compliance with payment network regulations

[continued]

Fine-grained control over framing (2/2)

Authenticating the framer

Continuing on the list of error-prone kludges required before the X-Frame-Options header acquired its ability to explicitly name another website as authorized framer:

Direct hand off with federated authentication. This assumes the user is logged into both the framer and framee with the same identity (for example an OpenID login where one side is the identity provider, and the other one is a relying party) The framer can pass a signed message to the framee containing a time-stamp, URL of the page being framed and identity of the user. Framee verifies this signature and compares the identity asserted in the message with the identity authenticated from the request. (Generalizing this slightly, it is not necessary for both sides to recognize the user with same identity, as long as the identities can be compared. For example, the user could have different pseudonyms at framer and framee, but as long as one side knows the pseudonym on the other side, this solution works.)
In the absence of shared identity, an improvised redirect scheme can be used as last resort. (To paraphrase Wheeler, most problems in web security can be solved by adding another layer of redirection.)
- Container includes a query string parameter C identifying itself when linking to the framed content.
- Framee looks at incoming parameter C. If it is among the containers permitted to frame this page, it sets a session cookie containing { R, C } where R is a random challenge, and redirects back to the container with R.
- Container notices the challenge, determines that it was indeed trying to frame content for this user and redirects back to framee, this time with { R, C } as query string parameters.
- Framee notices the presence of the session cookie, and compares the R in the cookie against the one in the query string . If these are identical, it deletes the cookie and returns the full content intended to be rendered in the frame.

Strictly speaking these kludges do not verify that the framer is one of the expected websites– only that it had the cooperation of one of those websites. For example in the first scheme it is possible for a dishonest container to create signed assertions for one of its own URLs and then hand them off to another site served from a completely different URL.

The catch with the cache

Another curious property: these schemes stop the HTTP response from being returned to the browser when framing conditions are incorrect. In contrast the X-Frame-Options mechanism does not suppress the response– in fact it must be returned along with the regular HTTP response– but relies on the browser to block rendering of content when the intended framing conditions are not satisfied. This is a crucial difference, and at first blush looks like an advantage, in that not serving content seems safest option. But caching can can throw a wrench into that. Once the framed content is served to an authorized container in a cacheable manner, future references from other containers will load it straight from the cache, bypassing any checks preformed during initial fetch. This makes it critical that any page emitting an ALLOW conditionally based on a query-string parameter ensure that the same page can not be loaded out of the cache by a different website trying to frame it. In the examples above, the first protocol achieves this by making the URL unique to each user based on their authenticated identity. The second option relies on the unpredictability of random value R and its existence as session cookie for that user.

Caching is also the basis for confusion in the X-Frame-Options RFC, in section 2.3.2.4:

Caching the X-Frame-Options header for a resource is not recommended.
[...]
However, we found that none of the major browsers listed in
Appendix A cache the responses.

Both statements are misleading. X-Frame-Options header is in fact cached along with the response and “replayed” from the cache just like Content-Options and many other HTTP response headers are. Major browsers including Internet Explorer, Firefox and Chrome all implement this correctly. (Of course if the resource itself is not cacheable according to Cache-control header, nothing will be stored. But there is no intermediate scenario where only the payload is stored but the header is stripped out.) It is easy to verify that web browsers are handling this correctly: here is a page with multiple test-cases for X-Frame-Options that includes one example with cacheable content.

In fact not storing X-Frame-Options for a cached resource would lead to a vulnerability and allow by-passing the restrictions:

Load the target resource in a top-level document, such as a new window, causing it to be cached by the client
Close that window and then reload the resource in a frame inside the malicious website.
Because the web browser retrieves the page out of the cache minus its stripped X-Frame-Options header (in this hypothetical implementation) it does not realize that framing is disallowed. The content renders correctly inside the iframe, creating a potential clickjacking vulnerability.

The problematic case the RFC tried to warn about is when ALLOWFROM is used to distinguish between multiple trusted framers. Suppose that foo.com and bar.com are both allowed to frame a cacheable resource served by a web-site. If this page is returned with ALLOWFROM=foo.com and later loaded by bar.com from the cache, it will not render. The cached header is granting access to the wrong framer.

Web browsers can’t divine when this problematic scenario arises. Looking at a single response containing ALLOWFROM directive, the browser does not know if there are multiple other authorized framers in addition to the presently named one. In the above example if foo.com was the only authorized website, there would be no problem with caching that restriction. Only the website has visibility into that logic. At best an RFC can point out this scenario and recommend that servers mark such responses non-cacheable. Realistically this is an edge-case: ALLOWFROM is not supported by Chrome. In fact a perfectly good patch submitted for implementing this was deliberately declined in favor of incorporating the functionality into Content Security Policy at some unspecified future date. In the absence of equivalence between major browsers, few websites can rely on ALLOWFROM exclusively for conditional framing.

Deciding after page-load

One can also devise workarounds where content is returned and rendered but somehow deactivated until the identity of the framer is established. For example all the UI elements may be disabled or the entire page obscured by an opaque div. The access check is performed in Javascript, based on a postMessage notification from the container. Since the message received includes the identity of the sender as vouched for by the web browser itself, the framed content can activate its UI based on the origin. These schemes do not suffer from caching problems as the access check is done every time the frame is loaded.

Forward secrecy and TLS: detecting active attacks (part II)

(Continued from part 1)

Checking for man-in-the-middle attacks

Imagine planting special “observer” nodes around the network. Each of them would try to make a TLS connection to the website we are interested in monitoring using one of the PFS ciphersuites. Each observer records a transcript of the TLS handshake conducted, including both its own messages and those purportedly sent by the server. Finally these transcripts are uploaded to a centralized monitoring service for processing. (This could be same as the website under observation or independent 3rd-party.) Likewise the website would also upload its own transcript of the TLS handshake when the TLS request is made to a special designated URL, designed to flag requests for special processing. With transcripts received from both sides, the monitoring service can reconcile them and verify that they are consistent.

Authenticating the transcripts

But the question is how can transcripts be delivered reliably? Let’s focus on the simplest case where the monitoring endpoint is operated by the same website. If observers tried to upload the data over TLS, this would lead to a circularity. We are trying to determine if TLS connections are being tampered with; we can not rely on that channel for integrity. Our threat model envisions a powerful attacker who can manipulate all network traffic, including what were previously assumed to be “protected” TLS communications. Such an attacker can impersonate the server when it comes time to upload the report, deceiving the observer into believing the report was delivered but then fabricating a bogus one to the original server that is consistent with unauthorized modifications induced by MITM attack.

But this is not a fundamental limitation. We can posit that each observer has their own public/private key pair for signing the transcripts for upload. These are independent keys held only by the nodes distinct from the TLS keys held by the website. Risk of compromise for TLS keys is mostly uncorrelated from that of individual keys distributed across ordinary PCs, tablets and smartphones around the Internet. Of course an adversary that controls the network can still block delivery of transcripts to the monitoring location. But then a complete absence of reports itself will look suspicious. Presumably temporary network outages or observers being powered off could mean that reports are often delayed or queued up to be uploaded in batches. This is not a problem as long as they are eventually delivered and reconciled successfully. On the other hand, once an MITM attack is executed, the adversary must suppress all reporting because the next report– however delayed– will include the evidence pointing to a discrepancy.

Limitations

There are a few caveats for making this type of decentralized observation system work:

Observer traffic must be indistinguishable from ordinary users at the network level. Otherwise the attacker would simply exempt these specific connections from MITM while successfully attacking ordinary users. For example if the observation agent had an unusual TLS configuration in terms of the supported ciphersuites/extensions, this would allow deciding at very early stage whether to intercept or let the connection go through.
A typical mitigation for this would be for observation agents to use off-the-shelf TLS libraries such as NSS or SChannel in the same configuration they are used by popular web browsers, such as Chrome and Internet Explorer respectively.
Original website being monitored must cooperate. This is crucial since the MITM detection relies on reconciling transcripts from “user” point of view with those from “website” point of view. If the website for any reason wanted to hide existence of such interception, it could always collude with the attacker and report bogus transcripts consistent with MITM attack.
There is limited ability for dispute resolution or proving to a third-party whether MITM occurred. At first this seems possible due to the way ephemeral DH/ECDH key exchange is implemented: the server signs its DH inputs in ServerKeyExchange message using the long-lived key from its X509 certificate. That allows verifying that the ServerKeyExchange message was in fact part of a genuine exchange with that server. In fact it even binds that message partially to other fragments of the handshake; the signature also includes client-random and server-random values. This prevents observers from fabricating completely bogus to report false-positive MITM attacks. At a minimum the ServerKeyExchange message must have originated with the server, or an attacker in possession of the same keys.
But that alone can’t prevent observers from swapping transcripts completely eg making two connections to the website with transcripts A and B, then uploading B as the first transaction. The website can detect this by recording messages it signed and realizing that a claimed MITM attack is in fact a confused client uploading a valid but mismatched transcript. The reconciliation point however can not make that determination without access to all TLS handshakes from that website.

Forward secrecy and TLS: limits of PFS ciphersuites (part I)

Much has been made recently of switching to perfect-forward secrecy in TLS. CNet has lavished praise on Google for being a pioneer in this area in a puff piece (never mind that these suites were included in TLS1.2 spec in 2008 and shipped in Windows client and server circa 2009 for anyone who cared to use it.) Latest update to the SSL/TLS deployment best practices includes the recommendation to prioritize PFS ciphersuites when configuring a web server.

Perfect-forward secrecy

First a bit about what PFS. PFS ciphersuites use a nested key-exchange, adding one more step to the process of deriving session keys used to protect the exchange of information between client and server. In ordinary TLS ciphersuites those session keys are exchanged in one step using the server’s long-lived RSA key, followed by a confirmation steps to verify that both sides ended up with the same value. But that means if server RSA key is ever compromised– even at later date in the future– someone who recorded a transcript of a previous handshake can now go back and decrypt it to obtain those session keys. Session keys in turn allow decrypting all of the subsequent traffic.

PFS introduces a Diffie-Hellman key exchange (either plain vanilla DH or elliptic-curve) that is in turn authenticated by the long-lived RSA or ECDSA private-key. Even if that RSA key is later compromised, it is too late for the DH exchange that was already protected in the past. Attacker faces the problem of solving that particular Diffie-Hellman problem, which is conjectured to be computationally difficult and related to the problem of discrete logarithms. More importantly each DH exchange is a separate problem; there is no single “key” to break that will magically solve all of the other instances with no additional effort. (Except for the disturbing possibility that most of them are based on a small number of elliptic curves. Some discrete-log algorithms involve a precomputation phase tailored to a specific curve, after which solving individual logarithms in that specific curve becomes more efficient.)

Limits of forward secrecy

PFS can not prevent an attacker from decrypting future communications after coming in possession of the secret keys. But due to the way TLS implements forward secrecy, attacking such communication requires tampering with the traffic in real-time, using an active man-in-the-middle attack. Armed with the server RSA keys, an attacker can impersonate the server to perform a different Diffie-Hellman exchange, using inputs chosen by the attacker instead of the original website. This allows arriving at the same session keys as the client for encrypting future traffic. To make the exploit truly transparent, the attacker then has to turn around and relay the decrypted traffic to the original website using an independent TLS connection.

Forcing an active MITM already raises the bar in three ways:

Actively modifying traffic is more difficult than simply monitoring and recording it. For example setups that involve “diverting” a copy of each packet to a collection point will not work.
It must be done in real-time. It is not an option to store lots of traffic, in the hopes of going back to decrypt it when keys are later obtained.
It can be discovered– in principle.

#3 is where things get interesting.

Working around Diffie-Hellman exchange

When the adversary is carrying out the second part of the MITM attack– connecting back to the original server to relay the exact same traffic the user sent– she has to initiate another TLS handshake. This handshake can reuse some of the same bits sent by the original user. For example the exact same initial key-exchange message can be recycled. The contents of that message were only protected using the long-lived RSA key of the server; by assumption our resourceful attacker already has their hands on that key so they can decrypt it.

But she can not use exactly the same DH messages that the original user picked. Those messages were based on a random, hidden value only known to the user, never revealed during the execution of the protocol. (Recall that in DH exchange both sides converge on the same result– which becomes the agreed-upon secret key– by combining their hidden value with the input sent by the other side.) That means the attacker has to improvise: she substitutes a different DH input with the known underlying random value. That leads to a divergence in protocol messages: the bits communicated when the user (mistakenly) believed they were talking to the server are different from the bits the server received when it (mistakenly) believed it was talking to the original user.

Updated: Oct 15, to clarify PFS handshake details.

[continued]

Physical access with PIV card: untapped potential

“Build it, and they will come” does not always work out for standards. Case in point: the sad state of physical access implementations for the US government PIV (Personal Identity Verification) card. Specified by NIST publication SP800-73 lays out an ambitious vision, supporting both logical and physical access control. The first category is access to buildings, restricted areas such as airport tarmacs. In the second category are scenarios such as smart-card logon for computers, connecting to a wireless network that uses 802.1x authentication or creating a VPN tunnel to the corporate network. The standard defines multiple public/private key-pairs and associated X509 certificates that a card can carry, intended for different purposes such as encryption or document signing. It even has some limited flexibility in choosing algorithms, supporting both RSA and ECDSA.

Strong authentication with public-key cryptography

The capabilities outlined in the PIV specification lend to a straightforward physical access protocol with high-level assurance. A very rough sketch of the interaction between card and compatible readers would run like this:

Cardholder presents their card into a badge reader
The reader queries the PIV card for one of the digital certificate.
It verifies the certificate up to a trust root and performs revocation checking.
Then the reader extracts the public-key from the certificate and issues a cryptographic challenge to the card that can only be answered with the corresponding private key.
Card computes the response to the challenge.
Reader uses the public-key to verify that the card response is correct. If this step fails, the protocol terminates with failure.
If the response is correct, the reader has successfully verified the identity of the cardholder.
This is not quite the end of the story however, since we still have to determine whether that person is allowed access to the restricted space. Typically that involves querying a back-end system that keeps track of access rules. These rules can be arbitrarily complex. For example some users may only be granted access to restricted area during business hours. But such policies are independent of the authentication scheme used between card and reader.

Reality: static data, no authentication

In reality many readers that claim to support PIV cards however do not implement anything near this level of security assurance. To take one example: the RP40 is a widely-deployed contactless reader from HID’s multiCLASS family of readers. Along with legacy 125Khz used for supporting the flawed and broken HID iClass protocol, the reader supports the modern 13.56Mhz band associated with NFC.

The PIV card also happens to be dual-interface, meaning that it can be used either by bringing the reader into contact with the metal plate on the card surface or wirelessly, by holding the card in the induction field generated by an NFC reader. The standard goes to great lengths to distinguish between NFC and contact-based usage, describing which operations are permitted in both cases. Of the different key-pairs specified in the PIV standard, only one– the card authentication key– can be used over NFC. The others are only accessible over the contact interface. (This restriction correlates with requirement for PIN entry: any key that requires PIN entry prior to use can only be invoked over the contact interface.)

RP40 specifications state that these readers support the “US Government PIV” standard. In principle then RP40 readers could have implemented a sound public-key based cryptographic protocol, compliant with the PIV standard by using the card-authentication key along the lines sketched above. But it turns out they don’t. Much like other early generation of PIV-compatible readers, they rely one of two pieces of static data:

UID associated with the card. This operates at the NFC layer, independent of PIV standard. UID supposed to be a unique identifier for NFC tags. In reality it is neither guaranteed to be unique across all tags or stable. Some cards deliberately emit a random UID that changes on each NFC activation, as a privacy measure designed to deter tracking. NFC standard only depends on UID to be unique for multiple tags introduced into the reader field at the same time, so-called “anti-collision” purposes. It is not intended to be used for authentication. While genuine NFC tags are required to have globally unique identifiers burnt-in at the factory, counterfeit chips exist that allow changing the UID to masquerade as any other tag.
CHUID, or card-holder unique identifier. Despite the name similarity, CHUID is a data object defined by the PIV standard. This is just a static piece of information stored on the card. It may have its own signature or other integrity protection but this signature is also static. CHUID can be trivially copied to another card and replayed. (Incidentally an update to FIPS201, the basis for PIV standard, clarified this further and deprecated use of CHUID for access control.)

In neither case is there a challenge-response protocol to verify that this static data emitted from the card was not cloned from a legitimate one. In fairness HID also has a new line of readers called PIVclass which does have proper authentication using either card-authentication key over NFC or the PIV authentication key with card slot & numeric keypad for PIN entry. But this is a relatively recent offering, specifically targeted at the government sector. Many commercial office buildings– including this blogger’s current and previous office locations– have an installed base of HID multiCLASS readers. Ripping out readers and installing new ones is a difficult proposition. Until they are upgraded, physical access with PIV falls short of its full potential.

Using cloud services as glorified drive: a wishlist (part VIII)

Recap from this series of posts exploring the idea of creating a private cloud storage systems (where the service provider can not read user data even if they want to or are compelled to) using only commodity systems:

Encrypting File System (EFS) does not interact as expected with cloud storage systems, leading to unprotected plaintext data being uploaded.
Parts 4 & 5 explored how BitLocker can be used to encrypt virtual disk images (VHDs) which are then uploaded to the cloud. But this design suffers from very slow upload times due to lack of incremental sync in most storage system, not to mention the inability to perform backups when the VHD contents are in use.
Parts 6 & 7 looked at an alternative design that applies BitLocker on virtual iSCSI targets inside a Windows VM hosted in the cloud. This one has incremental replication but does not provide data integrity when used over untrusted connections. It also has problems with concurrent access, requiring some higher-level protocol to ensure that only one device is accessing data at a given time.

Given that none of these proof-of-concept implementations were practical, time to ask a different question: what would an ideal cloud storage system look like?

1. 100% user ownership for keys

Cryptographic keys used for encryption are generated by the user and only stored on user-owned/operated devices. Keys are never “loaned” to the cloud provider, not even temporarily to perform on-the-fly decryption operations when the user is accessing data. Otherwise the provider can make a copy of the key or otherwise improperly capture the key, extending that “temporary” access into “permanent” access. Similarly keys can not be stored by the cloud provider permanently, not even in password protected form because that would permit the cloud provider to mount an offline attack to try to guess that password. (SpiderOak fails this criteria as noted earlier.)

2. Locally installed and managed application

The code performing the encryption/decryption must be a locally installed client application. This allows the user to exercise much better control over the behavior of that code, and guard against unauthorized changes. This is in contrast with on-the-fly delivery of such code from the cloud provider, as in the case of a web page for example. If the encryption logic is embedded in Javascript loaded every time from the cloud service, it would be trivial for that service provider to go rogue and serve modified logic that surreptitiously copies user keys or otherwise subtly undermines the integrity of encryption. This is exactly what happened with Hushmail because they were relying on JavaScript code delivered by the service provider into the user browser.

Of course the line between “locally installed”– or what used to be called shrink-wrap software in the days when applications would come in boxes lining store shelves– versus “web-based” is increasingly blurred. Even local applications can have update mechanisms that call home and receive additional pieces of code they execute. Depending on the vendor, such mechanism may or may not provide any user control. Microsoft for instance tends to make automatic updates at least opt-out. Google on the other typically favors forcing updates on users.

In this mode the software publisher to slip-in malicious logic targeted at specific users to undermine the encryption process. That said this is a relatively high-risk process. In principle such a backdoor would be discoverable if someone went to the trouble of reverse-engineering updates. It would also be undeniable for the publisher, since software updates are typically digitally signed. A reputation for delivering malicious updates can be limiting for future business prospects. (As an aside, Hushmail also has an option for using a Java applet, touted as “safer” option— never mind that Java in the browser has been a source of constant vulnerabilities. But that applet itself comes from Hushmail, so there is no reason the service provider could not tamper with its logic if it were inclined to do so.)

3. Standardized, modular protocol for cloud synchronization

To avoid the type of situation described earlier, it is best to decouple the local software that provides encryption from the remote service that provides storage of bits. Ideally this means a modular design: multiple local encryption schemes can be coupled with a given storage provider. Conversely for a given local encryption scheme, there are multiple providers who can store the resulting ciphertext, competing on factors such as space allocated, bandwidth and cost. This relieves the user from depending on a single entity for both providing the encryption logic and storing the resulting ciphertext. More importantly it gives users full control over the implementation: if they distrust a particular software publisher, they can choose a different interoperable one.

4. Encryption at individual file-level

This is primarily to simplify access from multiple devices. It is much easier to merge or manage changes at the granularity level of individual files than at a lower level of filesystems. The reason Bitlocker-based designs did not handle concurrent access very well is that a filesystem is effectively a global structure that can not be managed piecemeal by multiple devices unaware of each other. Worst-case scenario would be one device overwriting an edit made elsewhere but this is far more amenable to existing technologies for tracking/merging changes as long as all versions of the file can be retrieved from the cloud.

Reminder: oauth tokens and application passwords give attackers persistence (part II)

[continued from part I]

The password anti-pattern

Oauth protocol is an example of design-by-committee. It started out as a solution to a simple data sharing problem. Before long it branched out into a series of edge-cases for solving every possible use-case while blurring the line between authentication and authorization along the way.

The starting objective can be plainly stated as: allow user data to be reused across websites. To take a contemporary example, suppose LinkedIn wants to access user contacts from Gmail in order to suggest existing professional connections by comparing email addresses. The original approach adopted by every website in these situations came to be called the password anti-pattern. LinkedIn simply asked users to type in their Gmail password, then turned around and impersonated the user to Google, logging into their account to scrape contacts. (We could also call it “institutionalized phishing” but when respectable web services engage in the practice, a more neutral expression is preferred in polite company. Incidentally LinkedIn has been sued over their aggressive contact scraping, and the plaintiffs allege “hacking” into user accounts. That sounds like a creative attorney describing this practice of impersonating users with their password.)

There are many problems with the password anti-pattern. It trains users to get phished by creating the misleading impression that it is OK for any website to ask for any other website’s password. It is not compatible with two-factor authentication because it assumes that only a password is needed. (To add insult to injury, LinkedIn could also have asked for the one-time passcode since 2-factor authentication with OTP is still susceptible to phishing. Luckily they have not gone that far.) Finally any access granted will be lost when the user changes their password, requiring another round of collection.

Oauth addressed this problem by defining a protocol for the user to grant one website (“consumer“) access to specific resources associated with that user at another website (“service provider“). Not only does this avoid password sharing but it offers fine-grained access control: LinkedIn could request permission to access contacts only, without getting access to email or documents for instance. The end result of completing the oauth consent flow is an access token obtained by the consumer that can be used to access user data in the future.

Oauth for unauthorized access

By the definition of the earlier post, oauth counts as “alternative account access” mechanism. It can be used independently of passwords or any other credentials to access user data. Of course if this works for legitimate websites the user intended to grant access, it works just as well for websites controlled by attackers. After gaining temporary access to an account, an attacker can go through the oauth approval flow and grant her own website access to all possible user resources associated with that service provider.

Oauth for client applications

The original oauth use case was an example of authorization problem: controlling access to resources. Oauth did not prescribe how users authenticate at either the consumer or the resource provider. Almost immediately the protocol came to be repurposed and used for different use-cases: accessing user data from devices and client-applications. The distinction between these is becoming blurred now. Originally the first category intended to cover special-purpose appliances such as DVD players or gaming consoles, while the second one refers to applications running on commodity platforms such as a Windows desktop application or a mobile app on iPhone.

Both have two distinguishing features. At a superficial level, they lack the standard web browser interface for interacting with the ordinary oauth approval flow. More importantly, the ultimate destination for user data is a device he/she owns, as opposed to a service in the cloud with its own distinct identity. This is a somewhat bizarre notion of “authorization”: devices and applications are not independent actors with their own volition. In traditional security models, they are perceived as agents working on behalf of the user without any distinction made. Accessing Netflix from a DVD player is not a case of “authorizing” the DVD player to download movies, any more than logging into a banking website is an act of “authorizing” the web browser to access financial data.

Oauth and Android

Android relies heavily on this model for managing Google accounts. Because authentication on mobile devices is highly inconvenient, the operating system attempts to do this only once and persist some type credential for the life of the phone. When the user sets up their account on ICS and newer flavors of Android, an all-powerful oauth token is stored by the account manager. This token has the special login scope: it can be used to obtain oauth tokens for any other scope. Much like other access tokens, it can be revoked by the user. Unlike ordinary oauth tokens, it is invalidated automatically on a password change, providing some damage control in cases of recovering from account hijacking.

[continued]

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31