Temporarily using a Nexus S in Istanbul

Some pitfalls for the unwary, before popping in a new SIM:

Switching SIMs will remove passwords from saved accounts and break existing sync. This is a general property of Android and perhaps someone can explain the reason for this “feature.” Conspiracy-minded critics are likely to cry “carrier-humping surrender monkeys!” again. SIM is the instrument of customer lock-in for carriers; why create one more hurdle for switching providers, even when the switch is temporary? Replacing the original SIM does not recreate the lost credentials. Granted this is not irreversible, account names are still persisted and one can retype passwords– although it can be quite frustrating to enter symbols and punctuation marks on the inane virtual keyboard. Let’s not even get started on the difficulty of obtaining access-codes for accounts set up with new 2-step verification feature. It is not clear what threat this is defending against; merely removing the SIM without replacing it does not have this effect. Only inserting a new SIM appears to trigger the behavior, so it is useless in theft scenarios where the adversary removes the SIM to prevent remote wipe instructions. Incidentally it would be a real security feature if credentials were stored on the SIM card and never exported, with an applet on the SIM responsible for authentication. After all the SIM presents the only ubiquitous secure element found in every GSM phone. Carrier lock-in effects persists but at least there is a redeeming virtue in improved protection for credentials. Unfortunately contents of the SIM are tightly controlled by carriers and uploading your own Javacard applet there for other useful functionality has been a non-starter as far as business plans go. This is a major squandered opportunity for improving authentication across the board.
Configure the OS to not lock the SIM card. In the US most SIM cards do not require a PIN. At least in Turkey they appear to be; all the prepaid Turkcell cards I have seen had both the regular PIN and PIN2 for restricting call numbers. This adds one more step to the phone unlock process, on top of the pattern or existing passcode. A better design would have been for the operating system to realize that there is already an existing lock mechanism for the device, and cache the PIN automatically. (That said the screen locking is easier to by pass, as it is implemented in software; even the smudge patterns left on the screen have been shown vulnerable recently. By comparison the tamper-resistant SIM enforces its own lock out mechanism against guessing attempts.)
Mysteriously navigation does not work. Google Maps itself works like a charm– at least for now, Turkey does have a track record of blocking/unblocking Google services at seemingly random intervals. Also not surprisingly, GPS is very accurate and turn-by-turn directions are correct. But the device does not switch into navigation mode, hanging on “checking if navigation is available.” Fail.

Stuxnet and collateral damage

To update von Clausewitz’s maxim for contemporary times: “Malware is the continuation of politics by other means.” This is one of the lessons from the ongoing Stuxnet debate: targeted computer attacks has become part and parcel of nation states’ arsenal in carrying out foreign policy objective.

There have been solid technical analysis of Stuxnet’s complex inner workings, but the debate on policy implications is starting in earnest now. One question that has been overlooked is the extent of collateral damage tolerable from carrying out this type of attack.

Stuxnet was the odd combination of both being targeted very precisely and casting an extremely wide net. The malicious payload that infected industrial controllers only kicked into gear when it detected a very specific environment, believed to represent the uranium enrichment plant operating in Iran. On the other hand, because the software development for such critical facilities typically takes place behind air-gapped networks, the worm had to be released into the wild. Its humble beginnings were no different than the self-propagating malware that wreaked havoc in the past: Code Red, Nimda, Blaster, Slammer, … Except Stuxnet was light-years ahead of its predecessors in terms of sophistication and sheer number of different vectors used to infect new targets.

Because it was after a very specific target that would not be reachable directly from the Internet, the designers threw the kitchen sink at the problem, including an exploit that allowed the malware to propagate by USB drives between machines. This meant Stuxnet would eventually reach places that vanilla malware does not, including compartmentalized networks that been assumed to be isolated from the warzone that is the Internets. Stuxnet was designed to explore every nook and cranny in that space, in pursuit of its ultimate target, the programmable logic controllers destined to spin enrichment centrifuges. Given its non-discriminatory approach to spreading, it is surprising that most of the infections remained contained in Iran, with a smaller number in Indonesia and India– countries starting with “I” apparently did not fare well. By comparison the number of infections in the US were not significant. The first question then is what other systems are “fair game” on the way to reaching an objective. Stuxnet case is complicated by the fact that the presumed target is not directly reachable. Intermediate stepping stones are required to get there, which may end up being personal computers, Internet cafes, anything that is ultimately connected to the persons of interest in some unexpected six-degrees-of-separation logic. (This brings to mind the quote from Robert H. Morris Sr: “To a first approximation, every computer in the world is connected with every other computer.”) Worse the connections are not known in advance: it is a massively parallel search, exploring every possible path along the way in hopes that one may cross paths with the actual target. Such expansive views on scope risk turning every machine in the world into collateral damage in the name of reaching the destination.

The second dimension concerns damage. On most machines it infected, Stuxnet did nothing but propagate to other targets. Again there is a similarity to the massive worm outbreaks of good old days– with the exception of Witty, most contained no malicious payload. Even if it happened to land on a computer where some unlucky engineer had been tasked with developing software for industrial controllers for an unrelated industry, the tampered product would likely have worked flawlessly for its intended environment. This is not to say that there was no cost to Stuxnet for those in its path: there is still time and productivity wasted on removing the malware from the system, both for individuals and companies. On the other hand, economic impact for software vendors is murky. Antivirus vendors benefit from trumping up scare stories. This one fits the bill perfectly, complete with cloak-and-dagger nation state implications. Similarly it is difficult to argue that MSFT suffered great expense in addressing the vulnerabilities implicated in Stuxnet, considering their leisurely patch schedule in the presence of known 0-days.

In any case, it is misleading to focus on the designers’ intent in not harming systems– far from being a magnanimous gesture on their part, it was simply following best-practices in malware design. Noisy/buggy malware is the one that gets noticed and removed. Stealth is a survival strategy: even run-of-the-mill keystroker recorders designed to be steal credit cards in the name of petty theft strive to be very stable. Vandalizing user data, blue-screening the system or displaying in-your-face popup advertisments is the surefire way to get your malware noticed by an AV vendor. (Interesting enough Stuxnet was noticed by Kaspersky and filed away as vanilla malware a full year before its inner workings were properly understood.) The problem is that modern operating systems are incredibly complex, and it is not possible to ensure that malware lives up to its promise of zero collateral damage. When Robert Morris Jr. released the Internet worm, he intended it to propagate only, with no malicious payload and barely noticeable load on infected systems. But a slight miscalculation/bug in the logic caused it to overwhelm networks and machines. Even MSFT can not ship software updates without breaking users in some unexpected, obscure configuration– and they have much higher Q&A expertise and test matrix then organizations developing malware.

The network infrastructure has long been a battle ground, with participants of every scale from hobbyist vandals to organized crime groups and nation states, duking it out with packets. The question raised by Stuxnet is whether these frontlines will expand to includes the machines owned/used by ordinary citizens, turning them into dispensable pawns in pursuit of an elusive objective.

Choices and security: when designers can not decide

(Reflections on Joel Spolsky’s talk at Google NYC office previous week.)

Joel Spolsky has previously harped on the problem of arrogant UI design interrupting users with self-important questions on trivial settings– how many items to display under recently opened files, whether to upgrade to release R8 etc. This is one of the main themes in his 2001 book “User interface design for programmers”– the options/preferences menu to paraphrase Spolsky, is a record of all the design controversies the developers ever faces and failed to resolve decisively, punting it to the user.Given the mediocre quality of most UI design, it is difficult to argue with this. In fact finding hillariously awful examples of lame dialogs popping up at inopportunue moments is about as difficult as shooting fish in a barrel. But two of the points cited in the talk deserve closer scrutiny.

One example came from the Options dialog in Visual Studio. There are literally hundreds of possible settings to tweak in that particular application and bringing up that dialog must be like opening Pandora’s box. But there is a big difference between an element of the interface that the user intentionally seeks out verses one that interrupts the primary activity with a question that the user is likely not interested in at that point. This is similar to the “about:config” option in Firefox– no one would fault the Firefox developers for burying ultra-advanced options such as whether to enable ecdhe_ecdsa_des_ede3_sha cipher suite in TLS. It would rightly justify ridicule if Firefox asked this question in the middle of connecting to a website or even displayed a checkbox for it under the security-options tab; but they did not. Clicking past the semi-humorous warning about voiding your warranty implies an assumption of risk that complex beasts lie ahead.

Second example is the standard Authenticode dialog from Windows, the dreaded “do you want to install software published by Acme Inc?” question. A former colleague at MSFT who also worked on IE once joked that the text be replaced with “Do you feel lucky today?” (Being polite our software would drop the modifier from the original Dirty Harry version.) Because the user often has exactly zero context to make a decision more informed than flipping a coin. Let’s suspend disbelief for a moment and pretend that certificate authorities were competent. Company name displayed in the dialog accurately represented the identity of the software publisher with no misleading, sound-alike names. There are thousands of companies publishing software for Windows. A handful may have brand recognition: if the dialog claims ActiveX control is signed by Microsoft, chances are it is not intentionally malicious. (Ofcourse This does not mean that it is not buggy or contains an unintended security vulnerability that will still lead to grief– only that the developers started out with “good intentions” assuming their interests are aligned with that of the user.) Vast majority of developers are not household names. Worse the bundling of spyware means that even publishers with the benefit of name recognition such Kazaa and Morpheus etc. in the heyday of P2P file sharing had a dubious record of shipping adware.

In other words, Joel Spolsky is right: the user is not in a great position to make this security decision because they have very little information to go by. Unfortunately the designers of the software are in an even worse position: they are just as ignorant of the facts, and worse they do not share the user’s value judgments.

Going back to that Authenticode prompt: its designers are no more prescient than the user in divining the quality of software development practices or for that matter the integrity of the business model from the vendor name. MSFT provides the platform for independent software vendors; grading the efforts of those vendors has traditionally been a matter for customers voting with their dollars.

Most of the obvious security decisions are already settled by reasonable defaults. IE no longer prompts users to decide what to do about an expired certificate issued from a trusted authority with mismatched name. It practically dead-ends the user in a semi-threatening error page that is very difficult to get past. This is the easy case: designers can make a right call with high confidence. In this case case they made the call that SSL depends on certificates validating correctly and if you can not configure your website correctly, you deserve to lose traffic. The first one is a fact, the second a value judgment, a relatively new one at that: certainly did not used to be the case in the early days of the web when “making it work” took priority over security. Yet it is a sentiment most people will agree with today, except for the clueless website owners still struggling with their certificate setup. For most of the interesting trust decisions, there are no such clear cut answers.

Second designers may face significant legal concerns: if they favor installing software from Acme but not from its competitor, legal sparks will fly. This is why efforts to classify malware need air cover from watertight definitions of spyware, applied consistently to leave no room for allegations of playing favorites.

Finally designers and users differ in their values. This is a case where deciding on behalf of the user is the arrogant and presumptuous option. For a moment replace “Acme Inc” with “Government of China.” Do we want the publisher deciding that it is OK to trust software authored by the Chinese government for automatic install? One can decry the sad state of compartmentalization in modern operating systems, but current reality is that installing an application has significant consequences. This is not a cosmetic change to the appearance of a seldom-used menu or the color of background: confidentiality and integrity of everything the user has on that computer is at stake. Fundamentally this user is facing a trust decision. Designers can not make that decision for him/her because everyone has different values predisposing them to embrace certain institutions wholeheartedly while being inherently skeptical of others. They have different levels of risk tolerance– the Internet cafe user looking for the proverbial dancing squirrels clip verses the attorney with confidential documents to protect. This is one case where the decision belongs to the user.

Unlinkable identifiers and web architecture: connecting the dots (3/3)

[Final piece of the series, see first and second posts.]

The greater challenge with trying to create unlinkable user identifiers on the web is the ease of linking them online. Standard models of “linking” assume that the sites the user visited get together offline, long after the user has visited both of them, and try to ascertain whether two activity sequences they observed belong to the same users. It is relatively easy in this model to come up with ways of assigning identifiers to users that are deterministic, unique to each site and computationally difficult to link even when multiple sites collude.

Problem is, websites are not constrained to this simplistic attack model. Even today user tracking on the web involves a type collusion enabled by one of the elementary assumptions in HTML: any website is free to include content from any other website. That is by design. Any website can include an image, a frame, a video or song from another website. That means the web browser will automatically follow hypertext links crafted by one site and pointing to another website. That is a problem for privacy– a link can encode arbitrary information.

Consider an authentication system that assigns cryptographically unlinkable identifiers to users. The movie rental website knows this user as #123 while the bookstore knows her as #456. Enterprising marketing teams at these websites decide they want to collude and link user information. End goal is that the bookstore learns the users’ movie preferences and the video store gets an idea about her library, in the hopes they can create personalized offers. This is going to be a tall order when users are offline, because there is no unique identifier to key off. (Assuming we suspend disbelief– in reality credit card numbers or shipping addresses are the fly in the ointment as explained in the second part.) Instead they must capitalize on that window of opportunity when the user is online and logged into both sites.

Every page on the bookstore website has an image or other embedded content pointing to the video download site, and vice verse. Using transparent 1×1 images is customary for this purpose but such attempts at stealth are not required. The link for the embedded content contains the pairwise-unique identifier for the user observed by one site. When the user follows that link, they are going to be communicating two identifiers:

First one is implicitly encoded in that link crafted by the sender, say #123. This is the identifier observed by the originating site and it will be encoded in the URL, or other piece of the request such as the form fields.
Second identifier is explicitly asserted in the authentication protocol used by the destination. This is the identifier associated with the destination, say #456.

At this point the destination site has enough information to link the two: user #123 at the bookstore is same person as user #456 over here. Once that association is made, databases can be joined offline: everything about her book purchases can be joined against everything known about her tastes in bad 1980s cinema.

Granted this attack has some significant limitations: the user must be authenticated at both sites at simultaneously, or at least have some persistent identifier (such as a cookie) stored on both sides that encodes their identity. This turns out not to be a significant limitation, since users do authenticate to multiple sites in a single browser session, and in any case they need to fall into this trap just once for the permanent linkage to be created. A bigger problem is that linkage is limited to a pair of websites only. If 10 websites need to collude, there are 45 pairs of identifiers to sort out so this approach implemented naively would not scale.

Fortunately or unfortunately depending on the perspective, having each pair of websites in the conspiracy link identifiers is not necessary. A much simpler solution is to designate a single tracking agent against which everyone else’s identifiers are linked. Every website embeds content from this one site, which observes and stores all of the identity pairs observed together.

In the real world of course such tracking agents go by a more mundane name: advertising network. Display advertising networks have the unique benefit that they in fact appear as embedded content any number of websites, by design. Any time a user is authenticated to multiple sites and these sites contain third-party content hosted by the network, there is an opportunity to link the two identities together. In fact explicit authentication to the advertising network is not required: even a weak, temporary identity such as session cookie works: anytime more than one external ID is observed, that creates a permanent record. If the network observes #123 and #456 appearing in one session today, and sees #456 and #987 in an independent session tomorrow, the conclusion is all three identities are linked.

What this suggests is that until automatic loading of embedded content on pages is controlled better, unlinkable identities will be facing an uphill battle against one of the basic design principles behind the web.

Incompetent by design: why certificate authorities fail

(And more importantly, why we can not do anything to fix that.)

It sounds like a great business model, one that ought to be advertised on late-night TV infomercials. It might work this way– imagine loud, over enthusiastic pitchman reciting:

“Become a certificate authority! For a few pennies a day, you can sign largely meaningless statements that users around the world depend on to protect their Internet communications. Forget about hard work and all those complicated problems of identity verification. Let your servers figure out the trustworthiness of the customers. And if they get it wrong? Hey, who cares– your certification policy statement indemnifies you from any damages resulting from gross negligence.

And if you call in the next 10 minutes, we will even allow you to issue EV certificates!”

There is an abundance of evidence that certificate authorities are by and large incompetent, clueless and outright bone-headed:

Bogus Microsoft code-signing certificate, valid until 2017. (Yes they are revoked but look in your Windows cert store, and under “untrusted certificate” you will see these two shipping in every version of Windows out of the box marked with the equivalent of danger-will-Robinson sign, just in case revocation check failed for some reason.)
The real Mozilla: bogus Mozilla.com certificate issued for the distribution website of Firefox.
Circular dependency on DNS. Dan Kaminksy’s work on DNS vulnerabilities revealed that some certificate authorities verified ownership of a domain simply by sending email to that domain– a security solution designed to be resilient in the face of a completely 0wned network is rooted on the assumption of email routing being trusted?
Having an email address with the word “admin” in the tiel does not entitle user to SSL certificates for that domain. This is a lesson email-hosting providers found out the hardware when free email accounts with authoritative-sounding names were enough to convince CAs to issue certificates to the user who squatted on that account name.
The crowning achievement: RapidSSL continues to use MD5 for issuing new certificates in 2008– four years after the pioneering work of Xioyung Wang made it clear that MD5 can not be used for new signatures due to possibility of hash collisions. Resulting attack, in spite of being entirely predictable as train wreck in slow motion, is spectacular enough to earn best-paper at Crypto 2009.

It’s not that every certificate authority is clueless: but the abundance of these examples make it clear that somewhere somone is bound to do something remarkably dumb. That brings us to problem #1:

It takes only one CA to bring down the system. To simplify the picture a bit, operating systems have a set of “trust anchors”– list of the certificate authorities they are willing to trust for important assertions such as the identity of a website they are about to share some personal information with. These anchors are interchangeable. A certificate issued by one is trusted as any other. Any one of them is good enough to show that padlock icon for IE or yellow address-bar for Chrome or … (In an ingenious marketing scheme, a new category called extended validation or EV was created to separate the “competent” CAs from the hoi-polloi. We will address in the second part of this post why EV is a very profitable type of delusion.)

This is a classic case of the weakest link in the chain. Successfully compromising any one CA is enough to defeat the security guarantees offered by the public-key infrastructure (PKI) system they constitute collectively.

Quick peek at the Windows certificate store in XP reveals somewhere north of 100 trusted certificates there. Among these, Verisign and Microsoft are probably the only recognized brands. How many users heard of Starfield Class 2? UserTrust based out of Salt Lake City, Utah? For some international flavor try Skaitmeninio sertifikavimo centras, based in Lithuania. There is also “NO LIABILITY ACCEPTED (c) 97 Verisign Inc”– yes that is the name of the issuer on one of the root certificates. (Complete with all capital words; Verisign could use some help from Danah Boyd on capitalization it appears.)

That list is actually a conservative estimate. Windows has a feature to auto-update new root certificate from Windows Update on demand during chain building. That’s why the sparse appearance of roots in Vista and Windows 7 out-of-the-box is misleading. They are still implicitly trusted, waiting to be triggered during a search for roots. MSFT has a KB article pointing to the full list of current roots. 0wn one of these 100+ entities, and you can mint certificates for any website. 0wn one of the couple dozen trusted for code signing and you can start writing malware “signed” by Microsoft or Google.

What does it take to join this privileged club? It’s not yet being advertised in infomercials but both Microsoft and Mozilla publish the criteria. MSFT being under the scrutiny of regulators is under particular pressure to keep a level playing-field and keep the ranks of membership relatively open. Mozilla has its own requirements to allow low-cost issuers since the open source community generally views PKI as no less than a cabal of rent-seeking oligopoly. (Google Chrome simply picks up the trust-anchors in the platform, namely Windows root from CAPI on Windows and NSS/Mozilla roots on Linux.) In reality since an SSL certificate that only works for some fraction of users is largely worthless, the criteria CAs live up to is the intersection of both programs.

The interchangeable nature of CAs for end-user security brings us to the fundamental economic problem: there is no incentive for better security in a commodity product. CAs are competing on a commodity and market dynamics will drive prices closer towards zero– along with that the quality of the product. If one CA decides to do a better job of verifying customer identity and charge extra for this, customers will simply move on to the next CA who rubber-stamps everything sent its way. Competence is self-defeating.

There is a more subtle economic problem in the model: the paying “customer” is the website or company that needs a digital identity. (or as the examples above demonstrate, the miscreants trying to impersonate the legitimate owner– CA gets paid either way.) But the end-users who depend on that certificate authority doings its job correctly are not privy to the transaction. This externality is not factored into the price.

One could argue the direct customer is on the hook: if a company suffers an attack because of a forged certificate in its name, their own reputation is on the line and this provides economic incentive to do the right thing. But this is deceptive. Even if Alice cares enough about her reputation to shell-out extra $$$ for a high-quality certificate from the most diligent CA, she can not fend off an attack from Mallory who tricks the dirt-cheap and careless certificate authority into issuing a bogus “Alice” identity to Mallory. The existing installed base of web browser does not provide a way for Alice to instruct her current customers to only trust certificates from the competent CA and ignore any other identities. (Even assuming Alice wanted to do this– it would mean creating lock-in for the CA and voluntarily relinquishing the competitive pressure on pricing.)

[continued]

Unlinkable identifiers on the web: rearanging deck chairs (2/3)

CardSpace boasts a limited degree of unlinkability, based on a weak attack model: for self-signed cards, user can generate two assertions for two different websites that appear independent. (My colleague Ben from Google security team disputes even that weak guarantee, arguing that only assertions that can not be linked even with help from the identity provider qualify as “unlinkable”)

OpenID gets a bad reputation for allowing linkability but in fact there is no requirement of universal identifier in the specification. OpenID provider could choose to assert two different “names” for the same user to two dfiferent websites. (Of course they are still linkable in the sense that the ID provider knows what is going on, even if the sites can not put the picture together on their own– sort of, see next two points.)

The problem is even this weak guarantee of “unlinkable” identities at multiple websites breaks down in the real world, for two reasons.

First problem is that websites insist on email address or other unique identifier– and they want this at authentication time. When an inherently PII information such as email address is shared, unlinkability of the underlying protocol becomes largely irrelevant since there is another, even more universal identifier to go by. Same email address would appear even when user authenticates via two different identity provides– this is linkage across independent providers.

Federated ID providers are not in a position to say no: they are trying to convince relying sites to interoperate. Everyone already has a proprietary identity management system already, requiring users to sign up. This registration process collects some basic information and availability of that information is firmly embedded in the business logic. Going from a model where the site has an email address to one where they know the user as “pq2t45x” is not an appealing proposition. Similarly any time the user shares a global identifier such as address, real name or credit card number they void any privacy guarantees from the identity model.

As a matter of architecture, authentication systems should strive for minimum disclosure– more identifying information can always be added after the fact, but it is impossible to go back in the direction of greater privacy. Even if majority of transactions ended up with the user sharing PII during at one point (making them very linkable regardless of authentication) it’s fair to argue that underlying protocols need to optimize for the best case of no disclosure and casual browsing. But the reliance on email address in existing scenarios means that redesigning basic protocols in this fashion to disclose less will be an exercise in rearranging deck chairs.

In many ways email address is the easiest attribute to fix, especially when ID provider is also the email provider– true for the three largest ID providers, Windows Live/Passport, GMail/Google and Yahoo– they could simply fabricate email aliases that forward to the original. Unfortunately that still breaks support scenarios because when alice@gmail.com calls asking for help, the system has her records files under the very private dx4r2p6@gmail.com. Other identifiers have their own private versions depending on the provider: some credit card companies support issuing one-time card numbers billing to the original. Mail services allow signing up for a PO box and hide original physical address (although good luck getting many ecommerce merchants to deliver to one due to their high incidence of fraud) and conceivably they could start algorithmically generating those PO box numbers to break linkage.

Even if every instance of linkable PII could be replaced by a pairwise unique variant, there is a second problem: linkage between identifiers is possible when user is authenticated to multiple sites at the same time.

[continued]

MD2: reply to comments

Glad to see a comment from Dan Kaminksy on the last post about the severity of MD2 issues.

Follow-ups:

There is no denying the problem with MD2. Discontinuing its use (eg rejecting certificates signed with MD2, as OpenSSL already has and t he upcoming MSFT patch will implement) is the right response.
Point argued in the post is that severity and urgency of the problem is low. Compared to other X509 problems disclosed by Dan Kaminsky and Moxie Marlinspike — including the null handling, OID confusions and even more deadly remote code execution in NSS– MD2 issue is a distant second. The sky is not (yet) falling.
It’s not clear the MD5 parallel holds: When Wang and her colleagues found actual collisions people were widely using MD5 for new signatures. In fact the forgery of an intermediate CA cert in December 2008 proved some certificate authorities are so clueless that they continued using MD5 for new signatures after 4 years and several improved attacks. (The fact that SSL CAs are bound to be incompetent and clueless as the expected competitive outcome deserves its own blog post.) MD2 has long been retired for new signatures, leaving only past signatures to exploit.
Basic birthday attacks are enough to exploit new signatures. Advances in the types of collisions possible– such as controlling the prefix– only improve the odds. But leveraging past signatures in a hash function that is no longer used requires a second pre-image attack. Nobody has managed to produce even a single one for MD2.
As of this writing, best second preimage attacks have time complexity comparable to 2**73 MD2 invocations and storage complexity of 2**73. And that second number makes this attack impractical . Eight billion terabytes– an awful lot of spare disk drives. (As an aside– Daniel Bleichenbacher looked into this and did not see any low-hanging improvements to the storage requirement either.)

Bottom line: yes there is a problem with MD2. It never presented an immediate danger. Cryptographic attacks are fascinating but the more mundane X509 parsing bugs disclosed around the same time– and continuing tradition of CA incompetence– are far more fatal to PKI.

MD2: hash-collision scare of the day

Overshadowed by the far more serious X509 parsing vulnerabilities disclosed at BlackHat, one of the problems noted by Dan Kaminsky et al. was the existence of an MD2-signed root certificate.

On the surface it looks bad. If MD2 preimage collision is possible, an enterprising attacker could forge other certificates chaining up to this one, and “transfer” the signature from the root to the bogus certificate, complements of the MD2 collision. Root certificates are notoriously difficult to update– Verisign can not afford (for business reasons, even if it is the “right thing” for the Internet) to risk revoking all certificates chaining up to the root. Re-publishing the root signed with a better hash function is a noop: the existing signature will not be invalidated. Only option is to not trust any certificate signed with MD2 except for the roots.

But looked from another perspective, the MD2 problem is tempest in a teapot. Luckily no CA is using MD2 to issue new certificates. (At least as far as anyone can determine– CA incompetence is generally unbounded.) This is important because the MD5 forgery from last December depended on a bone-headed CA continuing to use MD5 to sign new certificate requests. That means a second preimage collision is necessary; simple birthday attacks will not work. Finding a second message that hashes to a given one is a much harder problem than finding two meaningful, but partially unconstrained messages that collide.

Eager to join in the fray against PKI, the researchers point to a recent result, An improved preimage attack on MD2, to argue that such a possibility is indeed around the corner. It turns out the feasibility of this attack and the 0wnership of MD2 was slightly exaggerated, to paraphrase Mark Twain. The paper in fact does quote 2**73 applications of MD2 hash function as the amount of time required to find a second pre-image. This is an order of magnitude above what any previous brute-force attack has succeeded in breaking but Moore’s law can fix that. What the paraphrase seems to have neglected is a far more severe resource constraint, stated bluntly in the original paper and mysteriously neglected in the Kaminsky et al summary: the attack also requires 2**73 bytes of space. Outside the NSA nobody likely has this type of storage lying around. None of the existing distributed cryptographic attacks have come anywhere near this limit– in fact most of them made virtually no demands on space from participants. To put this in context, if one hundred million people were participating, each would have to dedicate more than a thousand terabytes of disk space. Not happening. This does not even take into account the communication and network overhead now required between the different users each holding one fragment of this massive table as they need to query other fragments.

Pairwise identifiers and linkability online (1/3)

There has been a lot of talk about “unlinkability” in the context of web authentication systems. For example Cardspace is touted as being somehow more privacy friendly compared to OpenID because it can support different identifiers for each site the user is interacting with. This post is a first attempt to summarize some points around that– complete with very rusty blogging skills.

To start with, the type of unlinkability envisioned here is a very weak form compared to the more elaborate cryptographic protocols involving anonymous credentials. It comes down to a simple feature: when a user participates in a federated authentication system (which is to say, they have an account with an identity provider that allows the user to authenticate to websites controlled by different entitites) does the user appear to have a single, consistent identifier everywhere he/she goes?
It is not a stretch to see that when such a universal identifier is handed out to all of the sites, it enables tracking. More precisely it allows correlating information known by different sites. Netflix might know the movie rental history and iTunes might know their music preferences– if the user is known to both sides by the same consistent identifier, the argument goes, the two can collude and build an even more comprehensive dossier about the user. This is a slightly contrived example because movie rentals are uniquely identifying already (as the recent deanonymization paper showed) and chances are so is the music collection, but it is easy to imagine scenarios where neither site has enough information to uniquely identify a user but when they can collude and put all of the data together, a single candidate emerges. Consider Latanya Sweeney’s discovery in 2000 that 87% of the US population can be identified by birthdate, five-digit zipcode and gender. It does not require very many pieces of information– if they can be assembled together with the help of a unique ID each one is associated with– to pick out individuals from the online crowd.

The obvious solution is to project a different identity to each website. Alice might appear as user #123 to Netflix but iTunes remembers her as user #987. With a little help from cryptography it is easy to design schemes where such distinct IDs can be generated easily by the authentication system, and virtually impossible to correlated by the websites receiving them even when they are trying to collude.

[continued]

OCSP: “This fail brought to you by the number three”

There is much to write about the disclosure of vulnerabilities in X509, independently found by Moxie Marlinspike and Dan Kaminsky. There was some overlap in their discoveries, but also unique aspects. One worth highlighting is Moxie’s attack on defeating OCSP with a single byte.

Online Certificate Status Protocol attempts to verify the validity of web site certificates by contacting an OCSP responder, operated by the certification authority (CA) and asking the question “is the certificate with serial number #123 still valid?” This is one of two approaches to doing revocation checks– the Achilles heel of PKI. The other option is CRLs or certificate revocation lists. In that model the CA publishes a list of revoked certificates periodically and client download these (hopefully in advance, so that by the time you have to check it is locally cached) and look for the certificate in question there.

Now since the revocation status of a certificate is critical bit of information, the answer from the CA, whether packaged into a CRL or generated on-demand in response to an OCSP ping, has to be somehow protected. Otherwise the bad guys– on top of having obtained a bogus certificate and private key– can simply forge a response to suggest that all is well with the certificate.

It’s tempting to say “download the CRL / run the OCSP check over SSL” but that would create a circularity for boot-strapping. Not to mention that SSL is a very expensive solution to the problem of content integrity. Instead CRLs and OCSP responses are digitally signed, typically by another certificate that chains up to the issuing root CA.

Now if that had been the entire story revocation checking would be sound.

But Moxie noticed that by design OCSP allows unauthenticated responses, namely the set of responses collectively dubbed non-authoritative. This includes conditions such as an internal error with the service, malformed request and “try again,” suggesting that the server might be overwhelmed with demand at the moment. These replies do not require a signature– by design. It’s in the RFC. In these cases a single byte indicating the response status is a valid OCSP response.

Of course this defeats one important security guarantee: when a non-authoritative response is received, the client can never be sure if it came from the OCSP responder. An attacker pulling of a man-in-the-middle attack could always forge one of these respones. Granted such an attacker could also drop the traffic and make it appear that the OCSP responder has vanished from the surface of the Earth. The bottom line is that in the absence of a signed response, client can not make any conclusions about the status of the certificate.

Of course implementations of OCSP must deal with this condition. They need to report what went on, and the buck stops somewhere along the application stack, where one developer decides what to do with these non-authoritative error codes.

Moxie’s discovery is for both Windows CAPI and NSS, that decision is to treat the “Try Again” response with code 3 as a successful revocation check. That means IE and Chrome (built on top of CAPI) and Firefox (built on top of NSS) are trivially confused in OCSP checks… with a one-byte response containing “3.” Neatly summed up by one of the money slides in the presentation: a giant three interposed over the OCSP RFC.

Granted this is not the entire story: starting with Vista there is a complex revocation checking logic in CAPI that will load-balance between OCSP and downloading CRLs. CRLs are more efficient at scale: If every user in the world started hammering Verisign’s OCSP responder for every SSL request, Verisign would fall over in a matter of seconds. But they are highly inefficient in the short-term: in order to check the status of a single certififcate, the client is tasked with downloading a massive document, in the middle of setting up a connection. Vista tries to solve this problem by looking for frequent revocation checks and scheduling CRL downloads for them. Once a non-expired CRL has been downloaded, in principle the OCSP check is not required because looking at the locally cached document is faster and will reveal the revoked status of the certificate. In other words there may be edge cases to the Moxie attack where it stops working, depending on the past history of revocation checks.

Still a remarkable way to cap off a series of attacks against X509 parsing.

cemp

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Random Oracle

Building and breaking systems

Author: Cem Paya

Temporarily using a Nexus S in Istanbul

Stuxnet and collateral damage

Choices and security: when designers can not decide

Unlinkable identifiers and web architecture: connecting the dots (3/3)

Incompetent by design: why certificate authorities fail

Unlinkable identifiers on the web: rearanging deck chairs (2/3)

MD2: reply to comments

MD2: hash-collision scare of the day

Pairwise identifiers and linkability online (1/3)

OCSP: “This fail brought to you by the number three”