Issuance vs revocation
Certificate revocation remains the Achilles heel of public-key infrastructure systems. The promise of public-key cryptography was stand-alone credentials that can be used in a peer-to-peer manner: a trusted issuer provisions the credential and then steps aside. They are not involved each time the credential is used to prove your identity. Contrast that with prevalent models on the web today such “login with Facebook” or “login with Google.” These are glorified variants of an enterprise authentication system called Kerberos that dates from the 1980s. Facebook or Google sit in the middle of every login, mediating the transaction. In the PKI model, they would be replaced by a trusted third-party, known as the “certificate authority,” issuing long-lived credentials to users. Armed with that certificate and corresponding private key, the person could now prove their identity anywhere without having to involve the CA again. This is the model for proving identities for websites: the TLS protocol, that once ubiquitous lock-icon on the address bar or green highlight around the name of the website, is predicated on the owner of that site having obtained a digital certificate from a trusted CA.
Revocation throws a wrench in that simple model. Credentials can expire but they can also be revoked: the website could lose their private-key or one of the assertions made in the certificate (“this company controls the domain acme.com”) may stop being true at any point. There are two main approaches to revocation not counting a home-brew design used by Google in Chrome, presumably because the relevant standards were not invented in Mountain View. The first one is Certificate Revocation Lists or CRLs: the certificate authority periodically creates a digitally signed list of revoked certificates. That blacklist is made available at a URL that is itself encoded in the certificate, in a field appropriately called the CDP, for “CRL Distribution Point.” Verifiers download CRLs periodically— each CRL defines its lifetime, to help with caching and avoid unnecessary polling— and check to see if the serial number of the certificate they are currently inspecting appears in the blacklist. There are additional optimizations such as delta-CRLs that can be used to reduce the amount of data transferred when dealing with large-scale PKI deployments where millions of certificates could have a revoked status at any time.
If CRLs are the batch mode for checking revocation, Online Certificate Status Protocol is the interactive version. OCSP defines an HTTP-based protocol for verifiers to query a service affiliated with the CA to ask about the current status of a certificate. Again the location of that service is embedded in the certificate in the “Authority Information Access” field. Interestingly OCSP responses are also signed and cacheable with a well-defined lifetime, which allows optimizations such as OCSP stapling.
Taken in isolation, the cost of doing a revocation check may seem daunting: web-browsers try to shave milliseconds from the time it takes to render a web page or start streaming that video. Imagine having to pause everything while a CRL is downloaded in the background or OCSP status queried. But recall that this cost is amortized over thousands of certificates: one CRL covers not just one website but all websites with a certificate from that CA. Considering that a handful of public certificate authorities account for the majority of TLS certificates, not having a fresh CRL already present would be the exceptional scenario, not the norm. More importantly the retrieval of this information need not be done in real-time: for example Windows has sophisticated heuristics to download CRLs for frequently used CAs in the background. It can even prefetch OCSP responses for frequently visited individual sites, based on a predictive model that those sites are likely to be visited again.
Weakest link in the (certificate) chain
There is one aspect of PKI operation that no amount of client-side optimizations can compensate for: CA incompetence. If CAs are not promptly acting on reports of lost credentials, failing to maintain CRLs correctly or experiencing frequent outages with their OCSP responders, relying-parties will at best remain in the dark about the status of certificates— having to make a difficult decision on what to do if revocation status can not be determined— or worse, reach the wrong conclusion and make trust decisions based on a bogus credential. Cases of CA ineptitude in the case of TLS certificates are legion, and need not bear repeating here other than to point out an economic truth: it would be astonishing to expect a different outcome. All CAs trusted by popular web-browsers are effectively on equal standing; there is nothing distinguishing a certificate issued by a competent CA from one that constantly screws up. In both cases the websites get their all-important lock icon or avoid being publicly shamed by Google Chrome. (Extended validation certificates were introduced a decade ago with better vetting requirements, but that only had the effect of creating two tiers. All EV-qualified issuers still stand on equal ground in terms of trust afforded by browsers.) That means issuers can only compete on price. There is zero incentive for CAs to compete on operational competence, beyond the minimal baseline required by the CA/Browser Forum to avoid getting kicked out. It turns out even that modest baseline is too high a bar: Symantec managed to get kicked out of both Mozilla and Google roots.
Operational errors by certificate authorities have been well documented in the context of TLS certificates. From relatively “minor” infractions such as issuing certificates to unauthorized parties pretending to be affiliated with major websites to epic failures such as accidentally signing subordinate certificates—effectively delegating ability to issue any certificate to the bad guys— public PKI has been veritable zoo of incompetence. It turns out that code signing and revocation create even more novel ways for CAs to stumble.
Same mathematics, different purposes
A public-key algorithm such as RSA can be used for different purposes such as encryption, authentication or digital signatures. The underlying computations are same: whether we are using RSA to protect email messages in transit or prove our identity to a remote website, there is a secret value that is used according to the same mathematical function. While there are superficial differences in formatting and padding, there is no reason to expect a fundamental difference in risk profile. Yet each of these scenarios have different consequences associated with loss or compromise of keys. (To be clear, “loss” here is used to mean a key becoming permanently unusable, both for the legitimate owner and any potential attacker. By contrast “key compromise” implies the adversary managed to get their hands on the key.)
Consider the case of a key used for authentication. In a realistic enterprise scenario, this could be a private key residing on a smart-card used for logging into Activity Directory. If that key is rendered unusable for some reason— let’s say the card inadvertently went through a cycle in the washing machine— the user will be temporarily inconvenienced. Until they replace the credential, they can not access certain resources. Granted, that inconvenience could be significant. Depending on group policy, they may not even be able to login to their own machine if smart-card logon is mandatory. But as soon as the IT department issues a new card with a new set of credentials, access is restored. The important point is that the new card will have equivalent credentials but different key-pair. Previous key was permanently bound to the malfunctioning hardware. It can not be resurrected. Instead a new key-pair is generated on the replacement card and a corresponding new X509 certificate is issued, containing the new public key. As far as the user is concerned, the new key is just as good as the previous one: there is nothing they are prevented from doing that used to be possible with the previous key material.
It would be an entirely different story if that same smart-card was used for email encryption, for example S/MIME which is popular in enterprise deployments. While we can reissue this employee a new card, the replacement can not decrypt email encrypted to the previous key. If there was no backup or key-escrow in place, there has been permanent loss of functionality: past encrypted correspondence has become unreadable. That goes beyond a “temporary inconvenience” IT department can resolve. This is one example of a more general observation: loss or compromise of a key can have different consequences depending on the high-level scenario it is used for, even when all of the cryptographic operations are otherwise identical.
Loss vs compromise
At first glance, the signature scenario looks similar to authentication as far as the consequences from key loss are concerned. If a signing key becomes unusable, simply generate a new one and start signing new messages with that replacement key. This is indeed the approach taken in systems such as PGP: users still have distribute the new key, perhaps by publishing it on a key-server. There is no need to go and re-sign previous messages: they were signed with a valid key, and recipients can continue to validate them using the previous key. (Note we are assuming that recipients are flexible about choice of signing key. There are other systems— notably cryptocurrencies such as Bitcoin— where control over funds is asserted by signing messages with one particular key which can not be changed. In that model, loss of the private key has serious repercussions, directly translating to dollars.)
The picture gets murky once again when considering compromise of the key, defined as adversaries coming into possession of the secret. For encryption keys, the results are straightforward: attackers can read all past messages encrypted in that key. The best-case scenario one can hope for is publicizing the replacement quickly, to prevent any more messages from being encrypted to the compromised key. Not much can be done to save past messages, beyond purging local copies. If an attacker can get hold of those ciphertexts by any avenue, they can decrypt them. There is no way to “revoke” all existing copies of previously encrypted ciphertexts. In fact the adversary may already have intercepted and stored those ciphertexts, optimistically betting on future capability for obtaining the decryption key.
Consequences from the compromise of a signing key are more subtle, because time enters into the picture as a variable. Authentication commonly occurs in real time and once, with a single known counter-party attempting to verify your identity. Signatures are created once and then verified an unbounded number of times by an open-ended collection of participants indefinitely into the future. That creates two complications when responding to a key compromise:
- Future messages signed by the attacker using the compromised key must be invalidated, such that recipients are not accidentally tricked into accepting them
- Yet past messages made by the authorized signer when they still had full control over the private-key must continue to validate properly.
To appreciate why that second property is critical, consider code-signing: software publish digitally sign the applications they distribute, to prevent malware authors from tricking users into installing bogus, back-doored versions imitating the original. Suppose a publisher discovers that their signing key has been compromised and used to sign malware— a very common scenario that figured prominently in high-profile campaigns such as Stuxnet. The publisher can quickly race to revoke their signing certificate. Client properly implementing revocation checks will then be protected agains the signed malware, rejecting the signature as invalid. But what about legitimate pieces of software that distributed by the same publisher? Indiscriminate revocation would result in all of those apps also becoming untrusted, requiring the publisher to not only re-sign every app they ever shipped but also guarantee that every single copy existing anywhere a customer may download apps from is replaced by the updated version.
[continued]
CP