Use and misuse of code-signing (part II)

[continued from part I]

There is no “evil-bit”

X509 certificates represent assertions about identity. They are not assertions about competence, good intentions, code-quality or sound software engineering practices. Code-signing solutions including Authenticode can only communicate information about the identity of the software publisher— the answer to the question: “who authored this piece of software?” That is the raison d’etre for the existence of certificate authorities and why they ostensibly charge hefty sums for their services. When developer Alice wants to obtain a code-signing certificate with her name on it, the CA must perform due diligence that it is really Alice requesting the certificate. Because if an Alice certificate is mistakenly issued to Bob, suddenly applications written by Bob will be incorrectly attributed to Alice, unfairly using her reputation and in the process quite possibly tarnishing that reputation. In the real world, code-signing certificates are typically issued not to individuals toiling alone— although many independent developers have obtained one for personal use— but large companies with hundreds or thousands of engineers. But the principle is same: a code-signing certificate for MSFT must not be given willy-nilly to random strangers who are not affiliated with MSFT. (Incidentally that exact scenario was one of the early debacles witnessed in the checkered history of public CAs.)

Nothing in this scheme vouches for the integrity of the software publisher or the fairness of their business model. CAs are only asserting that they have carefully verified the identity of the developer prior to issuing the certificate. Whether or not the software signed by that developer is “good” or suitable for any particular purpose is outside the scope of that statement. In that sense, there is nothing wrong— as far as X509 is concerned— with a perfectly valid digital certificate signing malicious code. There is no evil bit required in a digital certificate for publishers planning to ship malware. For that matter there is no “competent bit” to indicate that software published by otherwise well-meaning developers will not cause harm nevertheless due to inadvertent bugs or dangerous vulnerabilities. (Otherwise no one could issue certificates to Adobe.)

1990s called, they want their trust-model back

This observation is by no means novel or new. Very early on in the development of Authenticode in 1997, a developer made this point loud and clear. He obtained a valid digital certificate from Verisign and used it to sign an ActiveX control dubbed “Internet Exploder” [sic] designed to shut-down a machine when it was embedded on a web page. That particular payload was innocuous and at best a minor nuisance, but the message was unambiguous: the same signed ActiveX control could have reformatted the drive or steal information. “Signed” does not equal “trustworthy.”

Chalk it up to the naivete of the 1990s. One imagines a program manager at MSFT arguing this is good enough: “Surely no criminal will be foolish enough to self-incriminate by signing malware with their own company identity?” Yet a decade later that exact scenario is observed in the wild. What went wrong? The missing ingredient is deterrence. There is no global malware-police to chase after every malware outfit even when they are operating brazenly in the open, leaving a digitally authenticated trail of evidence in their wake. Requiring everyone to wear identity badges only creates meaningful deterrence  when there are consequences to being caught engaging in criminal activity while flashing those badges.

Confusing authentication and trust

Confusing authentication with authorization is a common mistake in information security. It is particularly tempting to blur the line when authorization can be revoked by deliberately failing authentication. A signed ActiveX control is causing potential harm to users? Let’s revoke the certificate and that signature will no longer verify. This conceptual shortcut is often a sign that a system lacks proper authorization design: when the only choices are binary yes/no, one resorts to denying authorization by blocking authentication.

Developer identity is neither necessary or sufficient for establishing trust. It is not necessary because there is plenty of perfectly useful open-source software maintained by talent developers only known by their Github handle, without direct attribution of each line of code to a person identified by their legal name. It is not sufficient either, because knowing that some application was authored by Bob is not useful on its own, unless one has additional information about Bob’s qualifications as a software publisher. In other words: reputation. In the absence of any information about Bob, there is no way to decide if he is a fly-by-night spyware operation or honest developer with years of experience shipping quality code.

Certificate authorities as reluctant malware-police

Interesting enough, that 1997 incident set another precedent: Verisign responded by revoking the certificate, alleging that signing this deliberately harmful ActiveX control was a violation of the certificate policy that this software developer agreed to as a condition for issuance. Putting aside the enforceability of TOUs and click-through agreements, this is a downright unrealistic demand for certificate authorities to start policing developers on questions of policy completely unrelated to verifying their identity. It’s as if the DMV had been tasked with revoking driver’s licenses for people who are late on their credit-card payments.

That also explains why revoking certificates for a misbehaving vendors is not an effective way to stop that developer from churning out malware. As the paper points out, there are many ways to game the system, all of which being used in the wild by companies with a track record of publishing harmful applications:

  • CA shopping: after being booted from one CA, simply walk over to their competitor to get another certificate for the exact same corporate entity
  • Cosmetic changes: get certificates for the same company with slightly modified information (eg variant of address or company name) from the same CA
  • Starting over: create a different shell-company doing exactly same line of business to start with a clean-slate

In effect CAs are playing whack-a-mole with malware authors, something they are neither qualifier or motivated to do. In the absence of a reputation system, the ecosystem is stuck with a model where revoking trust in malicious code requires revoking the identity of the author. This is a very different use of revocation than what the X509 standard envisioned. Here are the possible reasons defined in the specification– incidentally these appear in the published revocation status:

CRLReason ::= ENUMERATED {

unspecified             (0),
 keyCompromise           (1),
 cACompromise            (2),
 affiliationChanged      (3),
 superseded              (4),
 cessationOfOperation    (5),
 certificateHold         (6),
 -- value 7 is not used
 removeFromCRL           (8),
 privilegeWithdrawn      (9),
 aACompromise           (10) }

Note there is no option called “published malicious application.” That’s because none of the assertions made by the CA are invalidated upon discovering that a software publisher is churning out malware. Compare that to key-compromise (reason #1 above) where the private-key of the publisher has been obtained by an attacker. In that case a critical assertion has been voided: the public-key appearing in the certificate no longer speaks exclusively for the certificate holder. Similarly a change of affiliation could arise when an employee leaves a company, a certificate issued in the past now contains inaccurate information for “organization” and “organizational unit” fields. There is no analog for the discovery of signed malware, other than vague reference to compliance with the certificate policy. (In fairness, the policy itself can appear as URL in the certificate but it requires careful legal analysis to answer the question of how exactly the certificate subject has diverged from that policy.)

Code-signing is not the only area where this mission creep has occurred but it is arguably the one where highest demands are put on the actors least capable of fulfilling those expectations. Compare this to issuance of certificates for SSL: when phishing websites pop-up impersonating popular services, perhaps with a subtle misspelling of the name, complete with a valid SSL certificate. Here there may be valid legal grounds to ask the responsible CA to revoke a certificate because there may be trademark claim. (Not that it does any good, since the implementation of revocation in popular browsers ranges from half-hearted to comically flawed.) Lukcily web browsers have other ways to stop users from visiting harmful websites: for example, Safe Browsing and SmartScreen maintain blacklists of malicious pages. There is no reason to wait for CA to take any action- and for malicious sites that are not using SSL, it would not be possible anyway.

Code-signing presents a different problem. In open software ecosystems, reputation systems are rudimentary. Antivirus applications can recognize specific instances of malware but most applications start from a presumption of innocence. In the absence of other contextual clues, the mere existence of verifiable developer identity becomes a proxy for trust decision: unsigned applications are suspect, signed ones get a free pass. At least, until it becomes evident that the signed application was harmful. At that point, the most reliable way of withdrawing trust is to invalidate signatures by revoking the certificate. This uphill battle requires enlisting CAs in a game of whack-a-mole, even when they performed their job correctly in the first place.

This problem is unique to open models for software distribution, where applications can be sourced from anywhere on the web. By contrast, the type of tightly controlled “walled-garden” ecosystem Apple favors with its own App Store rarely has to worry about revoking anything, even though it may use code signing. If Apple deems an application harmful, it can be simply yanked from the store. (For that matter, since Apple has remote control over devices in the field, they can also uninstall existing copies from users’ devices.)

Reputation systems can solve this problem without resorting to restrictive walled-gardens or locking down application distribution to a single centralized service responsible for quality. They would also take CAs out of policing miscreants, a job they are uniquely ill-suited for. In order to block software published by Bob, it is not necessary to revoke Bob’s certificate. It is sufficient instead to signal a very low reputation for Bob. This also moves the conflict one-level higher, because reputations are attached to persons or companies, not to specific certificates. Getting more certificates from another CA after one has been revoked does not help Bob. As long as the reputation system can correlate the identities involved, the dismal reputation will follow Bob. Instead of asking CAs to reject customers who had certificates revoked from a different CA, the reputation system allows CAs do their job and focus on their core business: vet the identity of certificate subjects. It is up to the reputation system to link different certificates based on a common identity, or even related families of malware published by seemingly distinct entities acting on behalf of the same malware shop.

CP

Use and misuse of code-signing (part I)

Or, there is no “evil bit” in X509

A recent paper from CCS2015 highlights the incidence of digitally signed PUP— potentially unwanted programs: malicious applications that harm users, spying on them, stealing private information or otherwise acting against the interests of the user. While malware is dime-a-dozen and occurence of malware digitally-signed with valid certificates is not new either, this is one of the first systemic studies of how malware authors operate when it comes to code signing. But before evaluating the premise of the paper, let’s step back and revisit the background on code-signing in general and MSFT Authenticode in particular.

ActiveX: actively courting trouble

Rewinding the calendar back to the mid-90s: the web is still in its infancy and browsers highly primitive in their capabilities compared to native applications. These are the “Dark Ages” before AJAX, HTML5 and similar modern standards which make web applications competitive with their native counterparts. Meanwhile JavaScript itself is still new and awfully slow. Sun Microsystems introduced Java applets as an alternative client-side programming model to augment web pages. Ever paranoid, MSFT responds in the standard MSFT way: by retreating to the familiar ground of Windows and trying to bridge the gap from good-old Win32 programming to this this scary, novel web platform. ActiveX controls were the solution the company seized on in hopes of continuing the hegemony of the Win32 API. Developers would not have to learn any new tricks. They would write native C/C++ applications using COM and invoking native Windows API as before—conveniently guaranteeing that it could only run on Windows— but they could now deliver that code over the web, embedded into web pages. And if the customers visiting those web pages were running a different operating system such as Linux or running on different hardware such as DEC Alpha? Tough luck.

Code identity as proxy for trust

Putting aside the sheer Redmond-centric nature of this vision, there is one minor problem: unlike the JavaScript interpreter, these ActiveX controls execute native code with full access to  operating system APIs. They are not confined by an artificial sandbox. That creates plenty of room to wreak havoc with machine: read files, delete data, interfere with the functioning of other applications. Even for pre-Trustworthy-Computing MSFT with nary a care in the world for security, that was an untenable situation: if any webpage you visit could take over your PC, surfing the web becomes a dangerous game.

There are different ways to solve this problem, such as constraining the power of these applications delivered over the web. (That is exactly what JavaScript and Java aim for with a sandbox.) Code-signing was the solution MSFT pushed: code retains full privileges but it must carry some proof of its origin. That would allow consumers to make an informed decision about whether to trust the application, based on the reputation of the publisher. Clearly there is nothing unique to ActiveX controls about code-signing. The same idea applies to ordinary Windows applications sure enough was extended to cover them. Before there were centralized “App Stores” and “App Markets” for purchasing applications, it was common for software to be downloaded straight from the web-page of the publisher or even a third-party distributor website aggregating applications. The exact same problems of trust arises here: how can consumers decide whether some application is trustworthy? The MSFT approach translates that into a different question: is the author of this application trustworthy?

Returning to the paper, the researchers make a valuable contribution in demonstrating that revocation is not quite working as expected. But the argument is undermined by a flawed model. (Let’s chalk up the minor errors to lack of fact-checking or failure to read specs: for example asserting that Authenticode was introduced in Windows 2000 when it  predates that, or stating that only SHA1 is supported when MSFT signtool has supported SHA256 for some time.) There are two major conceptual flaws in this argument:
First one is misunderstanding the meaning of revocation, at least as defined by PKIX standards. More fundamentally, there is a misunderstanding what code-signing and identity of the publisher represent, and the limits of what can be accomplished by revocation.

Revocation and time-stamping

The first case of confusion is misunderstanding how revocation dates are used: the authors have “discovered” that malware signed and timestamped continues to validate even after the certificate has been revoked. To which the proper response is: no kidding, that is the whole point of time-stamping; it allows signatures to survive expiration or revocation of the digital certificate associated with that signature. This behavior is 100% by design and makes sense for intended scenarios.

Consider expiration. Suppose Acme Inc obtains a digital certificate valid for exactly one year, say the calendar year 2015. Acme then uses this certificate so sign some applications published on various websites. Fast forward to 2016, and a consumer has downloaded this application and attempts to validate its pedigree. The certificate itself has expired. Without time-stamping, that would be a problem because there is no way to know whether the application was signed when the certificate was still valid. With time-stamping, there is a third-party asserting that the signature happened while the certificate was still valid. (Emphasis on third-party; it is not the publisher providing the timestamp because they have an incentive to backdate signatures.)

Likewise the semantics of revocation involve a point-in-time change in trust status. All usages of the key afterwards are considered void; usage before that time is still acceptable. That moment is intended to capture the transition point when assertions made in the certificate are no longer true. Recall that X509 digital certificates encode statements made by the CA about an entity, such as “public key 0x1234… belongs to the organization Acme Inc which is headquartered in New York, USA.” While competent CAs are responsible for verifying the validity of these facts prior to issuance, not even the most diligent CA can escape the fact that their validity can change afterwards. For example the private-key can be compromised and uploaded to Pastebin, implying that it is no longer under sole possession of Acme. Or the company could change its name and move its business registration to Timbuktu, a location different than the state and country specified in the original certificate.  Going back to the above example of the Acme certificate valid in 2015: suppose that half-way through the calendar year Acme private-key is compromised. Clearly signatures produced after that date can not be reliably attributed to Acme: it could be Acme or it could be the miscreants that stole the private-key. On the other hand signatures made before, as determined by third-party trusted timestamp should not be affected by events that occurred later.**

In some scenarios this distinction between before/after is moot. If an email message was encrypted using the public-key found in an S/MIME certificate months ago, it is not possible for the sender to go back in time and recall the message now that the certificate is revoked. Likewise authentication happens in real-time and it is not possible to “undo” previous instances when a revoked certificate was accepted. Digital signatures on the other hand are unique: the trust status of a certificate is repeatedly evaluated at future dates when verifying a signature created in the past. Intuitively signatures created before revocation time should still be afforded full trust, while those created afterwards are considered bogus. Authenticode follows this intuition. Signatures time-stamped prior to the revocation instant continue to validate, while those produced afterwards (or lacking a time-stamp altogether) are considered invalid. The alternative does not scale: if all trust magically evaporates due to revocation, one would have to go back and re-create all signatures.

To the extent that there is a problem here, it is an operational error on the part of CAs in choosing the revocation time. When software publishers are caught red-handed signing malware and this behavior is reported to certificate authorities, it appears that CAs are setting revocation date to the time of the report, as opposed to all the way back to original issuance time of the certificate. That means signed malware still continues to validate successfully according to Authenticode policy, as long as the crooks remembered to timestamp their signatures. (Not exactly a high-bar for crooks, considering that Verisign and others also operate free, publicly accessible time-stamping services.) The paper recommends “hard-revocation” which is made-up terminology for setting revocation time all the way back to issuance time of the certificate, or more precisely the notBefore date. This is effectively saying some assertion made in the certificate was wrong to begin with and the CA should never have issued it in the first place. From a pragmatic stance, that will certainly have the intended effect of invalidating all signatures. No unsuspecting user will accidentally trust the application because of a valid signature. (Assuming of course that users are not overriding Authenticode warnings. Unlike the case of web-browser SSL indicators which have been studied extensively, there is comparatively little research on whether users pay attention to code-signing UI.) While that is an admirable goal, this ambitious project to combat malware by changing CAs behavior is predicated on misunderstanding of what code-signing and digital certificates stand for.

[continued]

CP

** In practice this is complicated by the difficulty of determining precise time of key-compromise and typically involves conservatively estimating on the early side.