Interactive services detection and crypto hardware: when security features collide

It is not uncommon for security features to have unexpected interactions, undercutting each other. For example Tor and Bitcoin do not mix. More subtle are situations when one feature designed to mitigate a specific threat blocks some other security feature from working. This blogger recently ran into an example with Windows Server.

Enterprises frequently have to operate public-key infrastructure (PKI) systems to issue credentials to their own employees—and arguably such closed PKI systems have been far more successful than the house-of-cards that is SSL certificate issuance for the web. There are stand-alone certificate authority products such as the open-source EJBCA package but for most MSFT environments, the requirement is typically addressed by CA functionality built into Windows Server. Certificate Services is a role that can be added to the server configuration, either integrated with Active Directory (not necessarily colocated with the domain controller) or as stand-alone CA.

Offloading key-management

Since the security of PKI is critically dependent on security of the cryptographic keys used by the certificate authority, one of the standard ways to harden such a system is to move key material into dedicated cryptographic hardware. In enterprise environments, this usually means a hardware security module (HSM) connected to the CA servers. Lately the meaning of “HSM” has been greatly watered-down by companies suggesting that a glorified smart-card qualifies- perhaps these designers envision an unorthodox data-center layout with card readers glued to the side of server racks. But if one is willing to live without the improved tamper-resistance and higher performance of a dedicated HSM product, there is a more attractive option built-in. Starting with Windows 8, a Trusted Platform Module can be used to emulate virtual smart cards based on the GIDS specification.

Regardless of hardware choice, from the tried-and-true FIPS140 certified massive box to the jankiest USB token from an over-enthusiastic DIY project, these solutions all have the same defining feature. Private key material used by the CA for signing certificates is stored on an external device. No matter how badly the Windows server itself has been compromised, that key can not be extracted. (Of course the device will happily oblige if the compromised server asks to sign any message. That can be almost as bad as having direct key access when the signature has high value, as Adobe found out in the code-signing case.)

Getting along with external cryptographic hardware

At the nuts and bolts level, getting this scenario to work requires that Windows have some awareness of external cryptographic hardware. Windows crypto stack includes an extensibility layer for vendors to integrate their own device by authoring smart card mini-drivers. Certificate Services role in turn has an option to pick a particular cryptographic service provider during setup:

Screenshot from ADCS configuration UI

As noted earlier, the smart-card provider is actually a “meta-provider” that can route operations to other hardware using a mini-driver for that model of hardware. So the most direct route would be:

Create a virtual-smart card on TPM and initialize it with PIN
Configure Certificate Server to use smart-card provider
Generate the signing key on the virtual smart card

By all appearances, this process appears to work during initial configuration of certificate services. When the CA is being initialized, a PIN prompt for the virtual smart-card will appear and after authenticating, a self-signed certificate will be created as expected. (Assuming we are going with the most common configuration of a root certificate. There are other options such as creating a CSR to source a certificate from a third-party; in that case the CSR will be created correctly.)

But when the service itself is started, something strange happens. The certificate server management console can not connect to the service as RCPs time out. It appears to be stuck; in fact it does not look like it started successfully. It can not be stopped or restarted either, short of killing the hosting process. So what is going on?

Shatter attacks

Explaining why the CA service got stuck involves a flash back to 2005. Prior to Vista, different applications showing UI on a Windows desktop were not isolated from each other. For example a privileged application running with administrator or system account could show a prompt in the same desktop session that unprivileged user applications operate. While this might seem obvious— where else would the UI appear if not on the user’s screen?— there is a serious security problem here. By design applications can send UI-related messages, called “window messages,” to each other. For example one application can send a message to simulate clicking on a button in another application or pasting text into a dialog box.

The original example of this vulnerability dubbed Shatter, involved more than just simulating button clicks or faking keystrokes. It relied on the existence of a specific message that includes a callback function- effectively instructing the application to invoke arbitrary piece of code. As envisioned, these callback functions were supposed to have been specified by the application itself and accordingly trusted. But nothing prevented a different app running under a different OS account from injecting the same type of message into the message queue of another application. When you can influence the flow of execution in a process to the point of making it jump to an arbitrary specified memory address, you have full control. (The original attack also relied on injecting shell-code into the memory space of the target via an earlier window message simulating pasting of text.)

But even if that dangerous message type was deprecated or applications modified to validate the incoming callback before invoking it, the broader architectural problem remains: applications running at different privilege levels can influence each other. For example if the user opened an elevated command line prompt running as administrator, even her unprivileged user applications could send keystrokes to that window, executing arbitrary commands as administrator.

Vista introduced proper UI isolation to address this problem. These changes also affected a special class of applications that normally would never be expected to show UI or interact with users: services. But it turns out many background services are not content to run invisible in the background, and occasionally feel compelled to converse with users. Session-0 isolation comes into play for that case. There are now multiple sessions in Windows, and services all operate in the special privileged session 0. Any UI displayed there would have no effect on the main desktop the user is staring at. This uses the same principle as terminal server: if multiple users were logged into a server, an application opened by one user would only render on his/her desktop, with no effect on others users with their own remote desktop session.

Interactive services detection

Hiding UI from background services under the rug may trades the security problem for an application-compatibility problem. Services are not supposed to display UI directly. Instead they are supposed to have an unprivileged counterpart in the user session they can communicate with via standard inter-process channels such as named-pipes or LPC. Knowing that developers do not necessarily know- much less care to follow- best practices, Windows team faced the problem of accommodating “legacy” services. As far as the service is concerned, there is nothing obviously wrong. It has rendered UI and is waiting for the user to make a decision, which appears to be taking forever.

Interactive services detection attempts to solve that problem by detecting such UI and alerting the user in their own session with a notification. By acting on the notification, user can switch to session 0 temporarily, which has only that one dialog from the interactive service visible, and deal with the prompt.

“Our code never makes that mistake”

That provides an explanation for why certificate services is stuck:

During initial configuration of certificate services role, the cryptographic hardware is being accessed from an MMC console process running in the user session. PIN collection dialog renders without a problem.
During sustained operation of certificate services, the same hardware is accessed from a background service.

So why did interactive service detection not kick-in and alert the user that there is some UI demanding attention in session 0?

The answer is an optimistic assumption made by MSFT that by “now” (defined as Windows Server 2012 time-frame) all legacy services will have been fixed, rendering interactive service detection redundant. In WS2012 the feature is disabled by default. It turns out even Windows Server 2008 had traces of that optimism: 64-bit services were exempt on the theory that developers porting their service from x86 to x64 might as well be forced to fix any interactivity. But in this case the “faulty” code is the built-in certificate service running in native 64-bit mode from MSFT: credential-collection prompts from the smart-card stack are showing up in session 0.

Luckily the feature can be enabled with a registry tweak. With interactive service detection enabled, when certificate services starts up, the expected notification does show up in user session. Switching to session 0 one finds the familiar PIN prompt for the virtual smart card. (Note that entering the PIN is not required each time the CA uses the external crypto device to sign. It is only collected once to create an authenticated session, allowing the system to operate as a true hands-off service after the operator has kick-started it.)

Taskbar notification for interactive service

Interactive services detection dialog on main desktop

The problem is not confined to use of virtual smart cards. Other vendors appear to have run into the same problem. Thales who manufactures the nSafe (formerly nCipher) line of HSMs has a white-paper noting that interactive service detection must be enabled for their product to operate correctly. Oddly enough, the configuration dialog for certificate server already has a checkbox to indicate that administrator interaction may be required for use of signing keys. That alone should have been a hint that this “background service” may in fact need to interact with the administrator, especially when the UI-related behavior of vendor-specific drivers can not be known in advance.

From stock options to predatory lending for tech employees

“We are an investment firm interested in structuring liquidity for current or former [company name redacted] employees. If you or anyone you know is interested in help exercising options […] or is making a major life purchase, we may be able to help.”

Not exactly your average spam message. Addressed by name and referring specifically to a particular start-up, this email contained an offer for an unusual type of financial transaction: loan to help offset the cost of exercising options. Welcome to the Silicon Valley version of predatory lending.

To make sense of the connection between high-risk loans and technology IPOs, we need to revisit the history of equity-based compensation. While employee ownership is by no means new phenomenon, starting with 1990s it became a significant if not primary component of employee compensation for technology startups. Leaving established companies to join startups often involved exchanging predictable salaries for higher-risk, higher-reward structure conferred by equities. In many ways, the original success story for equity in technology was a blue-chip company predating that first dot-com bubble: Microsoft. “Never touch your options” was the advice offered to new employees, urging them to hold out on exercising as long as possible— because the price of MSFT much like tulip bulbs (and later real-estate) could only go up, the argument went.

Equity rewards come in two main flavors: stock grants or stock options. (Strictly speaking there are also important distinctions within each category, such as incentive-stock options versus non-qualifying stock options which we will put aside here.) Stock grants are straightforward. Employees are rewarded a number of shares as part of employment offer, subject to a vesting schedule. Typically there is a “cliff” at one-year for starting to vest some fraction—employees who don’t last that long end up with nothing. This is followed by incremental additions monthly or quarterly until the end of the vesting schedule. It is also not uncommon to issue refresher grants yearly or based on performance milestones.

Options, or more precisely call-options, represent the right to buy shares of stock at a predetermined price. For example, employee Alice joins Acme Inc when company stock is trading at $100. That would be her “strike price.” She is given a grant, which represents some number of option, also subject to a vesting schedule. Fast forward a year later when vesting starts, let’s say Acme stock is trading at $120— representing a remarkable 20% yearly increase, which was not uncommon for technology companies in their early growth phase. These options allow Alice to still purchase her shares at the original $100 price and if she chooses to, turn around and sell at prevailing market price.

Due to vagaries of accounting, options were far more favorable to a company than outright grant of stock. Downside is they transfer significant risk downstream to employees. Notwithstanding the irrational exuberance of asset bubbles, it turns out there is no iron-clad rule guaranteeing a sustained increase in stock prices, not even for technology companies. Even if the company itself is relatively healthy, stock price can be dragged down by overall economic trends— such as recession— such that current market price may well be below the strike price at any given time, rendering the options worthless. There is also a great element of volatility. Employees with start dates only weeks apart may observe very different outcomes if there were wild price swings determining the initial price— not at all uncommon for small growth stocks. It is not even guaranteed that an earlier start is necessarily an advantage. If the strike price happens to be set right before a major correction, employees may find their options underwater for a long time. (After Microsoft experienced a steep decline following its ill-advised adventure with DOJ anti-trust trial and dot-com crash of 2000, some employees observed that quitting and re-applying for their current job would be the rational course of action.) Such volatility runs deep; small fluctuations in price can make a significant difference. Consider an employee whose strike price was $100, when it is trading at $110 currently. So far, so good. Another 10% increase in price would roughly doubles the value of those options. 10% decrease by contrast renders them completely worthless. Stock grants on the other hand are far more stable against such perturbations. A 10% movement in the underlying stock represents exactly 10% change in either direction in the value of the grant.

No wonder then that options have historically invited foul-play. There has been more than one scandal including at the mighty Apple for backdating. Executives have faced criminal charges for these shenanigans. Even Microsoft, the original pioneer of option-based compensation, eventually threw in the towel and switched to stock grants after years of stubborn persistence. Of course for MSFT there was a more compelling reason: the stock price plateaued, or even slightly declined in inflation adjusted terms rendering the options worthless for most employees. (Interestingly, Google also switched from options to RSU stock grants around 2010, even when the core business was still experiencing robust growth, unlike its mismanaged counterpart in Redmond. But Google also created a one-time opportunity for employees to exchange underwater options for new ones priced at post-2008 collapse valuation.)

So what does all this have to do with questionable loans for startup employees? Our scenario involving the hypothetical employee made one assumption: she can sell the stock after exercising the option. Recall that the option by itself is a right to buy the stock. In other words, using the option involves spending money and ending up in possession of actual stock. After that initial outlay, one can turn around immediately and sell those shares on the market to realize the difference as profit. This process is common enough to have its own designation: exercise-and-sell. Most employee stock-option plans allow employees to execute it without fronting the funds for initial purchase of stock. Similarly there is exercise-and-sell-to-cover, where some fraction is sold at market price to offset exercise costs while holding on to remaining shares. In neither case does the employee have to provide any funds upfront for the purchase of stock at the strike price.

Therein lies a critical assumption: that there exists a market for trading the shares. That holds for publicly traded companies but gets murky in pre-IPO situations. Facebook shares were traded on Second Market prior to the IPO. Next generation of startups took that lesson to heart and incorporated clauses into their employee agreements to remove that liquidity avenue. Typically the equity rewards have been structured such that shares are not transferrable and can not be used as collateral.

Suppose employee Alice is leaving her pre-IPO company to take another opportunity. If the equity plan involved outright grants, the picture is simple: Alice retains what she has vested until her final day of employment and leaves on the table the unvested portion. In the case of stock options, the picture gets more complicated. Option plans are typically contingent on employment. Not only does vesting end, but unexercised options are forfeited unless they are exercised before departure or within short time-window afterwards. That means if Alice wants to retain her investment in future success of the company, she must buy the stock outright at its current valuation (typically set based on private financing rounds, at artificially low 409A valuations) and pay tax on “gains” depending on the type of option involved. Of course such gains exist purely on paper at that point, since there is no way for Alice to capture the difference until there is a liquidity event to enable selling her equity. In other words, Alice is forced to take on additional risk to protect the value of options. Alice must come up with cash to cover both the purchase of an asset that has presently zero liquidity— and may well turn out to have zero value, if the company flames out before IPO— and income tax on the profits IRS believes have resulted from this transaction.

This is where the shady lenders enter the picture. Targeting employees at start-ups who have recently changed jobs on LinkedIn, these enterprising folks offer to lend money for covering the value of the options. While this blogger can not comment on the legality of such services, it is clear that both sides are taking on significant risk. Recall that agreements signed by the employee preclude transfer of unexercised options— more precisely, the company will not recognize the new “owner.” Such assets can’t be used as meaningful collateral for the loan, leaving the lender with no recourse if he/she were to default. They could report the default to credit bureaus or forgive the loan— in which case it becomes “income” as far as IRS is concerned, triggering additional tax obligations— but none of that reduces the lender exposure.

Meanwhile the employee is taking a gamble on the same outcome, namely that the executed future value of these options will be realized. If the company does not IPO or market valuation ends up significantly lower than the projections used for purposes of calculating exercise costs, they will end up in the red against the loan. That is materially worse than ending up with underwater options. Recall that an option is a right but not an obligation to purchase stock at previously agreed upon (and one hopes, lower than current market value) price. There is no requirement to exercise an option that is underwater; the rational strategy is to walk away from it. But once options are exercised and shares owned outright, the employee is fully exposed to risk of future price swings. They have effectively “bought” those options on margin, except that instead of a traditional brokerage house extending margin with full control over the account, it is a new breed of opportunistic predatory lenders.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Random Oracle

Building and breaking systems

Month: May 2015