Trading cryptocurrency without trusted third-parties (part II)

[continued from part I]

To recap the scenario: Alice and Bob are interested in trading bitcoin (BTC) for ether (ETH.) Alice owns BTC, Bob has ETH, and they have agreed on pricing and quantity. (Note we are fast-forwarding past the scene where Alice and Bob miraculously located each other and organized this trade. That is one of the most valuable functions of a market, a point that we will return to.) Now they want to set up a fair-exchange where Alice only receives her ETH if Bob receives the corresponding amount of BTC.

Fragility of ECDSA as a feature

One way to do this involves turning what could be considered a “bug” in the ECDSA signature algorithm—used by both Bitcoin and Ethereum— into a feature. ECDSA is a randomized signature algorithm. Signing a message involves picking a random nonce each time. The random choice of nonce for each operations means even signing the same message multiple times can yield a different result each time. This is in contrast to RSA for example, where the most common padding mode is deterministic. Processing the same message again will yield the exact same signature.** It is critical for this nonce to be unpredictable and unique, otherwise the security of ECDSA completely breaks down:

If you know the nonce, you can recover the private key.
If the same unknown nonce is reused across different messages you can recover the private key. (Just ask Sony about their PlayStation code-signing debacle.)
It gets worse: if multiple messages are signed with different nonces with known relationship (such as, linear combination of some nonces equals another one) you can still recover the private key.

That makes ECDSA highly fragile, dependent critically on a robust source of randomness. It also means implementations susceptible to backdoors: a malicious version can leak private-keys by cooking the nonce while appearing to operate correctly by producing valid signatures. Variants have been introduced to improve this state of affairs. For example deterministic ECDSA schemes compute the nonce as a one-way function of secret-key and message, without relying on any source of randomness from the environment.

But this same fragility can prove useful as a primitive for exchanging funds across different blockchains, by deliberately forcing disclosure of a private key. Specifically, it’s possible to craft an Ethereum smart-contract that releases funds conditionally on observing two valid signatures for different messages with the same nonce.

Setup

Alice has her public-key A, which can be used to create corresponding addresses on both Bitcoin & Ethereum blockchains.
Bob likewise has public-key B.
Alice generates a temporary ECDSA key, the “transfer-key” T.

Before starting execution, Alice rearranges her funds and moves the agreed-upon quantity of bitcoin into a UTXO with a specific redeem script. The script is designed to allow spending if either one of these two conditions are satisfied:

One signature using Alice’s own public key A but only after some time Δ has elapsed. This is a time-lock enabled by the check-locktime-verify instruction.
2-of-2 multi-signature using Bob’s public key B and the transfer key T.

Once this UTXO is confirmed, Alice sends Bob a pointer to the UTXO on the blockchain. In practice she would also have to send the redeem script for Bob to verify that it has been constructed. (Since the P2SH address is based on a one-way hash of the script, it is not possible in general to infer the original script from an address alone.)

Once Bob is satisfied that Alice has put forward the expected Bitcoin amount subject to the right spending conditions, he sets up an Ethereum contract. This contract has two methods:

Refund(): Can only be called by Bob using B and only after some future date. Sends all funds back to Bob’s address. This is used by Bob to reclaim funds tied up in the contract in case Alice abandons the protocol.
Exchange(signature1, signature2): This method is called by Alice and implements the fair-exchange logic. It expects two signatures using the transfer-key T over predefined messages, which can be fixed ahead of time such as “foo” and “bar”. The method verifies that both signatures are valid and more importantly they are reusing the ECDSA nonce. (In other words, the private key for T has been disclosed.) If these conditions are met, the contract sends all of its available balance to Alice’s address.

Alice in turn needs to verify that this contract has been setup correctly. As a practical matter, all instances of the contract can share the same source-code, differentiated only by parameters they receive during the contract creation. These constructor parameters are the Ethereum addresses for Alice and Bob, along with the public-key for T to check signatures against. That way there is no need to reverse-engineer the contract logic from EVM byte-code. A single reference implementation can be used for all invocations of the protocol. Only the constructor arguments need to be compared against expected values, along with the current contract balance.

Assuming this smart-contract is setup correctly, Alice can proceed with taking delivery of the ETH from Bob. She signs two messages with her private-key, reusing the same nonce for both. Then she invokes the Exchange method on the contract with these signatures. Immutability of smart-contract logic dictates that upon receiving two signatures with the right properties, the contract has no choice but to send all its funds to Alice.

At this point Alice has her ETH but Bob has not claimed his BTC. This is where the fair-exchange logic comes into play: Alice staked her claim to the ETH by deliberately disclosing the private-key T. Looking back at the redeem script for Alice’s UTXO on the blockchain, possession of T and Bob’s key B allows taking control of those funds. Bob can now sign a transaction using both private-keys to move that BTC to a new address he controls exclusively. Meanwhile Alice herself is prevented from taking those funds back herself because of the timelock.

The fine-print: caveats and improvements

A few subtleties about this protocol. Invoking Exchange() on the contract means the entire world learns the private key for T, not just Bob; blockchain messages are broadcast so all nodes can verify correct execution. Why not have Alice send one of the signatures to Bob out-of-band, in private? A related question is why not allow the Bitcoin funds to be moved using the transfer-key T only, instead of requiring a multi-signature? The answer to both of these is that Bob can not count on his knowledge of T being exclusive. Even if the Ethereum smart-contract only expected a single signature (having the expected nonce hard-coded) Alice can still publish the private-key for T to the entire world after she receives her ETH. If her funds only depended on a single key T for control, it would become a race-condition between Bob and everyone else in the world to claim them. Alice does not care; once she discloses T someone will take her BTC. But Bob cares very much that he is the only recipient and not have to race against others to get their TX mined first. Including an additional key B only known to Bob guarantees this, while also making it moot whether other people come in possession of the private-key for T.

Speaking of race conditions, there is still one case of Bob racing against the clock: he must claim the bitcoin before the time-lock on the alternative spending path expires. Recall that Alice can claw-back her bitcoin after some time/block-height is reached. That path is reserved for the case when the protocol does not run to completion, for example Bob never publishes the ethereum smart-contract. But even after Bob has published the contract and Alice invoked it to claim her ETH, the alternative redemption path remains. So there is an obligation for Bob to act in a timely manner. The deadline is driven by how the time-locks are chosen. Recall that the Ethereum smart-contract also has a deadline after which Bob can claw the funds back if Alice fails to deliver T. If this is set to say midnight on a given day while the Bitcoin UTXO is time-locked to midnight the next day (these are approximate, especially when specified as block-height since mining times are randomly distributed) then Bob has 24 hours to broadcast the transaction. That time window can be adjusted based on the preferences of two sides, but only at the risk of increasing recovery time after protocol is abandoned. In that situation Alice is stuck waiting out the expiration of this lock before she can regain control of her funds.

Another limitation in the basic protocol as described is lack of privacy. The transaction is linkable across blockchains: the keys A, B and T are reused on both sides, allowing observes to trace funds from Bitcoin into Ethereum. This situation can be improved. There is no reason for Alice to reuse the same key A for reclaiming her Bitcoin as the key she uses to receive Ethereum from Bob. (In fact Bob only cares about the second one since that is given as a parameter to the contract.) Similarly Bob can split B into two different keys. Dealing with T is a little more tricky. At first it looks like this must be identical on both chains to allow private-key disclosure to work. But there is another trick Alice and Bob can use. After Alice gives the public-key for T to Bob, Bob can craft his Ethereum contract to expect the related key T* = m·T for a random scalar m used to mask the original key. He in turn shares this masking factor with Alice. Since Alice has the private key for T, she can also compute the private key for T* by simply multiplying with m. When she discloses that private-key, Bob can now recover the original key for T by using the inverse of m. Meanwhile to outside observers the keys T and T* appear unrelated. This provides a form of plausible deniability. If many people were engaging in transactions of this exact format with identical parameters, it would not be possible to link the Bitcoin side of the exchange to the Ethereum side. But “identical parameters” is the operative qualification. If Alice and Bob are trading 1BTC while Carol and David are trading 1000BTC, the transactions are easily separated. Similarly if the time-locks on ETH and BTC side are not overlapping, it becomes possible to rule out an ETH contract as being the counterpart of another BTC transaction posted around the same time.

Finally an implementation detail: why use the repeated-nonce trick for disclosing private key instead of simply sending private-key bits to the contract? Because the Solidity language used for writing smart-contract has a convenient primitive for verifying ECDSA signatures given a public-key. It does not have a similar primitive to check if a given private-key corresponds to a public-key. In fact it makes sense for Solidity to have no facilities for working with private-keys. Since all smart-contract execution is public, the assumption is only publicly available information would ever be processed by the contract and never secret material. For this reason we resort to the nonce reuse trick. Ethereum virtual machine also has the additional primitives required to compare two signatures for nonce equality. Interestingly Bitcoin script-language is exactly one instruction shy of being able to accomplish that. The instruction OP_CAT is already defined in the scripting language but currently disabled and for good reason: without other limits, it can be used as a denial-of-service vector. But if OP_CAT were enabled, it could be used to construct a redeem script that receives ECDSA signatures in suitably encoded form (nonce and second component as individual stack-operands) and checks them for nonce reuse. Other “splicing” opcodes such as OP_SUBSTR can also achieve the same effect by parsing the full ASN1 encoded ECDSA signature to extract the nonce piece into an individual stack operand where it can be compared for equality against another nonce. Either way, it would allow inverting the protocol sequence: Bob posts a smart-contract on the Ethereum blockchain first, Alice sets up the corresponding Bitcoin UTXO, which Bob proceeds to claim by disclosing the transfer key.

[continued]

** RSA does have a randomized padding mode as well called PSS.

Trading cryptocurrency without trusted third-parties (part I)

[Full disclosure: the author works on security for a cryptocurrency exchange]

The collapse of Mt. Gox in 2014 and its aftermath has inspired a healthy dose of skepticism towards storing cryptocurrency with online services. It has also inspired the search for decentralized exchange models where the functionality provided by MtGox can be realized without a single point-of-failure where all risk is concentrated. While the mystery of what went on at Mt Gox remains unresolved to this day, blockchain designs have continued to evolve. Bitcoin itself has not changed much at the protocol level, although it added a couple of new instructions to the scripting language. More significant advances happened with the introduction of segregated-witness, along with the emergence of so-called “layer 2” solutions for scaling such as the Lightning Network. Even more promising is the emergence of alternative blockchains capable of expressing more complex semantics, most notably Ethereum with its Turing-complete smart-contract language. This makes it a good time to revisit the problem of decentralized cryptocurrency exchange without the concentration risk created by storing funds.

Answering this question in turn requires reexamining the purpose of an exchange. At the simplest level, an exchange connects buyers and sellers. Sellers post the quantity they are willing to part with and a price they are willing to accept. Buyers in turn place bids to purchase a specific quantity at a price of their choosing. When these two sides “cross”—the bid meets or exceeds an ask— a trade is executed. The exchange facilitates the transfer of assets in both directions, delivering assets to the buyer while compensating the seller with the funds provided by the buyer.

In an ideal world where everyone is good for their word, this arrangement does not require parking any funds with the exchange. If Alice offers to sell 1BTC and Bob has agreed to purchase that for $1200, we can count on Alice to deliver the cryptocurrency and Bob to send the US dollars. In this hypothetical universe, they do not have to place funds in escrow with the exchange or for that matter any other third-party. Bob can wire fiat currency to Alice’s bank-account and Alice sends Bitcoin over the blockchain to Bob’s address. In reality of course people frequently deviate from expected protocol, violate contractual obligations or engage in outright fraud. Perhaps Bob never had the funds to begin with or he had a change of heart after finding a cheaper price on another exchange after agreeing to the trade with Alice.

These are examples of counter-party risk. It becomes increasingly unmanageable at scale. It would be one thing if Alice and Bob happen to know each other, or expect to be doing business continuously—in these scenarios “defecting” and trying to cheat the other side becomes counterproductive. With thousands of participants in the market and interactions between any pair being infrequent, there is not much of an opportunity to build up a reputation. It is infeasible for everyone to keep tabs on the trustworthiness of every potential counter-party they may be trading with, or to disadvantage new participants because they have no prior history to evaluate.

The standard model for exchanges provides one possible solution to this problem: Alice and Bob both deposit their funds with the exchange. The exchange is responsible for ensuring that all orders are fully covered by funds under custody. Using the example of BTC/USD trading, Alice can only offer to sell Bitcoin she has stored at the exchange and Bob can only place buy orders that his fiat balance can cover. Bob can be confident that the assets he just bid on are not phantom-Bitcoins that may fail to materialize after the trade completes. Likewise Alice knows she is guaranteed to receive USD regardless of which customer ends up being paired with her order.

The counter-party risk is mitigated but only at the expense of creating new challenges. In this model, the exchange becomes a custodian funds for everyone participating in the market. Aside from the obvious risk of a MtGox-type implosion, it creates a liquidity problem for these actors: their funds are tied up. Consider that a trader will be interested in betting on multiple cryptocurrencies across multiple exchanges. Even within a single trading pair such as USD/BTC, there are significant disparities in prices across exchange, creating arbitrage opportunities. But exploiting such disparities requires either maintaining positions everywhere or rapid funds movement between exchanges. Speed of Bitcoin movement is governed by mining time—which is an immutable property of the protocol, fixed at 10 minutes on average— and competition against other transactions vying for scarce room in the next block. In principle fiat currency can be moved much faster using the Federal Reserve wire system but that too depends on the implementation of wire transfer functionality at each exchange. All of this spells increased friction for moving in/out of markets, as well as greater amount of capital committed at multiple exchanges in anticipation of trading opportunities.

Is it possible to eliminate counter-party risk without introducing these inefficiencies? Over the years, alternative models have been put forward for trading cryptocurrency while eliminating or at least greatly reducing the concentration of risk. For example Bitsquare bills itself as a decentralized exchange, noting that it does not hold any user funds. Behind the scenes, this is achieved by relying on trusted arbitrators to mediate exchanges and resolve disputes:

“If Trader A fails to confirm the receipt of a national currency transfer within the allotted time (e.g. six days for SEPA, one day for OKPay, etc.), a button to contact the arbitrator will appear to both traders. Trader B will then be able to submit evidence to the arbitrator that he did, in fact, send the national currency. Alternatively, if Trader B never sent the national currency, Trader A will be able to submit evidence to the arbitrator that the funds were never received.”

In other words, counter-party risk is managed by having humans in the loop acting as trusted third-parties, rendering judgment on which side of the trade failed to live up to their obligations. The system is designed with economic incentives to encourage following the protocol: backing out of a trade or failing to deliver promised asset does result in loss of funds for the party at fault. (Interesting enough, the punitive damages are rewarded to the arbitrator, rather than the counter-party inconvenienced by that transgression. It is practically in the interest of arbitrators to have participants misbehave, since they get to collect additional payments above and beyond their usual fee.) Arbitrators are also required to post a significant bond, which they will lose if they are caught colluding with participants to deviate from the protocol.

Even with the fallibility of human arbitrators, this system achieves the stated goal of diffusing risk: instead of relying on the exchange to safeguard all funds, participants rely on individual arbitrators to watch over much smaller amounts at stake in specific trades. But there are other types of risk this arrangement can not hedge against, notably that of charge-backs. This is a very common challenge when trying to design a system for trading fiat currency against cryptocurrency. Blockchain transfers are irreversible by design. By contrast, most common options for transmitting fiat can be reversed in case they are disputed. For example, if an ACH transfer is initiated using stolen online banking credentials, the legitimate owner can later object to this transaction by notifying their bank in writing. Depending on the situation, they may have up to 60 days to do so. If the bank is convinced that the ACH was unauthorized, they can reverse the ACH transfer. What this means is that Alice can face an unpleasant surprise many weeks after releasing Bitcoin to Bob. Bob— or whoever owns the account Bob used to send those funds— can recover the funds Alice received as proceeds, leaving her holding the proverbial bag, since she has no recourse to clawing back bitcoin.

Also note that functionality is somewhat reduced compared to a traditional exchange. As the FAQ notes, settlement phase can take multiple days depending on how fiat-currency is sourced. Bitcoin purchased this way is not available immediately; it can not be transferred to a personal wallet or used to pay for purchase. That’s a stark contrast from a conventional exchange where settlement is nearly instantaneous. Once the trade has executed, either side can take their USD or BTC, and use it right away, withdraw to another address or place orders for a different pair such as BTC/ETH. In P2P models, availability of funds depends on the fiat payment clearing to the satisfaction of the counter-party, and that person getting around to sending the cryptocurrency. High-frequency trading in the blink of an eye, this is not.

Looking beyond fielded systems to what is possible in theory, we can ask whether there are any results in cryptography that can provide a basis for truly decentralized, trust-free trading of currencies. Here the news is somewhat mixed.

This problem in the abstract has been studied under the rubric of fair-exchange. A fair-exchange protocol is an interactive scheme for two parties to exchange secrets in an all-or-nothing manner. That is, Alice has some secret A and Bob has a different secret B. The goal is to design a protocol such that after a number of back-and-forth messages, one of two outcomes happen:

Alice has obtained B and Bob has obtained A.
Neither one has learned anything new

This protocol is “fair” because neither side comes out ahead in any outcome. By contrast, if there was an outcome where Alice learns B and Bob walks away empty-handed, the result would be decidedly unfair to Bob. There is a nagging question here of how participants can verify the value and/or legitimacy of their respective secrets ahead of time. But assuming that problem can be solved, such protocols would be incredibly useful in many contexts including cryptocurrency. For example if A happens to be a private-key controlling an Ethereum account while B controls some bitcoin, one could implement BTC/ETH trade by arranging for an exchange of those secrets.

Now the bad news: there is an impossibility result proving that such protocols can not exist. A 1999 paper titled “On the Impossibility of Fair Exchange without a Trusted Third Party” shows exactly what the title says: there exists no protocol which can achieve the above objectives with only Alice and Bob in the picture. There must be an impartial referee Trent such that if either Alice or Bob deviate from the protocol, Trent can intervene and force the protocol to produce an equitable outcome. The silver lining is that the negative result does not rule out so-called optimistic fair-exchange, where third-party involvement is not required provided everyone duly performs their assigned role. The referee is only asked to intervene when one side deviates from the expected sequence. But “hope is not a method,” as the saying goes. Given the sordid history of scams and fraudulent behavior in cryptocurrency, counting on everyone to follow the protocol is naive.

On paper this does not bode well for the vision of implementing trust-free exchange. But this is where blockchains provide a surprising assist: it has been observed that the blockchain itself can assume the role of an impartial third-party. Here is a simple example from 2014 where Andrychowicz et al. leverage Bitcoin to improve on a well-known cryptographic protocol for coin-flipping. Slightly simplified, the original protocol proceeds this way:

Alice and Bob both pick a random bit string
They “commit” to their strings, by computing a cryptographic hash of that value and publishing that commitment
After both have committed, each side “opens” the commitment by revealing the original string
Since the hash function is public, both sides can check that commitments were opened correctly
Alice and Bob now compare the least-significant bits of the two unveiled strings. If those bits are identical, Alice wins the coin-toss. Otherwise Bob wins.

This is great in theory but what happens if Bob stops at step #3? After all, once Alice reveals her commitment, Bob has full-knowledge of both strings. He can already see the writing on the wall if he lost. That would be a great time to feign network connection issues, Windows 10 upgrade or any other excuse to stop short of revealing his original choice to prevent Alice from obtaining the information necessary to prove she won the coin-toss.

Enter Bitcoin. Blockchains allow defining payments according to predetermined rules. Those rules have fixed capabilities; they can not magically reach out into the real world, dive-tackle Bob and compel him to continue protocol execution. But they can arrange for the next best outcome: make it economically costly for Bob to deviate from the protocol. Specifically Alice and Bob both must commit some funds as good-faith deposit at the outset. To reclaim their money, they must open the commitment and reveal their original bit-string by a set deadline. If either side fails to complete the protocol in a timely manner, the other party can claim their deposit. This outcome is “fair” in the sense that Bob backing out (regardless of how creative his excuse is) results in Alice being compensated.

Variants of this idea can be used to design protocols for fair exchange of crypto-currency between different blockchains. The next post will look at a specific example involving Bitcoin and Ethereum. This is admittedly a case of looking for keys under the lamp-post; developing protocols to exchange crypto-currencies is much easier than trading against fiat. Blockchain payments proceed according to well-defined mathematical structures. By contrast, fiat movement involves notions such as ACH or wire-transfers that are extrinsic to the blockchain, and not easily mapped to those constructs.

[continued in part II]

Bitcoin and the C-programmer’s disease

Revenge of the C programmer

The Jargon File, a compendium of colorful terminology from the early days of computing later compiled into “The New Hacker’s Dictionary” defines the C programmer’s disease as the tendency of software written in that particular programming language to feature arbitrary limits on its functionality:

C Programmer’s Disease: noun. The tendency of the undisciplined C programmer to set arbitrary but supposedly generous static limits on table sizes (defined, if you’re lucky, by constants in header files) rather than taking the trouble to do proper dynamic storage allocation. If an application user later needs to put 68 elements into a table of size 50, the afflicted programmer reasons that he or she can easily reset the table size to 68 (or even as much as 70, to allow for future expansion) and recompile. This gives the programmer the comfortable feeling of having made the effort to satisfy the user’s (unreasonable) demands, …

Imagine spreadsheets limited to 50 columns, word-processors that assume no document will exceed 500 pages or a social network that only lets you have one thousand friends. What makes such upper bounds capricious—earning a place in the jargon and casting aspersions on the judgment of C programmers everywhere— is that they are not derived from any inherent limitation of the underlying hardware itself. Certainly handling a larger document takes more memory or disk space. Even the most powerful machine will max out eventually. But software afflicted with this problem pays no attention to how much of either resource the hardware happens to possess. Instead the software designers in their infinite wisdom decided that no sane user needs more pages/columns/friends than what they have seen fit to define as universal limit.

It is easy to look back and make fun of these decisions with the passage of time, because they look incredibly short-sighted. “640 kilobytes ought to be enough for anybody!” Bill Gates allegedly said in reference to the initial memory limit of MS-DOS (although the veracity of this quote is often disputed.) Software engineering has thankfully evolved beyond using C for everything. High-level languages these days make it much easier to do proper dynamic resource allocation, obviating the need for guessing at limits in advance. Yet more subtle instances of hardwired limits keep cropping up in surprising places.

Blocked on a scaling solution

The scaling debate in Bitcoin is one of them. There is a fundamental parameter in the system, the so-called block-size, which has been capped at a magic number of 1MB. That number has a profound effect on how many transactions can take place, in other words how many times funds can be moved from one person to another, the sine qua non When there are more transactions available than blocks, congestion results: transactions take longer to appear in a block and miners can become more picky about which transactions to include.

Each transaction includes a small fee paid to miners. In the early days of the network, these fees were so astronomically low that Bitcoin was being touted as the killer-app for any number of problems with entrenched middlemen. (In one example, someone moved $80M paying only cents in fees.) Losing too much of your profit margin to credit-card processing fees? Accept Bitcoin and skip the 2-3% “tax” charged by Visa/Mastercard. No viable alternative to intrusive advertising to support original content online? Use Bitcoin micro-payments to contribute a few cents to your favorite blogger each time you share one of their articles. Today transactions can exceed ~$1 on average. No one is seriously suggesting paying for coffee in Bitcoin any longer, but some other scenarios such as cross-border remittances where much larger amounts are typically transferred with near-usurious rates charged by incumbents like Western Union remain economically competitive.

Magic numbers and arbitrary decisions

Strictly speaking the block-size cap is not a case of the C programmer’s disease. Bitcoin Core having been authored in C++ has nothing to do with the existence of this limit. To wit, many other parameters are fully configurable or scale automatically to utilize available resources on the machine where the code runs. The blocksize is not an incidental property of the implementation, it is a deliberate decision built into the protocol. Even alternative implementations written in other languages are required to follow it. The seemingly-innocuous limit was introduced to prevent disruption to the network caused by excessive blocks. In other words, there are solid technical reasons for introducing some limit. Propagating blocks over the network gets harder as their size increases, a problem acutely experienced by the majority of mining power which happens to be based in China and relying on high-latency networks behind the Great Firewall. Even verifying that a block is correctly produced is a problem, due to some design flaws in how Bitcoin transactions are signed. In the worst-case scenario the complexity of block verification scales quadratically: a transaction twice as large can take four times as much CPU time to verify. (A pathological block containing such a giant transaction was mined at least once, in what appears to have been a good-intentioned attempt by a miner to clean up previous errant transactions. Creating such a transaction is much easier than verifying it.)

In another sense, there is a case of the C programmer attitude at work here. Someone, somewhere made an “executive decision” that 1MB blocks are enough to sustain the Bitcoin network. Whether they intended that as a temporary stop-gap measure to an ongoing incident, to be revisited later with a better solution, or as an absolute ceiling for now and ever is open to interpretation. But one thing is clear: that number is arbitrary. From the fact that a limit must exist, it does not follow that 1MB is that hollowed number. There is nothing magical about this quantity to confer an aura of inevitable finality on the status quo. It is a nice, round number pulled out of thin-air. There was no theoretical model built to estimate the effect of block-size on system properties such as propagation time, orphaned blocks, bandwidth required for a viable mining operation— the last one being critical to the idea of decentralization. No one solved a complex optimization problem involving varying block-sizes and determined that an even 1000000 bytes is the ideal number. That was not even done in 2010, much less in the present moment where presumably different, better network conditions exist around bandwidth and latency. If anything, when academic attention turned to this problem, initial results based on simulation suggested that the present population of nodes can accommodate larger blocks.

Blocksize and its discontents

Discontent around the blocksize limit grew louder in 2015, opening the door to one of the more acrimonious episodes in Bitcoin history. The controversy eventually coalesced around two camps. The opening salvo came from a group of developers who pushed for creating an incompatible version called Bitcoin XT, with a much higher limit: initially 20MB, later “negotiated” down to 8MB. Activating this version would require a disruptive upgrade process across the board, a hard-fork where the network risks splintering into two unless the vast majority of nodes upgrade. Serious disruption can result if a sizable splinter faction continues to run the previous version which rejects large blocks. Transactions appearing in these super-sized blocks would not be recognized by this group. In effect Bitcoin an asset itself would splinter into two. For each Bitcoin there would have been one“Bitcoin XT” you own on the extended ledger with large blocks and one garden-variety old-school Bitcoin owned on the original ledger. These two ledgers would start out identical but later evolve as parallel universes, diverging further with each transaction that appears on one chain without being mirrored in the other.

To fork or not to fork

If the XT logic for automatically activating a hard-fork sounds like a reckless ultimatum to the network, the experience of the Ethereum project removed any doubts on just how disruptive and unpredictable such inflection points can get. An alternative crypto-currency built around smart contracts, Ethereum had to undertake its own emergency hard-fork to bailout the too-big-to-fail DAO. The DAO (Distributed Autonomous Organization) was an ambitious project to create a venture capital firm as a smart-contract running on Ethereum with direct voting on proposal by investors. It had amassed $150M in funds until an enterprising crook noticed that the contract contained a security bug and exploited it to siphon funds away. The Ethereum Foundation sprung into action, arranging for a hard-fork to undo the security breach and restore stolen funds back to the DAO participants. But the rest of the community was unimpressed. Equating this action to crony-capitalism and bailout of failed institutions common in fiat currencies—precisely the interventionist streak that crypto-currencies were supposed to leave behind— a vocal minority declined to go along. Instead of going along with the fork, they dedicated their resources to keeping the original Ethereum ledger going, now rebranded as “Ethereum Classic.” To this day ETC survives as a crypto-currency with its own miners, its own markets for trading against other currencies (including USD) and most importantly its own blockchain. In that parallel universe, the DAO theft has never been reverted and the alternate ending of the DAO story is the thief riding off into the sunset holding bags of stolen virtual currency.

The XT proposal arrived on the scene a full year before Ethereum provided this abject lesson on the dangers of going full-speed ahead on contentious forks. But the backlash against XT was nevertheless swift. Ultimately one of its key contributors rage-quit, calling Bitcoin a failed experiment. One year after that prescient comment, Bitcoin price had tripled, proving Yogi Berra’s maxim about the difficulty of making predictions. But the scaling controversy would not go away. Blocks created by miners continued to edge closer to the absolute limit, fees required to get transactions into those blocks started to fluctuate and spike, as did confirmation times.

Meanwhile Bitcoin Core team quietly pursued a more cautious, conservative approach, opting for introducing non-disruptive scaling improvements, such as faster signature verification to improve block verification times. This path avoided creating any ticking time-bombs or implied upgrade-or-else threats for everyone in the ecosystem. But it also circumscribed limits on what types of changes could be introduced when maintaining backwards compatibility is a nonnegotiable design goal. The most significant of these improvement was segregated-witness. It moves part of transaction data outside the space allotted to transactions within a block. This also provides a scaling improvement of sorts, a virtual block-size increase without violating the sacred 1MB covenant: by slimming down the representation of transactions on the ledger, one could squeeze more of them into the same scarce space available in one block. The crucial difference: this feature could be introduced as soft-fork. No ultimatums to upgrade by a certain deadline, no risk of network-wide chaos in case of failure to upgrade. Miners indicate their intention to support segregated witness in the blocks they produce. The feature is activated when a critical threshold is reached. If anything segregated witness was too deferential to miner votes, requiring an unusually high degree of consensus at 95% before going into effect.

Beyond kicking the can down the road

At the time of writing, blocks signaling support for segregated witness plateaued around 30%. Meanwhile Bitcoin Unlimited (BU) inherited the crown from XT in pushing for disruptive hard-forks, by opening the door to miners voting on block size. It has gained enough support among miners that a contentious fork is no longer out of the question. Several exchanges have signed onto a letter describing how Bitcoin Unlimited would be handled if it does fork into a parallel universe, and at least one exchange has already started trading in futures about the fork.

Instead of trying to make predictions about how this stand-off will play out, it is better to focus on the long-term challenge of scaling Bitcoin. One-time increase in capacity enabled by segregated witness (up to 2x, depending on assumptions about adoption rate and mix of transactions) is no less arbitrary than the original 1MB limit that all sides are railing against. Even BU with the implied of lack of limitations in the name turns out to cap blocksize at 256MB—not to mention that in a world where miners decide block size, it is far from clear that the result will be a relentless competition to increase it over time. Replacing one magic number pulled out of thin air with an equally bogus one that does not derive from any coherent reasoning built on empirical data is not a “scaling solution.” It is just an attempt to kick the can down the road. The same circumstances precipitating the current crisis—congested blocks, high and unpredictable transaction fees, periodic confirmation delays— will crop up again once network usage starts pushing against the next arbitrary limit.

Bitcoin needs a sustainable solution for scaling on-chain without playing a dangerous game of chicken with disruptive forks Neither segregated witness or Bitcoin Unlimited provides a vision for solving that problem. It is one thing to risk disruptive hard-forks once to solve the problem for good. It is irresponsible to engage in such brinkmanship as the standard operating procedure.

Extracting OTP seeds from Authy

An OTP app is an OTP app is an…

A recent post on the Gemini blog outlined changes to two-factor authentication (2FA) on Gemini, providing additional background on the Authy service. Past comments suggest there were common misconceptions about Authy, perhaps none more prominent than the assumption that it is based on SMS. Authy is a service which includes multiple options for 2FA: SMS, voice, mobile app for generating codes and OneTouch. At the same time a common question often asked is: “Can I use Google Authenticator or other favorite 2FA application instead?” Making that scenario work turns out to be a good way to gain insight into how Authy app itself operates under the covers.

Authy has mobile applications for Android, iOS as well as two incarnations for desktops: a Chrome extension and a Chrome application. All of them can generate one-time passcodes (OTP) to serve as second-factor when logging into a website. A natural question is how these codes are generated and whether they are compatible with other popular OTP applications such as Google Authenticator or Duo Mobile.

This is not a foregone conclusion and not all OTP-generation algorithms are identical. For example the earliest design in the market was SecurID by RSA Security. These were small hardware tokens with a seven-segment LCD display for showing numerical codes. RSA not only sold these tokens but also operated the service for verifying codes entered by users. For all intents and purposes, the tokens were a blackbox: the algorithm used to generate the codes was undocumented and users could not reprogram the tokens on their own. (That did not work out all that well when RSA was breached by nation-state attackers in 2011, resulting in the downstream compromise of RSA customers relying on SecurID— including notably Lockheed Martin.)

Carrying around an extra gadget for a single purpose limits usability, all but guaranteeing that solutions such as Secure ID are confined to “enterprise” scenarios— in other words, where employees have no say in the matter because their IT department decided this is how authentication works. Ubiquity of smart-phones made it possible to replace one-off special purpose gadgets with so-called “soft tokens:” mobile apps running on smartphones that can implement similar logic without the baggage of additional hardware.

Google Authenticator was among the first of these intended for mass market consumption, as opposed to the more niche enterprise scenarios. [Full disclosure: This blogger worked on two-factor authentication at Google, including as maintainer of the Android version of Google Authenticator] Early versions were open-sourced, although the version on Play Store diverged significantly after 2010 without releasing updates to source. Still looking at the code and surrounding documentation explains how OTPs are generated. Specifically it is based on two open standards:

TOTP: Time-based OTP as standardized in RFC 6238. Codes are generated by applying a keyed-hash function (specifically HMAC) to the current time, suitably quantized into intervals.
HOTP: HMAC-based OTP standardized by RFC 4226. As the RFC number suggests, this predates TOTP. Codes are generated by applying a keyed hash function to an incrementing counter. That counter is incremented each time an OTP is generated. (Incidentally the internal authentication system for Google employees— as opposed to end-users/customers— leveraged this mode rather than TOTP.)

The mystery algorithm

So what is Authy using? TOTP is a reasonable guess because Authy is documented to be compatible with Google Authenticator, and can in fact import seeds using the same URL scheme represented in QR-codes. But strictly speaking that only proves that Authy includes a TOTP implementation as subset of its functionality. Recall that Authy also includes a cloud-service responsible for provisioning seeds to phones; these are the “native” accounts managed by Authy, as opposed to stand-alone accounts where the user must scan a QR code. It is entirely conceivable that OTP generation for native Authy accounts follows some other standard. The fact that native Authy accounts generate 7-digit codes lends some support to that theory, since GA can only generate 6-digit codes. (Interestingly the URL scheme for QR codes allows 6 or 8 digits, but as the documentation points out GA ignores that parameter.)

Answering this question requires looking under the hood of the Authy app itself. In principle we can pick any version. This blog post uses the Chrome application as case study, because reviewing Chrome apps does not require any special software beyond the Developer Tools built into Chrome itself. No pulling APKs from the phone, no disassembling Dalvik binaries or firing up IDA Pro necessary.

There is another benefit to going after the Chrome application: since the end-goal is using an alternative OTP application to generate codes compatible with Authy, it is necessary to extract the secret-key used by the application. Depending on how keys are managed, this can be more challenging on a mobile device. For example some models of Android feature hardware-backed key storage provided by TrustZone, where keys are not extractable after provisioning. AN application can ask the hardware for specific operations to be performed with the key, such as signing a message as called for by TOTP. But it can not obtain raw bits of the key to ship them off-device. (While the security level of TrustZone is weak compared to an embedded secure element or TPM— hardened chips with real physical security— it still raises the bar against trivial software attacks.) By contrast browser applications are confined to standard Javascript interfaces, with no access to dedicated cryptographic hardware even if one exists on the machine.

Understanding the Chrome application

First install the Authy application from the Chrome store:

Next visit chrome://extensions and make sure Developer Mode is enabled:

Now open chrome://apps in another tab and launch the Authy app:

Back to the extensions tab and click on “main.html” to inspect the source code, switch to “Sources” tab, and expand the JS folder. There is one Javascript file here called “app.js” The code has been minimized and does not look very legible:

Luckily Chrome has an option to pretty-print the minimized Javascript, prominently advertised at the top. After taking up Chrome on that offer, the code becomes eminently readable with many recognizable symbols. Searching for the string “totp” finds a function definition:

This function computes TOTP in terms of HOTP, which may seem puzzling at first— time-based OTP based on counter-mode OTP? The intuition is both schemes share the same pattern: applying a cryptographic function to some internal state represented by a positive number. In the case of HOTP that number is an incrementing counter, increasing precisely by 1 each time an OTP is generated. In the case of TOTP the number is based on current time, effectively fast-forwarded to skip any unused values.

Looking around this function there are other hints, such as default number of digits set to 7 as expected. More puzzling is the default time-interval set to 10 seconds. TOTP uses a time interval to “quantize” time into discrete blocks. If codes were a function of a very precise timestamp measured down to the millisecond, it would be very difficult for the server to verify it without knowing exactly when it was generated with the same accuracy. It would also require both sides to have perfectly synchronized clocks. (Recall that OTPs are being entered by users, so it is not an option to include additional metadata about time.) To work get around this problem, TOTP implementations round the time down to the nearest multiple of an interval. For example if one were using 60-second intervals, the OTP code would be identical during an entire minute. Incidentally TOTP spec defines time as number of seconds since Unix epoch, so these intervals needs not start on an exact minute boundary.

The apparent discrepancy arises from the fact that Authy app displays a code for 20 seconds, suggesting it is good for that period of time. But the underlying generator is using 10 second intervals, implying that codes change after that. What is going on here? The answer is based on another trick used to make OTP implementations deal with clocks getting out of sync or delays in submitting OTP over a network. Instead of checking strictly against the current time interval, most servers will also check submitted OTP against a few preceding and following intervals. In other words, there is usually more than 1 valid OTP code at any given time, including those corresponding to times a few minutes back and a few minutes into the future. For this reason even a “stale” OTP code generated 20 seconds ago can still be accepted even when the current time (measured in 10-second intervals) has advanced one or two steps.

But we are jumping ahead— there is one more critical step required to verify that we are looking at the right function, that this is the code-path invoked by the Chrome app when generating OTPs. Easiest way to check involves setting a breakpoint in that function, by clicking on the line number:

Now we wait for the app to generate a code. Sure enough it freezes with a “paused in debugger” message:

Back to Chrome developer tools, where the debugger has paused on a breakpoint. The debugger has helpfully annotated the function parameters and will also display local variables if you hover over them:

Tracing the call a few steps further would show that “e” is the secret-seed encoded in hex, “r” is the sequence number and “o” is the number of digits. (More precisely it used to be the secret-seed; this particular Authy install has since been deleted. Each install receives a different seed, even for the same account.) Comparing “r” against the current epoch time shows that it is roughly one-tenth the value, confirming the hypothesis of 10 second intervals.

Recreating the OTP generation with another app

A TOTP generator is defined by three parameters:

Secret-seed
Time interval
Number of digits

Since we have extracted all three, we have everything necessary to generate compatible codes using another TOTP implementation. Google Authenticator has defined a URL scheme starting with otp:// for encoding these parameters, which is commonly represented as 2-dimensional QR code scanned with a phone camera. So in principle one can create such a URL and import the generator into Google Authenticator or any other soft-token application that groks the same scheme. (Most 2FA applications including Duo Mobile have adopted the URL scheme introduced by GA for compatibility, to serve as drop-in replacements.) One catch is the key must be encoded using the esoteric base32 encoding instead of the more familiar hex or base64 options; this can be done with a line of Python. The final URL will take the form:

otpauth://totp/Authy:alice@foobar.com?secret=<base32-encoded>&digits=8&issuer=Authy&period=10

Why 8 digits? There is a practical problem with Authy using 7 digits output. The specification for the URL scheme states that valid choices are 6 or 8. (Not that it matters, since Google Authenticator does not support any option other than 6.) Luckily we do not need exactly 7 digits. Due to how HOTP/TOTP convert the HMAC output into a sequence of digits, the correct “7-digit code” can be obtained from the 8-digit one by throwing away the first digit. Armed with this trick we can create an otp:// URL containing the secret extracted from Authy and specify 8 digits. Neither Google Authenticator or Duo Mobile were able to import an account using this URL but the FreeOTP app from RedHat succeeds. Here is the final result, showing screenshots captured from both applications around the same time.

On the left: Authy Chrome app; on the right: FreeOTP running in Android emulator

Note the extra leading digit displayed by FreeOTP. Because the generator is configured to output 8 digits, it outputs one more digit that needs to be ignored when using the code.

There is no “vulnerability” here

To revisit the original questions:

What algorithm does Authy use for generating codes? We have verified that it is based on TOTP.
Can a different 2FA application be used to generate codes? Sort of: it is possible to copy the secret seed from an existing Authy installation and redeploy it on a different, stand-alone OTP application such as FreeOTP. (One could also pursue other avenues to getting the seed, such as leveraging the server API used by the provisioning process.)

One point to be very clear on: there is no new vulnerability here. When using the Authy Chrome application, it is a given that an attacker with full control of the browser can extract the secret seeds. Similar attacks apply on mobile devices: seeds can be scraped from a jailbroken iPhone or rooted Android phone; it is just a matter of reverse-engineering how the application stores those secrets. Even when hardware-backed key storage is used, the seeds are vulnerable at the point of provisioning when they are delivered from the cloud or initially generated for upload to the cloud.

[Updated Apr 27th to correct a typo]

Principle of least-privilege: looking beyond the insider risk

Second-law applied to access rights

“I have excessive privileges for accessing this system. Please reduce my access rights to avoid unintended risk to our users.”

That is a statement you are unlikely to hear from any engineer or operations person tasked with running an online service. Far more likely are demands for additional access, couched in terms of an immediate crisis involving some urgent task that can not be performed without a password, root access on some server or being added to some privileged group. The principle of least privilege states that people and systems should only be granted those permissions absolutely necessary to perform their job. Holding this line in practice is an uphill battle. Similar to the increase of entropy dictated by the second-law of thermodynamics, access rights inexorably seem to expand over time, converging on a steady state where everyone has access to everything.

That isn’t surprising given the dynamics of operating services. Missing rights are quickly noticed when someone going about their work runs into an abrupt “access denied” error. Excessive, unnecessary privileges have no such overt consequences. They are only identified after a painstaking audit of access logs and security policies. A corollary: it is easy to grant more access, it is much harder to take it away. The effects of removing existing privilege are difficult to predict. Will that break an existing process? Was this person ever accessing that system? No wonder that employee departures are about the only time most companies reduce access. Doing it any other time preemptively requires some data crunching: mine audit logs to infer who accessed some resource in the last couple of weeks and compare that against all users with access. But even that may not tell the whole story. For example, often access is granted to plan for a worst-case disaster scenario, when employees who carry out some role in the normal course of affairs are no longer able to perform that role and their backups— who may never have accessed the system until that point— must step in.

Part of the problem is scaling teams: a policy that starts out entirely consistent with least-privilege can become a free-for-all when amplified to a larger team. Small companies often have very little in the way of formal separation of duties between engineering and operations. In a startup of 5 employees, people naturally gravitate to being jack-of-all-trades. If any person may be asked to perform any task, it is not entirely unreasonable for everyone to have root/administrator rights on every server the company manages. But once that rule is applied reflexively for every new hire (“all engineers get root on every box”) as the company scales to 100 people, there is massive risk amplification. It used to be that only five people could cause significant damage, whether by virtue of inadvertent mistakes or deliberate malfeasance. Now there are a hundred people capable of inflicting that level of harm, and not all of them may even have the same level of security awareness as the founding group.

There is also a subtle, cultural problem with attempting to pare down access. It is commonly interpreted as signaling distrust. In the same way that granting access to some system implies that person can be trusted with access to resources on that system—user data, confidential HR records, company financials— retracting that privilege can be seen as a harsh pronouncement on the untrustworthiness of the same person.

This post is an attempt to challenge that perception, primarily by pointing out that internal controls are not only, or even primarily, concerned with insider risk. Following the logic of least-privilege hardens systems against the more common threats from external actors. More subtly it protects employees from becoming a high-value target for persistent, highly-skilled attackers.

Game of numbers

Consider the probability of an employee device being compromised by malware. This could happen through different avenues such as unwittingly installing an application that had been back-doored or visiting a malicious website with a vulnerable browser or plugin. Suppose that there is one in a thousand (0.1%) chance of this happening to any given employee in the course of a month when they are going about their daily routine surfing the web and installing new app/updates/packages from different locations. Going back to our hypothetical examples above, for the garage company with five engineers there is about 6% chance that at least one person will get their machine owned after a year. But the post-seed-round startup with 100 engineers is looking at odds of 70%—in other words, more likely than not.

If every one of those hundred engineers had unfettered access to company systems, all of those resources at risk. As the saying goes: if an attacker is executing arbitrary code on your computer, it is no longer your computer. The adversary can use compromised hosts as stepping stone to access other resources such as internal websites, databases containing customer information or production system. Note that neither two-factor authentication nor hardware tokens help under this threat model. A patient attacker can simply wait for the legitimate user to scan their fingerprint, connect USB token/smart-card, push buttons, perform an interpretive dance or whatever other bizarre ritual is required for successful authentication, piggy-backing on the resulting session to access remote resources once the user initiates a connection.

That in a nutshell is the main problem with over-extended access policies: every person in the organization becomes a critical point of failure against external attacks. This is purely a matter of probabilities; it is not casting aspersions on whether employees in question are honest, well-meaning or diligent. The never-ending supply of Flash 0-day vulnerabilities affects hard-working employees just as much as it affects the slackers. Similarly malicious ads finding their way into popular websites pose a threat to all visitors running vulnerable software without discriminating against their intentions.

Painting a target on the home team

There is a more subtle reason that over-broad access rights are dangerous not only for the organization overall, but for the individuals carrying them. It may end up painting a target on those employees in the eyes of a persistent attacker. Recall that one of the distinguishing factors for a targeted attack is the extensive investment in mapping out the organization the adversary is attempting to infiltrate. For example, they may perform basic reconnaissance through open-source channels including LinkedIn or Twitter to identify key employees and organizational structure. Assuming that information about internal roles can not be kept secret indefinitely, one concludes that attackers will develop a good idea of which persons they need to go after in order to succeed. Exactly who ends up being under the virtual cross-hairs depends on their objectives.

For run of the mill financial fraud, high-level executives and accounting personnel are obvious targets. For instance the 2014 attack on BitStamp started out with carefully tailored spear-phishing messages to executives, followed by successfully impersonating those executives to request transfer of Bitcoin. FBI has issued a general warning about these scams last year warning that “… schemers go to great lengths to spoof company e-mail or use social engineering to assume the identity of the CEO, a company attorney, or trusted vendor. They research employees who manage money and use language specific to the company they are targeting, then they request a wire fraud transfer using dollar amounts that lend legitimacy.”

While the FBI attributes a disputed figure of $2.3B in losses to such scams, perhaps the more remarkable part of this trend is: these are relatively simple and broadly accessible attacks. There are no fancy zero-days or creative new exploits techniques involved requiring nation-state level of expertise. Yet even crooks carrying out these amateurish smash-and-grab operations were capable of mapping out the organization structure and homing in on the right individuals.

More dangerous than the get-rich-quick attackers are those pursuing long-term, stealth persistence for intelligence gathering. This class of adversary is less interested in hawking credit-card numbers than retaining long-term access to company systems for ongoing information gathering. For example they could be interested in customer data, stealing intellectual property or simply using the current target as stepping stone for the next attack. That level of access requires gaining deep access to company systems, not just reading the email inbox an executive or two. Emails alone are not enough when the objective is nothing less than full control of IT infrastructure.

This is where highly-privileged employees come into the picture. It is a conservative assumption that resourceful attackers will identify the key personnel to go after. (That need not happen all at once; gaining initial access to an unprivileged account often allows looking around to identify other accounts.) Those with root access on production systems where customer data is stored are particularly high-value targets. They provide the type of low-level access to infrastructure which permit deploying additional offensive capabilities such as planting malware which is not possible with access to email alone. Equally high-profile as targets are internal IT personnel with the means to compromise other employee machines. for example by using a centralized-management solution to deploy arbitrary code.

It is difficult to avoid the existence of highly privileged accounts. When something goes wrong on a server in the data-center, the assumption is that someone can remotely connect and get root access in order to diagnose/fix the problem. Similarly for internal IT: when an employee forgets their password or experiences problems with their machine, there is someone they can turn to for help and that person will be able to reset their password or log into their machine to troubleshoot. Those users with the proverbial keys to the kingdom are understood be in the cross-hairs. They will be expected to exercise greater degree of caution and maintain higher standards of operational security than the average employee. Their activity will be monitored carefully and any signs of unusual activity investigated promptly.

The challenge for an organization is keeping the size of that group “manageable” in relation to the that of the company itself. If everyone has root everywhere or everyone can effectively compromise every other person through shared infrastructure (sometimes such dependencies can be subtle) the entire company becomes one high value target. Any individual failure has far-reaching consequences. Every successful attack against one employee becomes a major incident potentially impacting all assets.

Setting the right example

Information security professionals can lead by example in pushing back against gratuitous access policies. It is very common for security teams to be among the worst-offenders in not following least privilege. When this blogger was in charge of security at Airbnb, everyone wanted to gift security team more access to everything: administrator role at AWS, SSH to production servers running the site and many other creative ways to access customer data. Mere appearance of the word “security” in job titles seems to confer a sense of infallibility: not only are these fellows deserving of ultimate trust with the most valuable company information, but they must also have perfect operational security practices and zero chance of falling victim to attacks that may befall the standard engineer. These assumptions are dangerous. By virtue of getting access, we all introduce risk to the system regardless of how good our opsec game may be. Comparing the incremental risk to benefits generated on a case-by-case basis is a better approach to crafting access policies. If someone is constantly exercising their new privileges to help their overloaded colleagues solve problems, the case is easy to make. If on the other hand the system is used infrequently or never—suggesting access rights were given out of a reflexive process born out of hypothetical future scenarios—the benefits are unclear. Meanwhile risks will increase, because the person is less likely to be familiar with sound security practices for correctly using that system. Making these judgment calls is part of sound risk management.

Still one click away? Lessons from Yahoo on lock-in & competition

[Full disclosure: This blogger worked on MSFT and Google security teams]

The tar pit of platform lock-in

“Our competitors are just one click away.”

That used to one of the oft-repeated slogans at Google. On its face, this is the type of cliched motivational line senior leadership likes to throw around for rallying the troop: warning against complacency, with visions of users walking away, lured by the siren song of a more nimble competitor. But read at another level, it was a subtle dig at MSFT and their business strategy. MSFT had become an industry juggernaut by relying on the lock-in effects created by the Windows platform. Once consumers bought a copy of Windows, they were caught hook-and-sinker in the entire ecosystem, buying more applications written for Windows. (Microsoft Office suite ranking near the top of that list did not hurt either.) Many of those applications were either not available for other platforms or the porting job was at best an after-thought- as is still the case for Office on OSX today. Trying to switch from this ecosystem to an alternative platform such as Macintosh OSX or Linux became the IT equivalent of getting out of the tar pit. In fact the challenges MSFT faced deprecating Windows XP suggest that even movement within the ecosystem can be a daunting challenge for participants.

Such lock-in effects are even more pronounced in enterprise software. A company with thousands of employees running Windows inevitably finds itself setting up Active Directory to manage that fleet. AD in turns comes with an array of auxiliary features, and before long the IT department is shelling out $$ for more Windows Server licenses to operate VPN services for remote-access, Sharepoint for internal collaboration, Exchange for hosting email and that is only the beginning. Coupled with a slew of proprietary undocumented protocols which discouraged emergence of competing implementations (at least until the EU settlement forced MSFT’s hand in open documentation) these dependencies all but assured that any attempt to migrate out of the ecosystem would be an expensive and painful project for any large enterprise.

Lower switching costs online?

Online services were supposed to be different— in theory. If Google search quality went downhill, it is not that difficult for users to surf over to a competing search engine to run the same query. Everything goes through a standard web browser: no new software to install, no dependencies to untangle, no compatibility nightmare involving other applications breaking because the user decided to search on Bing.

This is not to say that the market for search engines are immune to inertia in consumer preferences or brand-name effects. For years MSFT ran a campaign involving “blind comparisons” designed to prove that Bing search results were at least as good as, if not better than, Google. (Leading to the insider joke at MSFT that Bing stands for “But It’s Not Google” to explain why users continued to favor Google.) The campaign did not seem to have convinced many people outside Redmond, but at least it was predicated on a reasonable assumption: search-engine choices can be swayed. If consumers were convinced they had a better option, nothing prevented them from switching.

Except that assumption had long been under assault. Search-engine preferences were increasingly becoming part of software configuration with varying degrees of control. Once it became clear that search was strategic, software designers decided it was too important to leave it up to users to navigate to the right website and type a query. Instead search functionality was integrated into toolbars, web-browsers and in the case of Windows, the operating system itself. Consumers only had to type their query into a magical search field and results would come back. Part of that convenience involved a decision by software authors (as opposed to users) on which search engine gets to provide those answers. Not surprisingly queries from Google toolbar were routed to Google, those from Internet Explorer went to Bing and Yahoo toolbar seemingly routed to whoever paid Yahoo more that year. Naturally that lead to plenty of disgruntlement and accusations of anti-competitive behavior from the provider who did not come out on top. Regulators got involved. In the case of the European Union, an investigation prodded by Google resulted in MSFT agreeing to change Internet Explorer and forcing users to choose a search engine on first run. The micro-management did not stop there: to prevent any bias the list of search options had to be randomly ordered for each user. (Some behavioral economics experiments suggest consumers have a preference for picking the first or last object out of a line-up)

Random ordering of search engines in IE

Personalization: lock-in through data

Particularly puzzling feature of this episode is that online search is not even a personalized service at its core. Google has aggressively promoted its ability to return search results tailored to each user, and not coincidentally encouraged/nagged/bribed users into staying logged-in online while performing those searches to build a more comprehensive history. But online search can be carried out fully anonymously, with the service provider having no idea about the person behind the query. Reasonable people may disagree to what extent this degrades quality of results. One data-point: there are search engines such as DuckDuckGo which promise to not save search history or engage in other privacy-infringing user tracking. The corollary is that switching search providers does not “leave behind” much with the previous provider. There is a surprising asymmetry in the value attached to search history. It is a priceless asset for search engines which they can use to mine for patterns and improve their own accuracy. On the other hand it is not something that users get attached to or wax nostalgic over, wondering “what was I searching for last Thanksgiving?” There is no concept of downloading your search history from one provider and uploading it to another to maintain continuity.

Holding users hostage

That brings up the subject of Yahoo Mail and the (possibly inadvertent) “hostage taking” the company engaged in by disabling email forwarding at a particularly inopportune moment in the middle of a PR crisis with users heading for the exits en masse. Unlike search, email is intrinsically personalized. Switching email providers comes with the prospect of leaving some resources behind. There is all the past archive of messages to begin with, which may run into the gigabytes, not a trivial amount to download. Some protocols such as IMAP can make it easier to retrieve all messages in bulk, although the consumer is still stuck with locally managing this stash—making sure it is properly backed-up, confidential messages encrypted etc. More subtly there is the email address itself. Registering for a new email address is easy; updating every place where the previous email address was used is hard. Luckily there is a standard solution for this: email forwarding. If Alice can forward all incoming messages from alice@yahoo.com to her new account at alice@gmail.com, she can bid farewell to Yahoo and only use Google going forward. Meanwhile her friends and associates can continue writing to her former Yahoo address until they gradually update their address books. Message delivery will not be interrupted; Alice will receive the forwarded messages and reply from Gmail.

This is good news for consumers who want to switch providers. It is also good news for fostering competition among email providers to lure away users from their rivals. On the other hand, it is bad news for email providers who are on the losing side of that competition. Case in point, Yahoo. Plagued by a scandal over having sanctioned NSA mass-surveillance of all email, the company found itself facing a mass exodus of users, complete with step-by-step guides published in mainstream media explaining how to close a Yahoo account. (Not surprisingly, there is no link to closing an account from the Account page where one would actually expect to find it.)

Coincidentally around this time, the company decided to disable email-forwarding citing improvements in progress:

“While we work to improve it, we’ve temporarily disabled the ability to turn on Mail Forwarding for new forwarding addresses,”

In keeping with Hanlon’s razor, let us give Yahoo benefit of the doubt and assume that decision is indeed motivated by engineering concerns, as opposed to strategic maneuvering to hold users hostage until the PR crisis blows over. (Indeed the functionality was restored a few days later, prompting an Engadet headline to declare “you can finally leave.”) It is nonetheless a staggering demonstration of how deceptive that “one-click away” premise for competition can be. Many Yahoo users may have been outraged enough to register a new email account with Google or MSFT after reading the news but that is not the same abandoning ship altogether. As long as they are still receiving email at their Yahoo address and they can not forward those messages automatically, they are still chained to Yahoo. These users can be counted on for visiting the Yahoo web-properties, seeing banner ads chosen by Yahoo (in other words, generating advertising revenue for the company) or running the mobile app on their phones. It is not until email forwarding is operational or customers decide they can afford to abandon any messages sent to their old email address that they can fully sever their ties with Yahoo.

That raises the question: does Yahoo have any obligation to provide email forwarding for the lifetime of the account? After all there is some cost to operating an email service. The unstated, if not deliberately obscured, assumption is that users indirectly pay for “free email” with their attention and privacy, being subjected to advertising while providing the raw-material of clicks for the data mining required to fine-tune the delivery of those ads. Arguably users who are no longer visiting the website or seeing ads are not holding up their end of this implicit bargain. They represent a net negative to the cloud provider. However the economics of carrying such inactive users may shake out, major providers do not appear to have embraced the Yahoo logic. Both Gmail and Outlook.com allow forwarding messages to another address chosen by the user. That seeming generosity may have something to do with the relatively small number of users taking advantage of the feature, representing negligible cost to either service. While both MSFT and Google were implicated in the NSA PRISM program revealed by Edward Snowden, neither company has quite faced the type of persistent backlash Yahoo experienced over its own surveillance debacle, or for that matter the gradual but steady decline in the company’s fortunes over Mayer’s tenure.

In fact Google goes above and beyond mail forwarding. In 2011 the company introduced a feature called Takeout, as part of the aptly named “Data Liberation Front” project. It allows users to download data associated with their Google services. The list which has been expanding since its introduction now includes not only the usual suspects of email and Google Drive files, but also search history, location, images, notes, calendar and YouTube videos. This is an upfront commitment to customers that their data will not be held hostage at Google. In fact Takeout seems to go out of its way to play well with rivals: it has an option to upload the resulting massive archive to Dropbox or MSFT OneDrive.

Google Takeout results can be saved to competing services

The bad news is that the dynamics of competition for cloud services has shifted: dreaded switching costs and lock-in effects associated with old-school enterprise software arise in this space too. The good news is that cloud services can still voluntarily go out of their way to offer functionality that restores some semblance of the conventional wisdom: “Our competitors are just one click away.”

“Code is law” is not a law

(Reflections on Ethereum Classic and modes of regulation)

East Coast code vs West Coast code

When Lawrence Lessig published Code and Other Laws of Cyberspace in 1999, it became an instant classic on technology policy. In a handful of chapters it chronicled how technology, or West Coast code, functions as a force for regulating behavior along side its more visible old-school counterpart: regulation by law, or East Coast code; economics (tax & raise the costs → discourage consumption); historic social-norms enforced by peer pressure. “Code is law” became the cliff-notes version of that pattern. Today that thesis is increasingly interpreted to mean that code ought to be law. Most recently Ethereum Classic has grabbed that banner in arguing against intervention in the DAO theft. This is a good time to revisit the original source. The subtlety of Code’s argument has been lost in the new incarnation: Lessig was no cheerleader for this transformation where code increasingly takes the place of laws in regulating behavior.

Renegade chains: Ethereum and Ethereum Classic

Ethereum Classic grew out of a major schism in the crypto-currency community around what “contracts” stand for. To recap: a German start-up attempted to put together a crowd-sourced venture capital firm as a smart-contract, called The DAO for “decentralized autonomous organization.” This was no ordinary contract spelled out in English, not even the obscure dialect legal professionals speak. Instead it was expressed in the precise, formal language of computer code. The contract would live on the Ethereum blockchain, its structure in public-view, every step of its execution transparent for all to observe. This was a bold vision, raising over $130 million based on the Ether/USD exchange rate of the time. Perhaps too bold, it turns out: there was a critical bug in the contract logic, which allowed an attacker to waltz away with $60 million of those funds.

Crypto-currency space has a decidedly libertarian ideology. In some of the more extreme positions, these can veer towards conspiratorial view of the Federal Reserve and debasement of currency through centralized intervention. So it was understandable that the decision by the Ethereum Foundation— closest analog to a governing body/standard forum/Politburo in this purportedly decentralized system— to bail out the DAO proved controversial. After all, there was no defect in the Ethereum protocol itself. As a decentralized platform for executing smart-contracts expressed in computer code, the system performed exactly as advertised. Instead there was a bug in one specific contract out of thousands written to execute on that platform. Rather inconveniently that one contract happened to carry close to 10% of all Ether, the currency of the realm, in existence at the time. It might as well have been a textbook behavioral economics experiment to demonstrate how bailouts, crony capitalism and “too-big-to-fail” can emerge naturally even in decentralized systems. The solution was a hard-fork to rewrite history on the blockchain, undoing the theft by reversing those transactions exploiting the vulnerability.

Between a fork and a hard-place

“When you come to a fork in the road, take it.” — Yogi Berra

A blockchain is the emergent consensus out of a distributed system containing thousands of individual nodes. If consensus breaks down and nodes disagree about the state of the world— which transactions are valid, the balance of funds in each account, who owns some debt etc.— there is no longer a single chain. Instead there are two or more chains, a “fork” that splits the system into incompatible fragments or parallel universes with different states: a payment has been accepted in one but never received in the other, or a debt has been paid in one chain only. Before Ethereum made forks into a regular pastime, they were dreaded and avoided at all costs. Blockchains are designed to quickly put them out of existence and “converge” back on a happy consensus. Very short-lived forks happen all the time: in a large distributed system it is expected that not every far-flung node will be in-sync with everyone else. It is not uncommon in Bitcoin for competing miners to discover new blocks almost simultaneously, with each group proceeding to build on their own resulting in diverging chains. But the protocol corrects such disagreements with rules designed to anoint a single chain as the “winner” to survive and all others to quickly vanish, with no new blocks mined to extend them. This works well in practice because most forks are not deliberate. They are an accidental side-effect of decentralization and limits on the propagation of information in a distributed system. Occasionally forks may even be introduced by bugs- Bitcoin experienced one in 2013 when nodes running an older version of the software started rejecting blocks deemed valid by the newer version.

Until recently it was unusual for a fork to be introduced deliberately. Bitcoin Core team in particular adopted a fork-averse philosophy, even if it means foregoing the opportunity to quickly evolve the protocol by forcing upgrades across the board with a threatened deadline. Such a game-of-chicken is exactly what the Ethereum Foundation proposed to undo the theft of funds from the DAO. Updated versions Ethereum software would disregard specific transactions implicated in the heist, in effect rewriting “history” on the blockchain to revert those funds back to their rightful owner. It’s as if Intel, the manufacturer that makes x86 processors that power most consumer PCs, decided to redesign their perfectly good hardware in order to work around a Windows bug because Windows was the most popular operating system running on x86. (Alternatively: some critics pointed to a conflict of interest in Ethereum Foundation members having personal stakes in the DAO. The analogy becomes Intel redesigning its chips in order to compensate for Windows bugs if Intel were an investor in Microsoft.)

Not everyone agreed this was a good idea. Ethereum Classic was the name adopted by the splinter faction refusing to go along with the fait accompli. Instead this group opted to run the previous version of the Ethereum software which continued to build a blockchain on existing, unaltered history. Ethereum Foundation initially did not pay much attention to the opposition. It was assumed that the situation would resolve itself just like naturally occurring forks: one chain emerging 100% victorious and the other one dying out completely, with all miners working on the winning chain. That’s not quite how reality played out, and in hindsight this should have been expected, given the material difference in intent between accidental forks arising intrinsically from decentralization vs deliberate forks introduced by fiat. Ethereum Classic (“ETC”) retained close to 10% the hash-rate of mainline Ethereum. It also achieved a valuation around 10% and became a liquid currency in its own right once the exchange Poloniex listed ETC for trading.

The dust may have settled after the hard-fork but the wisdom of bailing-out the DAO remains a highly divisive topic in the cryptocurrency space. Recently ETC proponents have rallied around an old idea: Code Is Law. According to this line of argument, the DAO contract was faithfully executed on the blockchain exactly as written. Everything that transpired, from the initial fund-raising to the eventual theft and desperate “Robin Hood” recovery attempts, proceeded according to terms specified out in the original contract. If the Ethereum system performed as advertised in enforcing terms of the contract,what justification can there be for resorting to this deus ex machina to override those terms? If Code is Law as Lessig decreed, DAO hard-fork constitutes an “unlawful” intervention in a financial system built around contracts by virtue of violating contractual terms expressed in code:

Code is law on the blockchain. In the sense, all executions and transactions are final and immutable. So, from our (Ethereum Classic supporters) standpoint by pushing the DAO hard fork EF broke the “law” in the sense that they imposed an invalid transaction state on the blockchain.

Code: benevolent and malicious

This is where revisiting “Code” is helpful. Lessig was by no means indifferent to the ways code, or architecture of physical space before there were computers, had been leveraged in the past to achieve political ends. One examples cited in the book are the bridges leading to Long Island: they were built too low for buses to pass, deterring minorities dependent on public transportation. Even in that unenlightened time, there were no overtly discriminatory laws on the books saying outright that African-Americans could not visit Long Island. Instead it was left up to the “code” road infrastructure to implement that disgraceful policy. Code may have supplanted law in this example but it was clearly not the right outcome.

In fact much of “Code” is a check on the unbridled optimism of the late 1990s when it was fashionable to portray the Internet as an unambiguous force for good: more avenues for self-expression, greater freedom of speech, improved privacy for communication through strong cryptography, an environment inhospitable to surveillance. In short, more of the good stuff everyone wants. More importantly the prevailing opinion held that this was the “manifest destiny” of the Internet. The Internet could not help but propel us closer towards to this happy outcome because it was somehow “designed” to increase personal freedom, defend privacy and combat censorship.

That view sounds downright naive this day and age of the Great Firewall of China, locked-down appliances chronicled in The Future of the Internet, state-sponsored disinformation campaigns and NSA mass-surveillance. But Lessig was prescient in sounding the alarm at the height of dot-com euphoria: “Code” spoke of architectures of control as well as architectures of freedom as being equal possibilities for the future. When the process of public legislation, however dysfunctional and messy it may be, is supplanted by private agendas baked into software, there is no guarantee that the outcome will align with the values associated with the Internet in its early years. There is no assurance that a future update to code running the infrastructure will not nudge the Internet towards becoming a platform for mass consumer surveillance, walled-gardens, echo-chambers, invisible censorship and subtle manipulation.

Hard-forks as deus ex machina

There is much to be said about not making random edits to blockchains when the intervention can not be justified on technical merits. It’s one thing to change the rules of the game to implement some critical security improvement, as Ethereum recently did to improve resilience against DDoS attacks. This time there were no splinter-cells taking up the banner of the abandoned chain. By contrast, Ethereum Foundation actively cheerleading the controversial DAO hard-fork opens Pandora’s box: here is proof that blockchain interventions can be orchestrated on demand by a centralized group, even in a purportedly decentralized system that was supposed to be at the whim of its own users. What prevents repressive regimes from asking the Foundation to block funds sent to political dissidents in a future update? Could a litigious enterprise with creative lawyers take the Foundation to court over some transaction they would like to see reversed?

These questions are only scratching the surface. Many valid arguments can be advanced in favor of or in opposition to the DAO hard-fork. It is not the intent of this blog post to spill more electrons on that debate. The meta-point is that such complexity can not be dismissed with a simplistic appeal to “code-is-law,” however appealing such slogans may be. Lessig’s original observation was descriptive— an observation about how the architecture of the Internet is being used to supplant or subvert existing regulation. Ethereum Classic misappropriates that into a normative statement: code should be law and this an unadulterated good.

Comparing this notion of “law” to physical laws such as gravity is misleading. One does not have any choice in following the laws of nature; Mother Nature neither requires no policing to enforce her rules nor has need to mete out punishment for transgressions. By contrast, laws in a free society represent a voluntary bargain members of that society have collectively agreed to. They are effective only to the extent that such agreements are honored nearly universally and vigorously enforced against the few who run afoul of them. The consensus in that agreement can change over time. Unjust laws can be challenged through democratic channels. With sufficient support they can be changed. At one point several states in the US had anti-miscegenation laws on the books. Today such discrimination would be considered unthinkable. “Code is law” in the Ethereum Classic sense represents not an inescapable fact of life as a deliberate choice to cede control over enforcement of contracts to pieces of code executed on a blockchain. That choice is not a death pact. Code itself may have no room for ambiguity in its logic, but what lends power to that code is the voluntary decision by blockchain participants to prop up the platform it executes on. The validity of that platform can be challenged and its rules modified by consensus. In fact every hard-fork is in effect changing the rules of the game on a blockchain: some contract that used to be valid under the previous design is invalidated.

Not all instances of West Coast code supplanting East Coast code are beneficial or desirable from a social standpoint. Blind adherence to the primacy of West Coast code is unlikely to yield an acceptable alternative to contract law.

(Edited 11/06: fixed incomplete sentence in the first paragraph, added clarification about ideological positions)

Use and misuse of code-signing (part II)

[continued from part I]

There is no “evil-bit”

X509 certificates represent assertions about identity. They are not assertions about competence, good intentions, code-quality or sound software engineering practices. Code-signing solutions including Authenticode can only communicate information about the identity of the software publisher— the answer to the question: “who authored this piece of software?” That is the raison d’etre for the existence of certificate authorities and why they ostensibly charge hefty sums for their services. When developer Alice wants to obtain a code-signing certificate with her name on it, the CA must perform due diligence that it is really Alice requesting the certificate. Because if an Alice certificate is mistakenly issued to Bob, suddenly applications written by Bob will be incorrectly attributed to Alice, unfairly using her reputation and in the process quite possibly tarnishing that reputation. In the real world, code-signing certificates are typically issued not to individuals toiling alone— although many independent developers have obtained one for personal use— but large companies with hundreds or thousands of engineers. But the principle is same: a code-signing certificate for MSFT must not be given willy-nilly to random strangers who are not affiliated with MSFT. (Incidentally that exact scenario was one of the early debacles witnessed in the checkered history of public CAs.)

Nothing in this scheme vouches for the integrity of the software publisher or the fairness of their business model. CAs are only asserting that they have carefully verified the identity of the developer prior to issuing the certificate. Whether or not the software signed by that developer is “good” or suitable for any particular purpose is outside the scope of that statement. In that sense, there is nothing wrong— as far as X509 is concerned— with a perfectly valid digital certificate signing malicious code. There is no evil bit required in a digital certificate for publishers planning to ship malware. For that matter there is no “competent bit” to indicate that software published by otherwise well-meaning developers will not cause harm nevertheless due to inadvertent bugs or dangerous vulnerabilities. (Otherwise no one could issue certificates to Adobe.)

1990s called, they want their trust-model back

This observation is by no means novel or new. Very early on in the development of Authenticode in 1997, a developer made this point loud and clear. He obtained a valid digital certificate from Verisign and used it to sign an ActiveX control dubbed “Internet Exploder” [sic] designed to shut-down a machine when it was embedded on a web page. That particular payload was innocuous and at best a minor nuisance, but the message was unambiguous: the same signed ActiveX control could have reformatted the drive or steal information. “Signed” does not equal “trustworthy.”

Chalk it up to the naivete of the 1990s. One imagines a program manager at MSFT arguing this is good enough: “Surely no criminal will be foolish enough to self-incriminate by signing malware with their own company identity?” Yet a decade later that exact scenario is observed in the wild. What went wrong? The missing ingredient is deterrence. There is no global malware-police to chase after every malware outfit even when they are operating brazenly in the open, leaving a digitally authenticated trail of evidence in their wake. Requiring everyone to wear identity badges only creates meaningful deterrence when there are consequences to being caught engaging in criminal activity while flashing those badges.

Confusing authentication and trust

Confusing authentication with authorization is a common mistake in information security. It is particularly tempting to blur the line when authorization can be revoked by deliberately failing authentication. A signed ActiveX control is causing potential harm to users? Let’s revoke the certificate and that signature will no longer verify. This conceptual shortcut is often a sign that a system lacks proper authorization design: when the only choices are binary yes/no, one resorts to denying authorization by blocking authentication.

Developer identity is neither necessary or sufficient for establishing trust. It is not necessary because there is plenty of perfectly useful open-source software maintained by talent developers only known by their Github handle, without direct attribution of each line of code to a person identified by their legal name. It is not sufficient either, because knowing that some application was authored by Bob is not useful on its own, unless one has additional information about Bob’s qualifications as a software publisher. In other words: reputation. In the absence of any information about Bob, there is no way to decide if he is a fly-by-night spyware operation or honest developer with years of experience shipping quality code.

Certificate authorities as reluctant malware-police

Interesting enough, that 1997 incident set another precedent: Verisign responded by revoking the certificate, alleging that signing this deliberately harmful ActiveX control was a violation of the certificate policy that this software developer agreed to as a condition for issuance. Putting aside the enforceability of TOUs and click-through agreements, this is a downright unrealistic demand for certificate authorities to start policing developers on questions of policy completely unrelated to verifying their identity. It’s as if the DMV had been tasked with revoking driver’s licenses for people who are late on their credit-card payments.

That also explains why revoking certificates for a misbehaving vendors is not an effective way to stop that developer from churning out malware. As the paper points out, there are many ways to game the system, all of which being used in the wild by companies with a track record of publishing harmful applications:

CA shopping: after being booted from one CA, simply walk over to their competitor to get another certificate for the exact same corporate entity
Cosmetic changes: get certificates for the same company with slightly modified information (eg variant of address or company name) from the same CA
Starting over: create a different shell-company doing exactly same line of business to start with a clean-slate

In effect CAs are playing whack-a-mole with malware authors, something they are neither qualifier or motivated to do. In the absence of a reputation system, the ecosystem is stuck with a model where revoking trust in malicious code requires revoking the identity of the author. This is a very different use of revocation than what the X509 standard envisioned. Here are the possible reasons defined in the specification– incidentally these appear in the published revocation status:

CRLReason ::= ENUMERATED {

unspecified             (0),
 keyCompromise           (1),
 cACompromise            (2),
 affiliationChanged      (3),
 superseded              (4),
 cessationOfOperation    (5),
 certificateHold         (6),
 -- value 7 is not used
 removeFromCRL           (8),
 privilegeWithdrawn      (9),
 aACompromise           (10) }

Note there is no option called “published malicious application.” That’s because none of the assertions made by the CA are invalidated upon discovering that a software publisher is churning out malware. Compare that to key-compromise (reason #1 above) where the private-key of the publisher has been obtained by an attacker. In that case a critical assertion has been voided: the public-key appearing in the certificate no longer speaks exclusively for the certificate holder. Similarly a change of affiliation could arise when an employee leaves a company, a certificate issued in the past now contains inaccurate information for “organization” and “organizational unit” fields. There is no analog for the discovery of signed malware, other than vague reference to compliance with the certificate policy. (In fairness, the policy itself can appear as URL in the certificate but it requires careful legal analysis to answer the question of how exactly the certificate subject has diverged from that policy.)

Code-signing is not the only area where this mission creep has occurred but it is arguably the one where highest demands are put on the actors least capable of fulfilling those expectations. Compare this to issuance of certificates for SSL: when phishing websites pop-up impersonating popular services, perhaps with a subtle misspelling of the name, complete with a valid SSL certificate. Here there may be valid legal grounds to ask the responsible CA to revoke a certificate because there may be trademark claim. (Not that it does any good, since the implementation of revocation in popular browsers ranges from half-hearted to comically flawed.) Lukcily web browsers have other ways to stop users from visiting harmful websites: for example, Safe Browsing and SmartScreen maintain blacklists of malicious pages. There is no reason to wait for CA to take any action- and for malicious sites that are not using SSL, it would not be possible anyway.

Code-signing presents a different problem. In open software ecosystems, reputation systems are rudimentary. Antivirus applications can recognize specific instances of malware but most applications start from a presumption of innocence. In the absence of other contextual clues, the mere existence of verifiable developer identity becomes a proxy for trust decision: unsigned applications are suspect, signed ones get a free pass. At least, until it becomes evident that the signed application was harmful. At that point, the most reliable way of withdrawing trust is to invalidate signatures by revoking the certificate. This uphill battle requires enlisting CAs in a game of whack-a-mole, even when they performed their job correctly in the first place.

This problem is unique to open models for software distribution, where applications can be sourced from anywhere on the web. By contrast, the type of tightly controlled “walled-garden” ecosystem Apple favors with its own App Store rarely has to worry about revoking anything, even though it may use code signing. If Apple deems an application harmful, it can be simply yanked from the store. (For that matter, since Apple has remote control over devices in the field, they can also uninstall existing copies from users’ devices.)

Reputation systems can solve this problem without resorting to restrictive walled-gardens or locking down application distribution to a single centralized service responsible for quality. They would also take CAs out of policing miscreants, a job they are uniquely ill-suited for. In order to block software published by Bob, it is not necessary to revoke Bob’s certificate. It is sufficient instead to signal a very low reputation for Bob. This also moves the conflict one-level higher, because reputations are attached to persons or companies, not to specific certificates. Getting more certificates from another CA after one has been revoked does not help Bob. As long as the reputation system can correlate the identities involved, the dismal reputation will follow Bob. Instead of asking CAs to reject customers who had certificates revoked from a different CA, the reputation system allows CAs do their job and focus on their core business: vet the identity of certificate subjects. It is up to the reputation system to link different certificates based on a common identity, or even related families of malware published by seemingly distinct entities acting on behalf of the same malware shop.

Use and misuse of code-signing (part I)

Or, there is no “evil bit” in X509

A recent paper from CCS2015 highlights the incidence of digitally signed PUP— potentially unwanted programs: malicious applications that harm users, spying on them, stealing private information or otherwise acting against the interests of the user. While malware is dime-a-dozen and occurence of malware digitally-signed with valid certificates is not new either, this is one of the first systemic studies of how malware authors operate when it comes to code signing. But before evaluating the premise of the paper, let’s step back and revisit the background on code-signing in general and MSFT Authenticode in particular.

ActiveX: actively courting trouble

Rewinding the calendar back to the mid-90s: the web is still in its infancy and browsers highly primitive in their capabilities compared to native applications. These are the “Dark Ages” before AJAX, HTML5 and similar modern standards which make web applications competitive with their native counterparts. Meanwhile JavaScript itself is still new and awfully slow. Sun Microsystems introduced Java applets as an alternative client-side programming model to augment web pages. Ever paranoid, MSFT responds in the standard MSFT way: by retreating to the familiar ground of Windows and trying to bridge the gap from good-old Win32 programming to this this scary, novel web platform. ActiveX controls were the solution the company seized on in hopes of continuing the hegemony of the Win32 API. Developers would not have to learn any new tricks. They would write native C/C++ applications using COM and invoking native Windows API as before—conveniently guaranteeing that it could only run on Windows— but they could now deliver that code over the web, embedded into web pages. And if the customers visiting those web pages were running a different operating system such as Linux or running on different hardware such as DEC Alpha? Tough luck.

Code identity as proxy for trust

Putting aside the sheer Redmond-centric nature of this vision, there is one minor problem: unlike the JavaScript interpreter, these ActiveX controls execute native code with full access to operating system APIs. They are not confined by an artificial sandbox. That creates plenty of room to wreak havoc with machine: read files, delete data, interfere with the functioning of other applications. Even for pre-Trustworthy-Computing MSFT with nary a care in the world for security, that was an untenable situation: if any webpage you visit could take over your PC, surfing the web becomes a dangerous game.

There are different ways to solve this problem, such as constraining the power of these applications delivered over the web. (That is exactly what JavaScript and Java aim for with a sandbox.) Code-signing was the solution MSFT pushed: code retains full privileges but it must carry some proof of its origin. That would allow consumers to make an informed decision about whether to trust the application, based on the reputation of the publisher. Clearly there is nothing unique to ActiveX controls about code-signing. The same idea applies to ordinary Windows applications sure enough was extended to cover them. Before there were centralized “App Stores” and “App Markets” for purchasing applications, it was common for software to be downloaded straight from the web-page of the publisher or even a third-party distributor website aggregating applications. The exact same problems of trust arises here: how can consumers decide whether some application is trustworthy? The MSFT approach translates that into a different question: is the author of this application trustworthy?

Returning to the paper, the researchers make a valuable contribution in demonstrating that revocation is not quite working as expected. But the argument is undermined by a flawed model. (Let’s chalk up the minor errors to lack of fact-checking or failure to read specs: for example asserting that Authenticode was introduced in Windows 2000 when it predates that, or stating that only SHA1 is supported when MSFT signtool has supported SHA256 for some time.) There are two major conceptual flaws in this argument:
First one is misunderstanding the meaning of revocation, at least as defined by PKIX standards. More fundamentally, there is a misunderstanding what code-signing and identity of the publisher represent, and the limits of what can be accomplished by revocation.

Revocation and time-stamping

The first case of confusion is misunderstanding how revocation dates are used: the authors have “discovered” that malware signed and timestamped continues to validate even after the certificate has been revoked. To which the proper response is: no kidding, that is the whole point of time-stamping; it allows signatures to survive expiration or revocation of the digital certificate associated with that signature. This behavior is 100% by design and makes sense for intended scenarios.

Consider expiration. Suppose Acme Inc obtains a digital certificate valid for exactly one year, say the calendar year 2015. Acme then uses this certificate so sign some applications published on various websites. Fast forward to 2016, and a consumer has downloaded this application and attempts to validate its pedigree. The certificate itself has expired. Without time-stamping, that would be a problem because there is no way to know whether the application was signed when the certificate was still valid. With time-stamping, there is a third-party asserting that the signature happened while the certificate was still valid. (Emphasis on third-party; it is not the publisher providing the timestamp because they have an incentive to backdate signatures.)

Likewise the semantics of revocation involve a point-in-time change in trust status. All usages of the key afterwards are considered void; usage before that time is still acceptable. That moment is intended to capture the transition point when assertions made in the certificate are no longer true. Recall that X509 digital certificates encode statements made by the CA about an entity, such as “public key 0x1234… belongs to the organization Acme Inc which is headquartered in New York, USA.” While competent CAs are responsible for verifying the validity of these facts prior to issuance, not even the most diligent CA can escape the fact that their validity can change afterwards. For example the private-key can be compromised and uploaded to Pastebin, implying that it is no longer under sole possession of Acme. Or the company could change its name and move its business registration to Timbuktu, a location different than the state and country specified in the original certificate. Going back to the above example of the Acme certificate valid in 2015: suppose that half-way through the calendar year Acme private-key is compromised. Clearly signatures produced after that date can not be reliably attributed to Acme: it could be Acme or it could be the miscreants that stole the private-key. On the other hand signatures made before, as determined by third-party trusted timestamp should not be affected by events that occurred later.**

In some scenarios this distinction between before/after is moot. If an email message was encrypted using the public-key found in an S/MIME certificate months ago, it is not possible for the sender to go back in time and recall the message now that the certificate is revoked. Likewise authentication happens in real-time and it is not possible to “undo” previous instances when a revoked certificate was accepted. Digital signatures on the other hand are unique: the trust status of a certificate is repeatedly evaluated at future dates when verifying a signature created in the past. Intuitively signatures created before revocation time should still be afforded full trust, while those created afterwards are considered bogus. Authenticode follows this intuition. Signatures time-stamped prior to the revocation instant continue to validate, while those produced afterwards (or lacking a time-stamp altogether) are considered invalid. The alternative does not scale: if all trust magically evaporates due to revocation, one would have to go back and re-create all signatures.

To the extent that there is a problem here, it is an operational error on the part of CAs in choosing the revocation time. When software publishers are caught red-handed signing malware and this behavior is reported to certificate authorities, it appears that CAs are setting revocation date to the time of the report, as opposed to all the way back to original issuance time of the certificate. That means signed malware still continues to validate successfully according to Authenticode policy, as long as the crooks remembered to timestamp their signatures. (Not exactly a high-bar for crooks, considering that Verisign and others also operate free, publicly accessible time-stamping services.) The paper recommends “hard-revocation” which is made-up terminology for setting revocation time all the way back to issuance time of the certificate, or more precisely the notBefore date. This is effectively saying some assertion made in the certificate was wrong to begin with and the CA should never have issued it in the first place. From a pragmatic stance, that will certainly have the intended effect of invalidating all signatures. No unsuspecting user will accidentally trust the application because of a valid signature. (Assuming of course that users are not overriding Authenticode warnings. Unlike the case of web-browser SSL indicators which have been studied extensively, there is comparatively little research on whether users pay attention to code-signing UI.) While that is an admirable goal, this ambitious project to combat malware by changing CAs behavior is predicated on misunderstanding of what code-signing and digital certificates stand for.

[continued]

** In practice this is complicated by the difficulty of determining precise time of key-compromise and typically involves conservatively estimating on the early side.

The problem with devops secret-management patterns

Popular configuration management systems such as Chef, Puppet and Ansible all have some variant of secret management solution included. These are the passwords, cryptographic keys and similar sensitive information that must be deployed to specific servers in a data-center environment while limiting access to other machines or even persons involved in operating the infrastructure.

Chef has encrypted data-bags
Ansible has vaults (Not to be confused with Hashicorp Vault, which is a service for distributing secrets)
Puppet has encrypted hiera

The first two share a fundamental design flaw. They store long-term secrets as files encrypted using a symmetric key. When it is time to add a new secret or modify an existing one, the engineer responsible for introducing the change will decrypt the file using that key, make changes and reencrypt using the key. Depending on design, decryption during an actual deployment can happen locally on the engineer machine (as in the case of Ansible) or remotely on the server where those secrets are intended to be used.

This model is broken for two closely related reasons.

Coupling read and write-access

Changing secrets also implies being able to view existing ones. In other words, in order to add a secret to an encrypted store or modify the existing one (in keeping with best-practices, secrets are rotated periodically right?) requires knowledge of the same passphrase that also allows viewing the current collection of those secrets.

Under normal circumstances, “read-only” access is considered less sensitive than “write” access. But when it comes to managing secrets, this wisdom is inverted: being able to steal a secret by reading a file is usually more dangerous than being able to clobber the contents of that file without learning what existed before.**

Scaling problems

Shared passphrases do not scale well in a team context. “Three can keep a secret, if two of them are dead” said Benjamin Franklin. Imagine a team of 10 site-reliability engineers in charge of managing secrets. The encrypted data-bag or vault can be checked into a version control system such as git. But the passphrase encrypting it must be managed out of band. (If the passphrase itself were available at the same location, it becomes a case of locking the door and putting the key under the doormat.) That means coordinating a secret to be shared among multiple people. That creates two problems:

The attack surface of secrets is increased. An attacker need only compromise 1 of those 10 individuals to unlock all the data.
It increases the complexity of revoking access. Consider what happens when an employee with access to decrypt this file leaves the company. It is not enough to generate a new passphrase to reencrypt the file. Under worst-case assumptions, the actual secrets contained in that file were visible to that employee and could have been copied. At least some of the most sensitive ones (such as authentication keys to third-party services) may have to be assumed compromised and require rotation.

Separating read and write access

There is a simple solution to these problems. Instead of using symmetric cryptography, secret files can be encrypted using public-key cryptography. Every engineer has a copy of the public-key and can edit the file (or fragments of the file depending on format, as long as secret payloads are encrypted) to add/remove secrets. But the corresponding private-key required to decrypt the secrets does not have to be distributed. In fact, since these secrets are intended for distribution to servers in a data-center, the private-key can reside fully in the operational environment. Anyone could add secrets by editing a file on their laptop; but the results are decrypted and made available to machines only in the data-center.

If it turns out that one of the employees editing the file had a compromised laptop, the attacker can only observe secrets specifically added by that person. Similarly if that person leaves the company, only specific secrets he/she added need to be considered for rotation. Because they never had access to decrypt the entire file, remaining secrets were not accessible.

Case-study: Puppet

An example of getting this model right is Puppet encrypted hiera. Its original PGP-based approach to storing encrypted data is now deprecated. But there is an alternative that works by encrypting individual fields in YAML files called hiera-eyaml. By default it uses public-key encryption in PKCS7 format as implemented by OpenSSL. Downside is that it suffers from low-level problems, such as lack of integrity check on ciphertexts and no binding between keys/values.

Improving Chef encrypted data-bags

An equivalent approach was implemented by a former colleague at Airbnb for Chef. Chef data-bags define a series of key/value pairs. Encryption is applied at the level of individual items, covering only the value payload. The original Chef design was a case of amateur hour: version 0 used the same initialization vector for all CBC ciphertexts. Later versions fixed that problem and added an integrity check. Switching to asymmetric encryption allows for a clean-slate. Since the entire toolchain for editing as well as decrypting data-bags have to be replaced, there is no requirement to follow existing choice of cryptographic primitives.

On the other hand, it is still useful to apply encryption independently for each value, as opposed to on the entire file. That allows for distributed edits: secrets can be added or modified by different people without being able to see other secrets. It’s worth pointing out that this can introduce additional problems because ciphertexts are not bound strongly to the key. For example, one can move values around, pasting an encryption key into a field meant to hold a password or API key, resulting in the secret being used for an unexpected scenario. (Puppet hiera-eyaml has the same problem.) These can be addressed by including the key-name and other meta-data in the construction of the ciphertext; a natural solution is to use those attributes as additional data in an authenticated-encryption mode such as AES-GCM.

Changes to the edit process

With symmetric encryption, edits were straightforward: all values are decrypted to create a plain data-bag written into a temporary file. This file can now be loaded in a favorite text editor, modified and saved. The updated contents are encrypted from scratch with the same key.

With asymmetric keys, this approach will not fly because existing values can not be decrypted. Instead the file is run through an intermediate parser to extract the structure, namely the sequence of keys defined, into a plain file which contains only blank values. Edits are made on this intermediate representation. New key/value pairs can be added, or new values can be specified for an existing key. These correspond to defining a new secret or updating an existing one, respectively. (Note that “update” in this context strictly means overwriting with a new secret; there is no mechanism to incrementally edit the previous secret in place.) After edits are complete, another utility merges the output with the original, encrypted data-bag. For keys that were not modified, existing ciphertexts are taken from the original data-bag. For new/updated keys, the plaintext values supplied during edit session are encrypted using the appropriate RSA public-key to create the new values.

Main benefit is that all of these steps can be executed by any engineer/SRE with commit access to the repository where these encrypted data-bags are maintained. Unlike the case of existing Chef data-bags, there is no secret-key to be shared with every team member who may someday need to update secrets.

A secondary benefit is that when files are updated, diff comparisons accurately reflect differences. In contrast, existing Chef symmetric encryption selects a random IV for each invocation. Simply decrypting and reencrypting a file with no changes will still result in every value being updated. (In principle the edit utility could have compensated for this by detecting changes and reusing ciphertexts when possible but Chef does not attempt that.) That means standards utilities for merging, resolving conflicts and cherry-picking work as before.

Changes to deployment

Since Chef does not natively grok the new format, deployment also requires an intermediate step to convert the asymmetrically-encrypted databags into plain databags. This step is performed on the final hop, on the server(s) where the encrypted data-bag is deployed and its contents are required. That is the only time when the RSA private key is used. It does not have to be available anywhere other than on the machines where the secrets themselves will be used.

** For completeness, it is possible under some circumstances to exploit write-access for disclosing secrets. For example, by surgically modifying parts of a cryptographic key and forcing the system perform operations with the corrupted key, one can learn information about the original (unaltered) secret. This class of attacks falls under the rubric differential fault analysis.