Designing “honest ransomware” with Ethereum smart contracts (part II)

[continued from part I]

There are special-case solutions for common public-key algorithms to prove that a given ciphertext decrypts to an alleged plaintext. For example, with RSA the key-holder can return the plaintext without removing its padding. Recall that RSA is typically used in conjunction with a padding scheme such as PKCS1.5 or OAEP. These schemes apply randomized transformation to plaintext prior to encryption and undo that transformation after decryption. Without knowing the padding used during encryption, it is not possible to check that a given unpadded plaintext corresponds to some known ciphertext. But once the fully padded version is revealed, the user can both check that it encrypts to the original ciphertext and that removing the padding results in expected plaintext, because both of those steps are fully deterministic.

This straightforward approach does not carry over to other algorithms. For example ElGamal is a randomized public-key encryption scheme—each encryption operation uses a randomly chosen nonce, so reencrypting the exact same message can yield different ciphertexts. Yet it is not possible to recover the random nonce used after the fact, even with knowledge of the private key.  Unless the private-key owner takes additional steps to stash that nonce someplace, they are stuck in a strange position: they can decrypt any ciphertext but they can not simply prove the decryption is correct by sharing the result. That is not the case for RSA: if you can decrypt, you can also recover the transformed input complete with randomized padding. ElGamal calls for something more complex, such as a zero-knowledge proof of discrete logarithm equality to show that the exponentiation of the random masking value (first part of the ciphertext) by the private-key was done correctly.

For a more generic solution, we can leverage blind decryption: ask the private-key holder to decrypt a ciphertext without that party realizing which ciphertext is being decrypted. This is typically achieved by exploiting homomorphic properties of cryptosystems. For instance raw RSA—without any padding— has a simple multiplicative relationship: encryption of a product is the product of encryptions.

RSA-ENC(a·b) = RSA-ENC(a) * RSA-ENC(b)

where all multiplication is done modulo N associated with the key. A similar property also holds true for decryption where X and Y are ciphertexts:

RSA-DEC(X·Y) = RSA-DEC(X) * RSA-DEC(Y)

Looked another way, we can decrypt a ciphertext indirectly by decrypting a different ciphertext with known relationship to the original:

RSA-DEC(X) = RSA-DEC(X·Y) * RSA-DEC⁻¹(Y)

To decrypt X, we ask the ransomware operator to instead decrypt X·Y as long we know the decryption of Y. (Typically because we obtained Y by encrypting a random plaintext m of our choice to begin with.) Note the original encryption can still be using a strict padding mode such as OAEP, as long as the decryption side is willing to handle arbitrary ciphertext that results in plaintext with no specific padding format.

This additional step prevents the ransomware author from cheating: X·Y looks like a random ciphertext, completely unrelated to the original encryption X of the symmetric key. Even if there is a cheat sheet listing every such X and its associated plaintext to answer chalenges, there is no way to correlate the particular challenge presented to one of these entries. Each challenge is equally likely to be derived from any ciphertext in the collection. The intuition is that if files were not encrypted according to a consistent scheme with the same RSA public-key, it would not be possible to produce the correct answer when presented with such a random-looking ciphertext.

This step can be repeated for a small subset of files to achieve high-confidence that most files have been encrypted according to the expected scheme. As long as the challenges are selected randomly, even a small number of successful tests provides high assurance against cheating. For example, suppose ransomware encrypted a million files but replaced 2% of them with random data instead of following the claimed hybrid-encryption process. If 100 files out of that collection are randomly selected for spot-checking, odds are 87% that this departure from protocol will be caught. The downside is diminishing returns with increased coverage. While a handful of challenges can rule out cheating on a wide scale— for example every other file being corrupted— it is not feasible to prove that 100% of data is recoverable. If ransomware deliberately mangled exactly 1 file in the collection, it would take a great deal of luck to discover that; one must pick that specific file as a challenge. That also suggests a strategy of picking the files you care about most for the challenge phase, given that for every other files there is a small but non-zero probability of failure to recover. (Incidentally the ransomware author has a diametrically opposed interest in making sure users do not get to choose the challenge subset unilaterally. Otherwise they could pick the 100 files they care about most and abandon the rest.)

There is one more subtlety here: the decryption challenge returns a symmetric key, which is then used to undo the bulk symmetric-encryption applied to the file. But there is still the problem of deciding whether the outcome of that decryption is “correct” in the sense that the plaintext was a file originally belonging to this user. Neither authenticated encryption or integrity checks help with that. After all ransomware could have replaced an MP3 music file with the “correct” AES-GCM encryption of random noise. Given the right AES key, that file will decrypt successfully, including validation of the GCM tag. It will not result in recovery of the original data, which is the only outcome the user cares about. To work around this we have to posit that users have access to an oracle that can examine a file (complete with all meta-data such as directory path and timestamps) and determine whether it is one of the documents originally belonging to their collection. In practice this could be based on a catalog of hashes or digitally signatures over documents, or in the worst-case scenario, manual inspection of files.

Once the user is convinced all files are indeed encrypted correctly, the next step is crafting a smart-contract to release payment conditional on the private-key being revealed. This contract will have a function intended to be invoked by the key-holder. After checking that the parameters supplied reveal enough information to recover the private-key, the function sends funds to an address agreed upon in advance. There is one subtlety here: specifying the destination address as part of the function call or inferring it from the message sender results in a race condition. Since Ethereum contract calls are broadcast on the network before they are mind in, anyone could copy the disclosed private-key and craft an alternative method invocation to shuttle funds some place else, hoping to get mined in first. Fixing the destination during contract creation solves this problem. Even if someone else succeeds in preempting the original method invocation with a different one sending the exact same information, the funds are still delivered to the intended address.

Just in case the private-key holder never shows up, the contract also needs an “escape hatch.” That is a second, time-locked method that can be invoked by the user after a mandatory delay to recover unclaimed funds. It is critical that these are the only ways of withdrawing any funds from the contract. If there was some other avenue for getting funds out, it would result in a race condition. When the private-key holder invokes the contract to collect payment, the contract owner can observe that transaction (including the disclosed private-key) and try to race them with a different transaction that siphons funds from the contract before it can pay out.

In terms of disclosing a private-key in a manner that can be verified by Ethereum smart-contracts, there are two natural solutions:

  1. Staying in the RSA setting, the smart-contract method that receives two large integers p and q, and verifies that their product is equal to the modulus N. While doable in principle, this requires implementing arbitrary precision integer multiplication in Solidity, which does not natively support such operations. The underlying Ethereum Virtual Machine (EVM) has 256-bit integer primitives and supports multiplication of words into 256-bit product, discarding more significant bits. One could write a big-number library to multiple large numbers by splitting them into 128-bit chunks. But this runs into another practical issue around gas costs: Ethereum smart-contract execution costs money proportional to the complexity of operations. Trying to check whether the product of two 1024-bit factors is equal to a known RSA modulus can become an “expensive” operation.
  2. Switch to an elliptic-curve setting and leverage the fragility of ECDSA for deliberately disclosing private keys. Because Ethereum natively supports verification of ECDSA signatures over the secp256k1 curve, this makes for a straightforward implementation. So instead of trying to make RSA operations work in Ethereum, we alter file-encryption model. ECIES or ElGamal can both be adapted to work over secp256k1. ElGamal in particular has a simple homomorphism that can be used to mask ciphertexts: multiplying both components of an ElGamal ciphertext by a scalar produces the encryption of the original message multiplied by that scalar. As with ECDSA, the public-key is a point on the curve and the private-key is the discrete logarithm of that with respect to the generator point. Since that key-pair is equally usable for ECDSA signatures, built-in EVM operations are sufficient to check for private key disclosure. As before, the function expects two valid ECDSA signatures over fixed messages such as “hello” and “world.” But in addition verifying these signatures— which is a primitive operation built into EVM— the contract also confirms these signatures share an identical nonce.

Summary

To recap: an honest ransomware scheme— or “third-party backup encryption service”— can be implemented using smart-contracts to make data recovery contingent on payment. We encrypt all files of interest using a hybrid encryption scheme with a single public-key. When it is time to recover the data, the user and TBES execute an interactive challenge protocol to decrypt randomly selected files, verifying that the encryption followed the expected format. (This step is redundant if encryption was done by the user, unless the integrity of encrypted backups is itself in question.) Assuming the proofs check out, the next step is for the user to create a smart-contract on the Ethereum blockchain. This contract is parametrized by an amount of Ether agreed upon, public-key used for encryption and address chosen by the TBES. Once the contract is setup and funded, TBES can invoke one of its methods to disclose the private-key associated with that public-key and collect the funds.

The logic of Ethereum smart-contract execution guarantees this exchange will be fair to both sides: payment only in exchange for valid private-key and no way to get out of payment once private-key is delivered.

[continued]

CP

Updated: 6/19, to describe alternative solution for ElGamal.

Designing “honest ransomware” with Ethereum smart contracts (part I)

“But to live outside the law, you must be honest” – Bob Dylan

By all accounts Wannacry ransomware made quite the splash, bringing thousands of systems to a standstill and forcing victims to shell out Bitcoin with little hope of recovery. In fact the ransomware aspect may well have been a diversion. Attribution for such attacks is tricky but both Kaspersky Labs and Symantec research groups have linked Wannacry to the Lazarus Group, a threat actor associated with the DPRK. (WaPo recently reported that the NSA concurs.) Their previous claim to fame: massive theft of funds from the Bangladesh central bank by exploiting the SWIFT network. That heist netted somewhere in the neighborhood of $80M, but the take would have been much higher were it not for the attackers’ rookie mistakes that resulted in even larger transfers being stopped or reversed. By comparison Wannacry earned a pittance, less than $100K at the time of writing.

Worse, this malware is not even  capable of living up to its raison d’etre: decrypting files after the attackers are paid off. For starters, only a handful of Bitcoin addresses were hard-coded in the binary, as opposed to unique deposit addresses for each victim. That makes it difficult to distinguish between different victims paying the bounty, which in turn violates the cardinal rule of ransomware: only users who pay the ransom get their data back. This is not so much a principle of fairness— there is no honor among thieves— as it is one of economic competitiveness: ransomware can only scale if victims are convinced that by paying the ransom they can recover their files. This is why successful ransomware campaigns in the past went so far as to feature helpful instructions to educate users about Bitcoin and staff customer-support operations to help “customers” recovery their data. Wannacry seems to have taken little interest in living up to such lofty standards of customer service.

But this incident does raise a question: are users at the mercy of ransomware authors when it comes to recovering their data? Is there a way to guarantee that payment will result in disclosure of the decryption key? After all, the crooks are demanding payment in cryptocurrency. Even Bitcoin with its relatively modest scripting language can express complex conditions for payment. Is it possible to design a fair-exchange protocol where decryption key is released if and only if corresponding payment is made?

This scenario is admittedly contrived and unlikely to be implemented. If there is no honor among thieves, certainly there is no desire to adapt more transparent and fair payment mechanisms to protect consumers from getting ripped-off by unscrupulous ransomware operators. But one can imagine more legitimate use-cases such as backup/escrow services that assist users encrypt their data for long-term storage. To avoid a single point of failure, the encryption key would be split using a threshold secret-sharing scheme. Specifically it is shared into N shares such that any quorum of M can reconstitute the original secret. Each share is in turn encrypted to the public-key of one trustee. When the time comes to decrypt this data, the consumer asks some subset of trustees to decrypt their share. This interaction calls for a fair-exchange protocol where the consumer receives the decryption result if and only if the trustee gets paid for its assistance.

Ethereum can solve this problem using the same idea behind fair-exchange of cryptocurrency across different blockchains, with a few caveats. The smart contract sketched out in previous blog-posts is designed to send funds when a caller discloses a specific private-key. But there are is a deeper problem around knowing which private-key to look for. In theoretical cryptography this falls under the rubric of “verifiable encryption” where it is possible to prove that some ciphertext is the encryption of an unspecified plaintext value that meets certain properties. Typically these constructs operate on abstract mathematical properties of plaintext, such as proving that it is an even number. In the more concrete setting of ransomware, plaintext under consideration are not mathematical structures but large complex data formats such as PDF documents. This model lends itself better to less-efficient statistical approach for verifying that the encryption process has been followed according to specification.

Let’s assume all files are encrypted using a hybrid-encryption scheme:

  • For each file/object to be encrypted, a random symmetric key is generated and bulk data is encrypted using a symmetric block-cipher such as AES.
  • The symmetric key is in turn encrypted using a fixed public-key cryptosystem such as RSA using the public-key of the trustee. This “wrapped” key is saved with the output of the bulk encryption.

This is similar to the format used by email encryption standards such as PGP and S/MIME. It is also how Wannacry operates, with an additional level of indirection. It generates a unique 2048-bit RSA keypair on each infected target and then encrypts that private-key using a different, fixed RSA public-key, presumably held by the ransomware crooks.  That means revealing the private-key used in step #2 is sufficient to decrypt all ciphertexts created according to this recipe.

There is one major difference between ransomware and the more legitimate, voluntary backup scenarios sketched out earlier: in the latter case the user can be certain of the public-key used for the encryption—because they performed the encryption themselves. In the former situation, they have just stumbled open a collection of ciphertext along with a ransom note asserting that all data has been encrypted using the process above with a private-key held by the author. Some proof is required that this claim is legitimate and the author is in possession of private-key required to recover their data. (Similar to fake DDoS threats, one can imagine fake ransomware authors reaching out to users to offer assistance, with no real capability to decrypt anything.)

A naive solution is to challenge the private-key holder to decrypt a handful of ciphertexts, effectively asking for free samples. But such a protocol can be cheated if it works by sending the full ciphertext or even the wrapped symmetric-key produced in step #2. For all we know, the ransomware encrypts each file using random symmetric keys and then stores those keys in a database. (In other words, it does not use a single public-key to wrap each of the symmetric keys; that part of the ciphertext is a decoy.) This operation could still respond to every challenge query successfully with the correct symmetric keys, by doing database lookups. But the encryption does not conform to the expected pattern above; there is no single private-key to unlock all files. In effect the user would be paying for a bogus key that has no bearing on the ability to decrypt ciphertexts of interest.

[continued]

CP

Two-factor authentication: a matter of time

How policy change in Russia locked Google users out of their account

Two-factor authentication is increasingly becoming common for online services to improve the protection afforded to customer accounts. Google started down this road in 2009, and its efforts were greatly accelerated when the company got 0wned by China in the Aurora attacks. [Full disclosure: this blogger worked on Google security team 2007-2013] At the time, and to a large extent today, the most popular way of augmenting an existing password-based system with an additional factor involved one-time passcodes or “OTP” for short. Unlike passwords, OTPs changed each time and could not be captured once for indefinite access going forward. (Note this is not equivalent to saying they can not be phished—nothing prevents a crook from creating fake login pages which ask for both password and OTP, and in fact such attacks have been observed in the wild.)

Counters and ticking seconds

There are many ways to generate OTPs but they all follow a similar pattern: there is a secret key, often referred to as a “seed” and this secret is combined with a variable factor such as current time or an incrementing sequence number using a one-way function. This one-way property is critical: it is a security requirement that even if an adversary can observe multiple OTPs and know the conditions they were generated in (such as exact timing) they can not recover the secret-seed required to generate future codes.

Perhaps the most well-known 2FA solution and one of the oldest is a proprietary offering from RSA called SecurID. It  was initially implemented only on hardware tokens sold by the company. SecurID is a decidedly closed ecosystem, at least in its original incarnation: not only did customers have to buy the hardware from RSA Inc but you also had to integrate with a service operated by the same company in order to verify if OTP codes submitted by users. That is because only RSA knew the secret-seed embedded in each token; customers did not receive these secrets or have any way to reprogram the token with their own secrets.

HOTP

Such closed models may have been great for customer lock-in but would clearly not fly in a world accustomed to open standards, interoperability and transparency. (Not to mention the wisdom of relying on a third-party for your authentication, a lesson that many companies including Lockheed-Martin would learn the hard way when RSA was breached by threat-actors linked to China in 2010.) Other industry players began pushing for an open standard, eventually resulting in a new design called HOTP being published as an RFC. The letter “H” stands for HMAC, a modern cryptographic primitive with a sound security model, compared to the home-brew Rube-Goldberg contraption used in SecurID. HOTP uses an incrementing counter as the internal “state” of the token. Each time an OTP is generated, this counter is incremented by one.

That model relies on synchronization beween the side generating OTP codes and the side responsible for verifying them. Both must use the same sequence number in order to arrive at the same OTP value. The sides can get out of sync due to any number of problems: imagine that you generated an OTP (bumping up your own sequence number) but your attempt to login to a website failed because of a network timeout. The server never received the OTP and therefore its sequence number is one behind yours. In practice this is solved by checking a submitted OTP not just against the current sequence number N but a range of values {N, N+1, N+2, …, N + t} for some tolerance value t. If a given OTP checks out against one of the later values in this sequence, the server updates its own copy of the counter, on the assumption that the client skipped past a few values. Even then things can go awry since collisions are possible: a user can accidentally mistype an OTP which then happens to match one of these later numbers, incorrectly advancing the counter.

TOTP

An alternative to maintaining a counter is using implicit state both sides independently have access to without having to synchronize. Time is the most obvious example: as long as both the client generating OTP value and server verifying it have access to an accurate clock, they can agree on the state of the OTP generator. Again in practice there is some clock drift permitted; instead of using a very accurate time down the millisecond, it is instead quantized into intervals of say 30 or 60 seconds. OTP codes are then generated by applying the HMAC function to the secret seed and this time-interval count. This was standardized in an open standard called TOTP, or Time-Based One-Time Password Algorithm.

There is one catch with TOTP: both sides must have an accurate source of time. This is easier on the server-side verifying OTP codes, but more difficult client-side where OTP generation takes place. The reason HOTP and sequence-numbers historically came first is that they could be implemented offline, using compact hardware without network connectivity. While embedded devices can have a clock, the challenge is those clocks require a battery to remain powered 24/7 and more importantly they eventually start drifting, running slower/faster than true time. A token that runs one second too fast every day will be a full 6 minutes ahead after a year.

Fast forward to 2009 with the smartphone revolution already underway, our model envisioned mobile apps handling OTP generation. Unlike stand-alone tokens, apps running on a smartphone have access to a system clock that is constantly being synchronized as long as the device is online. That take cares of time drift—even when the phone is only sporadically connected to the internet— making TOTP more appealing.

Which time?

One of the first questions that comes up about TOTP is the effect of time-zones. What happens when a user sitting in New York computes a TOTP that is submitted to a service in California for verification? In this case client and server separated by three time-zones. At first glance it looks like since they disagree on the current time, time-based OTP would break down in this model. Luckily that is not the case: as with most protocols relying on an accurate clock, TOTP calls on both participants to use an absolute frame of reference, namely the UNIX epoch time. Defined as the number of seconds elapsed since midnight January 1st, 1970 on Greenwich timezone, it does not depend on current location or daylight saving adjustments. That means TOTP generators only need to worry about having an approximately correct clock, a problem that is easily solved when the 2FA app runs on a mobile device with internet connectivity periodically checking some server in the cloud for authoritative time.

Daylight saving time considered harmful

Never underestimate the ability of the real world to throw a wrench in the plans. In 2011 Russia announced that it would not adjust back from daylight saving in the fall:

“President Medvedev has announced that Russia will not come off daylight saving time starting autumn 2011. Medvedev argued that switching clocks twice a year is harmful for people’s health and triggers stress. “

While the medical profession may continue to debate the effects of switching clocks on the general population, this change caused a good deal of frustration for software engineers. Typically “local time” displayed to users is determined by starting from a reference time such as GMT, making adjustments for local timezone and seasonal factors such as daylight saving. In the case of the Android operating system, those adjustments were hard-coded. If Russia did not going switch back to DST as scheduled, those devices would end up displaying the “wrong” time, even when they have perfectly accurate internal clocks.

In principle, this is only a matter of updating the operating system to follow the new, health-conscious Russian regime. In reality of course updating Android devices in the field has been a public quagmire of indifference and mutual hostility amongst device manufacturers, wireless carriers and Google. While the picture has improved drastically with Google moving to assert greater control over the update pipeline, in 2011 the situation was dire. Except for the handful of users on “pure” Google-experience devices such as Nexus S, everyone else was at the mercy of their wireless carrier for receiving software updates and those carriers were far more interested in locking users into 2-3 year contracts by selling another subsidized device than supporting existing units in the field. Critical security updates? Maybe, if you are lucky. Bug fixes and feature improvements? Forget about it.

Given that abject negligence from carriers and handset manufacturers, what is the average user to do when their phone displays 3’o clock when every one else is convinced it is 4’o clock? This user will take matters into their own hands and fix it somehow. The “correct” way to do that is shifting the timezone over by one, effectively going from Kaliningrad to Moscow. But this is far from obvious: the more intuitive fix given this predicament is to manually adjust the system clock forward by an hour. (One soon discovers that automatic time adjustments must also be disabled, or the next check-in against an authoritative time-server on the Internet will promptly restore the “correct” time.) Problem solved, the phone now reports that the local-time is 4’o clock as expected.

Off-by-one (hour)

Except for the unintended interaction with two-factor authentication, specifically TOTP which uses the current time to generate temporary codes. Overriding the system clock will shift the UNIX epoch time too. Now the TOTP generator is being fed from a clock with full one-hour skew. Garbage-in, garbage out. Most TOTP implementations will try to correct for slight clock drifts by checking a few adjacent intervals around current time, where each “interval” is typically 30 or 60 seconds. But no sane deployment is going to look back/forward as far as one hour, on the assumption that if your local clock is that far off, you are going to have many other problems.

Sure enough, reports started trickling in that users in Russia were getting locked out of their Google accounts because the 2FA codes generated by Google Authenticator on Android were not working. (In this blogger’s recollection, our response was adding a special-case check for a handful intervals around the +1 hour mark measured from current time— not all intervals between now and +1 hour mark. This has the effect of slightly lowering security, by increasing the number of “valid” OTP codes accepted, since each interval typically corresponds to a different OTP code modulo collisions.)

Real world deployments have a way of rudely bringing about the “impossible” condition. Until this incident, it was commonplace to assert that the reliability of TOTP based two-factor authentication is not affected by timezones or quirks of daylight saving time. In a narrow sense that statement is still true but that would have been no consolation to the customers in Russia locked out of their own accounts. Looking for a pace to pointer fingers, this is decidedly not a case of PEBKAC. Confronted with an obvious bug in their software, those Android users picked the most intuitive way of solving it. Surely they are not responsible for understanding the intricacies of local-time computation or the repercussions of shifting epoch time. Two other culprits emerge. Is the entire Android ecosystem to blame for not being able to deliver software upgrades to users, a problem the platform is still struggling with today? After all the announcement in Russia came months ahead of the actual change and iOS devices were not affected to the same extent. Or is the root-cause a design flaw in Android, having hard-coded rules about when daylight saving time kick-in? This information could have been retrieved from the cloud periodically, allowing the platform to respond gracefully when the powers-that-be declare DST a threat to the well-being of their citizenry.

Either way this incident is a great example of a security feature going awry because of decisions made in a completely different policy sphere affecting factors originally considered irrelevant to the system.

CP

 

Bitcoin and the ship of Theseus

Change and identity in a decentralized system

The Ship of Theseus is a philosophical conundrum about the continuity of identity in the face of change. Theseus and his ship sail the wide-open seas. Natural wear-and-tear takes its toll on the vessel, requiring its components to be replaced gradually over time. One day it is a few planks in the hull, the next season one of the masts are swapped out, followed by the sails. Eventually there comes a point where not a single nail or piece of fabric is left from the original build, and some parts have been replaced several times over. But unbeknownst to Theseus, a mysterious collector of maritime souvenirs has carefully preserved every component from the original ship taken out during repairs. (This is a variation on the original paradox, due to Hobbes.) In what may be the first case of retro-design, this person meticulously reassembles the original components into their original configuration. The riddle on which much ink has been spilled: which one is the true ship of Theseus? The one that has been sailing the seas all this time or the carefully restored one in the docks, which contains every last nut, bolt and rope from the original?

Bitcoin has been confronting a version of this riddle, most acutely during a few days in March when a hard-fork of the network appeared imminent. To recap: Bitcoin is a distributed ledger recording transactions and ownership of funds. This ledger is organized into “blocks,” with miners competing to tack on new blocks to the ledger, one block on average every 10 minutes. The catch is there is a limit on the size of blocks, which constrains how many transactions can be processed. Currently that stands at 1 megabyte—a gratuitous and arbitrary limit which may have seemed generous back in 2011 when it was first introduced, with plenty of spare room left in blocks to solve the problem. But kicking the can down the road predictably ends exactly as one would expect: increasing popularity of the network all but guaranteed that ceiling would be hit. Results of scarcity follow: transactions both became slower and more expensive. The time expected for a transaction to appear in a block increased and the fees paid to miners for that privilege sky-rocketed.

It is clear the situation calls for some form of scaling improvement. But there is no governance framework for Bitcoin. A system marvelously effective at bringing about distributed consensus at the technology level—everyone agrees on who owns what and which payments were sent—turns out to be terrible at producing consensus at the political level among its participants. Even the existence of a scaling crisis has been disputed. Some argue that Bitcoin excels as a settlement layer or store of value, and there is no reason to increase its on-chain capacity to handle everyday payment scenarios.

After much internecine fighting and several false-starts, the community coalesced around two opposing camps. One side mobilized under the Bitcoin Unlimited (BU) banner seeks to increase block size with a disruptive change, ratcheting up the arbitrary 1MB cap to some other, equally arbitrary but higher limit. On the other side is a group pushing for segregated-witness, a more complex proposal that solves multiple problems (including transaction malleability) but conveniently has the side-effect of providing an effective capacity increase. At the time of writing, this controversy remains in a deadlock. Segregated witness must reach roughly 75% miner support to activate. It has stalled at 30%, prompting supporters to give up on miners and seek an alternative approach called user-activated soft-fork or UASF. Meanwhile BU has been plagued by code-quality problems and DDoS attacks.

At some point in March, the miner support for BU was hovering dangerously close to the magic 50% mark. If that threshold is crossed, those miners could realistically start producing large blocks. While anyone can mine a large block any time, such blocks would be ignored by miners following the 1MB limit. It makes no sense to start producing them when they are only recognized by a minority. Such additions to the ledger would be quickly crowded-out and discarded in favor of alternative blocks obeying the 1MB restriction. But suppose BU exceeds 50%. (With some safety margin thrown in; otherwise there is the risk of block reorganization, where the original chain catches up and results in the entire history of big-blocks getting overwritten.) What happens if BU miners in the majority start producing those large blocks?
This was the question on everyone’s mind in March. It would result in a spit or “forking” of the Bitcoin ledger. Instead of one ledger there would be two parallel ledgers, maintained according to different rules. All blocks up to the point of the fork would be identical. If you owned 1 bitcoin, you still have 1 bitcoin according to both ledgers. But new transactions after the fork point can result in divergence, appearing on only one ledger. In effect two parallel universes emerge, where the same funds are owned by different people.

That brings us full-circle to the philosophical riddle of Theseus: which one of these is “Bitcoin”? Major cryptocurrency exchanges opted for a pragmatic answer: the original chain is Bitcoin-proper. According to this interpretation, the alternate ledger with large blocks will be considered an alternative cryptocurrency traded under its own ticker symbol BTU. (Reuse of the acronym for “British Thermal Unit,” a measure of heat, provides unintended irony for those who consider BU to be a dumpster-fire.)

While that tactical response addressed the uncertainty in markets, it did not provide a coherent definition of what exactly counts as Bitcoin. In effects the signatories were declaring that the chain with the large blocks would be relegated to “alternate coin” status regardless of hash power. For a system where security is derived from miners’ hash power to issue a blanket declaration of the irrelevance of hash power is extraordinary. Arguably the harshest criticism came from left field: the Ethereum community. Ethereum itself had taken flak in the past for taking exactly the same stance of ignoring miner choices during the 2016 DAO bailout. Orchestrated by the Ethereum Foundation and widely panned as crony-capitalism, this intervention resulted in a permanent, with a minority chain “Ethereum Classic” continuing at ~10% of hash power. In a case Orwellian terminology, this alternative chain is in reality the true continuation of the original Ethereum blockchain. What is now referred to as “Ethereum” incorporates the deus ex machina of the DAO intervention. But from a governance perspective, the most troubling aspect of the DAO debacle concerns how the legitimacy of the fork was ordained. When faced with a faction of the community expressing doubts about the wisdom of intervention, the Foundation insisted that regardless of what miners do, the branch reversing the DAO theft would become the official Ethereum branch. Such a priori declarations of the “correct chain” go against the design principle of miners providing integrity of the blockchain through costly, energy-intensive proof-of-work. Why waste all that electricity if you can just ask the Ethereum Foundation what the correct ledger is? In anointing a winner of the hard-fork by fiat without regard for hash power, the Bitcoin community had exhibited precisely the same disregard for market preferences.

Once unmoored from the economics of hash power, arguments about which chain is “legitimate” quickly devolves into philosophical questions about identity. If you hold that 1MB block-size is the sine quo non of Bitcoin, then any hard-fork modifying that property is by definition not Bitcoin. Yet some of the same individuals arguing that changing to 2MB blocks would results in complete loss of identity have also proposed  changing the proof-of-work function, after evidence emerged suggesting that mining hardware from a particular vendor may be exploiting a quirk of the existing PoW function to optimize their hardware. Changing the PoW is arguably a far more disruptive change than tweaking block size.

So it remains an open question what exactly defines Bitcoin and to what extent the system can evolve over time while unambiguously retaining its identity as Bitcoin. Is it still Bitcoin if block sizes are allowed to increase based on demand? If the proof-of-work function is replaced by a different one? Or if the environmentally wasteful proof-of-work model is abandoned entirely in favor of proof-of-stake approach? What if the distribution of coinbase rewards is altered to decrease continuously instead of having abrupt “halving” moments? Or to take a more extreme example, if the deflationary model with money supply capped at 21 million bitcoin is lifted, allowing the money supply to continue expanding indefinitely? Is it still “Bitcoin” or does that system deserve to be relegated to alt-coin status with an adjective attached to its name? A related question is who gets to make the branding determination? When Ethereum went through its hard-fork to bail to the DAO, it was the altered chain bestowed with the privilege of carrying the Ethereum name; the original, unmodified chain got relegated to second-class citizen as “Ethereum Classic.” Would the situation have been reversed if the Ethereum Foundation was instead opposed to intervention and the hard-fork was instead driven by a grassroots community effort to rescue the DAO at all costs?

Having been pronounced for dead multiple times, Bitcoin continues to defy the odds. It is already up more than 50% against the USD for 2017 at the time of writing, having survived a crack-down on capital controls in China. Yet the contentious scaling debate shows no signs of slowing down. BU proponents continue to lobby for a disruptive hard-fork, while segregated-witness adherents play a a game-of-chicken with user-activated soft forks.  Will the resulting system—or one of the resulting systems, in case the contentious fork results in a proliferation of incompatible blockchains— still qualify as “Bitcoin”? Beyond the crisis du jour, it remains unclear if Bitcoin is capable of improving by incorporating new ideas, especially when these ideas call for a disruptive change  breaking backwards compatibility. If the community interprets every hard-fork as an identity crisis that calls into question the meaning of “Bitcoin,” the resulting stasis will  place Bitcoin at a disadvantage compared to alternative cryptocurrencies which are more responsive to market demand. (To wit, the so-called “Bitcoin dominance index” which measures the market capitalization of BTC as a fraction of all cryptocurrencies is now at an all-time low, having dipped below 50% mark.) There is something to be said about stability and consistency. Whimsical changes and excessive interventionism of the type demonstrated during the Ethereum DAO hard-fork do not inspire confidence in the long-term reliability of a currency either. Bitcoin so far has stubbornly occupied the opposite end of the spectrum, clinging to a literal, originalist interpretation of its identity defined by Satoshi.

That is one way of dodging the paradox of Theseus: this ship may be taking on water, but at least every single one of its planks is original.

CP

 

Trading cryptocurrency without trusted third-parties (part III)

[continued from part II]

Reliable execution matters

The preceding discussions suggest it is possible in principle to exchange cryptocurrency across different blockchains, without calling on a trusted third-party to hold funds in escrow. (Or viewed another way, the blockchain itself is the trusted third-party equivalent, its immutable rules guaranteeing all-or-nothing fair exchange where neither side can cheat the other.) That however is not enough to achieve feature parity with existing marketplaces. It provides only one piece of the puzzle: arranging for settlement of funds after a trade is agreed upon. The problem statement made a leap of faith in assuming that Alice and Bob already found each other, and somehow came to an agreement on price/quantity. But that arguably is the raison d’être of markets: helping buyers and sellers locate each other while facilitating price discovery. In realistic equity markets,  such arrangements of crossing buy/sell orders can take far more complex forms than  pairwise arrangements. For example it is very common for an order to be executed piece-meal: when Alice places an order to sell 10BTC, some fraction of that order is paired with Bob while the remainder goes to Carol. These can even take place at different times; Alice has a partially-executed order in the interim before Carol shows up, and she could even cancel the remainder.

Meanwhile accurate price discovery depends on reliable trade execution. Suppose the exchange stopped at matching Alice and Bob, delegating the actual trading for the individual parties to work out. Imagine Alice and Bob each receiving an email: “Congratulations, we found a counter-party for your trade. Here is their contact information.” At this point there is no guarantee that settlement will take place. If Alice or Bob back-out—which does not violate the fair-exchange property as long as neither side delivered anything—this trade did not occur as far as the market is concerned. That means the bid/ask quotes come with a prominent disclaimer: you can buy/sell at this price as long as your counter-party is in the mood for executing the settlement. This is very different from the expectation of trading in equity markets: if there is an order on the book to sell 10 shares of Google stock at a specified price, and a buyer shows up offering that exact price/quantity, there is very high confidence that this trade will execute. (In fact one of the main objections to high-frequency trading popularized by accounts like Flash Boys involve edge-cases where those guarantees are weakened due to order- spoofing and phantom liquidity: seemingly available trades disappearing when somebody attempts to take advantage of it by posting the corresponding buy/sell order.)

In principle, trade execution can be incentivized in a P2P exchange by creating an economic structure of rewards and fines. Customers who bail out on the settlement can be forced to pay additional restitution to the exchange or their counter-party. In some cases the guilty party is easy to determine. Fair-exchange protocols that leverage the blockchain can be audited publicly. Anyone can observe its progress and determine who backed out at which stage. But turning this into a fee/reward structure already requires creating a financial dependency between the exchange and its customers. For example, customers may have to post bond as insurance against abandoned trades.

Second there is still the problem of friction and delays introduced by forcing every trade to hit the blockchain. In a traditional exchange, when Alice and Bob swap BTC for ETH that trade is not reflected on any external blockchain. Only an internal ledger reflecting their balances is updated. Requiring all such transactions to execute on-chain both introduces delays and aggravates the scaling challenge, particularly in the case of Bitcoin which is already facing acute congestion while proposed solutions are mired in political gridlock. (Ethereum is relatively fast with block times measured in seconds and plenty of room in blocks to accommodate expanded usage.) At its current capacity, the network has an estimated total capacity around ~7 transactions per second. By comparison roughly 1 trade per second occurred for USD/BTC alone on major exchanges over the last thirty days. If all of that activity were reflected on the blockchain, it would account a significant fraction of overall capacity and further strain the overloaded network. And that is just one trading pair among many currencies for which BTC markets exist. Not to mention variable fees that must be paid to miners for moving bitcoin on chain, compared to the efficiency of updating an internal ledger.

Unless settlement is immediate and guaranteed, at best Alice and Bob have something akin to a futures contract in place. Each is promising to deliver some asset (BTC or ETH) at a future time for a price agreed upon today. That price is not an accurate reflection of present BTC/ETH price: neither party is guaranteed to receive their BTC or ETH immediately at that price. Especially for volatile assets such as cryptocurrency where price can fluctuate widely in a short span of time, this is an important consideration. Paradoxically such high volatility can encourage parties to back out of settlement if they have the chance. Suppose Alice agreed to sell 1 BTC for 20ETH but while she is working through the peer-to-peer settlement process with Bob, a spike in BTC price makes her assets now worth 21ETH. She has every incentive at this point to walk away from the trade and seek an alternative buyer at the improved price. Meanwhile Bob who assumed that he had a deal to exchange his ETH for BTC discovers that the offer was a mirage. Without a forcing mechanism to guarantee timely (ideally, real-time) settlement of trades, prices quoted on the order-book become an unreliable indicator of supply/demand.

Granted none of the preceding implies that a fully decentralized, trust-free exchange with real-time settlement can not exist. It simply points to the chasm that exists between current attempts to replace the traditional exchange model, and places the problem of settlement—which may well turn out to be the easy piece—in context with the full spectrum of functionality that full-fledged market places are expected to provide. There are many challenges and open problems involved in designing a solution that can reasonably compete with the existing paradigm.

CP

Trading cryptocurrency without trusted third-parties (part II)

[continued from part I]

To recap the scenario: Alice and Bob are interested in trading bitcoin (BTC) for ether (ETH.) Alice owns BTC, Bob has ETH, and they have agreed on pricing and quantity. (Note we are fast-forwarding past the scene where Alice and Bob miraculously located each other and organized this trade. That is one of the most valuable functions of a market, a point that we will return to.) Now they want to set up a fair-exchange where Alice only receives her ETH if Bob receives the corresponding amount of BTC.

Fragility of ECDSA as a feature

One way to do this involves turning what could be considered a “bug” in the ECDSA signature algorithm—used by both Bitcoin and Ethereum— into a feature. ECDSA is a randomized signature algorithm. Signing a message involves picking a random nonce each time. The random choice of nonce for each operations means even signing the same message multiple times can yield a different result each time. This is in contrast to RSA for example, where the most common padding mode is deterministic. Processing the same message again will yield the exact same signature.** It is critical for this nonce to be unpredictable and unique, otherwise the security of ECDSA completely breaks down:

  • If you know the nonce, you can recover the private key.
  • If the same unknown nonce is reused across different messages you can recover the private key. (Just ask Sony about their PlayStation code-signing debacle.)
  • It gets worse: if multiple messages are signed with different nonces with known relationship (such as, linear combination of some nonces equals another one) you can still recover the private key.

That makes ECDSA highly fragile, dependent critically on a robust source of randomness. It also means implementations susceptible to backdoors: a malicious version can leak private-keys by cooking the nonce while appearing to operate correctly by producing valid signatures. Variants have been introduced to improve this state of affairs. For example deterministic ECDSA schemes compute the nonce as a  one-way function of secret-key and message, without relying on any source of randomness from the environment.

But this same fragility can prove useful as a primitive for exchanging funds across different blockchains, by deliberately forcing disclosure of a private key. Specifically, it’s possible to craft an Ethereum smart-contract that releases funds conditionally on observing two valid signatures for different messages with the same nonce.

Setup

  • Alice has her public-key A, which can be used to create corresponding addresses on both Bitcoin & Ethereum blockchains.
  • Bob likewise has public-key B.
  • Alice generates a temporary ECDSA key, the “transfer-key” T.

Before starting execution, Alice rearranges her funds and moves the agreed-upon quantity of bitcoin into a UTXO with a specific redeem script. The script is designed to allow spending if either one of these two conditions are satisfied:

  • One signature using Alice’s own public key A but only after some time Δ has elapsed. This is a time-lock enabled by the  check-locktime-verify instruction.
  • 2-of-2 multi-signature using Bob’s public key B and the transfer key T.

Once this UTXO is confirmed, Alice sends Bob a pointer to the UTXO on the blockchain. In practice she would also have to send the redeem script for Bob to verify that it has been constructed. (Since the P2SH address is based on a one-way hash of the script, it is not possible in general to infer the original script from an address alone.)

Once Bob is satisfied that Alice has put forward the expected Bitcoin amount subject to the right spending conditions, he sets up an Ethereum contract. This contract has two methods:

  • Refund(): Can only be called by Bob using B and only after some future date. Sends all funds back to Bob’s address. This is used by Bob to reclaim funds tied up in the contract in case Alice abandons the protocol.
  • Exchange(signature1, signature2): This method is called by Alice and implements the fair-exchange logic. It expects two signatures using the transfer-key T over predefined messages, which can be fixed ahead of time such as “foo” and “bar”. The method verifies that both signatures are valid and more importantly they are reusing the ECDSA nonce. (In other words, the private key for T has been disclosed.) If these conditions are met, the contract sends all of its available balance to Alice’s address.

Alice in turn needs to verify that this contract has been setup correctly. As a practical matter, all instances of the contract can share the same source-code, differentiated only by parameters they receive during the contract creation. These constructor parameters are the Ethereum addresses for Alice and Bob, along with the public-key for T to check signatures against. That way there is no need to reverse-engineer the contract logic from EVM byte-code. A single reference implementation can be used for all invocations of the protocol. Only the constructor arguments need to be compared against expected values, along with the current contract balance.

Assuming this smart-contract is setup correctly, Alice can proceed with taking delivery of the ETH from Bob. She signs two messages with her private-key, reusing the same nonce for both. Then she invokes the Exchange method on the contract with these signatures. Immutability of smart-contract logic dictates that upon receiving two signatures with the right properties, the contract has no choice but to send all its funds to Alice.

At this point Alice has her ETH but Bob has not claimed his BTC. This is where the fair-exchange logic comes into play: Alice staked her claim to the ETH by deliberately disclosing the private-key T. Looking back at the redeem script for Alice’s UTXO on the blockchain, possession of T and Bob’s key B allows taking control of those funds. Bob can now sign a transaction using both private-keys to move that BTC to a new address he controls exclusively. Meanwhile Alice herself is prevented from taking those funds back herself because of the timelock.

The fine-print: caveats and improvements

A few subtleties about this protocol. Invoking Exchange() on the contract means the entire world learns the private key for T, not just Bob; blockchain messages are broadcast so all nodes can verify correct execution. Why not have Alice send one of the signatures to Bob out-of-band, in private? A related question is why not allow the Bitcoin funds to be moved using the transfer-key T only, instead of requiring a multi-signature? The answer to both of these is that Bob can not count on his knowledge of T being exclusive. Even if the Ethereum smart-contract only expected a single signature (having the expected nonce hard-coded) Alice can still publish the private-key for T to the entire world after she receives her ETH. If her funds only depended on a single key T for control, it would become a race-condition between Bob and everyone else in the world to claim them. Alice does not care; once she discloses T someone will take her BTC. But Bob cares very much that he is the only recipient and not have to race against others to get their TX mined first. Including an additional key B only known to Bob guarantees this, while also making it moot whether other people come in possession of the private-key for T.

Speaking of race conditions, there is still one case of Bob racing against the clock: he must claim the bitcoin before the time-lock on the alternative spending path expires. Recall that Alice can claw-back her bitcoin after some time/block-height  is reached. That path is reserved for the case when the protocol does not run to completion, for example Bob never publishes the ethereum smart-contract. But even after Bob has published the contract and Alice invoked it to claim her ETH, the alternative redemption path remains. So there is an obligation for Bob to act in a timely manner. The deadline is driven by how the time-locks are chosen. Recall that the Ethereum smart-contract also has a deadline after which Bob can claw the funds back if Alice fails to deliver T. If this is set to say midnight on a given day while the Bitcoin UTXO is time-locked to midnight the next day (these are approximate, especially when specified as block-height since mining times are randomly distributed) then Bob has 24 hours to broadcast the transaction. That time window can be adjusted based on the preferences of two sides, but only at the risk of increasing recovery time after protocol is abandoned. In that situation Alice is stuck waiting out the expiration of this lock before she can regain control of her funds.

Another limitation in the basic protocol as described is lack of privacy. The transaction is linkable across blockchains: the keys A, B and T are reused on both sides, allowing observes to trace funds from Bitcoin into Ethereum. This situation can be improved. There is no reason for Alice to reuse the same key A for reclaiming her Bitcoin as the key she uses to receive Ethereum from Bob. (In fact Bob only cares about the second one since that is given as a parameter to the contract.) Similarly Bob can split B into two different keys. Dealing with T is a little more tricky. At first it looks like this must be identical on both chains to allow private-key disclosure to work. But there is another trick Alice and Bob can use. After Alice gives the public-key for T to Bob, Bob can craft his Ethereum contract to expect the related key T* = m·T for a random scalar m used to mask the original key. He in turn shares this masking factor with Alice. Since Alice has the private key for T, she can also compute the private key for T* by simply multiplying with m. When she discloses that private-key, Bob can now recover the original key for T by using the inverse of m. Meanwhile to outside observers the keys T and T* appear unrelated. This provides a form of plausible deniability. If many people were engaging in transactions of this exact format with identical parameters, it would not be possible to link the Bitcoin side of the exchange to the Ethereum side. But “identical parameters” is the operative qualification. If Alice and Bob are trading 1BTC while Carol and David are trading 1000BTC, the transactions are easily separated. Similarly if the time-locks on ETH and BTC side are not overlapping, it becomes possible to rule out an ETH contract as being the counterpart of another BTC transaction posted around the same time.

Finally an implementation detail: why use the repeated-nonce trick for disclosing private key instead of simply sending private-key bits to the contract? Because the Solidity language used for writing smart-contract has a convenient primitive for verifying ECDSA signatures given a public-key. It does not have a similar primitive to check if a given private-key corresponds to a public-key. In fact it makes sense for Solidity to have no facilities for working with private-keys. Since all smart-contract execution is public, the assumption is only publicly available information would ever be processed by the contract and never secret material. For this reason we resort to the nonce reuse trick. Ethereum virtual machine also has the additional primitives required to compare two signatures for nonce equality. Interestingly Bitcoin script-language is exactly one instruction shy of being able to accomplish that. The instruction OP_CAT is already defined in the scripting language but currently disabled and for good reason: without other limits, it can be used as a denial-of-service vector. But if OP_CAT were enabled, it could be used to construct a redeem script that receives ECDSA signatures in suitably encoded form (nonce and second component as individual stack-operands) and checks them for nonce reuse. Other “splicing” opcodes such as OP_SUBSTR can also achieve the same effect by parsing the full ASN1 encoded ECDSA signature to extract the nonce piece into an individual stack operand where it can be compared for equality against another nonce. Either way, it would allow inverting the protocol sequence: Bob posts a smart-contract on the Ethereum blockchain first, Alice sets up the corresponding Bitcoin UTXO, which Bob proceeds to claim by disclosing the transfer key.

[continued]

CP

** RSA does have a randomized padding mode as well called PSS.

Trading cryptocurrency without trusted third-parties (part I)

[Full disclosure: the author works on security for a cryptocurrency exchange]

The collapse of Mt. Gox in 2014 and its aftermath has inspired a healthy dose of skepticism towards storing cryptocurrency with online services. It has also inspired the search for decentralized exchange models where the functionality provided by MtGox can be realized without a single point-of-failure where all risk is concentrated. While the mystery of what went on at Mt Gox remains unresolved to this day, blockchain designs have continued to evolve. Bitcoin itself has not changed much at the protocol level, although it added a couple of new instructions to the scripting language. More significant advances happened with the introduction of segregated-witness, along with the emergence of so-called “layer 2” solutions for scaling such as the Lightning Network. Even more promising is the emergence of alternative blockchains capable of expressing more complex semantics, most notably Ethereum with its Turing-complete smart-contract language. This makes it a good time to revisit the problem of decentralized cryptocurrency exchange without the concentration risk created by storing funds.

Answering this question in turn requires reexamining the purpose of an exchange. At the simplest level, an exchange connects buyers and sellers. Sellers post the quantity they are willing to part with and a price they are willing to accept. Buyers in turn place bids to purchase a specific quantity at a price of their choosing. When these two sides “cross”—the bid meets or exceeds an ask— a trade is executed. The exchange facilitates the transfer of assets in both directions, delivering assets to the buyer while compensating the seller with the funds provided by the buyer.

In an ideal world where everyone is good for their word, this arrangement does not require parking any funds with the exchange.  If Alice offers to sell 1BTC and Bob has agreed to purchase that for $1200, we can count on Alice to deliver the cryptocurrency and Bob to send the US dollars. In this hypothetical universe, they do not have to place funds in escrow with the exchange or for that matter any other third-party. Bob can wire fiat currency to Alice’s bank-account and Alice sends Bitcoin over the blockchain to Bob’s address. In reality of course people frequently deviate from expected protocol, violate contractual obligations or engage in outright fraud. Perhaps Bob never had the funds to begin with or he had a change of heart after finding a cheaper price on another exchange after agreeing to the trade with Alice.

These are examples of counter-party risk. It becomes increasingly unmanageable at scale. It would be one thing if Alice and Bob happen to know each other, or expect to be doing business continuously—in these scenarios “defecting” and trying to cheat the other side becomes counterproductive. With thousands of participants in the market and interactions between any pair being infrequent, there is not much of an opportunity to build up a reputation. It is infeasible for everyone to keep tabs on the trustworthiness of every potential counter-party they may be trading with, or to disadvantage new participants because they have no prior history to evaluate.

The standard model for exchanges provides one possible solution to this problem: Alice and Bob both deposit their funds with the exchange. The exchange is responsible for ensuring that all orders are fully covered by funds under custody. Using the example of BTC/USD trading, Alice can only offer to sell Bitcoin she has stored at the exchange and Bob can only place buy orders that his fiat balance can cover. Bob can be confident that the assets he just bid on are not phantom-Bitcoins that may fail to materialize after the trade completes. Likewise Alice knows she is guaranteed to receive USD regardless of which customer ends up being paired with her order.

The counter-party risk is mitigated but only at the expense of creating new challenges. In this model, the exchange becomes a custodian funds for everyone participating in the market. Aside from the obvious risk of a MtGox-type implosion, it creates a liquidity problem for these actors: their funds are tied up. Consider that a trader will be interested in betting on multiple cryptocurrencies across multiple exchanges. Even within a single trading pair such as USD/BTC, there are significant disparities in prices across exchange, creating arbitrage opportunities. But exploiting such disparities requires either maintaining positions everywhere or rapid funds movement between exchanges. Speed of Bitcoin movement is governed by mining time—which is an immutable property of the protocol, fixed at 10 minutes on average— and competition against other transactions vying for scarce room in the next block. In principle fiat currency can be moved much faster using the Federal Reserve wire system but that too depends on the implementation of wire transfer functionality at each exchange. All of this spells increased friction for moving in/out of markets, as well as greater amount of capital committed at multiple exchanges in anticipation of trading opportunities.

Is it possible to eliminate counter-party risk without introducing these inefficiencies? Over the years, alternative models have been put forward for trading cryptocurrency while eliminating or at least greatly reducing the concentration of risk. For example Bitsquare bills itself as a decentralized exchange, noting that it does not hold any user funds. Behind the scenes, this is achieved by relying on trusted arbitrators to mediate exchanges and resolve disputes:

“If Trader A fails to confirm the receipt of a national currency transfer within the allotted time (e.g. six days for SEPA, one day for OKPay, etc.), a button to contact the arbitrator will appear to both traders. Trader B will then be able to submit evidence to the arbitrator that he did, in fact, send the national currency. Alternatively, if Trader B never sent the national currency, Trader A will be able to submit evidence to the arbitrator that the funds were never received.”

In other words, counter-party risk is managed by having humans in the loop acting as trusted third-parties, rendering judgment on which side of the trade failed to live up to their obligations. The system is designed with economic incentives to encourage following the protocol: backing out of a trade or failing to deliver promised asset does result in loss of funds for the party at fault. (Interesting enough, the punitive damages are rewarded to the arbitrator, rather than the counter-party inconvenienced by that transgression. It is practically in the interest of arbitrators to have participants misbehave, since they get to collect additional payments above and beyond their usual fee.) Arbitrators are also required to post a significant bond, which they will lose if they are caught colluding with participants to deviate from the protocol.

Even with the fallibility of human arbitrators, this system achieves the stated goal of diffusing risk: instead of relying on the exchange to safeguard all funds, participants rely on individual arbitrators to watch over much smaller amounts at stake in specific trades. But there are other types of risk this arrangement can not hedge against, notably that of charge-backs. This is a very common challenge when trying to design a system for trading fiat currency against cryptocurrency. Blockchain transfers are irreversible by design. By contrast, most common options for transmitting fiat can be reversed in case they are disputed. For example, if an ACH transfer is initiated using stolen online banking credentials, the legitimate owner can later object to this transaction by notifying their bank in writing. Depending on the situation, they may have up to 60 days to do so. If the bank is convinced that the ACH was unauthorized, they can reverse the ACH transfer. What this means is that Alice can face an unpleasant surprise many weeks after releasing Bitcoin to Bob. Bob— or whoever owns the account Bob used to send those funds— can recover the funds Alice received as proceeds, leaving her holding the proverbial bag, since she has no recourse to clawing back bitcoin.

Also note that functionality is somewhat reduced compared to a traditional exchange. As the FAQ notes, settlement phase can take multiple days depending on how fiat-currency is sourced. Bitcoin purchased this way is not available immediately; it can not be transferred to a personal wallet or used to pay for purchase. That’s a stark contrast from a conventional exchange where settlement is nearly instantaneous. Once the trade has executed, either side can take their USD or BTC, and use it right away, withdraw to another address or place orders for a different pair such as BTC/ETH. In P2P models, availability of funds depends on the fiat payment clearing to the satisfaction of the counter-party, and that person getting around to sending the cryptocurrency. High-frequency trading in the blink of an eye, this is not.

Looking beyond fielded systems to what is possible in theory, we can ask whether there are any results in cryptography that can provide a basis for truly decentralized, trust-free trading of currencies. Here the news is somewhat mixed.

This problem in the abstract has been studied under the rubric of fair-exchange. A fair-exchange protocol is an interactive scheme for two parties to exchange secrets in an all-or-nothing manner. That is, Alice has some secret A and Bob has a different secret B. The goal is to design a protocol such that after a number of back-and-forth messages, one of two outcomes happen:

  • Alice has obtained B and Bob has obtained A.
  • Neither one has learned anything new

This protocol is “fair” because neither side comes out ahead in any outcome. By contrast, if there was an outcome where Alice learns B and Bob walks away empty-handed, the result would be decidedly unfair to Bob. There is a nagging question here of how participants can verify the value and/or legitimacy of their respective secrets ahead of time. But assuming that problem can be solved, such protocols would be incredibly useful in many contexts including cryptocurrency. For example if A happens to be a private-key controlling an Ethereum account while B controls some bitcoin, one could implement BTC/ETH trade by arranging for an exchange of those secrets.

Now the bad news: there is an impossibility result proving that such protocols can not exist. A 1999 paper titled “On the Impossibility of Fair Exchange without a Trusted Third Party” shows exactly what the title says: there exists no protocol which can achieve the above objectives with only Alice and Bob in the picture. There must be an impartial referee Trent such that if either Alice or Bob deviate from the protocol, Trent can intervene and force the protocol to produce an equitable outcome. The silver lining is that the negative result does not rule out so-called optimistic fair-exchange, where third-party involvement is not required provided everyone duly performs their assigned role. The referee is only asked to intervene when one side deviates from the expected sequence. But “hope is not a method,” as the saying goes. Given the sordid history of scams and fraudulent behavior in cryptocurrency, counting on everyone to follow the protocol is naive.

On paper this does not bode well for the vision of implementing trust-free exchange. But this is where blockchains provide a surprising assist: it has been observed that the blockchain itself can assume the role of an impartial third-party. Here is a simple example from 2014 where Andrychowicz et al. leverage Bitcoin to improve on a well-known cryptographic protocol for coin-flipping. Slightly simplified, the original protocol proceeds this way:

  1. Alice and Bob both pick a random bit string
  2. They “commit” to their strings, by computing a cryptographic hash of that value and publishing that commitment
  3. After both have committed, each side “opens” the commitment by revealing the original string
  4. Since the hash function is public, both sides can check that commitments were opened correctly
  5. Alice and Bob now compare the least-significant bits of the two unveiled strings. If those bits are identical, Alice wins the coin-toss. Otherwise Bob wins.

This is great in theory but what happens if Bob stops at step #3? After all, once Alice reveals her commitment, Bob has full-knowledge of both strings. He can already see the writing on the wall if he lost. That would be a great time to feign network connection issues, Windows 10 upgrade or any other excuse to stop short of revealing his original choice to prevent Alice from obtaining the information necessary to prove she won the coin-toss.

Enter Bitcoin. Blockchains allow defining payments according to predetermined rules. Those rules have fixed capabilities; they can not magically reach out into the real world, dive-tackle Bob and compel him to continue protocol execution. But they can arrange for the next best outcome: make it economically costly for Bob to deviate from the protocol. Specifically Alice and Bob both must commit some funds as good-faith deposit at the outset. To reclaim their money, they must open the commitment and reveal their original bit-string by a set deadline. If either side fails to complete the protocol in a timely manner, the other party can claim their deposit. This outcome is “fair” in the sense that Bob backing out (regardless of how creative his excuse is) results in Alice being compensated.

Variants of this idea can be used to design protocols for fair exchange of crypto-currency between different blockchains. The next post will look at a specific example involving Bitcoin and Ethereum. This is admittedly a case of looking for keys under the lamp-post; developing protocols to exchange crypto-currencies is much easier than trading against fiat. Blockchain payments proceed according to well-defined mathematical structures. By contrast, fiat movement involves notions such as ACH or wire-transfers that are extrinsic to the blockchain, and not easily mapped to those constructs.

[continued in part II]

CP

Bitcoin and the C-programmer’s disease

Revenge of the C programmer

The Jargon File, a compendium of colorful terminology from the early days of computing later compiled into “The New Hacker’s Dictionary” defines the C programmer’s disease as the tendency of software written in that particular programming language to feature arbitrary limits on its functionality:

C Programmer’s Disease: noun.
The tendency of the undisciplined C programmer to set arbitrary but supposedly generous static limits on table sizes (defined, if you’re lucky, by constants in header files) rather than taking the trouble to do proper dynamic storage allocation. If an application user later needs to put 68 elements into a table of size 50, the afflicted programmer reasons that he or she can easily reset the table size to 68 (or even as much as 70, to allow for future expansion) and recompile. This gives the programmer the comfortable feeling of having made the effort to satisfy the user’s (unreasonable) demands, …

Imagine spreadsheets limited to 50 columns, word-processors that assume no document will exceed 500 pages or a social network that only lets you have one thousand friends. What makes such upper bounds capricious—earning a place in the jargon and casting aspersions on the judgment of C programmers everywhere— is that they are not derived from any inherent limitation of the underlying hardware itself. Certainly handling a larger document takes more memory or disk space. Even the most powerful machine will max out eventually. But software afflicted with this problem pays no attention to how much of either resource the hardware happens to possess. Instead the software designers in their infinite wisdom decided that no sane user needs more pages/columns/friends than what they have seen fit to define as universal limit.

It is easy to look back and make fun of these decisions with the passage of time, because they look incredibly short-sighted. “640 kilobytes ought to be enough for anybody!” Bill Gates allegedly said in reference to the initial memory limit of MS-DOS (although the veracity of this quote is often disputed.) Software engineering has thankfully evolved beyond using C for everything. High-level languages these days make it much easier to do proper dynamic resource allocation, obviating the need for guessing at limits in advance. Yet more subtle instances of hardwired limits keep cropping up in surprising places.

Blocked on a scaling solution

The scaling debate in Bitcoin is one of them. There is a fundamental parameter in the system, the so-called block-size, which has been capped at a magic number of 1MB. That number has a profound effect on how many transactions can take place, in other words how many times funds can be moved from one person to another, the sine qua non When there are more transactions available than blocks, congestion results: transactions take longer to appear in a block and miners can become more picky about which transactions to include.

Each transaction includes a small fee paid to miners. In the early days of the network, these fees were so astronomically low that Bitcoin was being touted as the killer-app for any number of problems with entrenched middlemen. (In one example, someone moved $80M paying only cents in fees.) Losing too much of your profit margin to credit-card processing fees? Accept Bitcoin and skip the 2-3% “tax” charged by Visa/Mastercard. No viable alternative to intrusive advertising to support original content online? Use Bitcoin micro-payments to contribute a few cents to your favorite blogger each time you share one of their articles. Today transactions can exceed ~$1 on average. No one is seriously suggesting paying for coffee in Bitcoin any longer, but some other scenarios such as cross-border remittances where much larger amounts are typically transferred with near-usurious rates charged by incumbents like Western Union remain economically competitive.

Magic numbers and arbitrary decisions

Strictly speaking the block-size cap is not a case of the C programmer’s disease. Bitcoin Core having been authored in C++ has nothing to do with the existence of this limit. To wit, many other parameters are fully configurable or scale automatically to utilize available resources on the machine where the code runs. The blocksize is not an incidental property of the implementation, it is a deliberate decision built into the protocol. Even alternative implementations written in other languages are required to follow it. The seemingly-innocuous limit was introduced to prevent disruption to the network caused by excessive blocks. In other words, there are solid technical reasons for introducing some limit. Propagating blocks over the network gets harder as their size increases, a problem acutely experienced by the majority of mining power which happens to be based in China and relying on high-latency networks behind the Great Firewall. Even verifying that a block is correctly produced is a problem, due to some design flaws in how Bitcoin transactions are signed. In the worst-case scenario the complexity of block verification scales quadratically: a transaction twice as large can take four times as much CPU time to verify. (A pathological block containing such a giant transaction was mined at least once, in what appears to have been a good-intentioned attempt by a miner to clean up previous errant transactions. Creating such a transaction is much easier than verifying it.)

In another sense, there is a case of the C programmer attitude at work here. Someone, somewhere made an “executive decision” that 1MB blocks are enough to sustain the Bitcoin network. Whether they intended that as a temporary stop-gap measure to an ongoing incident, to be revisited later with a better solution, or as an absolute ceiling for now and ever is open to interpretation. But one thing is clear: that number is arbitrary. From the fact that a limit must exist, it does not follow that 1MB is that hollowed number. There is nothing magical about this quantity to confer an aura of inevitable finality on the status quo. It is a nice, round number pulled out of thin-air. There was no theoretical model built to estimate the effect of block-size on system properties such as propagation time, orphaned blocks,  bandwidth required for a viable mining operation— the last one being critical to the idea of decentralization. No one solved a complex optimization problem involving varying block-sizes and determined that an even 1000000 bytes is the ideal number. That was not even done in 2010, much less in the present moment where presumably different, better network conditions exist around bandwidth and latency. If anything, when academic attention turned to this problem, initial results based on simulation suggested that the present population of nodes can accommodate larger blocks.

Blocksize and its discontents

Discontent around the blocksize limit grew louder in 2015, opening the door to one of the more acrimonious episodes in Bitcoin history. The controversy eventually coalesced around two camps. The opening salvo came from a group of developers who pushed for creating an incompatible version called Bitcoin XT, with a much higher limit: initially 20MB, later “negotiated” down to 8MB. Activating this version would require a disruptive upgrade process across the board, a hard-fork where the network risks splintering into two unless the vast majority of nodes upgrade. Serious disruption can result if a sizable splinter faction continues to run the previous version which rejects large blocks. Transactions appearing in these super-sized blocks would not be recognized by this group. In effect Bitcoin an asset itself would splinter into two. For each Bitcoin there would have been one“Bitcoin XT” you own on the extended ledger with large blocks and one garden-variety old-school Bitcoin owned on the original ledger. These two ledgers would start out identical but later evolve as parallel universes, diverging further with each transaction that appears on one chain without being mirrored in the other.

To fork or not to fork

If the XT logic for automatically activating a hard-fork sounds like a reckless ultimatum to the network, the experience of the Ethereum project removed any doubts on just how disruptive and unpredictable such inflection points can get. An alternative crypto-currency built around smart contracts, Ethereum had to undertake its own emergency hard-fork to bailout the too-big-to-fail DAO. The DAO (Distributed Autonomous Organization) was an ambitious project to create a venture capital firm as a smart-contract running on Ethereum with direct voting on proposal by investors. It had amassed $150M in funds until an enterprising crook noticed that the contract contained a security bug and exploited it to siphon funds away. The Ethereum Foundation sprung into action, arranging for a hard-fork to undo the security breach and restore stolen funds back to the DAO participants. But the rest of the community was unimpressed. Equating this action to crony-capitalism and bailout of failed institutions common in fiat currencies—precisely the interventionist streak that crypto-currencies were supposed to leave behind— a vocal minority declined to go along. Instead of going along with the fork, they dedicated their resources to keeping the original Ethereum ledger going, now rebranded as “Ethereum Classic.” To this day ETC survives as a crypto-currency with its own miners, its own markets for trading against other currencies (including USD) and most importantly its own blockchain. In that parallel universe, the DAO theft has never been reverted and the alternate ending of the DAO story is the thief riding off into the sunset holding bags of stolen virtual currency.

The XT proposal arrived on the scene a full year before Ethereum provided this abject lesson on the dangers of going full-speed ahead on contentious forks. But the backlash against XT was nevertheless swift. Ultimately one of its key contributors rage-quit, calling Bitcoin a failed experiment. One year after that prescient comment, Bitcoin price had tripled, proving Yogi Berra’s maxim about the difficulty of making predictions. But the scaling controversy would not go away. Blocks created by miners continued to edge closer to the absolute limit, fees required to get transactions into those blocks started to fluctuate and spike, as did confirmation times.

Meanwhile Bitcoin Core team quietly pursued a more cautious, conservative approach, opting for introducing non-disruptive scaling improvements, such as faster signature verification to improve block verification times. This path avoided creating any ticking time-bombs or implied upgrade-or-else threats for everyone in the ecosystem. But it also circumscribed limits on what types of changes could be introduced when maintaining backwards compatibility is a nonnegotiable design goal. The most significant of these improvement was segregated-witness. It moves part of transaction data outside the space allotted to transactions within a block. This also provides a scaling improvement of sorts, a virtual block-size increase without violating the sacred 1MB covenant: by slimming down the representation of transactions on the ledger, one could squeeze more of them into the same scarce space available in one block. The crucial difference: this feature could be introduced as soft-fork. No ultimatums to upgrade by a certain deadline, no risk of network-wide chaos in case of failure to upgrade. Miners indicate their intention to support segregated witness in the blocks they produce. The feature is activated when a critical threshold is reached. If anything segregated witness was too deferential to miner votes, requiring an unusually high degree of consensus at 95% before going into effect.

Beyond kicking the can down the road

At the time of writing, blocks signaling support for segregated witness plateaued around 30%. Meanwhile Bitcoin Unlimited (BU) inherited the crown from XT in pushing for disruptive hard-forks, by opening the door to miners voting on block size. It has gained enough support among miners that a contentious fork is no longer out of the question. Several exchanges have signed onto a letter describing how Bitcoin Unlimited would be handled if it does fork into a parallel universe, and at least one exchange has already started trading in futures about the fork.

Instead of trying to make predictions about how this stand-off will play out, it is better to focus on the long-term challenge of scaling Bitcoin. One-time increase in capacity  enabled by segregated witness (up to 2x, depending on assumptions about adoption rate and mix of transactions) is no less arbitrary than the original 1MB limit that all sides are railing against. Even BU with the implied of lack of limitations in the name turns out to cap blocksize at 256MB—not to mention that in a world where miners decide block size, it is far from clear that the result will be a relentless competition to increase it over time. Replacing one magic number pulled out of thin air with an equally bogus one that does not derive from any coherent reasoning built on empirical data is not a “scaling solution.” It is just an attempt to kick the can down the road. The same circumstances precipitating the current crisis—congested blocks, high and unpredictable transaction fees, periodic confirmation delays— will crop up again once network usage starts pushing against the next arbitrary limit.

Bitcoin needs a sustainable solution for scaling on-chain without playing a dangerous game of chicken with disruptive forks Neither segregated witness or Bitcoin Unlimited provides a vision for solving that problem. It is one thing to risk disruptive hard-forks once to solve the problem for good. It is irresponsible to engage in such brinkmanship as the standard operating procedure.

CP

Extracting OTP seeds from Authy

An OTP app is an OTP app is an…

A recent post on the Gemini blog outlined changes to two-factor authentication (2FA) on Gemini, providing additional background on the Authy service. Past comments suggest there were common misconceptions about Authy, perhaps none more prominent than the assumption that it is based on SMS. Authy is a service which includes multiple options for 2FA: SMS, voice, mobile app for generating codes and OneTouch. At the same time a common question often asked is: “Can I use Google Authenticator or other favorite 2FA application instead?” Making that scenario work turns out to be a good way to gain insight into how Authy app itself operates under the covers.

Authy has mobile applications for Android, iOS as well as two incarnations for desktops: a Chrome extension and a Chrome application. All of them can generate one-time passcodes (OTP) to serve as second-factor when logging into a website. A natural question is how these codes are generated and whether they are compatible with other popular OTP applications such as Google Authenticator or Duo Mobile.

This is not a foregone conclusion and not all OTP-generation algorithms are identical. For example the earliest design in the market was SecurID by RSA Security. These were small hardware tokens with a seven-segment LCD display for showing numerical codes. RSA not only sold these tokens but also operated the service for verifying codes entered by users. For all intents and purposes, the tokens were a blackbox: the algorithm used to generate the codes was undocumented and users could not reprogram the tokens on their own. (That did not work out all that well when RSA was breached by nation-state attackers in 2011, resulting in the downstream compromise of RSA customers relying on SecurID— including notably Lockheed Martin.)

Carrying around an extra gadget for a single purpose limits usability, all but guaranteeing that solutions such as Secure ID are confined to “enterprise” scenarios— in other words, where employees have no say in the matter because their IT department decided this is how authentication works. Ubiquity of smart-phones made it possible to replace one-off special purpose gadgets with so-called “soft tokens:” mobile apps running on smartphones that can implement similar logic without the baggage of additional hardware.

Google Authenticator was among the first of these intended for mass market consumption, as opposed to the more niche enterprise scenarios. [Full disclosure: This blogger worked on two-factor authentication at Google, including as maintainer of the Android version of Google Authenticator] Early versions were open-sourced, although the version on Play Store diverged significantly after 2010 without releasing updates to source. Still looking at the code and surrounding documentation explains how OTPs are generated. Specifically it is based on two open standards:

  • TOTP: Time-based OTP as standardized in RFC 6238. Codes are generated by applying a keyed-hash function (specifically HMAC) to the current time, suitably quantized into intervals.
  • HOTP: HMAC-based OTP standardized by RFC 4226. As the RFC number suggests, this predates TOTP. Codes are generated by applying a keyed hash function to an incrementing counter. That counter is incremented each time an OTP is generated. (Incidentally the internal authentication system for Google employees— as opposed to end-users/customers— leveraged this mode rather than TOTP.)

The mystery algorithm

So what is Authy using? TOTP is a reasonable guess because Authy is documented to be compatible with Google Authenticator, and can in fact import seeds using the same URL scheme represented in QR-codes. But strictly speaking that only proves that Authy includes a TOTP implementation as subset of its functionality. Recall that Authy also includes a cloud-service responsible for provisioning seeds to phones; these are the “native” accounts managed by Authy, as opposed to stand-alone accounts where the user must scan a QR code. It is entirely conceivable that OTP generation for native Authy accounts follows some other standard. The fact that native Authy accounts generate 7-digit codes lends some support to that theory, since GA can only generate 6-digit codes. (Interestingly the URL scheme for QR codes allows 6 or 8 digits, but as the documentation points out GA ignores that parameter.)

Answering this question requires looking under the hood of the Authy app itself. In principle we can pick any version. This blog post uses the Chrome application as case study, because reviewing Chrome apps does not require any special software beyond the Developer Tools built into Chrome itself. No pulling APKs from the phone, no disassembling Dalvik binaries or firing up IDA Pro necessary.

There is another benefit to going after the Chrome application: since the end-goal is using an alternative OTP application to generate codes compatible with Authy, it is necessary to extract the secret-key used by the application. Depending on how keys are managed, this can be more challenging on a mobile device. For example some models of Android feature hardware-backed key storage provided by TrustZone, where keys are not extractable after provisioning. AN application can ask the hardware for specific operations to be performed with the key, such as signing a message as called for by TOTP. But it can not obtain raw bits of the key to ship them off-device. (While the security level of TrustZone is weak compared to an embedded secure element or TPM— hardened chips with real physical security— it still raises the bar against trivial software attacks.) By contrast browser applications are confined to standard Javascript interfaces, with no access to dedicated cryptographic hardware even if one exists on the machine.

Understanding the Chrome application

First install the Authy application from the Chrome store:

Install_Authy_app.PNG

Next visit chrome://extensions and make sure Developer Mode is enabled:

Chrome_extensions_view.PNG

Now open chrome://apps in another tab and launch the Authy app:

Authy_app_launched.PNG

Back to the extensions tab and click on “main.html” to inspect the source code, switch to “Sources” tab, and expand the JS folder. There is one Javascript file here called “app.js” The code has been minimized and does not look very legible:

Main_HTML_application_JS.PNG

Luckily Chrome has an option to pretty-print the minimized Javascript, prominently advertised at the top. After taking up Chrome on that offer, the code becomes eminently readable  with many recognizable symbols. Searching for the string “totp” finds a function definition:

Formatted_JS_TOTP_function.PNG

This function computes TOTP in terms of HOTP, which may seem puzzling at first— time-based OTP based on counter-mode OTP? The intuition is both schemes share the same pattern: applying a cryptographic function to some internal state represented by a positive number. In the case of HOTP that number is an incrementing counter, increasing precisely by 1 each time an OTP is generated. In the case of TOTP the number is based on current time, effectively fast-forwarded to skip any unused values.

Looking around this function there are other hints, such as default number of digits set to 7 as expected.  More puzzling is the default time-interval set to 10 seconds. TOTP uses a time interval to “quantize” time into discrete blocks. If codes were a function of a very precise timestamp measured down to the millisecond, it would be very difficult for the server to verify it without knowing exactly when it was generated with the same accuracy. It would also require both sides to have perfectly synchronized clocks. (Recall that OTPs are being entered by users, so it is not an option to include additional metadata about time.) To work get around this problem, TOTP implementations round the time down to the nearest multiple of an interval. For example if one were using 60-second intervals, the OTP code would be identical during an entire minute. Incidentally TOTP spec defines time as number of seconds since Unix epoch, so these intervals needs not start on an exact minute boundary.

The apparent discrepancy arises from the fact that Authy app displays a code for 20 seconds, suggesting it is good for that period of time. But the underlying generator is using 10 second intervals, implying that codes change after that. What is going on here? The answer is based on another trick used to make OTP implementations deal with clocks getting out of sync or delays in submitting OTP over a network. Instead of checking strictly against the current time interval, most servers will also check submitted OTP against a few preceding and following intervals. In other words, there is usually more than 1 valid OTP code at any given time, including those corresponding to times a few minutes back and a few minutes into the future. For this reason even a “stale” OTP code generated 20 seconds ago can still be accepted even when the current time (measured in 10-second intervals) has advanced one or two steps.

But we are jumping ahead— there is one more critical step required to verify that we are looking at the right function, that this is the code-path invoked by the Chrome app when generating OTPs. Easiest way to check involves setting a breakpoint in that function, by clicking on the line number:

Breakpoint_Set.PNG

Now we wait for the app to generate a code. Sure enough it freezes with a “paused in debugger” message:

Paused_in_debugger.PNG

Back to Chrome developer tools, where the debugger has paused on a breakpoint. The debugger has helpfully annotated the function parameters and will also display local variables if you hover over them:

Breakpoint_Hit.PNG

Tracing the call a few steps further would show that “e” is the secret-seed encoded in hex, “r” is the sequence number and “o” is the number of digits. (More precisely it used to be the secret-seed; this particular Authy install has since been deleted. Each install receives a different seed, even for the same account.) Comparing “r” against the current epoch time shows that it is roughly one-tenth the value, confirming the hypothesis of 10 second intervals.

Recreating the OTP generation with another app

A TOTP generator is defined by three parameters:

  • Secret-seed
  • Time interval
  • Number of digits

Since we have extracted all three, we have everything necessary to generate compatible codes using another TOTP implementation. Google Authenticator has defined a URL scheme starting with otp:// for encoding these parameters, which is commonly represented as 2-dimensional QR code scanned with a phone camera. So in principle one can create such a URL and import the generator into Google Authenticator or any other soft-token application that groks the same scheme. (Most 2FA applications including Duo Mobile have adopted the URL scheme introduced by GA for compatibility, to serve as drop-in replacements.) One catch is the key must be encoded using the esoteric base32 encoding instead of the more familiar hex or base64 options; this can be done with a line of Python. The final URL will take the form:

otpauth://totp/Authy:alice@foobar.com?secret=<base32-encoded>&digits=8&issuer=Authy&period=10

Why 8 digits? There is a practical problem with Authy using 7 digits output. The specification for the URL scheme states that valid choices are 6 or 8. (Not that it matters, since Google Authenticator does not support any option other than 6.) Luckily we do not need exactly 7 digits. Due to how HOTP/TOTP convert the HMAC output into a sequence of digits, the correct “7-digit code” can be obtained from the 8-digit one by throwing away the first digit. Armed with this trick we can create an otp:// URL containing the secret extracted from Authy and specify 8 digits. Neither Google Authenticator or Duo Mobile were able to import an account using this URL but the FreeOTP app from RedHat succeeds. Here is the final result, showing screenshots captured from both applications around the same time.

Side-by-side OTP apps

On the left: Authy Chrome app; on the right: FreeOTP running in Android emulator

Note the extra leading digit displayed by FreeOTP. Because the generator is configured to output 8 digits, it outputs one more digit that needs to be ignored when using the code.

There is no “vulnerability” here

To revisit the original questions:

  1. What algorithm does Authy use for generating codes? We have verified that it is based on TOTP.
  2. Can a different 2FA application be used to generate codes? Sort of: it is possible to copy the secret seed from an existing Authy installation and redeploy it on a different, stand-alone OTP application such as FreeOTP.  (One could also pursue other avenues to getting the seed, such as leveraging the server API used by the provisioning process.)

One point to be very clear on: there is no new vulnerability here. When using the Authy Chrome application, it is a given that an attacker with full control of the browser can extract the secret seeds. Similar attacks apply on mobile devices: seeds can be scraped from a jailbroken iPhone or rooted Android phone; it is just a matter of reverse-engineering how the application stores those secrets. Even when hardware-backed key storage is used, the seeds are vulnerable at the point of provisioning when they are delivered from the cloud or initially generated for upload to the cloud.

CP

[Updated Apr 27th to correct a typo]

Principle of least-privilege: looking beyond the insider risk

Second-law applied to access rights

“I have excessive privileges for accessing this system. Please reduce my access rights to avoid unintended risk to our users.”

That is a statement you are unlikely to hear from any engineer or operations person tasked with running an online service. Far more likely are demands for additional access, couched in terms of an immediate crisis involving some urgent task that can not be performed without a password, root access on some server or being added to some privileged group. The principle of least privilege states that people and systems should only be granted those permissions absolutely necessary to perform their job. Holding this line in practice is an uphill battle. Similar to the increase of entropy dictated by the second-law of thermodynamics, access rights inexorably seem to expand over time, converging on a steady state where everyone has access to everything.

That isn’t surprising given the dynamics of operating services. Missing rights are quickly noticed when someone going about their work runs into an abrupt “access denied” error. Excessive, unnecessary privileges have no such overt consequences. They are only identified after a painstaking audit of access logs and security policies. A corollary: it is easy to grant more access, it is much harder to take it away. The effects of removing existing privilege are difficult to predict. Will that break an existing process? Was this person ever accessing that system? No wonder that employee departures are about the only time most companies reduce access. Doing it any other time preemptively requires some data crunching: mine audit logs to infer who accessed some resource in the last couple of weeks and compare that against all users with access. But even that may not tell the whole story. For example, often access is granted to plan for a worst-case disaster scenario, when employees who carry out some role in the normal course of affairs are no longer able to perform that role and their backups— who may never have accessed the system until that point— must step in.

Part of the problem is scaling teams: a policy that starts out entirely consistent with least-privilege can become a free-for-all when amplified to a larger team. Small companies often have very little in the way of formal separation of duties between engineering and operations. In a startup of 5 employees, people naturally gravitate to being jack-of-all-trades. If any person may be asked to perform any task, it is not entirely unreasonable for everyone to have root/administrator rights on every server the company manages. But once that rule is applied reflexively for every new hire (“all engineers get root on every box”) as the company scales to 100 people, there is massive risk amplification. It used to be that only five people could cause significant damage, whether by virtue of inadvertent mistakes or deliberate malfeasance. Now there are a hundred people capable of inflicting that level of harm, and not all of them may even have the same level of security awareness as the founding group.

There is also a subtle, cultural problem with attempting to pare down access. It is commonly interpreted as signaling distrust. In the same way that granting access to some system implies that person can be trusted with access to resources on that system—user data, confidential HR records, company financials— retracting that privilege can be seen as a harsh pronouncement on the untrustworthiness of the same person.

This post is an attempt to challenge that perception, primarily by pointing out that internal controls are not only, or even primarily, concerned with insider risk. Following the logic of least-privilege hardens systems against the more common threats from external actors. More subtly it protects employees from becoming a high-value target for persistent, highly-skilled attackers.

Game of numbers

Consider the probability of an employee device being compromised by malware. This could happen through different avenues such as unwittingly installing an application that had been back-doored or visiting a malicious website with a vulnerable browser or plugin. Suppose that there is one in a thousand (0.1%) chance of this happening to any given employee in the course of a month when they are going about their daily routine surfing the web and installing new app/updates/packages from different locations. Going back to our hypothetical examples above, for the garage company with five engineers there is about 6% chance that at least one person will get their machine owned after a year. But the post-seed-round startup with 100 engineers is looking at odds of 70%—in other words, more likely than not.

If every one of those hundred engineers had unfettered access to company systems, all of those resources at risk. As the saying goes: if an attacker is executing arbitrary code on your computer, it is no longer your computer.  The adversary can use compromised hosts as stepping stone to access other resources such as internal websites, databases containing customer information or production system. Note that neither two-factor authentication nor hardware tokens help under this threat model. A patient attacker can simply wait for the legitimate user to scan their fingerprint, connect USB token/smart-card, push buttons, perform an interpretive dance or whatever other bizarre ritual is required for successful authentication, piggy-backing on the resulting session to access remote resources once the user initiates a connection.

That in a nutshell is the main problem with over-extended access policies: every person in the organization becomes a critical point of failure against external attacks. This is purely a matter of probabilities; it is not casting aspersions on whether employees in question are honest, well-meaning or diligent. The never-ending supply of Flash 0-day vulnerabilities affects hard-working employees just as much as it affects the slackers. Similarly malicious ads finding their way into popular websites pose a threat to all visitors running vulnerable software without discriminating against their intentions.

Painting a target on the home team

There is a more subtle reason that over-broad access rights are dangerous not only for the organization overall, but for the individuals carrying them. It may end up painting a target on those employees in the eyes of a persistent attacker. Recall that one of the distinguishing factors for a targeted attack is the extensive investment in mapping out the organization the adversary is attempting to infiltrate. For example, they may perform basic reconnaissance through open-source channels including LinkedIn or Twitter to identify key employees and organizational structure. Assuming that information about internal roles can not be kept secret indefinitely, one concludes that attackers will develop a good idea of which persons they need to go after in order to succeed. Exactly who ends up being under the virtual cross-hairs depends on their objectives.

For run of the mill financial fraud, high-level executives and accounting personnel are obvious targets. For instance the 2014 attack on BitStamp started out with carefully tailored spear-phishing messages to executives, followed by successfully impersonating those executives to request transfer of Bitcoin. FBI has issued a general warning about these scams last year warning that “… schemers go to great lengths to spoof company e-mail or use social engineering to assume the identity of the CEO, a company attorney, or trusted vendor. They research employees who manage money and use language specific to the company they are targeting, then they request a wire fraud transfer using dollar amounts that lend legitimacy.”

While the FBI attributes a disputed figure of $2.3B in losses to such scams, perhaps the more remarkable part of this trend is: these are relatively simple and broadly accessible attacks. There are no fancy zero-days or creative new exploits techniques involved requiring nation-state level of expertise. Yet even crooks carrying out these amateurish smash-and-grab operations were capable of mapping out the organization structure and homing in on the right individuals.

More dangerous than the get-rich-quick attackers are those pursuing long-term, stealth persistence for intelligence gathering. This class of adversary is less interested in hawking credit-card numbers than retaining long-term access to company systems for ongoing information gathering. For example they could be interested in customer data, stealing intellectual property or simply using the current target as stepping stone for the next attack. That level of access requires gaining deep access to company systems, not just reading the email inbox an executive or two. Emails alone are not enough when the objective is nothing less than full control of IT infrastructure.

This is where highly-privileged employees come into the picture. It is a conservative assumption that resourceful attackers will identify the key personnel to go after. (That need not happen all at once;  gaining initial access to an unprivileged account often allows looking around to identify other accounts.)  Those with root access on production systems where customer data is stored are particularly high-value targets. They provide the type of low-level access to infrastructure which permit deploying additional offensive capabilities such as planting malware which is not possible with access to email alone. Equally high-profile as targets are internal IT personnel with the means to compromise other employee machines. for example by using a centralized-management solution to deploy arbitrary code.

It is difficult to avoid the existence of highly privileged accounts. When something goes wrong on a server in the data-center, the assumption is that someone can remotely connect and get root access in order to diagnose/fix the problem. Similarly for internal IT: when an employee forgets their password or experiences problems with their machine, there is someone they can turn to for help and that person will be able to reset their password or log into their machine to troubleshoot. Those users with the proverbial keys to the kingdom are understood be in the cross-hairs. They will be expected to exercise greater degree of caution and maintain higher standards of operational security than the average employee. Their activity will be monitored carefully and any signs of unusual activity investigated promptly.

The challenge for an organization is keeping the size of that group “manageable” in relation to the that of the company itself. If everyone has root everywhere or everyone can effectively compromise every other person through shared infrastructure (sometimes such dependencies can be subtle) the entire company becomes one high value target. Any individual failure has far-reaching consequences. Every successful attack against one employee becomes a major incident potentially impacting all assets.

Setting the right example

Information security professionals can lead by example in pushing back against gratuitous access policies. It is very common for security teams to be among the worst-offenders in not following least privilege. When this blogger was in charge of security at Airbnb, everyone wanted to gift security team more access to everything: administrator role at AWS, SSH to production servers running the site and many other creative ways to access customer data. Mere appearance of the word “security” in job titles seems to confer a sense of infallibility: not only are these fellows deserving of ultimate trust with the most valuable company information, but they must also have perfect operational security practices and zero chance of falling victim to attacks that may befall the standard engineer. These assumptions are dangerous. By virtue of getting access, we all introduce risk to the system regardless of how good our opsec game may be. Comparing the incremental risk to benefits generated on a case-by-case basis is a better approach to crafting access policies. If someone is constantly exercising their new privileges to help their overloaded colleagues solve problems, the case is easy to make. If on the other hand the system is used infrequently or never—suggesting access rights were given out of a reflexive process born out of hypothetical future scenarios—the benefits are unclear. Meanwhile risks will increase, because the person is less likely to be familiar with sound security practices for correctly using that system. Making these judgment calls is part of sound risk management.

CP