An exchange is a mixer, or why few people need Tornado Cash

The OFAC sanctions against the Ethereum mixer Tornado Cash have been widely panned by the cryptocurrency community as an attack on financial privacy. This line of argument claims that Tornado has legitimate uses (never mind that its actual usage appears to be largely laundering the proceeds of criminal activity) for consumers looking to hide their on-chain transactions from prying eyes. The problem with this argument is that the alleged target audience already has access to mixers that work just as well as Tornado Cash for most scenarios and happen to be a lot easier to use. Every major cryptocurrency exchange naturally functions as a mixer— and for the vast majority of consumers, that is a far more logical way to improve their privacy on-chain compared to interacting with a smart-contract.

Lifecycle of a cryptocurrency trade

To better illustrate why a garden-variety exchange functions—inadvertently—as a mixer, let’s look at the lifecycle of a typical trade. Suppose Alice wants to sell 1 bitcoin under her own self-custody wallet for dollars and conversely Bob wants to buy 1 bitcoin for USD. Looking at the on-chain events corresponding to this trade:

  1. Alice sends her 1 bitcoin into the exchange. This is an unusual aspects of trading cryptocurrency: there are no prime brokers involved and all trades must be prefunded by delivering the asset to the exchange ahead of time. This is an on-chain transaction, with the bitcoin moving from Alice’s wallet to a new address controlled by the exchange.
  2. Similarly Bob must deliver his funds in fiat, via ACH or wire transfers.
  3. Alice and Bob place orders on the exchange order book. The matching engine pairs those trades and executes the order. This takes place entirely off-chain, only updating the internal balances assigned to each customer.
  4. Bob withdraws the proceeds of the trade. This is an on-chain transaction with 1 bitcoin moving from an exchange-controlled address to one designated by Bob.
  5. Similarly Alice can withdraw her proceeds by requesting an ACH or wire transfer to her own bank account.

Omnibus wallet management

One important question is the relationship between the exchange addresses involved in steps #1 and  #4. Alice must send her bitcoin to some address owned by the exchange. In theory an exchange could use the same address to receive funds from all customers. But this would make it very difficult to attribute incoming funds. Recall that an exchange may be receiving deposits from hundreds of customers originating from any number of bitcoin addresses at any given moment. Each of those transaction. A standard bitcoin transaction does not have a “memo” field where Alice could indicate that a particular deposit was intended for her account. (Strictly speaking, it is possible to inject extra data into signature scripts. However that advanced capability is not widely supported by most wallet applications and in any case would require everyone to agree on conventions for conveying sender information, not just for Bitcoin but for every other blockchain.

This is where the concept of dedicated deposit addresses come into play. Typically exchanges assign one or more unique addresses to each customer for deposits. Having distinct deposit addresses provides a clean solution to the attribution problem: any incoming funds to one of Alice’s deposit addresses will always be attributed to her and result in crediting her balance on the internal exchange ledger. This holds true regardless of where the deposit originated from.  For example, she could share her deposit address for a friend and the friend could send bitcoin payments directly to Alice’s address. Alice does not even have to alert the exchange that she is expecting a payment: any blockchain transfer to that address are automatically credited to Alice.

(Aside: Similar attribution problems arise for fiat deposits. ACH attribution is relatively straightfoward since it is initiated by the customer through the exchange UI; in other words, it is a “pull” approach. But wire transfers pose a problem since there is no such thing as per-customer bank accounts. All wires are delivered to a single bank account associated with the exchange. Commonly this is solved by having customers provide wire IDs to match incoming wires to the sender.)

Incoming and outgoing

Where things get interesting is when Bob is withdrawing his newly purchased 1 bitcoin balance. While it is tempting to assume that 1 bitcoin must come from Alice’s original deposit address where she sent her funds, this is not necessary. Most exchanges implement a commingled “omnibus” wallet where funds are not segregated per customer on-chain. When Alice executes a trade to sell her bitcoin to Bob, that transaction takes place to entirely off-chain. The exchange makes an update to its own internal ledger, crediting and debiting entries in a database recording how much of each asset every customer owns. That trade is not reflected on-chain. Funds are not moved from an “Alice address” into a “Bob address” each time trades execute.

This is motivated by efficiency concerns: blockchains have limited bandwidth and moving funds on-chain costs money in the form of miner fees. Settling every trade on-chain by redistributing funds between addresses would be prohibitively expensive. Instead, the exchange maintains a single logical wallet that holds funds for all its customers. The allocation of funds among all these customers is not visible on chain; it is tracked on an internal database.

A corollary of this is that when a customer requests to withdraw their cryptocurrency, that withdrawal can originate from any address in the omnibus wallet. Exchange addresses are completely fungible. In the example above, while Bob “bought” his bitcoin from Alice—in the sense that his buy order executed against a corresponding sell order from Alice—there is no guarantee that his withdrawal of proceeds will originate from Alice’s address. Depending on the blockchain involved, different strategies can be used to satisfy withdrawal requests in an economical manner. In the case of bitcoin complex strategies are required to manage “unspent transaction outputs” or UTXO in an efficient manner. Among other reasons:

  • It is more efficient to supply a single 10BTC input to serve a 9BTC withdrawal, instead of assembling nine different inputs of one bitcoin each. (More inputs → larger transaction → higher fees)
  • Due to long confirmation times on bitcoin, exchanges will typically batch withdrawals. That is, if 9 customers each requesting 1 bitcoin, it is more economical to broadcast a single transaction with a 10BTC input and 9 outputs each going to one customer, as opposed to nine distinct transactions with one input/output.

In short, there is no relationship between the original address where incoming funds arrive and the final address which appears as the sender of record when those funds are withdrawn after a trade.

Coin mixing by accident

This hypothetical example tracked the life cycle of a bitcoin going through a trade between Alice and Bob. But the same points about omnibus wallet management also apply to a single person. Consider this sequence of events:

  1. Alice deposits 1 bitcoin into the exchange
  2. At some future date she withdraws 1 bitcoin

While the first transaction is going into one of her unique deposit addresses, the second one could be coming out of any address in the exchange omnibus wallet. It looks indistinguishable from all other 1 bitcoin withdrawals occurring around the same time. As long as Alice uses a fresh destination address to withdraw, external observes cannot link the deposit and withdrawal actions. In effect the exchange “mixed” her coins by accepting bitcoin that was known to be associated with Alice and spitting out an identical amount of bitcoin that is not linked to the original source on-chain.

In other words, an exchange with an omnibus wallet also functions as a natural mixer.

Centralized vs decentralized mixers

How favorably that mixer compares to Tornado Cash depends on the threat model. The main selling points of Tornado Cash are trustless operation and open participation.

  • Tornado is implemented as a set of immutable smart-contracts on Ethereum. Those contracts are designed to perform one function and exactly one function: mix coins. There is no leeway in the logic. It cannot abscond with funds or even refuse to perform the designated function. There is no reliance on the honest behavior of a particular counterparty. This stands in stark contrast to using a centralized exchange— those venues have full custody over customer funds. There is no guarantee the exchange will return the funds after they have been deposited. It could experience a security breach resulting in theft of assets. Or it could deliberately choose to freeze customer assets in response to a court order. Those possibilities do not exist for a decentralized system such as Tornado.
  • Closely related is that privacy is provided by all other users taking advantage of the mixer around the same time. The more transactions going through Tornado, the better each transaction is shielded among the crowd. Crucially, there is no single trusted party able to deanonymize all users, regardless of how unpopular the usage. By contrast, a centralized exchange has full visibility into fund flows. It can “connect the dots” between incoming and outgoing transactions.
  • There are no restrictions on who can interact with Tornado smart contract. Meanwhile centralized exchanges typically have an onboard flow and may impose restrictions on sign-ups, such as only permitting customers from specific countries or requiring proof of identity to comply with Know-Your-Customer regulations.

Reconciling the threat model

Whether these theoretical advantages translate into a real difference for a given customer depends on the specific threat model. Here is a concrete example from CoinGecko defending for legitimate uses of Tornado:

“For instance, a software employee paid in cryptocurrency and is unwilling to let their employer know much about their financial transactions can use Tornado Cash for payment. Also, an NFT artist who has recently made a killing and is not ready to draw online attention can use Tornado Cash to improve their on-chain privacy.”

CoinGecko article

The problem with these hypothetical examples is they assume all financial transactions occur in the hermetically sealed ecosystem of cryptocurrency. In reality, very few commercial transactions can be conducted in cryptocurrency today—and those are primarily in Bitcoin using the Lightning Network, where Tornado is of exactly zero value since it operates on the unrelated Ethereum blockchain. The privacy-conscious software developer still needs an off-ramp from Ethereum to a fiat currency such as US dollars. That means an existing relationship with an exchange that allows digital assets for old fashioned fiat. (While it is possible to trade ether for stablecoins such as Tether or USDC using permissionless decentralized exchanges, that still does not help. The landlord and the utility company expect to get paid in real fiat, not fiat equivalents.)

Looked another way, the vast majority of cryptocurrency holders already have an existing relationship with an exchange because that is where they purchase and custody their cryptocurrency in the first place. For these investors, using one of those exchanges as a mixer to improve privacy is the path of least resistance. While there have been notable failures of exchanges resulting in loss of customer funds—FTX being a prominent example—it is worth noting that the counterparty exposure is much more limited for this usage pattern. Funds are routed through an exchange wallet temporarily, not custodied long term. There is a limited time-window when the exchange holds the funds, until they are withdrawn in one or more transactions to new blockchain addresses that are disconnected from the original source. If anything, a major centralized exchange will afford more privacy from external observers due to its large customer base and ease of use, compared to the difficulty of interacting with Tornado contracts through web3 layers such as Metamask. While the customer has no privacy against the exchange, this is not the threat model under consideration: recall the above excerpt refers to a software developer trying to shield their transactions from their employer who pays their salary in cryptocurrency. That employer does not have any more visibility into what goes on inside the exchange than they have into say personal ATM or credit-card transactions for their employees. (In an extra-paranoid threat model where we are concerned about say Coinbase ratting on its customers, one is always free to choose a different, more trustworthy exchange or better yet mix coins through a cascade of multiple exchanges, requiring collusion among all of them to link inputs and outputs.)

That leaves Tornado Cash as a preferred choice only for a niche group of users: those who are unable to onboard with any reputable exchange (because they are truly toxic customers eg OFAC sanctioned entities) or those operating under the combination of a truly tin-foil-hat threat model (“no centralized exchange can be trusted, they will all embezzle funds and disclose customer transactions willy-nilly…”) and an abiding belief that all necessary economic transactions can be conducted on a blockchain without ever requiring an off-ramp to fiat currencies.

CP

Immutable NFTs with plain HTTP

Ethereal content

One of the recurring problems with NFT digital art has been the volatility of storage. While the NFT recording ownership of the artwork lives on a blockchain such as Ethereum, the content itself—the actual image or video—is usually too large to keep on chain. Instead there is a URL reference in the NFT pointing to the content. In the early days those were garden-variety web links. That made all kinds of shenanigans possible, some intended others not:

  • Since websites can go away for good (because the domain is not renewed) the NFT could disappear for good.
  • Alternatively the website could still be around but its contents can change. There is no rule that says some link such as https://example.com/MyNFT will always return the same content. The buyer of an NFT could find that the artwork they purchased has morphed. It could even be different based on time of day or the person accessing the link. (This last example was demonstrated in a recent stunt arguing that Web3 is not decentralized at all, by returning deliberately different image when the NFT is accessed through OpenSea.)

IPFS, Arweave and similar systems have been proposed as a solution to this problem. Instead of uploading NFTs to a website which may go out of business or start returing bogus concept, they are instead stored on special distributed systems. In this blog post we will describe a proof-of-concept for approximating the same effect using vanilla HTTPS links.

Before diving into the implementation details, we need to distinguish between two different requirements behind the ambiguous goal of “persistence:”


1. Immutability

2. Censorship resistance

The first one states that the content does not change over time. If the image looked a certain way when you purchased the NFT, it will always look that way when you return to view it again. (Unless of course the NFT itself incorporates elements of randomness, such as an image rendered slightly different each time. But even in that scenario, the algorithmic model for generating the image itself is constant.)

The second property states that the content is always accessible. If you were able to view the NFT once, you can do so again in the future. It will not disappear or become unavailable due to a system outage.

This distinction is important because each can be achieved independently of the other. Immutability alone may be sufficient for some use cases. In fact there is an argument to be made that #2 is not a desirable requirement in the absolute sense. Most would agree that beheading videos, CSAM or even copyrighted content should be taken down even if they were minted as an NFT.

To that end we focus on the first objective only: create an NFT that is immutable. There is no assurance that the NFT will be accessible at all times, or that it cannot be permanently taken down if enough people agree. But we can guarantee that if you can view the NFT, it will always be this particular image or that particular movie.

Subresource Integrity

At first it looks like there is already a web-standard that solves this problem out of the box: subresource integrity or SRI for short. With SRI one can link to content such as a Javascript library or a stylesheet hosted by an untrusted third-party. If that third-party attempts to tamper with the appearance and functionality of your website by serving an altered version of the content—for example a back-doored version of the Javascript library that logs keystrokes and steals passwords—it will be detected and blocked from loading. Note that SRI does not guarantee availability: that website may still have an outage or it may outright refuse to serve any content. Both of those events will still interfere with the functioning of the page; but at least the originating site can detect this condition and display an error. From a security perspective that  is a major improvement over continuing to execute logic that has been corrupted (undetected) by a third-party.

Limitations & caveats

While the solution sketched here is based on SRI, there are two problems that preclude a straightforward application:

  • SRI only works inside HTML documents.
  • SRI only applies to link and script elements. Strictly speaking this is not a limitation of the specification, but the practical reality of the extent most web-browsers have implemented the spec.

To make the first limitation more concrete, this is how a website would include a snippet of JS hosted by a third-party:

<script src="https://example.com/third-party-library.js"
integrity="sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU="
crossorigin="anonymous">

That second attribute is SRI at work. By specifying the expected SHA256 hash of the Javascript code to be included in this page, we are preventing the third-party from serving any other code. Even the slightest alteration to the script returned will be flagged as an error and prevent the code from executing.

It is tempting to conclude that this one trick is sufficient to create an immutable NFT (according to the modest definition above) but there are two problems.

1. There is no “short-hand” version of SRI that encodes this integrity check in the URL itself. In an ideal world one could craft a third-party link along the lines of:

https://example.com/code.js#/script[@integrity=sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU=']”

This (entirely hypothetical) version is borrowing syntax from XPath, combining URIs with an XML-style query language to “search” for an element that meets a particular criteria, in this case having a given SHA256 hash. But as of this writing, there is no web standard for incorporating integrity checks into the URI this way. (The closest is an RFC for hash-links.) For now we have to content ourselves with specifying the integrity as an out-of-band HTML attribute of the element.

2. As a matter of browser implementations, SRI is only applied to specific types of content; notably, javascript and stylesheets. This is consistent across Chrome, Firefox and Edge. Neither images or iframes are covered. That means even if we could somehow solve the first problem, we can not link to an “immutable” image by using an ordinary HTML image tag.

Emulating SRI for images

Working around both of these limitations requires a more complicated solution, where the document is built up in stages. While it is not possible to make a plain HTTPS URL immutable due to limitation #1 in SIR, there is one scheme that supports immutability by default.  In fact all URLs of this type are always immutable. This is the “data” scheme where the content is inlined; it is in the URL itself. Since no content is retrieved from an external server, this is immutable by definition. Data URLs can encode an HTML document, which serves as our starting point or stage #1. The URL associated with the NFT on-chain will have this form.

In theory we could encode an entire HTML document, complete with embedded images, this way. But that runs into a more mundane problem: blockchain space is expensive and the NFT URL lives on chain. That calls for minimizing the amount of data stored within the smart-contract, using only the minimal amount of HTML to boostrap the intended content. In our case, the specific HTML document will follow a simple template:

<!DOCTYPE html>
<html>
<head>
<script src="https://example.com/stage2.js"

integrity="sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU="
          crossorigin="anonymous">
  </script>
</head>
</html>

This is just a way of invoking stage #2, which is a chunk of bootstrap JavaScript hosted on an external service and made immutable using SRI. If that hosting service decides to go rogue and start returning different content, the load will fail, leaving the user starting at a blank page. But the hosting service cannot successfully cause altered javascript to execute, because of the integrity check enforced by SRI.

Stage #2 itself is also simple. It is a way of invoking stage #3, where the actual content rendering occurs.

var contents='” … contents of stage #3 HTML document … “;
document.write(contents);

This replaces the current document by new HTML from the string. The heavy lifting takes place after the third stage has loaded:

  • It will fetch additional javascript libraries, using SRI to guarantee that they cannot be tampered with.
  • In particular, we pull in an existing open-source library from 2017 to emulate SRI for images, since the NFT is an image. This polyfill library supports an alternative syntax for loading images, with the URL and expected SHA256 hash specified as proprietary HTML attributes.
  • Stage #3 also contains a reference to the actual NFT image. But this image is not loaded using the standard <img src=”…”> syntax in HTML; that would not be covered by SRI due to the problem of browser support discussed above.
  • Instead, we wait until the document has rendered and kick-off custom script that invokes the JS library to do a controlled image load, comparing the content retrieved by XmlHttpRequest against the integrity check to make sure the server returned our expected NFT.
  • If the server returned the correct image, it will be rendered. Otherwise a brusque modal dialog appears to inform the viewer that something is wrong.

Putting it all together, here is a data URL encoding an immutable NFT:

data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPHNjcmlwdCBzcmM9Imh0dHBzOi8vd3d3LmlkZWVzZml4ZXMuaW8vaW1tdXRhYmxlL3N0YWdlMi5qcyIKCSAgICBpbnRlZ3JpdHk9InNoYTI1Ni1YSlF3UkFvZWtUa083eE85Y3ozaExrZFBDSzRxckJINDF5dlNSaXg4MmhVPSIKCSAgICBjcm9zc29yaWdpbj0iYW5vbnltb3VzIj4KICAgIDwvc2NyaXB0PgogIDwvaGVhZD4KPC9odG1sPgo=

We can also embed it on other webpages (such as NFT marketplaces and galleries) using an iframe. as in this example:

Embedded NFT viewer

Chrome does not allow navigating the top-level document to a data URL, requiring indirection through the iframe. In this case the viewer itself must be trusted, since it can cheat by pointing the iframe at a bogus URL instead of the correct scheme printed above. But such corruptions are only “local” since other honest viewers will continue to enforce the integrity check.

What happens if the server hosting the image were to replace our hypothetical motorcycle NFT by a different picture?

Linking to the image with a plain HTTPS URL will display the corrupted NFT:

But going through the immutable URL above will detect the tampering attempt and not render the image:

CP

Notes on the Grayscale ETF rejection

Deja vu for bitcoin spot ETFs

[Full disclosure: This blogger worked for a cryptocurrency exchange associated with a past Bitcoin ETF filing]

Few observers were surprised when the SEC rejected yet another bitcoin spot ETF filing in July, adding to a long line of failed attempts going back to the 2017 Winklevoss ETF decision. (Not to be confused with bitcoin futures ETFs, which have been already approved for trading.) Even the sponsor did not seem particularly optimistic about its odds of victory: Grayscale preemptively retained high-profile legal counsel in the week leading up the decision, gearing for a protracted court battle.

One silver lining is that the SEC itself emphasizes the procedural nature of the decision, as opposed to being a judgment on whether bitcoin is a suitable investment:

“The Commission emphasizes that its disapproval of this proposed rule change, as modified by Amendment No. 1, does not rest on an evaluation of the relative investment quality of a product holding spot bitcoin versus a product holding CME bitcoin futures, or an assessment of whether bitcoin, or blockchain technology more generally, has utility or value as an 12 innovation or an investment.”

In other words: this is not a reflection on the suitability of bitcoin as an asset class, or even the relative advantages of holding bitcoin directly compared to holding it indirectly via spot or futures ETFs. SEC did not jump on the not-your-keys-not-your-bitcoin bandwagon and endorse self-custody. The ruling is strictly concerned with structural issues at play for this one particular proposed ETF. Comforting words for the bitcoin faithful but hardly the resounding endorsement the sponsor was looking for. Grayscale Trust has been under increasing pressure to convert from its current structure into an ETF. Recent reversal of its tracking error only added increased urgency to this filing, raising the stakes for the SEC decision: while GBTC NAV used to float at a comfortable premium above the underlying price, it is now trading significantly below spot prices.

The song remains the same

In an 86 page decision heavy on references, the Commission lays out its rationale for the rejection. Looking at this document closer reveals an interesting mix of “recycled content” from past rejections of similar ETFs as well as some unique counter-arguments to claims advanced in this particular application. In case this distinguished heritage is not clear, there is footnote #11. Taking up most of the third page and spilling over into the next page, it presents a laundry list of past bitcoin ETF rejections: Winklevoss Bitcoin Trust, USBT, Wisdom Tree, Valkyrie Bitcoin Fund, Krypton, SkyBridge ETF, NYDIF Bitcoin ETF, GlobalX, ARK21, One River Carbon Neutral Bitcoin Trust, SolidX, Granite Shares, VanEck, the list goes on. Grayscale decision hinges on similar rationale: the applicant has not met its burden under the Exchange Act to demonstrate that the proposal is consistent with the requirements of section 6(b)(5), specifically that the venues where the underlying product—bitcoin— is traded are “designed to prevent fraudulent and manipulative acts and practices” and “to protect investors and the public interest.”

While the ruling cites prior events going as far back as 2017, it is also surprisingly current. One note cites cites recent work from Trail Of Bits on centralization of public blockchains. Another note cites a support letter written in support of the Grayscale application date 21st is— eight days before the timestamp of this document. SEC spends the first half of the document disputing two claims by NYSE Arca, the sponsor behind the Grayscale ETF. (Here we will use Grayscale as short hand to refer collectively to the trust and its sponsor, even though NYSE Arca is a distinct entity.)

  1. That NYSE Arca has entered into a comprehensive surveillance sharing agreement with a regulated market of significant size
  2. Other alternative countermeasure for market manipulation are in place due to the unique properties of cryptocurrency

Own goals

Several commenters writing to the SEC in support of the Grayscale application argued that bitcoin markets were somehow inherently resistant to manipulation either due to their scale or some unusual transparency property of blockchains. (A curious argument, considering that trading activity itself is not reflected on chain and takes place within the internal, closed ledgers of centralized exchanges.) In a deft move, the Commission uses words straight out of NYSE Arca itself to refute those arguments. On the subject of whether bitcoin markets can be manipulated:

“NYSE Arca acknowledges in its proposal that “fraud and manipulation may exist and that [b]itcoin trading on any given exchange may be no more uniquely resistant to fraud and manipulation than other commodity markets.”NYSE Arca also states that “[b]itcoin is not itself inherently resistant to fraud and manipulation” and concedes that “the global exchange market for the trading of [b]itcoins” […] also “is not inherently resistant to fraud and manipulation.”

That is not exactly helping the case. To be fair, even without this own-goal the SEC had plenty of ammunition to cast doubt on the premise that spot price is immune to manipulation. Among others:

  • Tether, the gift that keeps on giving
  • Continued allegations of wash-trading on offshore, unregulated exchanges
  • Possibility of 51% attack or “hacking of the Bitcoin network”
    This constant refrain about hypothethical flood of hash-power colluding to rewrite history seems out of place here. While 51% attacks have always been a theoretical possibility, the security in proof-work comes from the difficulty of assembling the required level of resources to carry out such an attack. Blithely asserting that bitcoin is subject to 51% attacks is tantamount to saying a Bond villain with a trillion dollars could corner the market for baseball cards. Similarly the SEC ruling cites a statistic about 100 wallets controlling ~15% of all bitcoin in circulation. As previously discussed here, such statistics from on-chain address distribution can not be used to estimate the level of inequality in bitcoin ownership. One address does not necessarily equal one person or even one institutional investor, especially among addresses with highest balances that typically represent large omnibus wallets pooling funds.

Speaking of supporting letters, the SEC could not resist the temptation to nitpick some of these. Note #100 points out one letter from the Blockchain Association where the commenters claimed CFTC has been exercising anti-manipulation and anti-fraud enforcement authority over bitcoin futures market since 2014— which is a full three years before CFTC has overseen bitcoin futures.

Another example of an own-goal comes from Grayscale assertions concerning the “Index Price” and how the associated methodology for aggregating spot prices from multiple exchanges is resistant to manipulation. The SEC points out conflicting statements from the Registration Statement carrying a litany of caveats and qualifications:

“Moreover, NYSE Arca’s assertions that the Trust’s use of the Index helps make the Shares resistant to manipulation conflict with the Registration Statement. Specifically, the Registration Statement represents, among other things, that the market price of bitcoin may be subject to “[m]anipulative trading activity on bitcoin [trading platforms], which are largely unregulated,” and that, “[d]ue to the unregulated nature and lack of transparency surrounding the operations of bitcoin [trading platforms], they may experience fraud, security failures or operational problems, which may adversely affect the value of [b]itcoin and, consequently, the value of the Shares.”

Voluntary vs mandatory compliance

One interesting point made in the SEC ruling is that any mitigations implemented by “constituent platforms” (the five exchanges used for calculating the ETF price, namely: Coinbase Pro, Bitstamp, Kraken, and LMAX Digital) against market manipulation are entirely at their own discretion. These are not regulated platforms. They have no obligation to continue policing their order-books against suspicious trading activity and reporting bad actors to law enforcement:

“[…] these measures, unlike the Exchange Act’s requirements for national securities exchanges,117 are entirely voluntary and therefore have no binding force. The Constituent Platforms, including the platform operated by an affiliate of the Custodian, could change or cease to administer such measures at any time”

One counterpoint is that exchanges have very compelling incentives to maintain orderly markets since manipulative activity reduces investor confidence in the platform, resulting in loss of customers. On the other hand such profit/loss motivations do not carry the same weight as a binding regulation and may cut both ways. In good times it may well drive further investment in market surveillance to boost investor confidence in a bit to attract more risk-averse investors standing on the sidelines. But the same incentives could mean a troubled exchange facing crypto-winter will cut spending on compliance.

No exchange an island unto itself

While the Grayscale filing sings the praises of the Constituent Platforms and how robust they are against market manipulation, the SEC turns its attention to the rest of the bitcoin spot market. Rightly so— considering that the majority of spot bitcoin trading takes place on unregulated, off-shore exchanges outside these four platforms:

“NYSE Arca focuses its analysis on the attributes of the Constituent Platforms, as well as the Index methodology that calibrates the pricing input generated by the Constituent Platforms […] What the Exchange ignores, however, is that to the extent that trading on spot bitcoin platforms not directly used to calculate the Index Price affects prices on the Constituent Platforms, the activities on those other platformswhere various kinds of fraud and manipulation from a variety of sources may be present and persistmay affect whether the Index is resistant to manipulation. Importantly, the record does not demonstrate that these possible sources of fraud and manipulation in the broader spot bitcoin market do not affect the Constituent Platforms that represent a slice of the spot bitcoin market.”

This is spot on. Even if a particular index is designed to ignore signals from offshore exchanges on the theory that they are easier to manipulate, trading activity on those platforms can still affect “Constituent Platforms” as long as overlap exists in market participants. Suppose a market-maker simultaneously operates on one of CP and one of the off-shore exchanges— this is a very common scenario, since such strategies are typically rooted in exploiting price discrepancy across multiple venues. In that case price manipulation taking place on the sketchy offshore platform will also impact prices on the regulated venue because market-making algorithms will continue to exploit the price difference until it is arbitraged away or until they run out of liquidity. The contagion of price manipulation can not be confined to one venue in an efficient, interconnected market. (Ironically, trying to argue against this point by insisting that bitcoin markets are so disconnected from each other to violate the Rule Of One Price is self-defeating. It would mean the bitcoin spot market is too inefficient and immature to serve as the basis for any ETF.)

Tracking errors

The ruling also points out a subtle point about pricing: while the Grayscale Trust may use the reference index price to evaluate itself, there is no guarantee this is the same valuation the shares will trade at. (In fact there is another, third price that may drift from these two— the value that Authorized Participants in the fund buy/sell bitcoin on the open market when they are creating/redeeming baskets. Recall that the reference price is a synthetic creation, not an actual quote from an actual exchange where you can can execute trades at that price.) Again the SEC points to the historic track record of GBTC having significant tracking errors, trading as high as 142% (!!) over the underlying value of bitcoin holdings and at other times trading as low as 21% below the same benchmark. The source of these figures? The Registration Statement and 10-K filings from Grayscale. Converting GBTC into an ETF is expected to reduce these over/under tracking errors, but in making that case for approval, Grayscale unwittingly provided the SEC with even more ammunition to question the defense against price-manipulation.

Back to the futures ETF?

If the intrinsic properties of bitcoin or market-surveillance agreements— to the extent they even exist at all Constituent Platforms— are insufficient to deter price manipulation, what else could work? Here the Grayscale filing pinned its hopes on the existence of previously approved bitcoin futures ETFs such as the ProShares Bitcoin Strategy ETF which tracks CME futures. This argument rests on two premises:

  1. CME futures are traded on highly-regulated venues with well established market-surveillance and information sharing agreements, greatly reducing the risk of market manipulation
  2. Any successful manipulation of the proposed ETF must also manipulate the CME futures market in order to affect the price

That would appear to do the trick. No need for additional “surveillance-sharing agreements” or “market of significant size” within the spot market, when any manipulation attempts will be caught by the parallel mechanisms operating in the futures market.

That second argument did not fare well with the SEC. Leaving aside the relative volumes of the spot and futures markets, the main objection concerns lack of evidence around any causal relationship between prices in the two different markets. Here again Grayscale’s own words come back to haunt them: “… there does not appear to be a significant lead/lag relationship” between CME futures and the spot price. This is significant because if spot leads futures, a dishonest participant does not have to trade in the latter in order to influence the former. (On the contrary, their actions in spot markets will eventually also manipulate the futures market as a downstream effect.) Even the comment letters are not helping the cause, with one acknowledging there is “no clear winner” or even noting a bidirectional link were either market can move the other. In an interesting side-discussion the commission also clarifies what the expectation of future “leading” spot market means. Contrary to the straw-man version presented in another comment letter, the SEC criteria does not mean that futures always move first and the spot market responds in a perfectly predictable way afterwards. It is the existence of a statistical relationship that is sought, not a deterministic recipe for 100% reliable arbitrage between two markets.

False equivalences, take #2

Grayscale tried another variant of this strategy of arguing by comparison to already-approved futures ETFs. Suppose that instead of relying on futures ETFs as a bulwark against price manipulation, the premise is turned on its head and we asking whether the spot ETF is any easier to manipulate. Here Grayscale argued that if someone could manipulate the spot ETF, they could also target the futures ETF using the exact same mechanism since they rely on very similar price calculations. Ergo, GBTC is no more susceptible to such activity than an investment product previously approved by the SEC. Taking this one step further, disapproving GBTC while having approved the comparable futures ETF would constitute “arbitrary and capricious administrative action.”

On its face this is a compelling argument but the SEC is having none of it. First the order points out that the Bitcoin Reference Rate (BRR) used by CME futures serves a very different purpose:

“While the BRR is used to value the final cash settlement of CME bitcoin futures contracts, it is not generally used for daily cash settlement of such contracts, nor is it claimed to be used for any intra-day trading of such contracts. In addition, CME bitcoin futures ETFs/ETPs do not hold their CME bitcoin futures contracts to final cash settlement; rather, the contracts are rolled prior to their settlement dates. Moreover, the shares of CME bitcoin futures ETFs/ETPs trade in secondary markets, and there is no evidence in the record for this filing that such intra-day, secondary market trading prices are determined by the BRR.”

Also noted in passing: there are two additional spot exchanges (Gemini and ItBit) incorporated into the BRR not present in the GBTC reference price, further casting doubt on the assertion of “almost complete overlap.” Yet these are minor quibbles compared to what SEC points as the fundamental flaw in the GBTC filing:

“… the Commission’s consideration (and approval) of proposals to list and trade CME bitcoin futures ETPs, as well as the Commission’s consideration (and thus far, disapproval) of proposals to list and trade spot bitcoin ETPs, does not focus on an assessment of the overall risk of fraud and manipulation in the spot bitcoin or futures markets, or on the extent to which such risks are similar. Rather, the Commission’s focus has been consistently on whether the listing exchange has a comprehensive surveillance-sharing agreement with a regulated market of significant size related to the underlying bitcoin assets of the ETP under consideration, so that it would have the necessary ability to detect and deter manipulative activity.”

Even more telling is note #201: An ETF applicant was never required to demonstrate that cryptocurrency possesses some “unique resistance to manipulation” missing from other assets. Such unique properties could serve as an alternative to the original golden standard that the SEC seeks: “surveillance-sharing agreements with a market of significant size.” It is precisely the lack of such compressive market surveillance in bitcoin spot market that has lead GBTC on a wild-goose chase to identify some magic properties in blockchain assets rendering them intrinsically safe against manipulation techniques.

CP

Address ≠ person: the elusive Gini coefficient of cryptocurrencies

Estimating the distribution digital-assets from on-chain data is not straightforward

A false sense of transparency

The Gini coefficient of blockchains has long been a point of contention among defenders and detractors of cryptocurrency alike. Critics like to point to extreme levels of inequality based on the observed distribution of wealth among blockchain addresses. Far from having democratized access to finance or created a path for wealth accumulation for average investors, they point to these statistics as evidence that blockchains have only enabled another instance of capital concentration. Defenders downplay the significance of such inequality and hold that such disparities do not indicate any fundamental problems with the economics of cryptocurrency. Without picking sides in that ideological debate, this post outlines a different issue: the measures of alleged inequality calculated from blockchain observations are riddled with systemic errors.

Given the transparency of blockchains as a public ledger of addresses and associated balances, the Gini coefficient is very easy to compute in theory. Anyone can retrieve the list of addresses, sort them by associated balance and crunch the numbers. This methodology is the basis of an often-cited 2014 statistic comparing bitcoin to North Korea and more recent attention-grabbing headlines stating that bitcoin concentration “puts the US dollar to shame.” While blockchain statistics are very appealing in their universal accessibility, there are fundamental problems with attempting to characterize cryptocurrency distribution this way.

Address ≠ person: omnibus wallets

The first problem is that the transparency afforded in blockchain data only applies at the level of addresses. All of the purported eye-opening measures of inequality (“%0.01 of addresses control 27% of funds”) are based on distribution across addresses as the unit of analysis. But an address is not the same thing as a person.

One obvious problem involves omnibus wallets of cryptocurrency service providers, such as centralized exchanges and payment processors. [Full disclosure: This blogger worked at Gemini, a NYC-based exchange and custodian from 2014-2019] For operational reasons, it is more convenient for these companies to pool together funds belonging to different customers into a handful of addresses. These addresses do not correspond to any one person or even the parent corporate entity. The Binance cold-wallet address does not hold the funds of Binance, the exchange itself. Those assets belongs to Binance customers, who are temporarily parking their funds at Binance to take advantage of trading opportunities or simply because they do not want to incur the headache of custodying their own funds.

While the companies responsible for these addresses do not voluntarily disclose them, in many cases they have been deanonymized thanks to voluntary sleuthing by users and labelled on blockchain explorers. A quick peek shows that they are indeed responsible for some of the largest concentrations of capital on chain, including four of the top ten accounts by bitcoin balance and similarly five of the top ten for Ethereum as of this writing.

Address ≠ person: smart-contracts

Ethereum in facts adds another twist that accounts for several other high-value accounts: there are smart-contracts holding funds from multiple sources as part of a distributed application or app. For example, the number one address by balance currently is the staking contract for Ethereum 2.0. This contract is designed to hold in escrow the 32 ETH required as a surety bond from each participant interested in participating in the next version of Ethereum validation using proof-of-stake. The second highest balance belongs to another smart-contract, this one for wrapped Ether or wETH which is a holding vehicle for converting the native ETH currency ether into the ERC20 token format used in decentralized finance (“DeFi”) applications. Others in the top 25 correspond to specific DeFi applications such as the Compound lending protocol or the bridge to the Polygon network. None of these these addresses are meaningful indicators of ownership by anyone. As such it is surprising that even recent studies on inequality are making meaningless statements such as: “The account with the highest balance in Ethereum contains over 4.16% of all Ethers.” (Depending on when the snapshot was taken, that would be either the Ethereum 2.0 staking contract— now the highest balance with > 7% of all ETH in existence— or the Wrapped Ether contract.) Spurious inclusion of such addresses in the study obviously inflates the Gini coefficient. But even their very existence distorts the picture in a way that can not be remedied by merely excluding that data point. After all the funds at that address are real and belong to the thousands of individuals who opted into staking or decided to convert their ether into wrapped-ether for participating in DeFi venues. All of these funds would have to be withdrawn and redistributed back to their original wallets to accurately reflect ownership information that is currently hidden behind the contract.

Investors: retail, institutional and imaginary

On the other extreme, a single person can have multiple wallets, distributing their funds across multiple addresses. Interesting enough this can skew the result in either direction. If a single investor with 1000BTC splits that sum equally among a thousand addresses, counting each one as a unique individual will create the appearance of capital distributed in more egalitarian terms. But it may also go in the other direction. Suppose an investor holding 1 bitcoin splits that balance unevenly across ten addresses: the “primary” wallet gets the lion’s share at 0.90BTC while all others split the remainder. While keeping the total balance constant, this rearrangement has created several phantom “cryptocurrency owners,” each holding a marginal amount of bitcoin consistent with the narrative of a high Gini coefficient.

A different conceptual problem is that even for addresses with a single owner, that owner may be an institutional investor such as a hedge-fund or asset manager. Once again, the naive assumption “one address equals one person” results in overestimating the Gini coefficient when the address represents ownership by hundreds or thousands of persons. (In the extreme case, once sovereign-wealth start allocating to cryptocurrency a single blockchain address could literally represent millions of citizens of a country as stakeholders.) It’s as if an economist tried to estimate average savings in the US by looking at the balance of every checking account at a bank, without distinguishing whether the account belongs to a multinational corporation or ordinary citizen.

Getting the full picture

More subtly, looking at each blockchain in isolation does not paint an accurate picture of total cryptocurrency ownership overall. In traditional finance some amount of positive correlation is expected across different asset types. Investors holding stocks are also likely to have bonds as part of a balanced portfolio. But cryptocurrency has sharp ideological divides that may result in negative correlation where it matters most. If bitcoin maximalists frown upon the proliferation of dubious ICOs for unproven applications while Web3 junkies consider bitcoin the MySpace of cryptocurrency, there would be little overlap in ownership. In this hypothetical universe the correlation is negative: an investor holding BTC means is less likely to hold ETH. In that scenario Bitcoin and Ethereum may both have high inequality when measured in isolation while the combined holdings of investors across both chains exhibit a more egalitarian distribution. It is possible to aggregate assets within a chain, by taking into account all tokens issued on that chain. For example a single notional balance in US dollars can be calculated for each ethereum address by taking into account all token balances for that address, maintained in the ERC20 smart-contract responsible for tracking that asset. But this does not work across chains. There is no reason to expect the correlation between different ERC20 holdings— arguably closer in spirit to each other as utility tokens for various definitions of “utility”— to hold between ethereum and bitcoin.

Better data: paging cryptocurrency exchanges

Is there a better way to estimate the Gini coefficient than this naive accounting by address? The short answer is yes but it relies on closed data-sets. Centralized cryptocurrency exchanges such as Binance are in a better position to measure inequality using their internal ledgers. While an omnibus account may appear as a handful of high-balance addresses to external observers, the exchange knows exactly how those totals are allocated to each customer. Most exchanges also perform some type of identity validation on customers to comply with KYC/AML regulations, so they can distinguish between individual or institutional investor. This allows excluding institutional investors but at the risk of introducing a different type of distortion. If high net-worth individuals are investing in cryptocurrency through institutional vehicles such as family-offices and hedge funds, focusing on individual investors will bias the Gini coefficient down by removing outliers from the dataset. Finally, exchanges have a comprehensive view into balances of their customers across all assets simultaneously so they can arrive at an accurate total across chains and even fiat equivalents. (If a customer is holding dollars or euros at a cryptocurrency exchange, should that number be included in their total balance? What if they are holding stable-coins?) These advantages can yield a more precise estimate on exactly how unequal cryptocurrency ownership is, modulo some caveats. If customers subscribe to the “not your keys, not your bitcoin” school of custody and withdraw all cryptocurrency to their own self-hosted wallet after every purchase, the exchange will underestimate their holdings. Similarly customers holding assets at multiple exchanges— for example holding bitcoin at both Binance and FTX— will result in both providers underestimating the balance. Even with these limitations, getting an independent datapoint from a large-scale exchange would go a long way towards sanity-checking the naive estimates put forward based on raw blockchain data alone. It remains to be seen if any exchange will step up to the plate.

CP

Of Twitter bots, Sybil attacks and verified identities

Seeking a middle-ground for online privacy

The exact prevalence of bots has become the linchpin of Elon Musk’s attempt to bail out on the proposed acquisition of Twitter. Existence of bots is not disputed by either side; the only question is what percent of accounts these constitute. Twitter itself puts the figure around 5%, using a particular metric called “monetizable daily active users” or mDAU for calculating the ratio. Mr. Musk disputes that number and claims it is much higher, without citing any evidence despite having obtained access to raw data from Twitter for carrying out his own research.

Any discussion involving bots and fake accounts naturally leads to the question: why is Twitter not verifying all accounts to make sure they are actual humans? After all the company already has a concept of verified accounts sporting a blue badge, to signal that the account really belongs to the person it is claiming to be. This deceptively simple question leads into a tangle of complex trade-offs around exactly what verification can achieve and whether it would make any difference to the problem Twitter is trying to solve.

First we need to clarify what is meant by bot accounts. Suppose there is a magical way to perform identity verification online. While not 100% reliable, cryptocurrency exchanges and other online financial platforms are already relying on such solutions to stay on the right side of Know Your Customer (KYC) regulations. These include a mix of collecting information from the customer— such as the time-honored abuse of social security numbers for authentication— uploading copies of government-issued identity documents and cross-checking all this against information maintained by data brokers. None of this is free but suppose Twitter is willing to fork over a few dollars per customer on the theory that the resulting ecosystem will be much more friendly to advertisers. Will that eliminate bots?

The answer is clearly no, at least not according to the straightforward definition of bots. Among other things, nothing stops a legitimate person from going through ID verification and then transferring control of their account to a bot. There need not be any nefarious intent behind this move. For example, it could be a journalist who sets up the account to tweet links to their articles every time they publish a new one. In fact the definition of “bot” itself is ambiguous. If software is designed to queue up tweets from the author and publish them verbatim at specific future times, is that a bot? What if the software augments or edits human-authored content instead of publishing it as-is? Automation is not the problem per se. Having accounts that are controlled by software— even software that is generating content automatically without human intervention— may be perfectly benign.  The real questions are:

  1. Who is really behind this account
  2. Why are they using automation to generate content?

Motivation is ultimately unknowable from the outside but the first question can be tracked down to a name, either a person or corporate entity. Until such time as we have sentient AI creating its own social-media accounts, there is going to be someone behind the curtain, accountable for all content spewing from that account. Identity verification can point to that  person pulling the levers. (For now we disregard the very real possibility of verified accounts being taken over or even deliberately resold to another actor by the rightful owner.) But that knowledge alone is not particularly useful. What would Twitter do with the information that “nickelbackfan123” is controlled by John Smith of New York, NY? Short of instituting a totalitarian social credit system along the lines of China to gate access to social networks, there is no basis for turning away Mr. Smith or treating him differently than any other customer. Even if ID verification revealed that the customer is a known persona non grata to the US government— fugitive on the FBI most-wanted list or an OFAC-sanctioned oligarch— Twitter has no positive obligation to participate in some collective punishment process by denying them an online presence. Social media presence is not a badge of civic integrity or proof of upstanding character, a conclusion entirely familiar to any one who has spent time online.

But there is one scenario where Twitter can and should preemptively block account creation. Suppose this is not the first account but 17th one Mr. Smith is creating? (Let’s posit that all the other accounts remain active, and this is not a case of starting over. After all in America we all stand for second-acts and personal reinvention.) On the other hand if one person is simultaneously in controlling dozens of accounts, the potential for abuse is high— especially when this link is not clear to followers. Looked another way: there is arguably no issue with a known employee of the Russian intelligence agency GRU registering for a Twitter account and using their presence to push disinformation. The danger comes not from the lone nut-job yelling at the cloud— that is an inevitable part of American politics— but that one person falsely amplifying their message using hundreds of seemingly independent sock-puppet accounts. In the context of information security, this is known as a “Sybil attack:” one actor masquerading as thousands of different actors in order to confuse or mislead systems where equal weight is given to every participant. That makes a compelling case for verified identities online: not stopping bad actors from creating an account, but stopping them from creating the second, third or perhaps the one-hundredth sock-puppet account.

There is no magic “safe” threshold for duplicate accounts; it varies from scenario to scenario. Insisting on a one-person-one-account policy is too restrictive and does not take into account— no pun intended— use of social media by companies, where one person may have to represent multiple brands in addition to maintaining their own personal presence. Even when restricting our attention to individuals, many prefer to maintain a separation between work and personal identities, with separate social media accounts for different facets of their life. Pet lovers often curate separate accounts for their favorite four-legged companions— often eclipsing their own “real” stream in popularity. If we contain multitudes, it is only fair that Twitter allow a multitude of accounts. In other cases, even two is too many. If someone is booted off the platform for violating terms of service, posting hate speech or threatening other participants, they should not be allowed to rejoin under another account. (Harder question: should all personal accounts associated with that person on the platform be shuttered? Does Fido the dog get to keep posting pictures if his companion just got booted for spreading election conspiracies under a different account?)

Beyond real-names

So far the discussion about verified identity focused only on the relationship between an online service such as Twitter and an individual or corporation registering for an account on that platform. But on social media platforms, the crucial connections run laterally, between different users of the platform as peers. It is one thing for Twitter to have some assurance about the real world identity connected to a user. What about other participants on the platform?

One does not have to look back too far to see a large scale experiment in answering that question in the affirmative and evaluating how well that turned out. Google Plus, the failed social networking experiment from designed to compete against Facebook, is today best remembered as the punchline to jokes— if it is remembered at all. But at the time of its launch, G+ was controversial for insisting on the use of “real names”. Of course the company had no way to enforce this at the time. Very few Google services interacted with real world identities, by requiring payment or interactions with existing financial institutions. (The use of a credit card suddenly allows for cross-checking names against those already verified by another institution such as a bank. While there is no requirement that the name on a credit card is identical to that appearing on government issued ID, it is a good proxy in most cases.) Absent such consistency checks, all that Google could do was insist that the same name be used across all services— if you are sending email as “John Smith” then your G+ name shall be John Smith. Given how ineffective this is at stopping users from fabricating names at the outset, there had to be a process for flagging accounts violating this rule.  That policing function was naturally crowd-sourced to customers, with the expectation that G+ users would “snitch” on each other by escalating matters to customer support with a complaints when they spotted users with presumably fake names. While it is unclear if this half-baked implementation would have prevented G+ from turning into the cesspool of conspiracy theories and disinformation that Facebook evolved into, it certainly resulted in one predictable outcome: haphazard enforcement, with allegations of real-names violation used to harass individuals defending unpopular views. In a sense G+ combined the worst of both worlds: weak, low-quality identity verification by the platform provider coupled with a requirement for consistency between this “verified” identity known to Google and outward projection visible to other users.

Yet one can also imagine alternative designs that decouple identity verification from the freedom to use pseudonyms or assumed nicknames. Twitter could be 100% confident that the person who signed up is a certain John Smith from New York City in the offline world, while still allowing that customer to operate under a different name as far as all other users are concerned. This affords a reasonable compromise between providing freedom of expressing identity while discouraging abuse: if Mr. Smith is booted from the platform for threatening speech under a pseudonym, he is not coming back under any other pseudonym. (There is also the additional deterrence factor at play: if the behavior warrants referral to law enforcement, the platform can provide meaningful leads on the identity of the perpetrator, instead of an IP address to chase down.)

This model still raises some thorny questions. What if John Smith deliberately adopts the name of another person in their online profile to mislead other participants? What if the target of impersonation is a major investor or political figure whose perceived opinions could influence others and impact markets? Even the definition of “impersonation” is unclear. If someone is publishing stock advice under the pseudonym “NotWarrenBuffett,” is that parody or deliberate attempt at market manipulation? But these are well-known problems for existing social media platforms. Twitter has developed the blue checkmark scheme to cope with celebrity impostors: accounts with the blue check have been verified to be accurately stating their identity while those without are… presumably suspect?

That leads to one of the unintended side-effects of ubiquitous identity verification. Discouraging he use of pseudonyms (because participants using a pseudonym are relegated to second-class citizenship on the platform compared to those using their legal name) may have a chilling effect on expression. This is less a consequence of verified identities and more about the impact of making the outcome of that process prominently visible— the blue badge on your profile. Today the majority of Twitter accounts are not verified. While the presence of a blue badge elevates trust in a handful of accounts, its absence is not perceived as casting doubt on the credibility of the speaker. This is not necessarily by design, but an artifact of the difficulty of doing robust verification at scale (just ask cryptocurrency exchanges) especially for a service reliant on advertising revenue, where there is no guarantee the sunk cost can be recouped over the lifetime of the customer. In a world where most users sport the verification badge by agreeing to include their legal name in a public profile, those dynamics will get inverted: not disclosing your true identity will be seen as suspect and reduce the initial credibility assigned to the speaker. Given the level of disinformation circulating online, that increase skepticism may not be a bad outcome.

CP

Logical access and the security theater of data-nativism

Data-center address as security guarantee

WSJ recently quoted a spokesman for Binance.US stating that all US customer data is stored on servers located in the US. The subtext of this remark is that by exclusion, customer information is not stored in China, an attempt to distance the company from concerns around the safety of customer information. Such new-found obsession with “data terroire” is a common interpretation of the data-sovereignty doctrine, which holds that information collected from citizens of a particular country must remain both geographically and legally subject to its privacy regulations. While the concept predates the Snowden revelations of 2013, it was given renewed urgency after disclosures of US surveillance programs leveraging massive data collections hoarded by private companies including Google, MSFT and Yahoo among others named as participants in the mysterious PRISM program of “upstream” collection. [Full disclosure: this blogger was a member of the Google security team from 2007-2013] 

Data-sovereignty is a deceptively simple solution: If Google is forced to store private information of German citizens on servers physically located in Germany, the argument goes, then NSA— or its counterparts in China, Russia or whichever foreign policy boogeyman looms large in the imagination on a given day— can not unilaterally seize that data without going through the legal framework mandated by German law. This comforting narrative makes no sense from a technology perspective. (If it ever made sense in other terms, including lawful access frameworks. The NSA is legally barred from conducting surveillance on US soil. Moving data out of US into foreign countries amounts to declaring open season on those assets.) To explain why, we need to distinguish between two types of access: physical and logical.

One note about the hypotheticals explored here: the identity of the private companies hoarding sensitive customer information and the alleged boogeyman going after that stash varies according to the geopolitical flavor of the day. After Snowden, US tech giants were cast as either  hapless victims or turncoat collaborators depending on your interpretation, while the NSA conveniently assumed the role of the arch-villain. For the purpose of this blog post we will use Binance/US and China as the stand in for these actors, with the full expectation that in a few years these examples will appear quite dated.

Physical access vs logical access

Imagine there is a server located inside a data-center in the middle of nowhere, as most datacenter are bound to be for proximity to cheap hydropower and low real-estate costs. This server is storing some important information you need to access. What are your options?

1. You can travel to the datacenter and walk up to the server directly. This is physical access. It unlocks some very convenient options. Server is stuck, not responding to requests? Press the power button and power-cycle it. Typical rack-mounted servers do not have a monitor, keyboard, mouse or any other peripherals attached for ease of use. But when you are standing next to the machine, you can connect anything you want. This allows getting an interactive shell and using it as a glorified workstation. One could even attach removable storage such as a USB thumb-drive for conveniently copying files. In a pinch, you could crack-open the server chassis and pocket one of the disk drives to hoover up its contents. As an added bonus: if you walk out of the datacenter with that drive, the challenge of reading its contents can be done later from the comfort of your office. (Incidentally the redundancy in most servers these days means that they will continue ticking on as if nothing happened after the removal of the drive, since they are designed to tolerate failure of individual components and “hot-swapping” of storage.) But all of this flexibility comes at a high cost. First you have to travel to the middle of nowhere which will likely involve a combination of flying and driving, then get past the access controls instituted by the DC operator. For the highest level of security in tier-4 datacenter that typically involves both an ID badge and biometrics such as palm scans for access to restricted areas. Incidentally the facility is covered with cameras everywhere, resulting in a permanent visual record of your presence, lest there be any doubt on what happened later. 

2. Alternatively you can access the server remotely over a network using a widely deployed protocol such as SSH, RDP or IPMI. This is logical access. For all intents and purposes, the user experience is one of standing next to the machine staring at a console, minus the inconvenience of standing in the uncomfortable noisy, over-air-conditioned, florescent-lit datacenter aisle. Your display shows exactly the same thing you would see if you were logged into the machine with a monitor attached, modulo some lag in the display due to the time it takes for the signal to travel over a network. You can type commands and run applications exactly as if you had jury-rigged that keyboard/mouse/monitor setup with physical access. Less obvious is that many actions that we typically associate with physical access can be done remotely. Need to connect an exotic USB gadget to the remote server? Being thousands of miles away from the USB port may look like a deal-breaker but it turns out modern operating systems have the ability to virtually “transport” USB devices over a network. USB forwarding has been supported by Windows Remote Desktop Protocol (RDP) for over a decade, while the usbip package provides a comparable solution on Linux. Need to power-on a server that has mysteriously shutdown or reset one that has gotten wedged, not responding to network requests? There is a protocol for that too: IPMI. (IPMI runs on a different chip called the “baseboard management controller” or BMC located inside the server, so the server must still be connected to power and have a functioning network connection for its BMC which happens to be the usual state of affairs in a data-center.) Need to tweak some firmware options or temporarily boot into a different operating system from a removable drive? IPMI makes that possible too.

The only prerequisite for having all these capabilities at your fingertips from anywhere in the world is the foresight to have configure the system for remote access ahead of time. Logical access controls define which services are available remotely (eg SSH vs IPMI), who is allowed to connect, what hoops they jump through in order to authenticate— there is likely going to a be VPN or Virtual Private Network at the front door— and finally what privileges these individuals attain once authenticated. The company running that server gets to define these rules. They are completely independent of the physical access rules enforced by the datacenter, which may or may not even the same company. Those permitted to remotely access servers over a network could be a completely different set of individuals than those permitted to step inside the datacenter floor and walk up to that same server in real life.

Attack surface of logical access

Logical access is almost as powerful as physical access when it comes to accessing data while having the convenience of working form anywhere in the world. In some cases it is even more convenient. Let’s revisit the example from the previous section, of walking into a datacenter and physically extracting a couple of disk drives from a server, with the intention of reading their contents. (We assume the visitor resorted to this smash-and-grab option because they did not have the necessary credentials to login to the server and access the same data the easy way even while they were standing right next to it.) There are scenarios where that problem is not straightforward, such as when disk encryption is used or the volumes are part of a RAID array that must be reconstructed in a particular manner. Another challenge is retrieving transient data that is only available in memory, never persisted to disk. There are ways to do forensic memory acquisition from live systems, but the outcome is a snapshot that requires painstaking work to locate the proverbial needle in the haystack of a full memory dump. By comparison, if one could login to the server as a privileged user, with a few commands the running application could be reconfigured to start logging the additional information somewhere for easy retrieval.

There is another reason logical access beats physical access: it’s easier to hide. Logical access operates independently of physical access: there is no record of anyone getting on an airplane, driving up to the datacenter gates, pressing their palm on the biometric scanner or appearing on surveillance video wondering the aisles. The only audit trails are those implemented by the software running on those servers, easily subject to tampering once the uninvited visitors have full control over the system.

Data-nativism as security theater

This distinction between physical and logical access explains why the emphasis on datacenter location is a distraction. Situating servers in one location or another may influence physical access patterns but has no bearing on the far more important dimension of logical access. Revisiting the Binance/US example from the introduction to illustrate this, there are three threat models depending on the relationship between the company and alleged threat actor.

  1. Dishonest, outright colluding with the adversary to siphon US customer data
  2. Honest but helpless in the face of coercion from a foreign government to hand-over customer data
  3. Honest but clueless, unaware that APT associated with a foreign nation has breached its infrastructure for collecting customer data in an unauthorized manner

In the first case it is clear that the location of data-centers is irrelevant. Binance/US employees collectively have all necessary physical and logical access to copy whatever customer information is requested and turn it over to the authorities.

The second case is identical from capability standpoint. Binance/US employees are still in a position to retrieve customer data from any system under their control, regardless of its geographic location. The only difference is a legal norm that such requests be channeled through US authorities, under an existing Mutual Legal Assistance Treaty (MLAT) agreement. If China seeks information from a US company, the theory goes, it will route the request through DOJ who is responsible for applying appropriate safe-guards under the 4th amendment before forwarding the request to the eventual recipient. This is at best wishful thinking under the assumptions of our scenario— a rogue regime threatening private companies with retaliation if they do not comply with requests for access to customer information. Such threats are likely to bypass official diplomatic channels and be addressed to the target directly. (“It would be unfortunate if our regulators cracked down on your highly profitable cryptocurrency exchange.”) For-profit organizations on the receiving end of such threats will be disinclined to take a stand on principle or argue the nuances of due process. The relevant question is not whether data is hosted in a particular country of concern, but whether the company and/or its employees have significant ties to that country such that they could be coerced into releasing customer information through extra-judicial requests.

A direct attack on Binance infrastructure is one where geography would most likely come into play. Local jurisdiction certainly make it easier to stage an all-out assault on a data-center and walk out with any desired piece of hardware. But as the preceding comparison of physical and logical access risks indicate, remote attacks using software exploits are a far more promising avenue of attack than kicking in the door.  If the government of China wanted to size information from Binance, it is extremely unlikely to involve a SWAT-style smash-and-grab raid. Such overt actions are impossible to conceal; data-center facilities are some of the most tightly controlled and carefully monitored locations on the planet. Even if target is greatly motivated by PR concerns to conceal news of such raids, even limited knowledge of the incident breaks a cardinal rule of intelligence collection: not letting the adversary realize they are being surveilled. If nothing else, the company may think twice about placing additional infrastructure in the hostile country after the first raid. By comparison, pure digital attacks exploiting logical access can go undetected for a long time, even indefinitely depending on the relative level of sophistication between attacker vs defender. With the victim none the wiser, compromised systems continue running unimpeded, providing attackers an uninterrupted stream of intelligence.

Physical to logical escalation: attacker & defender view

This is not say that location is relevant. Putting servers into hostile territory can amplify risks involving logical access. One of the more disturbing allegations from the Snowden disclosures involve Google getting sold out by Level3, the ISP hired to provide network service to Google data-centers. Since Google at the time relied on a very naive model of internal security and traffic inside the perimeter was considered safe to transit without encryption, this would have given the NSA access to confidential information bouncing around the supposedly “trusted” internal network. Presumably a compliant ISP in China will be similarly willing to arrange for access to its customers’ private fiber connections than one located overseas. Other examples involve insider risks and more subtle acts of sabotage. For example the Soviet Union was able to hide listening devices within the structure of the US embassy in Moscow, not to mention backdoor typewriters sent for repair. Facilities located on foreign soil are more likely to have employees and contractors acting at the behest of local intelligence agencies. These agents need not even have any formal role that grants them access; recall the old adage that at 4AM the most privileged user on any computing system is the janitor. 

One silver lining here is that risks involving pure physical access have become increasingly manageable with additional security technologies. Full-disk encryption means the janitor can walk away with a bundle of disk drives, but not read their contents. Encryption in transit means attackers tapping network connections will only observe ciphertext instead of the original data. Firmware controls such as secure boot and measured boot make it difficult to install rogue software undetected, while special-purpose hardware such as hardware security modules and TPMs prevent even authorized users from walking away with high-value cryptographic keys.

Confidential computing takes this model to its extreme conclusion. In this vision customers can enlist run their applications on distant cloud service providers and process sensitive data, all the while being confident that the  cloud provider can not peek into that data or tamper with application logic— even when that application is running on servers owned by that provider with firmware and hypervisors again in the control of the same untrusted party. This was not possible using vanilla infrastructure providers such as AWS or Azure. Only the introduction of new CPU-level isolation models such as Intel SGX enclaves or AMD SEV virtual machines has made it possible to ask whether trust in the integrity of a server can be decoupled from physical access. Neither has achieved clear market dominance, but both approaches point towards a future where customers can locate servers anywhere in the world— including hostile countries where local authorities are actively seeking to compromise those devices— and still achieve some confidence that software running on those machines continues to follow expected security properties. Incidentally, this is a very challenging threat-model. It is no wonder both Intel and AMD have stumbled in their initial attempts. SGX has been riddled with vulnerabilities. (In a potential sign of retreat, Intel is now following in AMD’s path with an SEV competitor called Trust Domain Extensions or TDX.) Earlier iterations of SEV have not fared any better. Still it is worth remembering that Intel and AMD are trying to solve a far more challenging security problem than the ones facing by companies who operate data-centers in hostile countries, as in the case of Apple and China. Apple is not hosting its services out of some AWS-style service managed by CCCP in a mysterious building. While a recent NYT investigation revealed Apple made accommodations for guanxi, the company retains extensive control over their operational environment. Hardware configured by Apple is located inside a facility operated by Apple, managed by employees hand-picked by Apple, working according to rules laid down by Apple, monitored 24/7 by security systems overseen by Apple. That’s a far cry from trying to ascertain whether a blackbox in a remote Amazon AWS datacenter you can not see or touch— much less have any say in the initial configuration— is working as promised.

Beyond geography

Regulators dictating where citizens’ private information must be stored and companies defending their privacy record by stressing where customer data is not stored both share in the same flawed logic. Equating geography with data security reflects a fundamental misunderstanding of the threat model, focusing on physical access while neglecting the far more complex problems raised by the possibility of remote logical access to the same information from anywhere in the world.

CP

Flash loans and the democratization of market manipulation

Borrowed guns

Imagine an enterprising criminal out to rob a well-defended gold vault in the middle of nowhere. Unfortunately for his burgeoning career, he has neither the command of a private army of mercenaries or any tactical gear required for the plan. Nor does our hypothetical crook at the beginning of a life of crime have the funds to acquire them yet. He could try buying those resources on credit, with the promise to pay the lender back with the proceeds from the successful heist. But most honest financial institutions have been getting gun-shy about lending to criminals and even the loan-sharks require some type of collateral— which, again our man does not have.

Luckily the neighborhood aviation company is running a special: anyone can walk-in and rent an Apache AH-64 gunship for a very low price, no questions asked. But the offer comes with a few strings attached:

  • This bird is programmed to return to its original take-off point after an hour.
  • It can not refuel. You get exactly one tank of gas to work with.

Borrowers can take off and do whatever they want with the helicopter— including a detour to rob the gold vault— but must return to the designated landing area. If they run out of fuel and crash land in the middle of nowhere, they will have to walk away from the spoils and watch as the stolen loot is recovered by its rightful owners. The world reverts to its previous state, as if the heist never happened.

DeFi exploits in the wild

This is one perspective on the concept of flash loans in decentralized finance. Anyone can initiate an Ethereum transaction and borrow funds which must be paid back at the end of that transaction. There is no collateral or credit-check required because it is not possible for the borrower to default. The immutable logic of smart-contracts enforced by the blockchain that guarantees this. If the loan is not paid back by the end of the transaction, the transaction “reverts”— it is still recorded on the blockchain and fees paid to miners for their effort, but nothing has changed. But if all goes well and the loan is paid, the changes that occurred within the span of that transaction— money changing hands, someone making a killing, someone else losing their shirt— is committed to the blockchain. Those possibilities are only limited by the maximum gas that can be consumed in a transaction, the virtual equivalent of the AH-64 fuel tank.

Not surprisingly, flash loans have been used for attacking DeFi exchanges and lending pools by manipulating the price signals those applications rely on. The attacks are complex and necessarily involve multiple defi contracts (exchanges such as Uniswap or lending pools such as Compound) and trading in/out of multiple assets. Here is a very simplified example of how such an attack can be executed:

  1. Flash-borrow a large amount of Ether
  2. Divide the ETH into two chunks of capital
  3. Convert the first chunk into token A, using a decentralized exchange. Now recall that DEXes do not have traditional order-books with ask/bid offers that can be matched when they cross. Instead they use automated market-makers (AMMs) which set the price based on the total amount of funds available on either side. More importantly, the liquidity available on these exchanges is often razor thin. It does not require a lot of capital to cause massive change in price. The result of this large, single “buy” order to convert ETH → A is that the “price” of A goes way up on the decentralized exchange. In other words, there is massive slippage. This type of trade is normally a terrible idea—the buyer effectively overpaid for A when they could have gotten a much better deal if they traded on a centralized exchange. So how can an “attacker” make up lost ground if they are starting out with such a lousy trade?
  4. Convert the second chunk of ETH into asset A. The trick is using a different venue for this than #3. Goal is for this trade to execute at close to fair market price and avoid slippage.
  5. Time to visit the real victim, yet another defi application. There are specific criteria for selecting this target:it must be using the venue from step #3 as its price oracle. In other words, when the attacker tries to trade A or borrow using A as collateral on the target venue, that venue will rely on faulty price signals from #3 which has been artificially manipulated by the attacker. (Recall that everything is executing inside a single ethereum transaction orchestrated by the attacker; no other trades that could interfere with this mispricing can occur.)
  6. This time the attacker has a favorable trade from A → B.The target venue is working with an overinflated price for A, because the last A-for-ETH transaction artificially inflated the price of A relative to ETH. The market maker is willing to swap/lend an outsized quantity of some other token B in exchange for a small amount of A. This is the crucial step. While the attacker lost money on the first chunk and ended up with a deficit in asset A, they aim for a killing on the second chunk, ending up with a surplus of asset B relative to the value of A exchanged.
  7. Time to pay back the flash loan. The attacker converts enough of their holdings in A and B back to ETH to cover the original loan, again using a venue where price indications are not distorted. The proceeds are used to close out the flash loan and complete the transaction successfully.
  8. Assuming the profit from B exceeds the losses on A, the attacker comes out ahead. (What if the math did not work out? No harm, no foul. The transaction will revert. So the attacker does not stand to lose any money beyond the Ethereum gas fees paid for the attempt.)

This is a highly simplified view; actual attacks can be far more complicated. Any asset can be flash-loaned, so the starting point need not be ether. However the loan has to be paid back in kind, so the attacker is still on the hook for returning the identical amount. The exchange process may involve multiple hops such as ETH → A- → B → C → … → ETH before the cycle is completed. For more concrete examples, see this 2021 paper or breakdown of the recent attack on CREAM which involved dozens of steps within a single Ethereum transaction. That paper also poses the question of whether attacks in the wild were being “optimal” in how they divided up the total amount borrowed into two chunks. The surprising answer is they are far from optimal: in each case, a different allocation between different assets A and B would have resulted in a more profitable heist. The crooks left money on the table. (Incidentally, you have to wonder about the ethics of academic research that doubles as a handbook on committing more optimal robberies and leaving less money behind in the virtual vault.)

Root causes

With this background on how flash-loans are leveraged in recent attacks, we can revisit the original question: were flash loans the root cause? The answer is clearly no. Weak connection between prices on decentralized exchanges and the “real” market prices elsewhere is the real culprit. By definition blockchains are isolated systems: they can not interact with the outside world. A smart contract can not sidle up to a Bloomberg terminal and request a fresh quote on current commodity prices. It must rely on indirect indications, such as trusted pricing oracles maintained by others on-chain or observed actions of participants interacting with the contract when trading an asset. Multiple DeFi exploits have demonstrated that these signals are surprisingly easy to manipulate given enough capital. When taken in isolation, each such instance of manipulation looks self-defeating: “attacker” gets the price of an asset completely out-of-whack on one particular exchange but only by making a lousy trade. Whatever distortion is achieved will be short lived, as other investors take note of the mispricing and jump-in to quickly arbitrage away the difference. Why would any rational actor engage in this meaningless gesture? Because other venues rely on the same distorted price signal and create profit opportunities far exceeding the loss on the original trade. This is an intrinsic structural weakness for some— but not all— decentralized application with flawed pricing signals.

From this perspective, flash-loans did not enable a new class of attacks that were impossible before. The sequence of actions depicted in the previous section could have skipped the first step — flash borrowing— and start out with an existing pool of capital already in the hands of the perpetrator. Even the most extreme case of the recent CREAM attack involved a $500MM USD flash loan. There are many hedge-funds and high net-worth individuals in possession of amounts in that neighborhood. Every one of them could have executed the exact same transaction without borrowing a single wei. Seen in this light, flash loans democratize the possibility of market manipulation.

This episode has parallels in a story covered in an episode related by Michael Lewis in Flash Boys. Goldman Sachs argued that high-frequency trading source code allegedly stolen by its one-time employee Aleynikov could be used for “unfair market manipulation.” To which Lewis effectively retorted: If such code exists, is the real problem that Aleynikov had possession of it? Is market manipulation “fair” when the same algorithm is wielded by Goldman? To the extent DeFi applications are built on flawed pricing signals, they are vulnerable to manipulation. Whether the manipulation is done with institutional capital on-hand or aided by no-questions-asked flash-loans seems irrelevant.

Deterrence at the margins

One counter-argument is that reputable market participants with large concentrations of are unlikely to attack smart-contracts regardless of profit opportunity, for fear of legal and reputational risks. This is complicated by the ambiguity of what qualifies as attack. It is not clear that what happened to CREAM and others is a traditional “hack” in any sense. There were no logic bugs in the contract. There was no compromise of a secret key held by the CREAM team. Other smart-contracts such as DAO or the Parity multi-sig wallet suffered massive losses due to logic flaws in their implementation. In both of those cases, the smart-contract had a glaring programming error such that its behavior diverged from their intended behavior, however informally specified that may have been. Compare these two cases:

  • In the case of Parity, the expectation is that only the owner of the wallet can withdraw funds form their wallet. If everyone in the world can take money out of your wallet, there is no ambiguity: the contract has failed to implement the intended policy. Anyone taking advantage of that flaw is exploiting a security vulnerability and committing theft.
  • In the case of CREAM the contract worked exactly as intended, using precisely the price signals it was expected to consume. But the designers did not look far enough ahead to understand how their creation would behave in extreme circumstances when those signals become wildly distorted. If the casino designs a game such that clever players can inflict massive losses on the house while playing by the rules, is it an “attack” to implement that strategy?

If this is not a breach in the traditional sense, one could at least hope that it qualifies as market manipulation. (Standard disclaimer: the author is not a lawyer and none of this should be construed as legal advice.) At least that categorization could serve as a deterrent for participants interested in staying on the right side of the law. But it is unclear how existing statutes for trading securities or commodities apply in the context of blockchain assets. While this post liberally uses the term “market manipulation,” not every instance of buying up large quantities of something to profit from the artificial scarcity is necessarily criminal. Not every scalper hoarding Hamilton tickets for resale merits an SEC investigation. Even if the perpetrators of these attacks were identified and prosecuted— unlikely given the relative anonymity of blockchain transactions— they may well rest their defense on the claim that “manipulation” is impossible when dealing with a system that is defined by immutable rules implemented in code.

 On the other extreme, if we posit that what happened to CREAM constitutes criminal activity that falls under SEC or CFTC jurisdiction, some troubling questions are raised about the venues providing the flash-loans. Is there liability? Did they aid and abet theft? Returning to the opening hypothetical about the helicopter available for anyone to borrow: if that craft turned up as the get-away vehicle for an actual robbery, surely the owners would have some explaining to do. Were they aware that this customer intended to commit criminal activity? Did they conduct any due diligence? Saying that the business has a policy of not asking any questions reeks of willful blindness. Virtually all flash-loans on Ethereum today follow this model— since the loan is guaranteed to be repaid, the lender does not have to care about the creditworthiness of the borrower. But that narrow focus on avoiding defaults misses the negative externalities created by (temporarily) arming random people with large amounts of capital to wreak havoc on other blockchain applications. Did Maker aid and abet criminal activity in providing the half billion dollars in capital used to drain CREAM? In the same way that Aave is contemplating the creation of a permissioned lending pools subject to Know-Your-Customer rules, flash-loan providers may need to revisit their strategy around doing business with anyone.

CP

Pre-theft attacks on Ethereum: stealing from the future

Among financial instruments cryptocurrency is unique in equating possession of funds to control over a secret cryptographic key. If you have the private key corresponding to a particular blockchain address, you have full control over funds at that address. In particular, you can sign a transaction to move those funds anywhere. This simple threat model helps the defenders prioritize their strategies and place great emphasis on key management: making sure your private keys do not fall into the wrong hands. This is where offline air-gapped “cold storage” designs, multi-signature or equivalent MPC techniques and specialized hardware security modules come into play, helping raise the bar against attacks.

But there is a more subtle aspect of blockchain design that can complicate “second-order threats” involving temporary access to private keys. It’s clear that an adversary need not have actual possession of private keys— in the sense of having the raw bits they can print on a piece of paper— in order to carry out a heist. If they can instruct a blackbox to sign a transaction sending funds to a new blockchain address controlled by the attacker, that will do just fine.

A parallel from the world of code signing comes from the 2012 Adobe breach. Most consumer operating systems including Windows implement a code-signing requirement for software vendors to digitally sign applications they public. This is designed to help increase confidence in the authenticity of software and prevent malware from disguising itself as “Adobe Photoshop” for instance. Such keys must be carefully guarded and failure to do has been leveraged in well-known attacks, most notably the joint US/Israel Stuxnet malware targeting the Iranian nuclear program which used stolen private-keys to digitally sign its components. In the case of the Adobe breach, the company had taken steps to secure its private keys by using a hardware security module. This prevented attackers from being able to walk away with a copy of those keys. Not surprisingly, it did not stop them from signing a few pieces of malicious code during the time they had access to the HSM.

Here is a more routine scenario involving access control. Consider Bob, an employee in good-standing at a company that stores cryptocurrency. Because his role includes wallet management, Bob has access to the key management system to generate transactions. (Additional controls may exist to limit what transactions can be signed, but this makes no difference to the risk under consideration here.) Suppose Bob and his employer part ways in less-than friendly manner. It is clear that while he was employed, he could have signed and broadcast any permissible transaction. Let’s posit that his access has been properly revoked and he can not no longer access signing infrastructure. Let’s also grant that all private-keys were stored on HSMs in non-extractable fashion. Is there any reason to fear retaliation from Bob?

If Bob planned ahead, he could have signed some transactions and put them aside for future broadcast. Whether those transactions remain valid indefinitely depends on the specific blockchain protocol. In the case of Bitcoin, the answer is yes, unless some other transition is first broadcast to spend the same unspent transaction output or “UTXO.” In fact Bitcoin can only set time limits in one direction: it is possible to time-lock funds such that a transaction is not valid until a certain time or block-height. It is not possible—yet, barring a hard fork— to create a transaction that is valid today but stops being valid at a future date, short of having a conflicting transaction that double-spends the inputs. This provides a straight-forward, if expensive, way of mitigating risk from unknown signatures floating around: preemptively spend all existing UTXO associated with keys the ex-employee had access to. These can even be “loopback” transactions, sending funds to the same address without generating new keys, as long as the transactions are unpredictable.**

In the case of Ethereum and specifically externally-owned accounts (EOA) the situation is more tricky. It turns out disgruntled employees can steal future funds that do not even exist at the time they are employed.

Ethereum signing recap

Ethereum requires signed message to broadcast to authorize a funds transfer or invoke a function on a contract. This message has several fields encoded using a scheme called “RLP.” For our purposes the interesting ones are:

  • Destination address
  • Amount being transferred
  • Value of the current nonce associated with the source address

Compared to Bitcoin transactions, this information is only loosely bound to current blockchain state. Creating a successful bitcoin transaction requires knowing the SHA256 hash of an existing UTXO on chain, which is a function of past state including all previous inputs that feed into that UTXO. For ethereum, only some vague knowledge about the state of the world is necessary. Walking through the three fields above:

  • Destination address is arbitrary and completely attacker-controlled.
  • The only constraint on amount is that it must be less than the total stored at that address (Unlike bitcoin, transactions do not consume all available funds in that UTXO. Any amount not included in the transfer stays at that address.)
  • Nonce is a counter that starts at zero and increments by one for every transaction originating from that address. The nonce included in the transaction must be exactly equal to current nonce on blockchain.

Pre-theft: stealing nonexistent funds

Note that the only reference to blockchain state is the nonce. There is no need to know the exact balance on that address, much less the sequence of previous transactions resulting in that total. That property makes it possible to steal funds that do not even exist on-chain yet, given only temporary access to the signing interface.

Returning to our hypothetical disgruntled employee Bob: suppose Bob knows that some Ethereum address will receive deposits in the future, even though its current balance is exactly zero. Bob can use his temporary access to sign a transaction for a future predicted value of the nonce and amount. For example, he can bet that by the time the counter reaches 100, there will be at least 5 ETH balance in this account and create a corresponding transaction to funnel that amount to a personal address. Now all he has to do is wait until the counter reaches 100 and broadcast the previously signed transaction. As long as the balance is at least 5 ETH, the transaction will move that amount into Bob’s possession.

Optimizing the heist

What if Bob guessed wrong and there is only 3 ETH? Not to worry: he can supply 2 ETH from his own funds, add that to the original pool and then withdraw using the existing transaction. This is a bizarre pattern as far as criminal activities go: the thief must make a donation to their victim before committing robbery.

Side note to Bob: you would want to execute these two steps as close as possible in time. Otherwise there is a risk that the 2ETH “donation” gets processed but the counter increments past 100 due to intervening transactions, causing Bob to miss the attack window. Such outcomes are difficult to guarantee because the second step can not be implemented as a smart-contract invocation. (If that were possible, Bob could write a custom contract that attempts to execute both atomically and revert in case the withdrawal fails.) Only a miner colluding with Bob can guarantee that donation and theft transactions are executed back-to- back and before any other transaction involving the same address that may disrupt the nonce value.

In fact there is no reason for Bob to limit himself to just one signed transaction. To cover his bases, he can sign multiple TX for the same amount at different counter values in a given interval, for example spanning 100-110. This avoids any race conditions from Bob’s transaction being preempted by another in-flight transaction for the same counter value, originating with the authorized party.

Multiple signatures solve another nagging problem for Bob: leaving money on the table. Recall that Bob must guess at a particular amount to steal. If he guesses on the high-side, he faces the problem of having to supply some funds first. What if he guesses too low? Imagine if the balance was 50 ETH instead of 5 ETH. The transaction Bob prepared will only walk away with 10% of the total take possible compared to the optimal heist.

Bob can improve his odds by preparing a series of transactions with different nonce values and different amounts. Consider this sequence:

<100, 1 ETH>
<101, 2 ETH>
<102, 4 ETH>

<109, 512 ETH>

Assuming the final balance is 50ETH, he will broadcast the first five transactions, netting a total of 31ETH. The sixth one can not be broadcast as is, because at that point the remaining balance on the account is 19ETH, which is lower than the 32ETH withdrawal attempt. (Bob can use some of the proceeds from the initial batch to “loan” more funds into the victim address such that the total exceeds 32ETH and only then broadcast the final transaction.) Even without risking the race-condition associated with lend-and-steal, this sequence is guaranteed to capture up to 50% of available funds. In fact there is nothing magical about the factor of two appearing in the sequence above. At the cost of requiring additional signatures, one could prepare a series of transactions where the amounts increase by some other constant factor F > 1, with the guarantee that at least 1/F of total value sitting on that address can be captured directly.

Defender perspective: mitigations

Proving the existence of a signed transactions is relatively easy: broadcast it. (In fact there are zero-knowledge techniques from cryptography for convincing a verifier that you know such a signature without disclosing it.) But proving the non-existence of such a transaction is tricky. How can a custodian be confident that someone with access to the private-key in the past did not sign pre-theft transactions? There are two sound approaches:

1. Throw in the towel, deprecate the address and start over by transferring the entire balance over to a newly generated key-pair. This is straightforward but highly disruptive in having to update all existing references to the previous blockchain address.
2. Use a smart-contract. While externally-owned Ethereum addresses have hard-coded logic for authorizing funds movement, a contract is free to make up its own rules. Instead of using an incrementing nonce which is highly predictable, the contract logic can dictate that an uncontrollable value such as the block-hash must be incorporated into the signed message. While the block-hash is deterministic in one sense— it is computed as a function of all transactions included in the block— an attacker has no way to control it indefinitely into the future short of controlling 100% of hash rate.

What about audit trails? In theory if the signing system has a perfect logging mechanism that dutifully records every use of the key including the message signed, one can be confident there are no other, unknown transactions floating around. In reality it is difficult to achieve this level of assurance. Even standard cryptographic hardware does not help. Typical HSM logs can reveal when someone performed cryptographic operations but not necessarily which key is involved, much less the exact message they signed. Some vendor extensions to the PKCS#11 standard include counters that increment each time a key is used. (Safenet HSMs implement this in firmware 7.0 and higher versions.) One can reconcile that counter value against an independent record of all transactions ever submitted for signing. This approach can flag discrepancies, but not necessarily resolve them conclusively. Suppose the HSM counter shows a key was used 10 times but only 9 signed transactions are known to exist. There could be an innocent explanation: some transaction among the nine got signed twice, due to a transient error that was silently resolved by retrying with the same message. Or it could be evidence of pre-theft attack, where someone snuck in a tenth transaction outside the known set with intent to broadcast it in the future.

The missing feature is a more robust, tamper-resistant audit trail maintained internally by the HSM that incorporates hashes being signed. This need not be an append-only log in the traditional sense. For example the same logic used for “extending” PCRs in a TPM can be used to maintain a concise, constant size running tally of all messages ever signed with a given private key.

CP

** Segregated witness complicates this somewhat because the signatures are no longer included in the transaction hash. That removes one of the main sources of unpredictability from the UTXO, leaving only the amounts and mining fees.

Tricky accounting: cyptocurrency mining & energy use

Pinning down the true energy cost of mining

The staggering energy consumption and carbon emissions from Bitcoin mining has finally graduated from Twitter pundits to the national political stage when Senator Warren weighed in with her opinion. Given the amount of ink spilled on this subject, there are plenty of eloquent defenses for the case on both sides. But there are also two common, flawed arguments that are frequently repeated and it is to these that we take up here.

Per-transaction arithmetic

The first flawed argument seeks to “prove” the inefficiency of cryptocurrencies by attempting to derive at per-transaction costs with simple arithmetic. Take the total estimate for yearly energy consumption or implied carbon-emissions (based on reasonable estimates of the energy generation mix— these figures are not controversial) and divide it by the number of transactions that have occurred on the Bitcoin blockchain during that time frame. This simple allocation of cost results in highly dramatic and quotable comparisons such as “the energy used for a single bitcoin transaction could power an average house for a month”

Fail to scale?

Before discussing the problem with this line of reasoning, it is worth also pointing out where it is correct. The calculations do not reflect a temporary inefficiency due to under-utilization. Mining a block requires about the same energy regardless of how many transactions are included. In the worst case scenario a block can have just one lonely transaction: the so-called “coin-base” transaction that is always present and sends the newly minted block rewards to the winning miner. If one were to measure per-transaction costs for such a block, the wasted energy would be even more dramatic by three orders of magnitude. This is similar to the fuel-efficiency of a commercial jetliner: an airplane flying only its pilots with no passengers on-board still consumes almost as much fuel as if it were flying leaden with passengers and cargo. Blocks were already full before segregated witness change indirectly increased capacity. Even if additional changes double or triple the number of transactions that can be processed in a block, it will barely make a dent in the problem if the goal is viewed as reducing the energy of an individual transaction to levels comparable to  credit-card networks. Packing twice as many people into a jetliner will not make it as efficient as a car for short trip. (Layer 2 scaling solutions that aggregate a large number of off-chain payments into a single on-chain transaction could however result in more drastic gains.)

Incomplete attribution

The fundamental error in the per-transaction critique of bitcoin energy consumption is neglecting the other use-cases for a monetary system. To recap, money serves as:

  1. Unit of measure eg for pricing assets
  2. Method of exchange— in other words, making payments
  3. Store of value

It is that final purpose that is being neglected when the utility of bitcoin is only measured in terms of payments. In fact, it is clear that most cryptocurrencies score atrociously on the first two use-cases. Denominating prices in a highly volatile asset results in taking on exchange risks; no wonder most merchants who claim to accept bitcoin are in fact doing so through a payment processor who immediately converts the incoming funds into fiat currency and credits the merchant in dollars. Ubiquitous, peer-to-peer payments may have been an early source of excitement around bitcoin, with utopian visions of disintermediating the Visa/MC/AmEx oligopoly or helping unbanked residents in developing countries get access to the modern economy with nothing more than a mobile wallet app required. That vision has yet to pan out. With the exception of underground markets, fiat currency remains the preferred method of payment despite all of its perceived shortcomings. That leaves final scenario as the one cryptocurrency shines at: digital gold, an inflation hedge against the money-printer going out of control, or according to its detractors, a speculative asset class built around the grater-fool theory of asset valuation.

Accordingly the energy spent on mining can not be exclusively allocated to actual transactions, regardless of how many or few are occurring, or what fraction of those represent meaningful economical exchanges as opposed to shuffling funds around to erase their criminal provenance. A better question is whether the energy consumption and associated CO2 emissions is worth sustaining a new asset class whose market capitalization stood at over a trillion dollars at its peak. In this regard, bitcoin is more similar to a commodity such as gold or even a public company along the lines of Apple or Exxon-Mobil who shares can be purchased for investment purposes. Each of these asset classes can serve as a store of value. Critics may object that Apple and Exxon actually provide “useful” services in addition to having shares you can invest in as a store of value. Yet the alleged utility of those services is in the eye of the beholder. Just as some question whether censorship resistant, peer-to-peer payments are useful outside the context of criminal activity, one could argue the “product” Exxon-Mobil manufactures is in fact a net negative for society. Whether the investment value XOM provides its current shareholders is worth the cost of emissions directly and indirectly attributable to its production activities is equally debatable.

Mining and scarcity

With the problem reframed as storing value instead of payments, bitcoin defenders have gone on the offensive by comparing its CO2 emissions to that of gold-mining. By one estimate, bitcoin mining uses 50% more energy than gold mining while producing about half the emissions due to greater share of renewables in the generation mix. Case closed? Not exactly, for several reasons.

  1. Gold has a market cap 10-20x that of bitcoin, with the wide-range owing to the volatility of bitcoin during the timeframes one may care to sample. For bitcoin to claim parity in carbon-efficiency as store of value, it would have to be not twice but at least 10 times as efficient.
  2. Gold mining much like other industrial processes becomes more efficient over time as improvements in technology allow the same amount of mining and processing to be carried out using fewer inputs, including energy. Bitcoin mining faces a similar competitive pressure for efficiency— every miner wants to maximize the number of tickets to the proof-of-work lottery they can purchase every second using one watt of energy. Those same dynamics do not necessarily apply to total energy consumption. If a miner is profitable at current energy costs and bitcoin prices, when the price of bitcoin doubles it will be still profitable using twice as much energy to continue mining. Granted gold mining has similar incentives in that if prices double, there will be an incentive to throw more inputs into the search for gold. But cryptocurrency prices have appreciated much faster than gold. Even by mildly optimistic projections, another 3-5x appreciation is expected. More importantly, the production of commodities is not controlled by a simple calculus linking energy inputs to profit. Doubling the hash-rate of a cryptocurrency mining pool doubles expected block rewards, plain and simple. Digging twice as many wells does not result in doubling oil-reserves, and neither does using twice as much cyanide to process gold ore yield twice the amount of gold.
  3. The final flaw in the comparison against gold mining is the flip-side of the per-transaction accounting. Cryptocurrency advocates frequently emphasize that mining is there to secure the network, to protect the value of existing cryptocurrency against 51% attacks, censorship and other legerdemain that could result from a single entity taking over a majority of hash-power. But the unspoken corollary of that assertion is that mining can not stop or decrease substantially without undermining those assets. That is in short contrast to commodities. If gold mining activity stopped overnight or De Beers announced no more diamonds are left to dig out of the ground, gold and diamond would still be highly precious. (Arguably they would become even more valuable due to the scarcity implied by that news.) For Bitcoin to hold its value against inflation, mining must continue as a forever-war of pools consuming higher amounts of energy input to feed increasingly more efficient mining rigs to eke out a tiny advantage against competitors.

CP

Designing a duress PIN: covert channels for SSH (part V)

[continued form part IV]

Covert channels with ECDSA

ECDSA signatures are probabilistic, with a random nonce point chosen by the signer comprising half the signature. This potential for covert channels was known early on in the context of plain DSA over the integers, without the “EC” part— later elliptic curve adaptation of the scheme did not materially affect the existence of covert channels.

The core idea is to repeatedly try different nonces until the final signature satisfies some property. For example, suppose the goal is to convey the bit string “1011.” The signer chooses different random nonces and computes the corresponding half of the ECDSA signature. Next an HMAC is run on that result with a symmetric secret shared with the verifier. If HMAC outputs a result ending with the bit pattern “1011,” the signature can be released. Otherwise a new nonce is selected and the search continues. The verifier can extract the same bit pattern by repeating the HMAC calculation on the first half of the received signature

Compared to PSS this trial-and-error approach is very inefficient. It does not operate in constant time. Instead we check random nonces until a predicate is true, with the probability decreasing exponentially in the amount of information being conveyed. Even signaling a single bit of information—was the duress PIN invoked?—  will require 2 tries on average. That means signature times have effectively doubled on average and could get a lot worse if there is an unlucky streak of nonces failing our predicate. (Recall that the most expensive part of an ECDSA computation is the point-multiplication of random nonce with the generator point of the curve. So we are repeating the one step that accounts for the majority of CPU cycles.) One approach is to avoid starting from scratch with a new nonce, and instead building incrementally on the previous result. For example we can repeatedly multiply the current point by 2 or add the generator point until the predicate reports true. Such incremental changes are much cheaper than doing an entire multiplication from scratch. On the other hand, these short-cuts reduce the entropy of the nonce which is critical for the security of ECDSA. Even small information leaks about a nonce aggregated over many signatures can be leveraged for recovering the private key.

There is another way to convey information with ECDSA signatures owing to their malleability property. Specifically if <r, s> is a valid ECDSA signature on a given message, so is <r, -s> where the “negative” value is taken modulo curve order. This looks promising as special-case communication channel for exactly 1 bit: output either <r, +s> or <r, -s> depending on the least-significant bit of HMAC output and the true/false value we intend to convey.

Minor problem: an adversary can easily disrupt this channel. After the card releases a signature, the adversary is free to tamper with the second half without invalidating it. This makes the channel unreliable. Assuming a perfect implementation without side-channel leaks, the adversary will have no way to know for certain whether a duress PIN has been used. But if they suspect so, they can tweak the signature and send it with the opposite sign to disrupt the signal. (Of course, if the card-holder had supplied their true PIN, the adversary will have raised the alarm on themselves by manipulating it.) No such games are possible with PSS: any modification to the signature output from the card will invalidate it. An adversary can always ask the card for another signature on the same message,  but that does not help. As long as the duress PIN is being used, the card will continue to output more valid signatures tainted in exactly the same undetectable manner.

Determinism is in the eye of the beholder

The final type of key supported for SSH— EdDSA— makes for an interesting case. In principle EdDSA signatures are deterministic: signing the same message multiple times outputs the same signature. While there is still a unique nonce for each operation, this nonce is derived as a function of the message, guaranteeing determinism and reproducibility of results. Unlike ECDSA there is no freedom to leak information by playing games with the choice of random nonce.

The catch is that choice of nonce still looks random to external observers. They have no way to determine whether a blackbox signer— namely, the applet running on a smart-card— followed prescribed rules for computing the nonce or diverged from the protocol. (In fact such external verifiability is fundamentally incompatible with the security of EdDSA: if a verifier could predict what the nonce should be for a given message, they can recover the private key.) That creates some leeway for signaling a duress PIN. When a regular PIN is used, the applet follows the exact letter of EdDSA specification. By contrast when a duress PIN is used, a different deterministic scheme is invoked. “Deterministic” being the operational keyword; otherwise the adversary can trivially detect that something is amiss by asking the card to sign same message multiple times and observing different signatures. For that matter, if the adversary has ever witnessed an EdDSA signature on any message produced with the real PIN, they can detect duress PIN usage by asking for another signature on the same message and checking if results are identical.

It remains an open question how such a scheme can operate without side-channels (constant time and identical execution traces, regardless of which PIN is used) and without disclosing the private key. If we remove the latter requirement, there is a trivial solution. EdDSA uses a secret seed for deriving nonces from the message. Suppose the card application maintains two seeds, one private and one shared with the remote server. Ordinary PIN entry results in generation of nonces using the first one, while duress PIN entry switches to the latter. Since the server has a copy of the second seed, it can determine for any given signature which path was taken; the chances of a collisions are negligible. A serious disadvantage to this scheme is that invoking the duress PIN also discloses the private-key to the remote server. Recall that knowledge of nonce used for a signature allows key recovery. As such it is only feasible for closed ecosystems where the disclosure of private-key has no adverse consequences beyond that one remote system.

CP