Digital River, Microsoft and code-signing failures

In the wake of the recent Adobe code-signing debacle, this is a good time to revisit other failure modes of code signing. Recently this blogger tried downloading an evaluation copy of Microsoft Office and noticed a strange warning dialog about the installer being signed by “Digital River.” (Granted, that could make the author one of 5 people in the world paying attention to such warnings– and not proceeding with running the installer as a result.)

Who was Digital River and why would software published by MSFT carry a signature of any corporate entity other than MSFT itself? From a pure authentication perspective, the situation looked indistinguishable from a man-in-the-middle attack, where some nebulous attacker on the network observed the download request and clumsily substituted a Trojaned application instead, hoping the user would not notice the difference. The code was downloaded straight from the official Office website, a page that is not served over SSL. Or perhaps the servers hosting the application had been breached and started distributing malware to unsuspecting users hoping for free copies of Office.

Cursory Googling revealed a more benign, mundane explanation that did not involve malfeasance: Digital River hosts the official online MSFT store, serving as the distribution channel for purchasing software via direct download model. (Top search results include a forum post from 2009 featuring an irate customer titled “digital river does not deserve to be microsoft default agent”) But the same search also turned up a disturbing presentation on F-Secure website by Jarno Niemelä, dating back to 2010. The good news: it confirmed that Digital River does in fact handle software distribution for many published besides MSFT, and even digitally signs the applications on their behalf– that would explain the Authenticode dialog above. That alone does not make it safe to proceed past the dialog: all it means is that trust in the purposed Office installer is only as good as the trust in all other software signed by DR. After all any one of the other applications bearing the identical signature could have been substituted in its place, if the only criteria for establishing trust is that certificate. What else has DR signed? That is the really bad news: DR had been caught signing malware, as well as installers which were effectively open-ended: meta-installers designed to invoke other installers from third-party URLs that DR had no control over.

Vouching for the integrity of applications one has no control over is at best extreme naivete, and at worst, willful negligence under the guise of solving a problem for software publishers. There is no question that unsigned code is a user-experience problem: web browsers and A/V react differently, and present  danger-Will-Robinson warnings when confronted with applications of unknown origin. That is by design. “Solving” that problem by having another company sign anything thrown its way undermines any security benefit of code authentication. These technologies are rooted in the principle that trust– or lack thereof– in software  is derived from trust in the identity/brand of the publisher. When that identity is laundered by having some other entity such as Digital River putting its own brand on the product without conducting due diligence, it removes any semblance of accountability from the original author.

Returning to the example that served as the jumping point for this post, Office derives its credibility from having been authored by MSFT– not by virtue of being distributed by Digital River. That same code carry  exactly same degree of trust regardless of its download location. In fact being signed by Digital River subtracts from the credibility of the code, which is exactly the opposite of intended effect. An up-and-coming software company with no brand recognition might benefit from using the service: after all, anything beats the unsigned  code warning. (But then again, Digital River is not exactly a household name either. The only reasons to prefer that over having  your own certificate could be the cost and difficulty of implementing code signing– just ask Adobe.) In the case of MSFT, there is net loss of trust in the end product.

Luckily Office 2013 preview is also available for download. It carries the expected MSFT signature.

CP

Bringing cloud identity to the PC (2/2)

With Windows 8 released to manufacturing and available for download from MSDN, this is a good time to complete the post on using cloud identity in a traditional PC operating system. As MSFT announced on the Building Windows blog almost a year ago, Windows 8 will support signing in with Windows Live ID, now rebranded as Microsoft ID. Instead of creating local accounts, users can now authenticate to a Windows 8 machine using their existing cloud account.

Of course such integration is far from novel, with many examples of familiar consumer devices that had tight integration with a cloud authentication service, in some cases requiring that users authenticate with such an account to setup the device in the first place:

  •  iOS and its use of an Apple account on iPod/iPhone/iPad
  • Android and its integration with Google accounts systems. In fact Android has an extensible account manager concept: it allows defining additional cloud identity providers by having installed applications act as account authenticators, which can be invoked by any other app. (Looked another way, Android re-invented SSPI model that Windows supported since NT4 but never quite at the level of interchangeability its designers hoped– no new ideas under the sun.)
  • More recently Chrome OS and similar integration with Google accounts

In all cases, this identity becomes an integral part of device functionality when accessing cloud-based functionality: for example it is used to backup settings, migrate to new device, download email and calendar entries, make purchases in the respective app markets. This requires a level of integration between the OS and applications, such that after logging into the OS once, the user is automatically also logged into cloud services without having to explicitly type their password again. Without such automatic transfer of authentication state, the initial login would become pure window dressing that only grants access to local system resources. Luckily such seamless integration exists in Windows 8: after logging in, the mail application transparently downloads mail from Hotmail, Sky Drive can access saved files, Messenger can display presence information for contacts and Internet Explorer can open web pages requiring Live ID as already authenticated. In fact since as long as the functionality is implemented as standard SSP, it becomes available to third-party applications to use for creating apps that access user-data stored in the MSFT cloud.

There are also differences: first one is that Windows supports local accounts and the user may be upgrading a Windows 7 box– because nobody is running Vista– already configured with one. This introduces a requirement to retroactively associate an existing account with a cloud identity. Mobile devices started out with the assumption of cloud connectivity, and a clean slate to define their identity scheme. Second the user experience is different: for mobile devices user authentication is rare for good reasons: phones have awful virtual keyboards that make typing plain English painful, much less a strong password that containing  random mixture of symbols and digits. (While Android screen-lock can be configured with a passphrase, this is logically not the same as the Google account password.) With Windows 8 and Chrome OS even unlocking the screen locally can involve some type of authentication, making this ritual more frequent. That also creates a challenge in having to support offline mode: since the device may not have network connectivity at all times, it still has to authenticate the user’s cloud identity without the benefit of reaching the cloud.

Offline mode is not a new problem, as similar issues existed for the bread-and-butter protocols Windows supported before (NTLM and Kerberos) and can be solved by locally caching password hashes, at the well-known risk of introducing brute-force attacks against these cached copies. But some credentials can not be checked offline: an example is the one-time password or OTP codes used for Google 2-step verification: since these are meant to be dynamically generated each time, caching is not applicable and only Google knows what the next code in the sequence is. MSFT has a different concept called single-use codes for Live ID, which is not a secondary factor but replaces the password. It is unclear if these still work for login in connected state; they will likely not work for offline mode.

Stepping back, such tight-coupling between the OS and a particular cloud-identity provider also creates a natural “nudge” for users to favor cloud services authenticated by that identity, since the applications “just work” naturally without additional setup. Consider the difference between having to sign-in to a third-party email or instant-email service, verses going with the path of least resistance using the built-in variant that is automatically signed in. Granted most applications “solve” this problem with a strong bias for saving passwords (as well as annoying opt-out settings to automatically launch as soon as  the user logs in) This may level the playing field for user experience at the expense of security: instead of refreshing credentials over time, they rely on a password or long-lived token to create the illusion of automatic sign-in. Of course in the case of Windows 8, those cached credentials are already at the mercy of Live ID if the user enables one of the highly  touted-features: synchronization of saved passwords across multiple machines, as long as the user is signing in with the same Live ID, similar to Chrome synchronizing website passwords.

 

CP

NFC in US passports– verifying the random ID

One final post to conclude the series on reading data from recent US passports with an Android phone. In this post, we will look at the way the “unique ID” or UID emitted by the chip varies each time the chip is brought into the presence of an RF field.

Every NFC tag has a unique identifier that is burnt-in at the factory and constant throughout the lifetime of the hardware. Contrary to mistaken impressions, this identifier is intended for anti-collision eg distinguishing multiple tags when they are all in the presence of an RF field, rather than security applications to authenticate the tag. Devices such as ProxMark can forge an arbitrary UID. There are even off-the-shelf counterfeit MIFARE tags that allow overwriting the UID while still preserving the desired form factor.

While the UID falls short of being a reliable way to authenticate a particular tag, it is still problematic for privacy because it constitutes a persistent identifier that can be used for tracking. Each time the tag is scanned, it emits a constant value that permit correlating with previous times when that tag was scanned– this is true regardless of the higher level transaction. For example even a blank, unused tag completely void of data emits a UID. (Introducing application level protocols on top of the basic NFC transport can only make privacy worse: for example the standard contactless payment protocols will transmit stable identifiers such as the credit card number that are far less privacy-friendly, because unlike UID they can be correlated to many existing databases.)

This is where random UIDs enter into the picture. Instead of emitting the same UID, the hardware can be configured to generate a different one on each activation eg each time the tag is brought into the range of an RF field. A specific range of four-byte UIDs starting with 0x08 is reserved to designate such UID, and to disambiguate that from fixed UID assignments. The US passport is an example of hardware using this feature.

Going back to our NFC TagInfo application, scanning the same passport twice– removing it from the RF field of the phone between the two scans– shows the UID indeed changing:

    

CP

 

Case study on the perils of identity federation

This forceful critique (to put it mildly) of OpenID from a website/business owner perspective highlights one of the main leaps of faiths involved in federation: taking a dependency on a third party for the well-being of your own business.

There is a lot going on in the debacle described in the original post. Some of them could be attributed to “implementation issues,” the vague catch-all category that is the equivalent of “pilot error” we fall back on to explain away incidents without attributing a systematic cause: JanRain randomly changing APIs without proper communication, Google changing the  identifier returned, inconsistency between user profiles returned by different OpenID providers etc. These are not supposed to happen– better change tracking could have prevented some of the bone-headed mistakes involved. Instability of the OpenID standard and general lack of interoperability among implementations is an unfortunate outcome of the highly politicized standards process that results from reluctant bringing together avowed enemies to the negotiating table. (Inexplicably the US government has decided to throw its weight behind this already hobbled standard, by empowering the Nationals Institute of Health to work on a pilot program for federal adoption.) But again this is business as usual in trying to forge consensus for Internet standards, and not intrinsic to the problem of OpenID in particular or interoperability in general.

At the same time there are deeper issues at play, and these are inimical to any identity federation scheme . To quote the metaphor used by the original author:

[…] of all the failure points in your business – you really don’t want the door to be locked while you stand behind the counter waiting for business. No, let me rephrase that: you don’t want the door jammed shut, completely unopenable while your customers wait outside – irate that you won’t let them in.

Put simply, when users login to your website using a third-party identity provider (“IDP”) your business is at the mercy of that provider. If they experience a service outage, users can not login to your website either. If they decide to experiment with brand new user interface that confuses half the users, your website loses traffic.

Some of the risks can be mitigated contractually. For example the IDP could commit to a particular service level agreement, saying they have an expected uptime of 99.99%. But no IDP in existence is willing to shoulder the burden of full liability for losses incurred at relying party sites. Your website can make a compelling case that the inability to authenticate new users for an hour has resulted in loss of a thousand dollars, going by historic traffic patterns. The most you are likely to get out of the IDP are profuse, heartfelt apologies and at best a refund for that month. The incentives are highly asymmetric.

One could argue that specialization and economies of scale will compensate for this: JanRain is presumably handling authentication for thousands of web sites. So they are in a position to invest in very high-reliability infrastructure and maintain strong security posture. In principle then they are less likely to experience an outage (compared to what each relying party is capable of) less likely to get breached in an embarrassing manner as Gawker recently managed to, and more likely to respond to security incidents quickly in the worst case scenario. On the other hand, as probabilities for catastrophic failure decrease, the damage potential from that failure is going way up. An outage or breach at JanRain impacts not just the author of that blog post, but every other business using the OpenID interop service. More importantly, this is not a linear function of number of users: scale attracts scrutiny, both from white hat researchers and black-hats looking to capitalize on a lucrative target.

The above scenario only considered unintentional outages. What about cases where service is withheld on purpose? Presumably the IDP is getting paid by the site for their service. What happens when it is time to renew the contract? What if negotiations with the IDP go south and they decide to hold your users “hostage,” by refusing to authenticate them to any RP except yours until you agree to the higher price? If users are only known by their external identity, it is going to be very difficult to reestablish the link. The article quoted above describes the escape hatch required: collecting email address from users, so they can be authenticated independently, presumably by verifying their email. Of course that obviates one of the arguments for OpenID, namely individual websites no longer have to worry about the complexity and cost of operating their own authentication system. It turns out this is what the original post concluded, changing the site to nudge new users to their in-house authentication system instead of promoting OpenID.

CP

Identity as externality: Trustbearer, CAC, eID

TrustBearer has become the first public demonstration of an idea this blogger first described in a ThinkWeek paper in 2006: identity management systems create positive externalities. Once built for a purpose, they are often easily extended, adopted or co-opted for completely different objectives. This pattern predates the Web, PKI and even the development of modern computing systems. The classic example is the social security number. Originally introduced by the FDR’s New Deal-era Social Security Administration for the purpose of administering benefits, it has become the de facto identifier for everything from credit rating agencies to some badly designed online banking websites; Fidelity originally used SSN as “username” but later changed the system to allow for choosing nicknames. Drivers licenses were introduced to control who can drive vehicles on public roads. When laws introduced minimum drinking age and imposed penalties for serving to minors, bars found it the natural choice to decide who gets to order drinks. (A bartender in Seattle once declined to server this blogger due to an expired driver’s license.)

Not all of these extensions are necessarily good ideas. In particular the re-purposing of the social security number from a simple identifier into a credential– something that proves identity, never intended in the original design– created  the current identity theft mess. In another example, RFID tags are a primitive identity management system designed for tracking inventory; the tag identifies the object it is attached. But when the tags are not deactivated after they are sold to consumers, they can be repurposed for tracking. Each tag emits a constant identifier that can be scanned by anyone with the appropriate transmitter and receiver set up, allowing tracking of individuals in physical space.

Occasionally unofficial extensions to an identity system provides unexpected benefits. Typically there is a very large upfront investment in deploying a system, driven by a well-defined objective. But once the system is built, adding one more person who can use it, or one more website which uses that system for authentication has a small marginal cost. Take for example the Common Access Card or CAC, soon to be replaced by the PIV. These are both PKI systems managed by the Department of Defense, for the purpose of controlling access to systems with national security implications. But once the PKI deployment is operational, individuals have been issued their cards and smart-card readers, they can be used for purposes completely unrelated to defense sector. Case in point: TrustBearer’s OpenID service accepts CAC/PIV cards for authentication to any OpenID enabled relying site. DoD certainly did not design the system for employees to check their personal email accounts or write blog comments in spare time. But given that the smart-cards were already out there in the hands of users, it was a no-brainer for TrustBearer to accept these credentials for strong authentication. Any other website could have done the same: called “SSL client authentication,” the underlying functionality has been supported by web browsers and web servers in some fashion since the late 1990s. The user interface may be clunky because it is rarely seen outside the enterprise context, but all it takes is tweaking some settings in IIS or Apache. The Department of Defense created a positive externality for all websites.

Design matters of course: some technologies are far more amenable to being re-purposed this way. For example, Kerberos is inherently a closed system: adding another relying party requires coordinating with the people in charge. Public-key infrastructure is open by design: once a digital certificate is issued, people can use it to authenticate anywhere. There are still gotchas: revocation checking imposes costs on the identity provider (adding another relying party is not a free lunch when it is hammering the system with revocation checks) or it may not work at all for an entity “outside” the official scope. Some new protocols such as OCSP stapling address that by making freshness proofs portable. More important is the question of acceptable use policy. Just because the cryptography works out does not mean that the official owner of the identity system will approve the creative re-purposing.

That brings us to the European eID deployments. These are national ID systems, with the cards containing PKI credentials. Here is one case where a PKI based system funded by tax-payer money is built with the express intent that anyone can use it for authentication to their service. (This is what governments do after all– they generate externalities, much to the chagrin of libertarians.) Not surprisingly eID cards are also accepted by TrustBearer– specifically Belgian eID. This is an even greater externality because there are bound to be many more of them in existence even today, and this will only improve over time as other EU governments make progress on their deployment. On the other hand, the precedent for using eID online is scarce and chances are most users lack the required card-readers and drivers, while the CAC/PIV users already use their cards regularly in a professional context.

cemp

New York Times badly confused on identity management

Goodbye Passwords is that rare misstep form the otherwise consistently solid Digital Domain section in the Sunday NYT: confused, misinformed and way off base. Among the several muddled arguments, four of them stand out:

1. Equating OpenID to passwords.

“OpenID offers, at best, a little convenience, and ignores the security vulnerability inherent in the process of typing a password into someone else’s Web site.”

Minor factual error: actually the password is not being typed into a random website. It is supposed to be provided only to the website where the identity was originally created, not the website where it is being used. But the general difficulty of determining whether one indeed starting at the authentic site instead of a fraudulent replace– especially when the user has been sent there by the “someone else’s Web site” in question leads to the standard critique of OpenID as increasing phishing risks.

Major factual error: OpenID is a federation standard, not a new user authentication approach. It does not mandate passwords or any other scheme for verifying identity. Open ID 2.0 specification is loud and clear on this point:

“Methods of identifying authorized end users and obtaining approval to return an OpenID Authentication assertion are beyond the scope of this specification.”

That means the identity provider can choose to use good old-fashioned passwords, smart-cards, biometrics or experimental approaches such as reading tea-leaves to authenticate the user; OpenID is silent on this. In fact one of the more hyped extensions to the protocol, added at the urging of MSFT which has been desperately trying to promote CardSpace, is a way for signaling to websites that the user authenticated with credentials resistant to phishing— Infocards in the original vision that carved out this niche case, but also more generally strong authentication mechanisms such as PKI capable smart-cards.

2. Narrow definition of single sign-on:

OpenID promotes “Single Sign-On”: with it, logging on to one OpenID Web site with one password will grant entrance during that session to all Web sites that accept OpenID credentials.

In the most general sense, single sign-on refers to one identity being valid for accessing multiple systems. This is in contrast to the current state of affairs on the web: most websites have their own notions of user identities, requiring users to create a new account. Each account is valid at exactly one website and not recognized anywhere else. Single sign-on (“federation” using the fashionable term) is about merging these disconnected islands of identity such that the scope of an identity can extend beyond that one site.

Quick peek at the Wikipedia entry would have hinted that SSO is not tied to passwords. So it comes as surprise that a Microsoft architect is quoted as criticizing SSO. Cardspace is an instance of single sign-on: the vision calls for one identity held by the user’s machine to be usable for logging into any number of websites. Inside the enterprise, Active Directory is single sign-on because it allows the same credentials to be used for accessing everything from logging into a workstation with the three-finger salute to accessing email or HR systems.

3. Misconception that “information card” is a generic term-of-art as it relates to identity management. Information card, or infocard to use the original name for the technology before it was rebranded into CardSpace, is a particular proposal that defines specific formats and protocols for identity management. Writing about “the information cards” makes about as much sense as writing about “the Facebooks” and “the Googles.” Each is a specific incarnation of a general concept: a social networking site, a search engine and an identity management protocol.

4. No hint of the history of strong authentication or alternatives. A reader may walk away from this article with the impression no realistic alternatives to passwords existed until Cardspace magically burst on the scene. Basic fact checking would have unearthed some not entirely obscure facts: there is a concept of digital certificates dating back to the 1970s, leveraging the same brew of “hard to break cryptography” whose virtues are extolled in the article. Since late 1990s, digital certificates have been standardized by X509, a stable and widely implemented supported format. It would be a small jump from there to realize that the SSL protocol universally used for securing communications online has provisions for users to verify their identity with digital certificates and that many large organizations, including the United States Department of Defense have been depending on this capability for years.

This is not to say that there are not good points in the article. OpenID is a major distraction and duplication of effort precisely because it is a mediocre reinvention of the wheel, ignoring all the investments made towards deploying PKI on the web compliments of SSL and muddying the waters one more time just when there was a fighting chance that the industry might converge on a standard (SAML, far from perfect as it may be) as the underlying format for identity assertions. But it is a non-sequitur to argue that OpenID is doomed because of its dependence on passwords and inherent problems with single sign-on.

cemp

Cherry-picking identity providers in the open eco-system

Recap from a story developing last week:

  • MSFT announced that it was accepting OpenIDs for the new HealthVault service, a cloud-based solution for managing health records. But not just any OpenID: only accounts issued by Trustbearer and Verisign are accepted. Both companies have two-factor authentication with portable hardware tokens.
  • The blog ConnectID objected to the restriction, claiming that it violates the spirit of “open” in OpenID. Why is the user not free to choose any identity he/she prefers to use?
  • MSFT’s identity architect fired back, joined by another blogger, both arguing that cherry-picking identity providers is fair game.

Underlying this exchange is a misunderstanding: agreement on protocols is necessary but not sufficient for identity federation. Accepting an identity issued by another company is a risk management decision– or under a broader perspective, it is a business decision. The mere fact that the aspiring ID provider has successfully implemented some protocol, is compliant with this other standard or runs the most popular software package for authentication is not enough.

Authentication is a security-critical function. Getting it wrong leaves any resource protected by that system vulnerable. And if something does break, it will always be the service provider’s problem downstream, even they are provably not at fault. Suppose that HealthVault accepted identities from Keys-Are-Us, a hypothetical incompetent OpenID provider operating out of a basement. This is an external dependency; when Keys-Are-Us makes an assertion about the identity of the user, HealthVault will accept that assertion on face value and provide access to controlled resources such as health records. This is essentially betting on the ability of this shady outfit to properly run an identity management system. If Keys-Are-Us experiences a security breach, and the health records accessed by unauthorized persons as a result, MSFT is still on the hook. Yes, in principle it was not their fault: Keys-Are-Us made the error. But try getting that message across to the media and blogosphere pouncing on the incident as another indication of everything that is wrong with the Internet. More importantly, by agreeing to accept identities from Keys-Are-Us, HealthVault is implicated in the risk management decision.

Case in point, HealthVault accepts Windows Live ID, the identity management service operated by MSFT. (Full disclosure: this blogger worked on WLID security in a former life.) Because both of these organizations roll up to the same corporate entity, HealthVault designers have visibility into and more importantly, influence over the risks of accepting these identities. Similarly the Verisign and Trustbearer systems are known quantities, and their reliance on hardware tokens makes it possible to gauge the security assurance level in a way that is not possible for random OpenID provider.

cemp

LifeLock: the plot thickens

(Follow-up from earlier post)

The past few weeks had more developments on the story of LifeLock, the company that promises identity theft protection and challenges would-be criminals with the social security number of the CEO. New York Times published an article on May 24th covering this story. The overall tone of the article is fairly negative on the value proposition of this service:

“…a fraud alert is more like a burglar alarm. And if the alert repeatedly fires off false alarms, forcing creditors to constantly double-check the identities of LifeLock customers who have never been victims of fraud, it is possible that those credit issuers will pay less attention to them. Experian is so worried about this, along with other issues, that it has filed suit against LifeLock.”

Strangely the company has found a new ally in Bruce Schneier who came out swinging in defense of LifeLock.  BS portrays the issue purely as a conflict of business models between the triumvirate of credit reporting bureaus (Equifax, Experian and TransUnion) and Lifelock. Credit reporting agencies prefer that the process of completing a credit check and clearing an applicant is easy. Lifelock’s mission in life is to make that process as difficult as possible for the lender, in order to reduce the risk that the application was fraudulent.

“The reason lenders don’t routinely verify your identity before issuing you credit is that it takes time, costs money and is one more hurdle between you and another credit card. (Buy, buy, buy — it’s the American way.) So in the eyes of credit bureaus, LifeLock’s customers are inferior goods; selling their data isn’t as valuable. LifeLock also opts its customers out of pre-approved credit card offers, further making them less valuable in the eyes of credit bureaus.”

And later in the same approving vein: (links in the original)

“It’s pretty ironic of the credit bureaus to attack LifeLock on its marketing practices, since they know all about profiting from the fear of identity theft. Facta also forced the credit bureaus to give Americans a free credit report once a year upon request. Through deceptive marketing techniques, they’ve turned this requirement into a multimillion-dollar business.”

One point where everyone is in agreement is that the services are not worth it from a purely financial point of view. Most of the actions taken on behalf of subscribers by the commercial services can also be taken by individuals directly for free. Convenience is the main selling point. For example anyone can request to have an alert put on their credit file but these expire after 90 days.

The original Wired article covering allegations that the service does not work appears to have been removed. Not to worry: Kim Zetter (full disclosure– she is a friend) writing on the ThreatLevel blog has missile lock on the company. In a series of posts, she highlighted an original piece from the Phoenix New Times that surfaced questionable past connections of the co-founder. LifeLock announced in response that he was resigning from the company.

cemp

Making sense of identity management statistics

There is lies, statistics and identity management figures.

Are there a quarter billion OpenIDs? That would be the conclusion suggested by an announcement from OpenID website two months ago. How many of those users have actually used the OpenID protocol even once when authenticating anywhere? For that matter what percent even know what an OpenID is? This has been a major problem with any identity system that spans multiple sites. Users at this point have been trained to lower their expectations, and come to terms with islands of disconnected identity: each username/password works on one website only. Any system where users can authenticate to more than one relying party is confronted with the challenge of explaining this to users. (For example: “If you have a Hotmail or Messenger account, then you have a .Net Passport.”)

Does having 50% of desktops with Cardspace bits represent a tipping point for the technology to magically take off? By this logic, passwords ought to have been about as archaic as the vinyl record because nearly 100% of desktops have supported TLS client authentication and smart-cards since 2000. Even if we disregard Firefox and PKCS11 based interface and focus on IE running on Windows only, that is over 80% of all consumer PCs. Why isn’t everyone authenticating with digital certificates as the PKI vendors have  prophesied for the past decade?

cemp

Scraping, or how to weaken authentication systems

The current issue of Wired is running an article on “scraping” or recovering data from other online services. It tries to paint a balanced picture of why large providers including Craiglist have been highly ambivalent about the practice, welcoming the increased attention/relevance but also agonizing over the increased load on the system, as well as lost revenue opportunities when the data is monetized by a free-loader. (In the case of Craigslist, the website that mined/reformatted listings  was shut out because it featured Google Adsense, violating the prohibition against commercial use of data.)

One point the article glossed over is the distinction between scraping public vs. private data. Many websites do not require any type of authentication prior to retrieving data. Craiglist is an example: posting a classified may require login but viewing the listings does not. By contrast, scraping address-book contacts from an email provider such as Hotmail is not possible unless authorized by the user. The way Facebook and other invasive websites accomplish this is by asking the user for their credentials and then logging in as that user behind the scenes to access personal data.

This is a very bad idea for many reasons explained elsewhere as well, all of which boil down to the observation that sharing a credential with a 3rd party weakens the identity management system. Hotmail passwords (more precisely, Windows Live ID since that is the single sign-on solution used by MSFT properties) are intended for only WLID and the user. Having any other entity in possession of this information nothing more than unnecessary attack surface. To pick on the Facebook example used in the article: did Facebook delete that credential after importing the user’s contacts from Live Mail/Yahoo/GMail etc? Or did it save a copy for future scraping excursions?  Did it make a good-intentioned attempt to delete it but instead ended up writing it to log files replicated around the world, visible for any employee to see?

There is no way to know, and that is the problem. In defense of Facebook, part of the problem is that the protocols required to “do the right thing” for security did not exist until recently. Importing contacts is an authorization problem: grant Facebook access to data stored about the user by a 3rd party such as Yahoo. There is a deceptively simple solution: give Facebook the password and it can “become” the user, accessing any information it needs. As well as information it did not need:  contents of email message, RSS feeds on Live homepage, roaming favorites, XBox Live account, travel itineraries at Expedia and in the future even personal files stored in the cloud. And it need not stop at simply importing information: it can also delete contacts, spam your friends with advertisements that appear to originate from you, post enthusiastic, ghost-written endorsements of Facebook to your Spaces blog. The damage potential is open-ended by virtue of Passport/Live ID being a multi-site authentication system, making it the worst-case scenario in case Facebook proves malicious or more likely incompetent, in keeping with Robert Heinlein’s principle. There is no reason to suspect Facebook is doing any of this but there is no way to know either. Most online services do not expose transaction history to users; it’s not possible to check if another entity capable of acting as your Doppelganger has been rummaging around your personal data.

In other words sharing the password violates the principle of least privilege: it may solve the immediate problem but it grants the 3rd party unchecked authority greatly exceeding what was justifiable. This confusion around authentication vs. authorization is everywhere. In order to authorize access, it is not necessary for the other person to be able to authenticate as you. (That is the end result from sharing the password but also other schemes such as constrained delegation, where  a more constrained type of impersonation occurs without the password getting shared.) OAuth is a new protocol designed to address this problem. It’s built around the idea of one service asking for permission from a user to access his/her data stored by another service. The data custodian is still responsible for the permissions and UI for granting/revoking them and the requesting site authenticates as itself instead of “cloaking” itself in user credentials. It remains to be seen whether OAuth will succeed in replacing other proprietary solutions along same lines.

cemp