Scraping, or how to weaken authentication systems

The current issue of Wired is running an article on “scraping” or recovering data from other online services. It tries to paint a balanced picture of why large providers including Craiglist have been highly ambivalent about the practice, welcoming the increased attention/relevance but also agonizing over the increased load on the system, as well as lost revenue opportunities when the data is monetized by a free-loader. (In the case of Craigslist, the website that mined/reformatted listings was shut out because it featured Google Adsense, violating the prohibition against commercial use of data.)

One point the article glossed over is the distinction between scraping public vs. private data. Many websites do not require any type of authentication prior to retrieving data. Craiglist is an example: posting a classified may require login but viewing the listings does not. By contrast, scraping address-book contacts from an email provider such as Hotmail is not possible unless authorized by the user. The way Facebook and other invasive websites accomplish this is by asking the user for their credentials and then logging in as that user behind the scenes to access personal data.

This is a very bad idea for many reasons explained elsewhere as well, all of which boil down to the observation that sharing a credential with a 3rd party weakens the identity management system. Hotmail passwords (more precisely, Windows Live ID since that is the single sign-on solution used by MSFT properties) are intended for only WLID and the user. Having any other entity in possession of this information nothing more than unnecessary attack surface. To pick on the Facebook example used in the article: did Facebook delete that credential after importing the user’s contacts from Live Mail/Yahoo/GMail etc? Or did it save a copy for future scraping excursions? Did it make a good-intentioned attempt to delete it but instead ended up writing it to log files replicated around the world, visible for any employee to see?

There is no way to know, and that is the problem. In defense of Facebook, part of the problem is that the protocols required to “do the right thing” for security did not exist until recently. Importing contacts is an authorization problem: grant Facebook access to data stored about the user by a 3rd party such as Yahoo. There is a deceptively simple solution: give Facebook the password and it can “become” the user, accessing any information it needs. As well as information it did not need: contents of email message, RSS feeds on Live homepage, roaming favorites, XBox Live account, travel itineraries at Expedia and in the future even personal files stored in the cloud. And it need not stop at simply importing information: it can also delete contacts, spam your friends with advertisements that appear to originate from you, post enthusiastic, ghost-written endorsements of Facebook to your Spaces blog. The damage potential is open-ended by virtue of Passport/Live ID being a multi-site authentication system, making it the worst-case scenario in case Facebook proves malicious or more likely incompetent, in keeping with Robert Heinlein’s principle. There is no reason to suspect Facebook is doing any of this but there is no way to know either. Most online services do not expose transaction history to users; it’s not possible to check if another entity capable of acting as your Doppelganger has been rummaging around your personal data.

In other words sharing the password violates the principle of least privilege: it may solve the immediate problem but it grants the 3rd party unchecked authority greatly exceeding what was justifiable. This confusion around authentication vs. authorization is everywhere. In order to authorize access, it is not necessary for the other person to be able to authenticate as you. (That is the end result from sharing the password but also other schemes such as constrained delegation, where a more constrained type of impersonation occurs without the password getting shared.) OAuth is a new protocol designed to address this problem. It’s built around the idea of one service asking for permission from a user to access his/her data stored by another service. The data custodian is still responsible for the permissions and UI for granting/revoking them and the requesting site authenticates as itself instead of “cloaking” itself in user credentials. It remains to be seen whether OAuth will succeed in replacing other proprietary solutions along same lines.

cemp

2007: Is the tide turning for green technologies?

A collection of disparate and unrelated headlines:

House approves an energy bill to boost fuel-economy standards to 35MPG by the year 2020, over loud protests by domestic and foreign manufacturers. Unthinkable until a few years ago, no other action by the legislature could have sent a stronger signal to Detroit that their influence/lobbying power is waning and their days are numbered.
Google unveils the cryptically named “RE<C” initiative. It stands for renewable energy cheaper than coal. In other words making clean energy sources competitive with the cheapest and ecologically worst option, the abundant coal deposits supplying 50% of US electricity currently. This is the first time a company with significant resources is going beyond the standard-operating-procedure of hand-wringing over the economic incentives favoring coal.
Formula 1 decides to go green, announcing a ban on further engine development to focus on realizing higher efficiencies. One example according to the Wired article: kinetic-energy recovery systems, which improve on the regenerative brakes found on existing hybrids today, are expected to appear in 2009. F1 racing may sound remote from everyday concern but the trickle-down effect is responsible for ubiquity of antilock brakes and traction control, as well as more exotic options like clutchless manual transmissions.
Ferrari announces that the company will improve its fuel economy 40% across the line. It’s largely symbolic: while the cars routinely make the worst offender on EPA lists every year, there are very few on the road and likely they are not getting a lot of miles as Ecogeek points out. Total savings will be negligible. But the fact that a company operating in a unique niche market with captive audience, completely immune to mainstream trends is still pursuing a greener image speaks volumes.
Living With Ed, a reality show focused on an ecologically-minded actor and his more pragmatically inclined spouse debuts on HGTV channel.
By a single vote margin the Supreme Court decides that EPA can in fact regulate carbon emissions from automobiles, sweeping aisde “creative” interpretations of existing law as guaranteeing an inalienable right to pollute.
There are signs that price elasticity may exist after all when it comes to fuel prices: CNN/Money reports that drivers are cutting back as gas hovers around the magical $3 level. More mysteriously gasoline prices at the pump are not keeping up with the stratospheric rise of light-sweet crude in the barrel. In all previous price hikes, refiners were quick to dismiss allegations of price-gouging by arguing that price at the station directly follows from the underlying commodity prices. Oil briefly hit three digits a barrel but gas prices barely moved– because the demand is soft. Nobody is complaining or asking why they are not paying more. But it’s too early to declare the end of the SUV-era. As a former colleague pointed out suppressed demand may be temporary fallout from the credit-crunch. It’s too early to conclude that a renewed price-sensitivity has emerged.

On the downside:

Climate change meeting in Bali ends on a not-entirely-negative note. This is an improvement over the last time United States threw a wrench into the Kyoto agreement by rejecting the provisions after joining as a signatory first. Resulting agreement has no teeth, after a binding commitment for developed countries to cut emissions is dropped in favor of wishy-washy language about good intentions, best effort, sunshine-and-clear-skies.

cemp

Reflections on Comcast vs. SlingMedia

(Context: follow-up on earlier post about Comcast traffic-shaping on upstream bandwidth and its impact on using Slingbox for location-shifting.)Re-distributing content is a controversial subject in this day-and-age of copyright thugs, DMCA cease-and-desist letters and trigger-happy boutique litigation firms. SlingBox incorporates some design features (read: reduced functionality) to mitigate legal risk. Only one person can stream content at a time, preventing rampant sharing of cable content. There is still the possibility of extending functionality to other users when accepting analog inputs, because one user can stream a channel from a computer while another person in front of the TV could watch a different channel. The versions intended for digital cable and DVRs also allow simultaneous watching but the remote and local user would have to fight over the channel. Of course there is the possibility of using a screen-sharing solution along the lines of VNC to “split” the video stream virtually. Realistically the bandwidth available to most users is not enough to generate a usable picture that way.

The parent company has managed to avoid ending up in court thus far. News reports suggest that SlingMedia has been on the radar for content industry from the start. In 2006 the company pre-emptively sponsored the EFF Pioneer Awards ceremony at the Computers, Freedom & Privacy conference in Washington, DC. This was a departure from standard practice when start-ups ignore digital activism groups until they are looking at an expensive litigation. SlingMedia’s recent acquisition by the satellite TV network EchoStar may have lent the firm new found legitimacy, and perhaps additional resentment from cable operators.

From the point of view of cable and satellite providers, the Slingbox is disruptive technology, representing potential loss of revenue because of its ability to space-shift content. Today that risk is minimal. Only customers with multiple residences or travelling frequently benefit from maintaining a single subscription and “slinging” the content over on the road. Even that assumes they would be paying for the content twice otherwise; often hotels have extensive channels included and there is always the option of watching the big game at the neighborhod bars. Picture quality is often sub-par even for non-HD content. Viewing experience on a laptop or PC may not be acceptable to customers used to large-screen TVs and it’s not always an option to . On the other hand, the commercial success of video for tiny devices like the iPod suggests that hurdle may be easier to clear.

When extrapolated to higher bandwidth future, the biggest disruption is making content a commodity that can be moved around. This breaks the assumption of one household equals one subscription. Granted it’s unlikely that complex arrangements required to leverage these efficiencies can be developed– eg 50 people sharing 30 subscriptions on the assumption that <60% are watching at anytime, a FlexCar model for cable. But space-shifting already breaks a number of lucrative practices such as discriminatory pricing by market: charging more for the same cable in one area because of the willingness of customers in that zipcode to pay. Similarly blackouts on Internet programming such as the MLB.TV restrictions on local market games are no sustainable given the option of streaming the TV broadcast from a remote location.

Comcast as a provider of TV, Internet and phone services has two incentives for interfering with Slingbox:

Upholding service quality for other subscribers. Streaming uses significant upstream traffic and the provider typically can not sustain the stated bandwidth for all users at all times. In this regard they are similar to banks: while any one user can empty their account, every user trying to do the same would spell trouble. For this argument to hold, the traffic shaping must be applied indiscriminately to all upstream bandwidth regardless of protocol. If YouTube uploads are not exempt or get higher quota, this argument breaks down. The network does not care whether congestion is caused by an Ingmar Bergman masterpiece or the kid next door.
Blocking reuse of the subscription at another location. The business case for this is unclear. The signal has already been paid for and the remote location may not even be a Comcast service area. Preventing the recipient from getting decent video quality would not necessarily bring in one more cable subscriber. Rational behavior is to block inbound Slingbox traffic for Comcast internet users– because that customer clearly can purchase the same content. But Comcast gets no direct benefit from restricting the video stream going to a Time-Warner customer. The alternative, a gentlemans’s agreement between ISPs to block streaming to each other’s networks would likely be considered illegal collusion.

cemp

Is Comcast throttling all upstream bandwidth?

Comcast Inc. may have cast a much wider net in their effort to ~~bring customers inline~~ increase subscriber value. The Slingbox is set to become the latest example of collateral damage in the war against user content.

Quick recap: Slingbox is the generic name for a family of special-purpose devices that can stream TV content for remote viewing. In the same way that VCRs and DVRs allow time-shiftin watching a live broadcast at a different time, the Slingbox allows for “space-shifting” by watching content at the same time from a different place than the physical location of the cable connection or satellite dish. SlingPlayer application available on Windows, Mac and smart-phones allows connecting to the device from any Internet connection and streaming almost the same video/sound that one would see see watching television in comfort of the living room. “Almost” being the operative keyword, because video quality or how closely the streamed content approximate the original, is crucially dependent on available bandwidth. That includes both the upstream bandwidth available on the connection where the SlingBox is located and the downstream bandwidth at the remote location where the traveling customer is trying to tune in to his local TV station. As noted earlier here, downstream bandwidth is usually abundant while upstream bandwidth is the scarce commodity and the expected bottleneck for scenarios involving streaming from home. SlingBox FAQ notes that about 250-300kbps is the minimum recommended bandwidth. That turns out to be an understatement similar to Vista minimum hardware requirements. In this bloggers’s experience ~500kbps is required to avoid compression artifacts and closer to 800 kbps is called for when the signal is intended for display on a TV at standard watching distances instead of a tiny window on a laptop screen.

This is where the Comcast story comes in, only weeks after the company finally admitted to interfering with the operation of BitTorrent protocol. Recent experiments on trying to stream content from a Slingbox attached to a residential Comcast broadband line suggests that the traffic-shaping may be more widespread than peer-to-peer alone. SlingPlayer uses a sophisticated, adoptive algorithm to optimize image quality for the maximum available bandwidth on any given connection. It starts out by streaming a few frames at low quality, successively increasing the transmission rate until the channel is close to saturation or the client can not keep up with the decompression.
When streaming from a wireless home network where the Slingbox is located, bandwidth peaks out 2-3 Mbps and the image quality is very good. In a more representative scenario, during 2006 a SlingBox A/V routinely delivered cable content from Florida, behind a Cox 9.0/1.5Mbps broadband connection hitting anywhere between 700-800 kbps sustained, good enough to watch on a 32″ TV. (Ironically the downstream side of that connection in Chicago was Comcast.)

It turns out Comcast is happier to go along with receiving content than serving it. Below are pictures of bandwidth usage when streaming from a SlingBox Solo on a Comcast 9.0/1.5Mbps connection in Philadelphia.

Image from task manager’s network view
Image from perfmon, TCP/IP bytes received per second and packets per second

[Update: added second trace using perfmon– Jan 9, 2008]
As expected, the connection rate shows the initial gradual climb to roughly ~700 kbps. But after two minutes something very strange happens: it drops precipitously, shedding 50% of the bandwidth in a matter of seconds and flat-lines at around 350.

These results can be reproduced consistently, at different times of day, from a wide array of streaming locations: broadband at home in NYC, a corporate LAN in Silicon Valley, free hotel networks in San Francisco, even a 3G wireless modem. Without exception all of them exhibit the same jagged, initial climb followed by a sharp drop and flat-line.
The flat-line is very suspicious: “organic” network traffic is subject to random perturbations due to effects of congestion along the way.
We can rule out the client side as being source of the problem because it repros independently of how the streaming side is connected to the Internet. It’s unlikely to be a bug in SlingPlayer or bad interaction with a particular operating system’s networking implementation because it repros on Windows, OS-X and Mobile versions. Even allowing for the possibility that all of the cross-platform variants share the same code base, and susceptible to sharing the same bug, there is the mysterious fact that this “bug” never occurs when SlingPlayer connects over a home network– not crossing any Comcast controlled space– where it easily hits multiple Mbps.
Disconnecting from the Slingbox and immediately reconnecting restores the initial spike of high bandwidth– so there is no transient congestion issue either. That spike then follows the same pattern, eventually dropping off to a flat-line.
At this point the most plausible explanation is: Comcast has engaged in wide-spread traffic-shaping which downgrades available upstream bandwidth to a fraction of the stated value and in particular interferes with the operation of the Slingbox.

cemp

Windows randomness– sky is not falling yet

A recent paper at CCS reported problems in the Windows 2000 random-number generator. The story made it to Slashdot and later amplified in the blogosphere, after MSFT confirmed that the same problem applied to XP. One lone voice of reason on Slashdot tried to clear the air in vain, while speculation continued on whether the entire edifice of Windows cryptography had been undermined. MSFT itself did not help the case by taking issue with the definition of “vulnerability” but still announcing a change to the functionality in XP SP3

This blogger’s two cents worth of observations on the subject:

Most glaring problem with the paper is an unrealistic threat-model. The attack requires complete access to the internal state of the random-number generator. In a typical setting the adversary can observe the output of a PRNG but not peek inside the black-box to see what is going on. As such this work is closer in spirit to the side-channel attacks against OpenSSL or x86 shared-cache problem. These have the prerequisite that the adversary has additional visibility into the operation of the system.
In this case the authors assumed a very powerful adversary, one who has exploited a remote-code execution vulnerability to gain complete control of the application. (“Buffer overrun” is used as proxy for this in the paper, although fewer of these vulnerabilities are exploitable for code execution owing to proliferation of compiler and OS features.) The problem is, once attacker is running code on the system with the privileges of the application using the PRNG, they have complete control and have many options. There may be no reason to attack PRNG at this point: she can directly read any keys lying around in memory, access plain-text encrypted/decrypted in those keys etc. This is equivalent to the observation that once you can break into a house, the fact that the owner did not shred all the documents may be quite irrelevant if the same information in those documents can be obtained elsewhere in the residence.
Once the internal state of a PRNG is known, predicting future is trivial until it is rekeyed or supplied with entropy from a pool. No PRNG is secure against this problem. So the incremental risk presented by the attack applies to the following scenario only: a system is 0wned after it has generated, used and discarded key material using the PRNG but before the PRNG state has been reinitialized. In this case the PRNG state allows recovering the key that otherwise would not have been reachable. (“forward security” assumption.) Any earlier and the attack is irrelevant because the keys generated with the PRNG are still around in memory and can be read directly, without having to rewind PRNG state. Any later and PRNG state is lost irreversibly. That’s a narrow window of opportunity for carrying out this attack, on top of successfully exploiting a remote code execution vulnerability.
A similar lack of perspective around system security continues into the discussion of isolation boundaries. There is an extended discussion on the benefits of kernel vs user-mode as if that were a meaningful security boundary. Code running as administrator can easily obtain kernel privileges trivially in all versions of Windows prior to Vista (by programmatically loading a device driver) and read same PRNG state from the kernel. Similarly the PRNG can run in user mode but in a different process like lsass– which is also how key isolation works in Vista for private keys. In fact the user/kernel mode distinction does not even hold in Linux: root can directly read kernel memory.
For this reason, having separate processes each running their own PRNG can be good for security contrary to the argument in the paper. Compromising the state from one does not allow getting information for any other process. For example exploiting a buffer overrun in IIS service does not reveal information about PRNG state of the process that handles SSL negotiation– which is surprisingly not IIS. This is consistent with the isolaton between accounts provided by the OS.
There is an estimate of 600 SSL handshakes required for refreshing client state and cavalier assertion to the effect that this number is unlikely to be reached in practice. In fact: for SSL servers under load (the highest-risk case) 600 connections are easily cycled within a matter of minutes. As for clients a quick peek at SSL usage on the web would show that most large services do not use SSL-session resumption– because servers are load balanced and client could end up going to any one of hundreds of identical servers . So even logging into email once over SSL involves dozens of SSL re-handshakes from scratch, one for every object accessed (including images and other non-essential data embedded on the page) each exercising the PRNG.
The authors reverse engineered W2K code and yet keep referring it to as “world’s most popular RNG” even after citing the statistic that its market share is in single-digits percent. Due diligence would have suggested looking at XP, 2003 Server and Vista first before making these claims. Vista in particular has two completely independent cryptography implementations: CAPI which has existed since the earliest versions of NT, and Crypto Next Generation or CNG, new in Vista. Not only do they not share code for underlying primitives but even the respectives interfaces are incompatible. In the end W2K3 and Vista proved to be not vulnerable.

Bottom line: rumors of the complete breakdown of CAPI may have been slightly exaggerated.

cemp

Bandwidth asymmetry in US broadband (2/2)

Here is a recap of the challenges associated with remotely accessing computers at home behind a typical broadband connection:

The operating system. Most common versions of Windows have few built-in features to act as a server. Remote desktop is the only one that works out of the box. Even that is of limited value because of the licensing stipulations: a remote user will log out the interactive one for XP Professional and all Vista editions. Only the server SKUs, rarely found installed on end-user machines support concurrent logon sessions. Even the one remote connection you can forget about with XP Home edition, where 3rd party VNC solutions are the only way to access the machine remotely. IIS is available as an optional add-on for most SKUs. Linux and Mac OS-X are better in this respect out-of-the-box since they are traditionally used for both client and server roles. But none of the solutions amount to an easy-to-use, secure remote access/sharing solution for novice users.
Firewall interference. Not only does the OS lack server-side applications, it gets in the way of others with the default firewall configuration. The personal firewall is an important security feature introduced in Windows XP and significantly strengthened in XP SP2. Its deployment coincided with the rise of botnets, when reports were circulating that a Windows machine attached to an always-on broadband connection will be 0wned in a matter of minutes. This reality informed the decision to block most inbound ports. (Fortunately applications adjusted– after installation they silently opened the ports necessary by changing firewall settings, a trick that stopped working when Vista introduced the largely inane UAC feature.)
Home networking configuration. The standard configuration for most home networks involves a wireless router in the mix, behind another cable/DSL modem. This means that the PCs are not directly exposed to the internet egress. Good for security, more hoops to jump through for using the system remotely. Routers typically have built-in web UI which can be used for setting up port forwarding. On the fighting chance that users managed to get past first two hurdles, this is where they could stumble. The number of routers that support UPnP may be an encouraging sign here as that protocol can be used to dynamically open-up external facing ports.
ISPs. Finally the biggest obstacles are the internet service providers themselves. For all the advances in infrastructure, upstream bandwidth remains a scarce commodity. For example, here in Manhattan the standard Time-Warner package provides 10Mbps downstream and 512kbps upstream. That’s a factor of 20x. In the most “equitable” scenario, our previous provider in Central Florida offered 9Mbps down/1.5Mbps up. On top of the constrained bandwidth, there is port-blocking, often couched in the language of security intended to confuse users. For example blocking port 25 has certainly helped stem the tide of spam originating from zombies. But it also prevented users from hosting their own email server at home. Similarly inbound port 80 is often blocked to preempt web-servers operating out of the basement– at least not without shelling out for the “business class subscription” from the ISP.

The result of these policies has been the imposition of a dual-standard on broadband subscribers. They are expected to consume content originating elsewhere. Copious amounts of bandwidth is available for this and ISPs are falling over trying to provide exclusive content in an attempt to move up the value chain. But customers are also discouraged from participating in the distribution of content, even accessing their own resources remotely.

cemp

Windows Live ID ships identity-linking

It is great to see that the Windows Live ID service went live with the “linked identities” feature recently. (Full disclosure: this blogger worked on the security review for the design.) Linked IDs were introduced to deal with the problem of juggling multiple identities. It’s well known that due to the lack of interoperability between web service providers, users end up registering for multiple accounts, one for Google, one for Yahoo, one for MSN/Windows Live etc. This is a necessity because services available to one ID such as instant messaging a particular group of friends, are not available to others. Recent steps towards limited interoperability are encouraging and may decrease the need for that proliferation long term.

But less frequently acknowledged is the notion of personas, when users create multiple identities with the same Internet service provider. In this case the issue is not missing functionality or fragment networks, but the desire to maintain separation between aspects of one’s online activities. theodoregeisel@hotmail.com may have exactly the same capabilities as drseuss@hotmail.com but the user in this case presumably made a conscious decision to keep them distinct. Perhaps they may even want to discourage contacts from discovering the correlation between the two. Less contrived examples are keeping different accounts for personal and work use, or interacting with casual acquaintances verses expressing an alter-ego in the presence of good friends.

The challenge for these users is managing the multiple accounts. Typically web authentication systems have the notion of a single identity that can be expressed at once. This is often mistakenly ascribed to a limitation of web browsers, namely the existence of a single global “cookie jar” where the cookies that correspond to authentication state are kept– not true, as evidenced by linking feature and for that matter Google being able to sustain both an enterprise ID and user ID at the same time. That leaves the user constantly logging in and out of accounts in order to manage both. Aside from being frustrating, this breaks convenience features built into the authentication system which generally assumes a single account. For example, the various implementations of “keep me signed-in” / “remember me” works for only one account. Logging out of that account and signing in with another clears the saved credential. (Actually it is more complicated: technically the passwords can be remembered by client-side extensions including the web-browser and these are generally capable of remembering multiple credentials. Smart-clients are not limited to the one user rule, and even for web scenarios there is an exception with Windows Live ID login for Internet Explorer when the helper ActiveX control and BHO is installed.)

Linked identities provide an effective solution to this problem. The user proves ownership of both identities by entering the password for both on the same page on the Account page. This creates a permanent association between two identities. From that point on when the user is logged in as one account, they can quickly switch to the other by using a menu on the upper-right corner of the shared banner that appears across the top for most Live services. No logout, no additional credential prompts. Linking operation is symmetric, more than one account can be linked and the links can be revoked by the user anytime in the future. The feature can be experienced first-hand at the Hotmail website by all existing users. Congratulations to the team on this milestone.

cemp

Bandwidth asymmetry in the US broadband market (1/2)

Back in the 1990s pundits speaking of the “information super-highway” liked to contrast its interactive nature with TV, emphasizing how much better of we were going to be because the new medium works two ways. TV was old-school, making us passive recipients of content expressive powers limited to choosing one pre-packaged experienced over the other. On the Internet everybody was going to be a participant, creating content.

The prediction proved correct to some extent, as evidence by the popularity of user-supplied content in Web 2.0 whether it takes the form of rambling blogs, blurry photographs names DSC001 on Flickr and more recently the fifteen-minutes-of-fame video on YouTube. But in this world contributing to the proliferation of content noise out there still requires help from another well-financed entity: the blogging site, photo-sharing website etc.

For the most part users are not running their own servers at home. There is technology available for this, often open-source/free and to varying degrees usable by novice end-users. But there are good reasons for using a professional hosting service: it benefits from economy of scales, ease of management and gives users a host of features– including 24/7 reliability, backups etc.– that would be difficult to implement at home. For one-to-many sharing where the user is publishing “public” content intended for large number of people to access, it makes sense to upload it to a central distribution point. For private content, it is not as clear-cut. If your tax returns are stored on a home PC and the goal is to work on them from a different location, a direct connection to the machine would be the straight-forward solution. The popular GoToMyPC app is one of the commercial solutions that has emerged in response to the demand. In principle the file access scenario has an equivalent hosted solution, where you can upload your files to a service in the cloud such as Windows Live Drive. But it’s easy to craft scenarios where that is not true: if the home PC had an expensive application such as PhotoShop installed locally, the only way to use that software is remote-access. Similarly the disruptive technology in SlingBox which streams cable/TV/DVR content over the Internet requires direct connectivity, in this case to the appliance and using it as server hosted at home. Last year Maxtor debuted the Fusion, a new external drive with networking support and built-in capability for sharing files over the Internet using links in email messages.

This is where the triad of OS developers, networking equipment vendors and ISP business models conspire to make life very difficult for consumers.

(continued)

cemp

Real-estate agents: deceptive practices even in strong markets

Combine two ingredients:

1. Real estate business is not exactly known for transparency and integrity. In spite of strict regulations– such as legal obligations to disclose known defects and record all transactions in public records– deceptive advertising, distorted perception and Ponzi-scheme mentality remain the hallmarks of the industry. (Some of the subtle ways where an agent works against the interests of the client, pressuring sellers to bid higher and buyers to accept lower bids, was chronicled in Freakonomics.)

2. New York metropolitan area real-estate remains one of the few islands of stability and uninterrupted irrational exuberance in the midst of a sobering, country-wide correction after the unsustainable bubble in housing prices for a whole decade. Manhattan remains strictly a
seller’s market including in rentals.

It’s no surprise that brokers resort to questionable practices trying to move units. This also explains why Craigslist, that venerable free resource, has been rendered completely useless for Manhattan, flooded by hundreds of bogus listings for non-existent apartments meant for bait-and-switch scams and otherwise useless, content-free classifieds describing IN ALL CAPS why this apartment will not be on the market very long. Goes to prove that sometimes “free” is not a good thing: charging people to place ads would go a long way to assure quality control and improve signal/noise ratio.)

Consider the following blurb from a contract that must be signed before brokers are willing to show apartments:

“You understand that the commission charged by [brokerage firm] for the aforesaid services is 15% percent of the first year’s rent … payable to [firm] only if you rent in a building or complex shown to you by [firm] within 120 days of such showing.”

This contains an ambiguous case: broker Bob shows unit #123 in the building which does not work. Later broker Alice from a different firm shows apartment #456 which the customer decides to take. Is Bob owed any commission? From reading the above blurb, the answer seems to be in the affirmative. In this case “Bob” continued to insist that was not the case. In fact it is very much in the interest of the brokerage firm to have this over-reaching clause. It’s perfectly fair game to insist that a customer utilizing the services of an agent should properly compensate the firm. On the other hand by extending the claim to include all units and effectively “tainting” the building for for months, the company achieves lock-in effect. But Bob would also insist this is not an exclusivity agreement which is strictly speaking correct. It does not rule out working with another broker only creates strong economic incentives against doing that for the same building.

The pragmatic solution which worked in this case: different brokers for each neighborhood. This makes sense anyway because real-estate remains a very old-fashioned business personal connections matter and it’s unlikely that the same person has developed strong networks in all areas.

cemp

Crossing the line on privacy: Facebook story

It was a case of conventional wisdom at odds with itself.

Information security community has long maintained a very glib outlook on privacy. On the one hand embracing such enablers or paranoia as Tor, offshore data-havens and untraceable ecash, on the other hand griping about the indifference and cavalier attitude that most users have towards their own personal information. The failure of privacy-enhancing technologies to break into the mainstream has a consistent history from PGP to the failure of Zero Knowledge Networks to commercialize its network.

At the same time Facebook was the new poster-child for web 2.0 applications, the social network threatening to take over MySpace, flush with cash after having recently inked a lucrative advertising deal with MSFT after sitting in the middle of a bidding war against Google. It could do no wrong, and certainly not in such a trivial area as privacy. Scalability, performance, features– this is what makes or breaks social networks, as Friendster found out the hard way.

It turned out users did care about privacy after all. Long before a popular outcry from users, critics such as Cory Doctorow were writing blistering reviews of the Facebook
business model, referring to its view of users as “bait for the real customers, advertising networks.” In this case it did not take very long for popular sentiment to catch up. The Beacon feature crossed the line from dubious monetization strategy into outright abuse of customer data. At its core Beacon was a data linking scheme: Facebook partnered with several prominent ecommerce merchants including Amazon, Blockbuster and Fandango to access the transaction history of users at these external sites. This data stream which included purchase history was incorporated into a user feed, visible to other users. (A challenge considering that there is no shared identity spanning these sites– email address would have been the only link, which is good enough for advertising purposes.) In effect every time the user bought anything at one of these merchants, they became an unwitting walking billboard, advertising to other users what they purchased and the merchant.

Great value proposition for merchants on the face of it: through a process of viral marketing, friends can be inspired to click on the link and visit the same merchant to purchase identical item, in a case of keeping-up-with-Johns played out on a social network. Meanwhile those users particularly drawn to cataloging their material possessions online would have the data stream automatically generated. At least that must have been the elevator pitch in some PowerPoint presentation that inspired this scheme. One minor detail: viral marketing depends on willing participants who are impressed with the product and voluntarily rave about it to their contacts. Creating the appearance that users are implicitly endorsing everything they have bought is a non-starter, and forcing the endorsement to be carried out in a very public way demonstrated complete disregard for privacy.

A group of users 50K strong petitioned, more bad PR followed and eventually Facebook changed the feature to opt-in from opt-out. This is a very unusual and perhaps encouraging demand for privacy. Even in the original flawed design users had the option to disable the involuntary enrollment into the advertising program but they were sticking to the principle that meaningful consent must exist before people unwittingly become part of a dubious business plan with no clear value proposition for them. The storm is not over yet: a CNET article reports that EFF and CDT are planning on filing complaints with the FTC. Meanwhile Fortune/CNN is running a piece arguing that mismanaged PR and disregard for privacy is seriously damaging the company’s future prospects. Next up: damage-control time.

cemp

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Random Oracle

Building and breaking systems

Month: December 2007

Scraping, or how to weaken authentication systems

2007: Is the tide turning for green technologies?

Reflections on Comcast vs. SlingMedia

Is Comcast throttling all upstream bandwidth?

Windows randomness– sky is not falling yet

Bandwidth asymmetry in US broadband (2/2)

Windows Live ID ships identity-linking

Bandwidth asymmetry in the US broadband market (1/2)

Real-estate agents: deceptive practices even in strong markets

Crossing the line on privacy: Facebook story