Saved by third-party cookies: when phishing campaigns make mistakes


(Or, that rare instance when third-party cookies actually helped improve security.)

Third-party cookies have earned a bad reputation for enabling widespread tracking and advertising driven surveillance online— one that is entirely justified. After multiple half-hearted attempts to deprecate them by leveraging its browser monopoly with Chrome, even Google eventually threw in the towel. Not to challenge that narrative but this blog provides an example of an unusual incident where third-party cookies actually helped protect consumers, protecting them from an ongoing phishing attack. While this phishing campaign occurred in real life, we will refer to the site in question as Acme, after the hypothetical company in Wile E Coyote.

Recap on phishing

Phishing involves creating a look-alike, replica of a legitimate website in order to trick customers into disclosing sensitive information. Common targets are passwords which can be used to login to the real website by impersonating the user. But phishing can also go after personally identifiable information such as a credit-cards or social-security numbers directly, since those are often monetizable on their own. The open nature of the web makes it trivial to clone the visual appearance of websites for this purpose. One can simply download every image, stylesheet, video from that site, stash the same content on a different server controlled by the attacker and point users into visiting this latter copy. (While this sounds sinister, there are even legitimate use cases for it such as mirroring websites to reduce load or protect them from denial-of-service-attacks.)

Important point to remember is that visuals can be deceptive: the bogus site may be rendered pixel-for-pixel identical inside the browser view. The address bar is one of the only reliable clues to the provenance of the content; that is because it is part of the user-interface controlled 100% by the browser. It can not be manipulated by the attacker. But good luck spotting the difference between login.acme.com, loginacme.com, acnne.com or any of the dozen other surprising ways to confuse the unwary. Names that look “close” at the outset can be controlled by completely unrelated entities, thanks to the way DNS works.

What makes phishing so challenging to combat is that the legitimate website is completely out of the picture at the crucial moment the attack is going on. The customer is unwittingly interacting with a website 100% controlled by an adversary out to get them, while under the false impression that they are dealing with a trusted service they have been using for years. Much as the real site may want to jump in with a scary warning dialog to stop their customer from making a crucial judgment error, there is no opportunity for such interventions. Recall that the customer is only interacting with the replica. All content the user sees is sourced from the malicious site, even if it happens to be a copy of content originally copied from the legitimate one.

Rookie mistake

Unless that is, the attacker makes a mistake. That is what happened with the crooks targeting Acme: they failed to clone all of the content and create a self-contained replica. One of the Acme security team members noticed that the malicious site continued to reference content hosted from the real one. Every time a user visited the phishing site, their web browser was also fetching content from the authentic Acme website. That astute observation paved the way for an effective intervention.

In this particular phishing campaign, the errant reference back to the original site was for a single image used to display a logo. That is not much to work with. On the one hand, Acme could detect when the image was being embedded from a different website, thanks to the HTTP Referer [sic] header. (Incidentally referrer-policies today would interfere with that capability.) But this particular image consumed precious little screen real estate, only measuring a few pixels across. One could return an altered image— skull and crossbones, mushroom cloud or something garish— but it is very unlikely users would notice. Even loyal customers who have the Acme logo committed to memory might not think twice if a different image appears. Instead of questioning the authenticity of the site, they would attribute that quirk to a routine bug or misguided UI experiment.

It is not possible to influence the rest of the page by returning a corrupt image or some other type of content. For example it is not possible to prevent the page from loading or redirect it back to the legitimate login page. Similarly one can not return some other type of content such as javascript to interrupt the phishing attempt or warn users. (Aside: if the crooks had made the same mistake by sourcing a javascript file from Acme, such radical interventions would have been possible.)

Remember me: cookies & authentication

To recap, this is the situation Acme faces:

  1. There is an active phishing campaign in the wild
  2. Game of whack-a-mole ensues: Acme security team continues to report each site to browser vendors and hosting companies and Cloudflare (Crooks are fond of using Cloudflare, because it acts as a proxy sitting in front of the malicious site, disguising its true origin and making it difficult for defenders to block it reliably.) Attacker responds by changing domain names and resurfacing the exact same phishing page under a different domain name.
  3. Every time a customer visits a phishing page, Acme can observe the attack happening in real time because its servers receive a request
  4. Despite being in the loop, Acme can not meaningfully disrupt the phishing page. It has limited influence over the content displayed to the customer

Here is the saving grace: when customers reach out to the legitimate Acme website in step #3 as part of the phishing page, their web browser sends along all cookies. Some of those cookies contain information that uniquely identifies the specific customer. That means Acme finds out not only that some customer visited a phishing site, but it can find out exactly which customer did. In fact there were at least two such cookies:

  • “Remember me” cookie used to store email address and expedite future logins by filling in the username field in login forms.
  • Authentication cookies set after login. Interestingly, even expired cookies are useful for this purpose. Suppose Acme only allowed login sessions to last for 24 hours. After the clock runs out, customer must reauthenticate by providing their password or MFA again. Authentication cookies would have embedded timestamps reflecting those restriction. In keeping with the policy of requiring “fresh” credentials, after 24 hours that cookie would no longer be sufficient for authenticating the user. But for the purpose of identifying which user is being phished, it works just fine. (Technical note: “expired” here is referring to the application logic; the HTTP standard itself defines an expiration time for cookies after which point the browser deletes that cookie. If a cookie expires in that sense, it would not be of much use— it becomes invisible to the server.)

This gives Acme a lucky break to protect customers from the ongoing phishing attack. Recall that Acme can detect when incoming request for the image is associated with the phishing page. Whenever that happens, Acme can use the accompanying cookies to look up exactly which customer has stumbled onto the malicious site. To be clear, Acme can not determine conclusively whether the customer actually fell for phishing and disclosed their credentials. (There is a fighting chance the customer notices something off about the page after visiting it, and stops short of giving away their password. Unfortunately that can not be inferred remotely.) As such Acme must operate on the worst-case assumption that phishing will succeed and place preemptive restrictions on the account, such as temporarily suspending logins or restricting dangerous actions. That way, even if the customer does disclose their credentials and the crooks turn around to “cash in” those credentials by logging into the genuine Acme website, they will be thwarted from achieving their objective.

Third-party stigma for cookies

There is one caveat to the availability of cookies required to identify the affected customer: those cookies are now being replayed in a third-party context. Recall that the first vs third-party distinction is all about context: whether a resource (such as image) being fetched for inclusion on a webpage is coming from the same site as the overall owner of the page, called “top level document.” When an image is fetched as part of a routine visit to Acme website, it is a first-party request because the image is hosted at the same origin as the top-level document. But when it is being retrieved by following a reference from the malicious replica, it becomes a third-party request.

Would existing Acme cookies get replayed in that situation and reveal the identity of the potential phishing victim? That answer depends on multiple factors:

  1. Choice of browser. At the time of this incident, most popular web browsers freely replayed cookies in third-party contexts, with two notable exceptions:
    • Safari suppressed cookies when making these requests.
    • Internet Explorer: As the lone browser implementing P3P, IE will automatically “leash” cookies if they are set without an associated privacy policy: cookies will be accepted, but only replayed in first-party contexts.
  2. User overrides to browser settings. While the preceding paragraph section describes the default behavior of each browser, users can modify these to make them more or less stringent.
  3. Use of “samesite” attribute. About 15 years after IE6 inflicted P3P and cookie management on websites, a proposed update to the HTTP cookie specification finally emerged to standardize and generalize its leashing concept. But the script was flipped: instead of browsers making unilateral decisions to protect user privacy, website owners would declare whether their cookies should be made available in third-party contexts. (One can imagine which way advertising networks— crucially dependent on third-party cookie usage for their ubiquitous surveillance model— leaned on that decision.)

Luckily for Acme, the relevant cookies here were not restricted by the samesite attribute. As for browser distribution, IE was irrelevant with market share in the single digits, primarily restricted to enterprise users in managed IT environments, a far cry from the target audience for Acme. Safari on the other hand did have a non-negligible share, especially among mobile clients since Apple did not allow independent browser implementations such as Chrome at the time. (That restriction would only be lifted in 2024 when the EU reset Apple’s expectations around monopolistic behavior.)

In the end, it was possible to identify and take evasive actions for the vast majority of customers known to have visited the malicious website. More importantly, intelligence gathered from one interaction is frequently useful in protecting other customers, even when the latter are not directly identified as being targeted by an attack. For example, an adversary often has access to a handful of IP4 addresses when they are attempting to cash-in stolen credential. When an IP address is observed attempting to impersonate a known phishing victim, every other login from that IP can be treated with higher suspicion. Detection mechanisms can be invaluable even with less than 100% coverage.

Verdict on third-party cookies 

Does this prove third-party cookies have some redeeming virtue and the web will be less safe when— or at this rate, if— they are fully deprecated? No. This anecdote is more the exception proving the rule. Advertising industry has been rallying in defense of third-party cookies for over a decade, spinning increasingly desperate and far-fetched scenarios. From the alleged death of “free” content (more accurately, ad-supported content where the real product being peddled are the audience eyeballs for ad networks) to allegedly reduced capability for detecting fraud and malicious activity online, predictions of doom have been a constant part of the narrative.

To be clear: this incident does not in any way provide more ammunition for such thinly-veiled attempts at defending a fundamentally broken business model. For starters, there is nothing intrinsic to phishing attacks that requires help from third-party cookies for detection. The crooks behind this particular campaign made an elementary mistake: they left a reference to the original site when cloning the content for their malicious replica. There is no rule that says other crooks are required to follow suit. While this is an optimistic assumption built into defense strategies such as canary tokens, new web standards have made it increasingly easier to avoid such mistakes. For example content security policy allows a website to precisely delineate which other websites can be contacted for fetching embedded resources. It would have been a trivial step for crooks to add CSP headers and prevent any accidental references back to the original domain, neutralizing any javascript logic lurking in there to alert defenders.

Ultimately the only robust solution for phishing is using authentication schemes that are not vulnerable to phishing. Defense and enterprise sectors have always had the option of deploying PKI with smart-cards for their employees. More recently consumer-oriented services have convenient access to a (greatly watered-down) version of that capability with passkeys. Jury is out on whether it will gain any traction or remain consigned to niche audience owing to the morass of confusing, incompatible implementations.

CP

Leave a comment