Web Tracking: What You Should Know About Your Privacy Online

Web tracking, the practice of websites identifying users and collecting data about their browsing behavior, has become ubiquitous in the modern Internet age. Nearly every site you visit is tracking your activity in some way. While this enables beneficial features like personalized recommendations and targeted advertising, it has also led to growing concerns about online privacy and how our personal data is being collected and used, often without our knowledge or control.

As a full-stack developer who has implemented tracking for various clients, I‘ve seen firsthand how these technologies work under the hood. In this in-depth guide, we‘ll explore the world of web tracking from a technical perspective, examining the methods that companies and advertisers use to monitor our online activity, and discussing what you can do as a user and a developer to promote better privacy practices.

The Mechanisms of Web Tracking

At its core, web tracking relies on collecting unique identifiers and behavioral data points about individual users as they browse the web. These data points are tied together to form profiles that can be used to target content, ads, and more back to the user.

Some key tracking techniques include:

Cookies

HTTP cookies are small text files stored in the browser that contain data used to identify the user and maintain stateful information as they move between pages on a site. Here‘s an example of how a web server might set a cookie in Node.js:

app.get(‘/example‘, (req, res) => {
  res.cookie(‘user_id‘, ‘12345‘, { maxAge: 900000, httpOnly: true });
  res.send(‘Cookie has been set‘);
});

This sets a cookie named user_id with the value 12345 that expires after 15 minutes. The httpOnly flag ensures that the cookie is only accessible by the server and not by client-side JavaScript.

Browser Fingerprinting

Browser fingerprinting involves collecting various configuration details about the user‘s browser and device to create a unique identifier, even without cookies. Some common fingerprinting signals include:

  • User agent string
  • Screen resolution
  • Installed fonts
  • Plugin details
  • WebGL fingerprint
  • Hardware specs

Researchers have found that a combination of these signals can uniquely identify users with a high degree of accuracy. JavaScript libraries like Fingerprintjs2 make it relatively trivial to implement sophisticated fingerprinting.

Tracking Pixels and Beacons

Tracking pixels are tiny (often 1×1) transparent images embedded into web pages and emails. When loaded, they send a request to their host server indicating that the surrounding content was viewed. Similarly, tracking beacons are bits of JavaScript that send an HTTP request to a server, usually with query parameters containing behavioral data.

Here‘s an example of a tracking pixel implemented in HTML:

<img src="https://tracking.example.com/pixel.gif?user_id=12345&event=page_view" style="display:none" />

Mobile and Cross-Device Tracking

On mobile devices, trackers can access additional data points like unique advertising identifiers (e.g. iOS IDFA or Android AAID), geolocation, and sensor data. Techniques like ultrasonic cross-device tracking can even associate multiple devices to the same user by emitting inaudible audio signals.

The Scale of Tracking

To illustrate just how widespread tracking has become, let‘s look at some statistics:

  • A 2020 study by Ghostery found that 79% of websites globally contain at least one third-party tracker. The average website was found to contain 10 trackers.

Tracking prevalence statistics
Source: Ghostery

  • A 2019 paper by researchers at Google found that over 60% of all page loads on the web contained at least one third-party tracker from Google itself (e.g. Google Analytics, Google Ads). Facebook had trackers on 25% of all page loads.

Google tracking prevalence
Source: Google Research

  • The IAB Tech Lab, the standards body for the digital advertising industry, maintains a public list of hundreds of registered third-party tracking and data companies in its Global Vendor List. The self-reported purposes of these companies illustrate the sheer scope of tracking:

Tracking purposes
Source: IAB Europe

Tracking and Personally Identifiable Information

One of the most pervasive myths about web tracking is that it is entirely anonymous. In reality, the wealth of data points collected by trackers can be combined to form unique profiles that can often be tied back to individual identities.

A 2017 study published in the Journal of Cybersecurity found that 99.98% of Americans could be re-identified using only 15 demographic attributes. When location data was added, that rose to 99.99% – essentially everyone. As Stanford researcher Arvind Narayanan put it, "the very idea of anonymity doesn‘t really apply on the modern Internet."

Re-identification chart
Source: Nature

Even if trackers don‘t have your literal name attached, your behavioral profile is often distinctive enough to identify you. And that data can be synced with datasets from data brokers that do include PII. A Pew Research study found that 79% of Americans are concerned about how their data is being used by companies.

Pew research chart
Source: Pew Research Center

The Responsibility of Web Developers

As web developers, we play a crucial role in shaping the privacy landscape of the internet. Every time we add a third-party tracker to a site, we are making a decision to expose our users‘ data to another company. We need to think critically about whether the benefits are truly worth the privacy trade-off.

When implementing tracking, some best practices include:

  1. Only include trackers for clear and justifiable purposes. Avoid throwing in every analytics and advertising script just because they‘re available.

  2. Disclose all tracking to users in a clear and accessible privacy policy. Explain what data is collected, by whom, and for what purposes.

  3. Give users control wherever possible, like the ability to opt out of non-essential trackers or delete their data.

  4. Implement secure coding practices to prevent accidental data leaks. Use HTTPS everywhere, set secure and HTTP-only cookies, and avoid inadvertently passing identifiers in URLs.

The regulatory landscape around online tracking is also evolving rapidly, with laws like the GDPR in Europe and the CCPA in California imposing new requirements around user consent and data handling. As developers, it‘s our responsibility to stay on top of these regulations and ensure that our sites and apps are compliant.

The Future of Web Tracking

As users become more aware of tracking and demand better privacy protections, the industry is slowly starting to shift. All major browsers now include some form of tracking prevention, from blocking third-party cookies to hiding identifying browser features. Apple has made privacy a key selling point of its products, introducing features like Intelligent Tracking Prevention in Safari and App Tracking Transparency on iOS.

However, the cat-and-mouse game of tracking is far from over. As browsers clamp down on traditional third-party cookies, the ad tech industry is developing new methods like browser fingerprinting, CNAME cloaking, and server-side tracking to circumvent these protections. Emerging technologies like machine learning and internet-of-things devices are also creating new frontiers for tracking.

Ultimately, stemming the tide of invasive web tracking will likely require a combination of technical solutions, user education, and strong privacy regulations. As developers, we have a key role to play in advocating for user privacy and implementing ethical data practices in our work.

In the words of technologist and EFF special advisor Cory Doctorow:

"Privacy" is a word that gets used a lot but means different things to different people. When I advocate for privacy, I‘m not arguing for the right to keep secrets: I believe that transparency is essential for free and fair markets and societies. What I am advocating for is privacy from surveillance: the right not to have your activities tracked and monitored by companies and governments who have no business snooping on you. And as a web developer, you have a special obligation to safeguard the privacy of your users against snoops, spies and other bad actors.

By keeping privacy at the forefront of our work, we can help build a web that respects and protects the rights of users. It won‘t be easy, but it‘s a responsibility we must embrace if we want the internet to remain an open and empowering force for good.

Similar Posts