Skip to Content
2023.05.04

Ethical Understanding

Last updated 2023.05.09

Talking about tracking, telemetry, and analytics on the web can be a tricky business. Companies that have built monopolies around online surveillance in the service of selling targeted ads have put regular people on the defensive when they even hear these words, and omnipresent cookie banners remind people about it everywhere they go. What if we could talk about collecting data in a new way, the way that social scientists talk about it? Could we create the knowledge and understanding we want while respecting our users as guests on our sites?

What Websites Want.

When we run a website, especially a marketing site for product, there are a few essential things we want to know that help us make better decisions about where to spend our time and money.

Where is our traffic coming from?

From a marketing perspective, not only do we want to understand referring sites, but we want to understand which campaigns are delivering traffic and which are not. This lets us know which communities to focus our efforts on, which ad campaigns are generating click-throughs, and if affiliate programs are doing what we hope.

What parts of the site do people find valuable?

Sure, we think we know which features and content are the most useful, but real people will always surprise us. When we remove our own biases towards what's meaningful and see where people are choosing to spend their time, we can learn more about what people actually need.

How do people move through our site?

Is article recirculation working? How about cross-posts and back-links? Is a certain section of your product always sending people scrambling for the documentation? Is there a hidden gem that is getting lost in the shuffle? The web is a dynamic, non-linear medium, and it can make a difference how people are engaging with what's there.

Where do people leave?

Bounce rates can tell us a lot about what people are looking for, and whether we've managed to keep their attention. The places – which pages, and which sections of pages — where we lose our audience can give us valuable hints into what we should try next.

What's driving the outcomes we care about?

At the end the day, we have results we care about. How can we know that what we did helped achieve our results?

What People Want

As a person on the internet, we want some very different things. Above all else, we want to be treated with dignity and respect — not to have out attention and habits treated as commodities sold to the highest bidder.

Do not track any personally identifying information about me.

Connecting on-line behaviors and actions to individual human beings is not only an invasion of privacy, but can result in meaningful harm for those human beings.

Do not track me across domains.

Connecting individuals patterns from Site A to Site B can result in creating a personally identifying profile of someone, as well as create significantly higher risk profiles in mishandling data. Cross-domain tracking increases risks dramatically, with usually little benefit beyond selling ads.

Do not release my data to a third party.

Having our visits to a given website counted is usually seen as benign, and we tend not to mind. But as soon as that information becomes valuable enough to sell, the data collectors cross a line.

Do not use my data for ads.

Similar to third part data, people are tired of being grist for the mill of unending surveillance capitalism.

Tell me what's being collected.

People want to know what you know about them, and being transparent with them is a way to build trust and remain on the right side of ethical frameworks.

Allow me to opt out.

A big part of treating people with dignity is respecting them when they say no. Making it as easy as possible for people to opt out of tracking, and have any data regarding them deleting without fuss, is essential.

Finding The Solution

None of these interests, those of the website or those of the people, need to be in conflict. It’s perfectly possible to create an internet that lets websites understand themselves that treats people ethically, with dignity and respect.

Data Privacy Laws

To the ends of the protecting their citizens, a number of countries and states have created data privacy laws that regulate how companies can collect data on their citizens. The two big ones are the General Data Protection Regulation Act and the California Consumer Privacy Act

GDPR

The GDPR was passed by the European Union to protect users data, privacy, and create transparency around data collection practices. It requires that people be able to understand what data is being collected, adjust or delete that data, and limits the legal basis under which websites may collect that data. The GDPR is a good tool for understanding legal and ethical obligations around consent.

CCPA

The CCPA has a lot of the same provisions as the GDPR, especially when it comes to transparency and the right to opt out. It also focuses on reselling of data, anti-bias and non-discrimination, and additional protections for minors. The CCPA is a good starting point for thinking about ethical considerations around collecting data from vulnerable and minoritized groups.

Other Privacy Laws

Other privacy laws includes the DPA in France (users must provide free, informed, specific and unequivocal consent), PIPEDA in Canada (website and app operators get “express consent” for the installation of certain “computer programs”), and the APP in Australia (An APP entity must not collect sensitive information about an individual [with exceptions]).

This is something that states are talking seriously, and navigating these laws can be a potential minefield.

Data Privacy Frameworks

Alongside the actual laws that govern data collection practices, there are documents that provide guidance to states and companies alike. These data privacy frameworks, while not legally binding or enforceable, make statements about how the internet should be.

EPrivacy Directive

The EPrivacy Directive is the biggest presence in this space, as an official document of the European Union, the Directive is mean to help member states craft their own legislation like the GDPR. The Privacy Directive is both somewhat dated — being published in 20??, with an update several years overdue — and incredibly conservative.

Under the EPrivacy Directive, explicit user consent is required to store any data that originates from someones computer beyond what is strictly necessary to deliver the server they’ve requested. Under the narrowest reading, even most server monitoring and observability tools are in violation of this directive.

While the Directive is itself not a law, it’s strict defense of peoples right to privacy should make us thing think twice. Do we really need that geolocation and browser version?

EFF Data Collection Best Practices

The Electronic Frontier Foundation also provides guidance for collecting user data — they themselves use Matomo in a very locked down, customized set up.

Key takeaways from their guidance are only collecting data you actually need, and understand the worst-case scenario around that data.

Data Ethics

Of course, the internet is not the only place where data gets collected. The biomedical and social sciences have been considering this problem for far longer than the tech industry has, and has its own set of regulations and ethical frameworks can that guide our thinking.

The Institutional Review Board

Any University that wants to conduct research on human subjects must do so with the oversight of the IRB. Researchers submit proposals for what data they want to collect, why, and to what ends. The IRB’s job is to consider the ethical implications of that research and approve or deny it.

The IRB has rigorous considerations for any sensitive or minoritized group. Research with children, unhoused populations, or other groups that could be actively harmed are given serious attention and scrutiny.

Compared to social science research institutions, the internet is the Wild West. While nothing like the IRB is likely to ever exist for product analytics, it’s worth learning from a system that works hard to protect every-day people.

The Belmont Principles

Since the mid 1970’s, medical research in the US has been governed by ethical principles laid out in the Belmont Report. These principles are respect for persons, beneficence and justice.

In short, the Belmont Principles state that we must respect our participants decisions, protect them from harm, and share with them the benefits of our research. Each of these principles can apply to our own data collection projects.

Anti-racism & Anti-fascism

Today, any system we work to create should take the position of being anti-racist and anti-fascist by default. It’s not enough to try avoid perpetuating racism and fascism — we should always be looking to understand how what we build can work to create a more just and more equitable internet.

Putting It All Together

It’s entirely possible to understand your website or product in a way that aligns with regulatory requirements ethical best practices, and peoples expectations.

Your analytics solution should be collecting no data by default, disclosing what data you do collect, allowing anyone to opt out and have their data deleted, being intentional about what you collect and why, and taking that data’s security seriously.

This is the approach that Pushbroom takes — a privacy-first stance on data collection for a better internet.

References

  1. https://gdpr-info.eu/
  2. https://oag.ca.gov/privacy/ccpa
  3. https://link.springer.com/article/10.1007/s11948-022-00380-7
  4. https://www.eff.org/pages/online-privacy-nonprofits
  5. https://www.eff.org/deeplinks/2019/06/effs-recommendations-consumer-data-privacy-laws