I am writing my master thesis on the topic of datafication and algorithmic transparency and governance. Relying on an empirical research of my datafied self, I will use this as a background to raise and investigate these issues: why we need algorithmic transparency and governance of algorithms, what are the current debates, and how it can be done given algorithms and platforms’ specific nature.
The investigation of my datafied self, how I am being datafied by platforms through their datafication practices, apps and search engines will be done on the one hand, by using automated tools, and on the other, by self-observation. Adopting a technographic approach, I will additionally look into the discrepancy between my algorithmic identity and the database of intention prescribed to me by algorithms, and my sense of individual identity at this point in time. My data selfie vs my real me.
As we spend majority of our online time on our smartphones, switching from app to app, I was interested to find out who has access to our apps’ data, what kind of data these actors have access to, what kind of data they gather and are interested in, and who else is being granted access to the particular app’s data.
So I started the first phase of looking into who tracks me through the apps that I have installed on my phone and what kind of permissions each apps asks for. This will enable me to investigate who is collecting what kind of data on me (not just the apps, but the companies behind the apps), with whom is it shared (third-party access), and through a look at the “permissions” – to assess the access these apps have to (what kind of data and metadata).
As this is just the first step, what needs to be further done is to investigate the trackers by category, as that will give additional insights into the scope of access to data and kinds of data (just think of analytics, for example).
The first visualisation shows the trackers per app.I was surprised by some of the findings, for example, that Twitter has only Google trackers, by the number of trackers on some apps (Duolingo, for example) and by apps that publicly announced that they will restrict third-party data sharing (like Bumble). However, as not all trackers are “created equal”, what I will do next is coding them per type (analytics, advertising) and investigate what kind of trackers are represented the most in the dataset and per app.
The same data, but visualized differently, per trackers, gives a bit of a different overview. In the visualisation below, one can see the presence of particular trackers (and companies) in the dataset.
The dominance of Google and Facebook trackers is not surprising; this mirrors the general state of web tracking where these two players dominate the trackers network (using the DuckDuckGo’s privacy essentials extension, it is estimated that Google is found on 43% of the websites I’ve visited, and Facebook on 23% ).
What is left to be done, is an analysis of the permissions each of the apps ask for an access to. Some of these permission are common and necessary, but many of them are what the GDPR would call “excessive” and not necessary for the functioning of the app and providing a smooth service. This is a work in progress and this post will be soon updated.
Published November 9, 2018.