Wintersemester 2018/2019, BA/MA Produkt-Design
Algorithmic Bias
Team:
Freie Universität Berlin: Eva-Maria Herbst, Nina Matthias, Eduard Gette
Kunsthochschule Weißensee: Cara Celine Schlenzig, Maximilian Blazek
Concept Outline:
The exponential increase in computing power and available data led to drastic improvements in the field of machine learning. Due to their remarkable effectivity these powerful tools get deployed more and more often in fields previously reserved for human beings.
While the understanding of how this could increase productivity comes natural to most people, only few truly understand the arising problems of the use of algorithmic systems in human domains. One of the biggest issues being the implicit introduction of human bias into the system, either due to the people creating the system or the nature of the data used for training.
One striking example of how biased data leads to manifestation of human bias in algorithmic systems are ML systems in the U.S.A. that are intended to help judges determine risk levels of offenders, but unfortunately showed a strong bias against african americans, since those were over represented in the data used for training.
In our project, we tried to visually demonstrate the existence and the effect of bias in a very well known word embeddings dataset learned by the GloVe algorithm, demonstrating that algorithms can indeed be seen as mirrors of our society.
Use Case:
What we came up with is an interactive mirror in which you will see a reflection of the algorithmic assumptions made, based on your input.
The user stands in front of this “mirror” which is connected to an input device, accessible to the visitor, to enter a few informations about herself.
At first the user will be able to still see himself, but as soon as the user finishes his input, he will be overlaid with a 3D model that mimics his movements.
The 3D model, chosen from a pool of hundreds of pre-stored 3D models, is picked based on the “similarity” of the user’s input text and a text description of the corresponding 3D model.
The prototype:
In our prototype the user is prompted to enter three characteristics and three hobbies to describe himself, meaning the input consists of six words.
Each of those words will be looked up in our “GloVe”-word embeddings dataset to find the vector representation for each word. The average over these vectors then represents the user’s input in the GloVe-space.
With Glove-space we mean a set of vectors, each vector representing a word in natural language, that was learned by an algorithm, called “GloVe” on a text corpus consisting of Wikipedia articles and the Gigaword 5 corpus. The “goal” of the algorithm was to learn which words “belong together”, therefore words that the algorithm considers similar are very close to each other in the vector space.
As already mentioned, we store a set of tags describing each 3D model together with the model itself. Therefore we can calculate in the same way, as for the user, a representation in the GloVe-space for each 3D model.
Finally by calculating a metric, called cosine similarity, we can then assign and show the user the most similar model to his input, which at the current state of the prototype doesn’t yet move together with the user.
Outlook:
The prototype was created with a product in mind that could be used in an exposition setting, that is related to topics such as ethics in machine learning.
We therefore designed the interaction to be short, concise and requiring little to no prior knowledge in the field to understand the message being conveyed. Through this interaction we want to raise awareness for careless use of algorithmic systems in socially impactful domains, since it can potentially cause an unintentional downward spiral, solidifying prior existing biases.
What we imagine for the product in the future, besides the missing functionality described in the use case, would be a dynamic generation of 3D models instead of pre-defined models, i.e. through a Generative Adversarial Network (short GAN). The reason being the highly increased number of variety and the far better matching to the exact input of the user.
Video URL:
GitHub:
github.com/TransparencyTango/tangorithm