AI meets human language – Explorative perspectives on Word Embeddings in NLP by Anna-Lena Keith

Nowadays, the term artificial intelligence is not new to anybody. “Intelligent“ devices and algorithms are everywhere – as digital assistants or chatbots, for example. The notion, that an algorithm can handle human language – a very complex construct – always baffled me. After all, it is just code written by someone – and some very complicated maths. Can a machine truly “learn“? Can it “understand“ human language?

When I started my bachelor project as a design student, I knew nearly nothing about the functionality of machine learning algorithms. I wanted to explore what “learning“ means for a machine – and if it in any way resembles what humans do. My dataset consists of so-called “word embeddings“. A multidimensional vector is assigned to a specific word. In comparing the vectors in multidimensional space, you also compare the relationship of those words to each other.

In my understanding, word embeddings show an AI´s perspective on the human language – its “vocabulary“, if you will. By visualising this dataset, I got a better understanding of the possibilities and limitations of AI. And that it is only maths after all – but a powerful tool nonetheless.

As I had a unique position – someone who started with almost zero knowledge about a technical topic and tried to understand it over the span of just a few months – my bachelor project displays this journey. My word embedding dataset was created by the machine learning algorithm Word2Vec using TED talk transcripts as training data. Programming with p5.js, I designed multiple interactive data visualizations. The work was completed with some illustrated explanations and excerpts of my research. It is aimed at people without the technical background but also experts who are interested in a beginner´s perspective on the topic.