Schäfer: Bachelor Thesis - Random Feature Transformer for classification

Schäfer: Bachelor Thesis - Random Feature Transformer for classification

par Jörg Schäfer,
Nombre de réponses : 0

We offer the following bachelor thesis in our research group:

Introduction into the topic

Transformer models have reshaped the landscape of natural language processing (NLP) with their groundbreaking architecture, consistently delivering state-of-the-art performance across a spectrum of language tasks. Serving as the cornerstone of Language Models (LLMs) like ChatGPT and Google Bard, transformers have become indispensable tools in the NLP domain. Moreover, their utility extends far beyond NLP, finding application in various other domains. Transformer nowadays are also used for classification tasks, such as image classification and time series classification.

Task:

The heart of the transformer architecture lies in its learnable embedding matrices, namely , , and , which evolve during training. A pertinent research inquiry emerges:

What is the efficacy of transformers when we forego learning these matrices and opt for fixed random alternatives instead?

This deviation represents a random feature approach. To empirically evaluate this, we intend to compare the performance between a vanilla transformer model, such as the Annotated Transformer, and its counterpart utilizing fixed random embedding matrices. This investigation aims to elucidate the importance of learned embeddings in transformer models and the potential consequences of employing random features in place of learned ones.

Expectation:

  • Implement a transformer architecture with random embedding matrices. You can start from a given codebase like the Annotated Transformer
  • Find a “good” (often used) classification task and compare the performance of the vanilla transformer with the random feature type transformer
  • Incorporate visual representations of your findings within your thesis

Skills Required:

  • Basic understanding of Machine Learning
  • Experience in Python
  • Programming or modifying ML pipelines (Pytorch or Tensorflow)
  • Profound interest or expertise in the attention mechanism

Supervisor: Marius Lotz (mailto:marius.lotz@fb2.fra-uas.de)