Fb 2 LE I: Schäfer: Bachelor Thesis - Attention and similarity scores

Bachelorthesis: Attention under different similarity scores

We offer the following bachelor thesis in our research group:

Introduction

In contemporary times, self-attention arc oitectures have become extensively employed across various domains. At the core of these architectures lies the self-attention layer, which computes a similarity score using the formula:

$$𝐴𝑇𝑇(𝑥) \sim 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑥𝑊^𝑄(𝑊^𝐾)^𝑇𝑥^𝑇) = softmax(\langle Q,K^T\rangle)$$

However, the rationale behind this particular similarity score remains somewhat opaque. Hence, we aim to explore whether alternative similarity scoring mechanisms could offer improvements, tailored to specific fields of interest. These fields may include, but are not limited to:

Time Series Classification
Time Series Prediction
Time Series Regression

By investigating alternative similarity scores, we seek to enhance the effectiveness and applicability of self-attention architectures within these specialized domains

Task

We will provide the general Machine Learning model for you. You have to implement the different similarity scores (just simple functions) within the given framework. Train and test the model on provided datasets.

Expectation

Adapt the machine learning pipeline to accommodate the selected models
rain and test the model
Perform rigorous statistical evaluation of the models' performance.

Skills require

Basic background on ML models
Able to modify ML pipelines (Pytorch)
Basic background in Statistics

Advantages

Engaged in a topic of high interest within our research group
Likelihood of inclusion in the research paper

Supervisor: [Marius Lotz](mailto:marius.lotz@fb2.fra-uas.de)