Bachelorthesis: Attention under different similarity scores
We offer the following bachelor thesis in our research group:
Introduction
In contemporary times, self-attention arc oitectures have become extensively employed across various domains. At the core of these architectures lies the self-attention layer, which computes a similarity score using the formula:
$$𝐴𝑇𝑇(𝑥) \sim 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑥𝑊^𝑄(𝑊^𝐾)^𝑇𝑥^𝑇) = softmax(\langle Q,K^T\rangle)$$
However, the rationale behind this particular similarity score remains somewhat opaque. Hence, we aim to explore whether alternative similarity scoring mechanisms could offer improvements, tailored to specific fields of interest. These fields may include, but are not limited to:
- Time Series Classification
- Time Series Prediction
- Time Series Regression
By investigating alternative similarity scores, we seek to enhance the effectiveness and applicability of self-attention architectures within these specialized domains
Task
We will provide the general Machine Learning model for you. You have to implement the different similarity scores (just simple functions) within the given framework. Train and test the model on provided datasets.
Expectation
- Adapt the machine learning pipeline to accommodate the selected models
- rain and test the model
- Perform rigorous statistical evaluation of the models' performance.
Skills require
- Basic background on ML models
- Able to modify ML pipelines (Pytorch)
- Basic background in Statistics
Advantages
- Engaged in a topic of high interest within our research group
- Likelihood of inclusion in the research paper
Supervisor: [Marius Lotz](mailto:marius.lotz@fb2.fra-uas.de)