Anthony Richardson, M. Sc.
Contakt
anthony.richardsonprotect me ?!uni-bremenprotect me ?!.de
Enrique-Schmidt-str. 5
Cartesium, 3.043
28359 Bremen
ORCID: 0009-0002-0524-8294
Google Scholar: Anthony Richardson
Linkedin: anthony-richardson
Open Thesis Topics
Master Thesis: Enhanced Video Question-Answering on Unpaired Video Data
The goal of this Master Thesis is to create a machine learning model for Video Question-Answering, that can leverage large amounts of unpaired video data through unsupervised learning.
More information can be found here.
Master Thesis: Online Hyperparameter Adaptation via Learned Controllers
The goal of this Master Thesis is to pair a machine learning model with a light-weight controller that dynamically adjusts the used training hyperparameters throughout the course of the training.
More information can be found here.
Teaching
Seminar: Bremen Big Data Challenge
Master Projekt: ConVRge
Course: Advanced Machine Learning
Course: Advanced Machine Learning
Publications
CogniFuse and Multimodal Deformers: A Unified Approach for Benchmarking and Modeling Biosignal Fusion
Full paper: [pdf]
Authors: Anthony Richardson, Michael Beetz, Tanja Schultz, Felix Putze
Conference: 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2025)
Abstract: Continuously monitored physiological signals carry rich information about the human body and the biological processes happening within. Extracting this information from casually collected biosignal data in activities of daily living holds great potential for real-time monitoring of physical and mental states, but comes with increased difficulty due to the influence of noise and artifacts. Thus, we create CogniFuse, the first publicly available multi-task benchmark for multimodal biosignal fusion in such challenging environments. For many biosignals, especially electrophysiological signals, the information contained in different frequency bands plays a significant role in analyzing the physiological states of the body. Therefore, we introduce a group of novel fusion models, called Multimodal Deformers, that capture multi-level power features as well as long- and short-term temporal dependencies in multimodal biosignal data. In particular, our proposed Multi-Channel Deformer achieves the highest average benchmark score, outperforming all models of comparison. To assure full transparency and reproducibility, and to support future research on multimodal biosignal fusion, all code and data is made publicly available.
Motion Diffusion Autoencoders: Enabling Attribute Manipulation in Human Motion Demonstrated on Karate Techniques
Full paper: [pdf]
Authors: Anthony Richardson, Felix Putze
Conference: 27th International Conference on Multimodal Interaction (ICMI '25)
Abstract: Attribute manipulation deals with the problem of changing individual attributes of a data point or a time series, while leaving all other aspects unaffected. This work focuses on the domain of human motion, more precisely karate movement patterns. To the best of our knowledge, it presents the first success at manipulating attributes of human motion data. One of the key requirements for achieving attribute manipulation on human motion is a suitable pose representation. Therefore, we design a novel continuous, rotation-based pose representation that enables the disentanglement of the human skeleton and the motion trajectory, while still allowing an accurate reconstruction of the original anatomy. The core idea of the manipulation approach is to use a transformer encoder for discovering high-level semantics, and a diffusion probabilistic model for modeling the remaining stochastic variations. We show that the embedding space obtained from the transformer encoder is semantically meaningful and linear. This enables the manipulation of high-level attributes, by discovering their linear direction of change in the semantic embedding space and moving the embedding along said direction. All code and data is made publicly available.
Behaviour-based detection of Transient Visual Interaction Obstacles with Convolutional Neural Networks and Cognitive User Simulation
Full paper: [pdf]
Authors: Anthony Mendil, Mazen Salous, Felix Putze
Conference: IEEE International Conference on Systems, Man, and Cybernetics (SMC 2021)
Abstract: The performance of humans interacting with computers can be impaired by several obstacles. Such obstacles are called Human Computer Interaction (HCI) obstacles. In this paper, we present an approach of detecting a transient visual HCI interaction obstacle called glare effect from logged user behaviour during system use. The glare effect describes a scenario in which sunlight shines onto the display, resulting in less distinguishable colors. For the detection of this obstacle one and two dimensional convolutional neural networks (1D convnets and 2D convnets) are utilized. The 1D convnet decides based on temporal sequences while the 2D convnet uses synthetic images created with those sequences. In order to increase the available training data a cognitive user simulator is used that implements a generative optimization algorithm to simulate behavioural data. Four ensemble-based systems are implemented, one each for 5, 10, 15 and 20 game rounds. The first two are based on 1D and the other two on 2D convnets. Each system consists of multiple models voting for the final prediction. The accuracies of these systems in the order of the number of rounds are 72.5%, 82.5%, 80% and 85%