Moritz Schaller · Research Engineer · London / Vienna

Moritz.

Artificial Intelligence Machine Learning Researcher
Moritz
— that's me, 22
Scroll to begin
Chapter I · Groundwork
§ 01 — What I do

AI & machine learning research

AI & Data Science

3+ Years XP

Computer Vision

92%

Natural Language Processing

85%

Generative AI

90%

Reinforcement Learning

88%

Deep Reinforcement Learning

85%

RLHF

82%

Probability & Statistics

85%

Data Analytics & Visualization

80%

MLOps

90%

Distributed DNN Training

85%

Multi Agent Models

85%

Neural Networks & HPC

88%

Programming & Frameworks

3+ Years XP

Python

95%

PyTorch

92%

TensorFlow

88%

NumPy & Pandas

93%

R

80%

C & C++

90%

C#

90%

SQL

95%

MLflow / Weights & Biases

87%

Computing

2+ Years XP

Docker

90%

Microsoft Azure

80%

Amazon Web Services

75%


Virtualization

75%


Frontend

2+ Years XP

HTML

90%

CSS

85%

JavaScript

75%

Angular

85%

React Native

85%

Backend

2+ Years XP

Python - Flask, Fast API

95%

Node JS

80%

Misc

2+ Years XP

Git

90%

Linux

90%

DevOps

75%

Software Development

85%

Leadership

95%

Communication

90%

Time Management

98%
§ 02 — The journey

Experience & education

Academic

Professional

Machine Learning Engineer

Eviden Austria (ATOS)
Oct 2024 - Present

Data Science Engineer

Smart Digital
Jan 2023 - Jul 2024

Freelancer

Upwork
April 2022 - Dec 2023

Paramedic

Rotes Kreuz Österreich
Jul 2021 - Sep 2024
§ 03 — Research

Research & thesis

MSc Thesis · In progress

Foundational Model for Neurosurgical Video Intelligence

University College London · 2025-2026

Training a foundation model on 100+ hours of unlabelled neurosurgical video — learning instrument detection, pose, tissue segmentation, and 3D reconstruction directly from footage.

Read more →
BSc Thesis · 2024-2025

Automatic LLM Personalisation with Human Feedback

UAS Vienna · 2024-2025

Enhanced a commercial RAG system for data privacy and automatic user personalisation — training a local LLM with a custom user-embedding layer on RLHF to adapt response style to individual personas.

Read more
90%
Improvement in aligning responses to user personas
1.36s
Reduction in query time via KV caching
100%
Data privacy achieved by hosting local LLM
User embedding visualization

PCA plot showing the learned embeddings for the 'lawyer' (user_1) and 'social media analyst' (user_2). Their separation demonstrates the model successfully captured their distinct preferences.

I enhanced a commercial RAG system to address data privacy risks and enable automatic user personalisation. I developed a local LLM with a custom user-embedding layer that learns individual preferences from star ratings. The final system adapted its response style to different user personas, making a lawyer persona 90% more likely to receive a detailed, formal response.

The Problem

The existing system sent user data to third-party APIs, which was unacceptable for clients with sensitive information. The system also had no memory for user preferences between sessions, causing user frustration.

Technical Solution

System Architecture
System Architecture Diagram

Architecture diagram showing the RAG system, user feedback loop, local LLM, and custom user-embedding layer integration.

Architecture

My solution uses a Representation Learning Approach. I subclassed a LlamaForCausalLLM to introduce a custom user-embedding layer. This layer concatenates a learned user-specific vector with the standard word embeddings. This method allows a single model to serve all users, which avoids the high cost of training a model for each user.

Training and Optimisation

The model was trained using Reinforcement Learning from Human Feedback (RLHF). I used Group Relative Policy Optimization (GRPO) because its low memory footprint was essential for the single Tesla T4 16GB GPU, as it does not require a reference model like PPO. To make training feasible, I used LoRA, 8-bit precision, and Key-Value (KV) caching, which reduced response times by 1.36 seconds per query.

Results

Prompt: What is a projectile?
Lawyer Persona

A projectile is a body that is thrown into the air or projected through space by the action of a propelling force.

Social Media Analyst

A projectile is a body that is shot, flung, or thrown into the air.

Key Learnings

  • Hyperparameter Tuning: RLHF training is very sensitive. A high learning rate caused model collapse, where it only outputted the letter "e".
  • Hardware Constraints: Optimisation techniques like GRPO, LoRA, and KV caching were not optional. They were essential to successfully train and run the model on a single 16GB GPU.
  • Dataset Selection: The choice of dataset is important. The short answers in the SQuAD dataset were not suitable for training a model on style and verbosity. The QuAC dataset was more effective.
§ 04 — Selected work

A few projects

Finance AI · Double DQN

MarketVision — multi-modal trading

Trading agent using CNNs to fuse visual, textual, and trend-based market signals.

CV · Railways

Smart Digital — Railway Object Detection

Geospatial monitoring system using dual-camera and autonomous drones.

Distributed Systems

PaperLess

Scalable searchable document system built with independent Docker containers.

Geo · CV

Geo-AI — country detection

ResNet + CLIP ensemble geolocating images to country-level accuracy.

Code Analysis

UMLify

Automated UML diagram generation from source code using static analysis.

Interactive Maps

TourPlanner

Map-based itinerary builder with real-time suggestions and route visualization.

CV · Edge

AgeLens AI

Real-time age estimation from facial images — reliability-focused deep learning.

CV · Multi-Agent

Camel-Detect

Ranking system using multiple specialized models for wildlife auctions.

CV · Industrial

Eviden — Pipe Damage Detection

CNN-based monitoring system identifying regions of interest in industrial video.

NLP · RAG

RAG system — context-aware

Retrieval-augmented generation using word embeddings for minimal hallucinations.

NLP · LSTM

JazzGenius

LSTM-powered music generation trained on Jazz corpora to create original melodies.

NN · Voice

VoiceGuard

Intelligent trigger word detection using robust background noise training.

Industrial CV

Cable Length Detection

Mathematical image processing for calculating cable car wire spans.

Deep RL · PyTorch

Cathsim — endovascular navigation agent

Learning to thread a catheter through anatomy with DQN & imitation learning.

§ 05 — Get in touch

Let's talk.

Have something to build, train, or scale?

I'm open to research collaborations, freelance engagements, and full-time roles — especially where medical imaging, foundation models, or RL are involved.