The Benchmarks for EEG Transfer Learning (BEETL) is a competition that aims to stimulate the development of transfer and meta-learning algorithms applied to a prime example of what makes the use of biosignal data hard, EEG data. BEETL acts as a much-needed benchmark for domain adaptation algorithms and provides a real-world stimulus goal for transfer learning and meta-learning developments for both academia and industry.
Given the multitude of different EEG-based that exist, we offer two specific challenges: Task 1 is a cross-subject sleep stage decoding challenge reflecting the need for transfer learning in clinical diagnostics, and Task 2 is a cross-dataset motor imagery decoding challenge reflecting the need for transfer learning in human interfacing.
Task 1 (see Fig 1) is in the field of medical diagnostics and specifically has the goal of automatic sleep stage annotation from sleep EEG data. In clinical diagnostics, EEG data is still analysed "by hand" (gold-standard) and requires integrating many sensory modalities (polysomnography) including EEG to reliably characterise sleep state. This manual scanning and interpretation of EEGs is time-consuming since these recordings may last hours or days and is also an expensive process as it requires highly trained experts. Therefore, high-performance automated analysis of EEGs can reduce time to diagnose and enhance real-time applications by flagging sections of the signal that need further review. We provide a data set for adult users (80 sessions, around 40 subjects, age 22-65) is given as a training base and sleep stage annotation has to be transferred to two different age groups (65-80 and 80+) for which 5 subjects worth of data are provided with each. Task 1 is an essential use case for the development of ready-to-use medical diagnostics developed on a standard, large user base that has to be then transferred to many different clinically relevant subpopulations for which respectively only a few subjects worth of data can be collected.The purpose of Task 1 is to find the best-performing across-subject transfer learning algorithms when hardware and experimental setup are preserved. As an added challenge, the transfer has to work on different users groups with well documented systematic EEG differences (elderly and very elderly subjects) during sleep.
Task 2 (see Fig.2 for illustration) is a motor imagery challenge that gets at the heart of the problem of current BCI systems: motor imagery training data is exhausting for subjects to record, and historically has been difficult to use in a cross-subject manner for a variety of reasons. We provided (and organise) in the past several motor imagery data sets for BCI challenges through our MOABB (Mother of All BCI Benchmarks, see https://github.com/NeuroTechX/moabb) database to test the performance of algorithms in terms of their generalisation performance on each data set. These were not previously used to assess learning across data sets -- which generally shows negative transfer and which we believe should be addressable by the development of novel transfer learning algorithms. The classification accuracy on a new test set will be the standard to compare different algorithms, the test set is an unpublished data set that we collected for this purpose and will be added to MOABB after the competition ends.
We hope that this competition will stimulate the substantial development of real-world suitable transfer learning algorithms for noisy, multi-dimensional time-series data. Crucially, we expect to see solutions that work well on Task 1 or Task 2, however, throughout competition design, we hope to see general transfer learning architectures emerge that will be strong in both tasks. Ultimately, the success of EEG-based technology, as any technology for human augmentation, will be judged by user uptake and by the improvements in their quality of life. This requires being able to use, reuse and adjust EEG based systems quickly and without expert technical support. While we focus here on EEG, transfer learning on biosignal data is relevant for all areas of biomedical engineering, healthcare and consumer technologies where we want to find solutions that work on new groups of users and are easy and fast to and get-going for each new user.
In EEG BCI but also in many other domains of healthcare and human interfacing we have several transfer problems that are superimposed or intersecting. Domain adaptation is required here in a number of situations and we focus on the two main limiting factors:
First, transferring from one subject to another subject where differences may arise because of the specific fitting of the sensors, but also differences in state of mind or physiological state, differences in anatomy and disposition between subjects (including length of hair and body fat ). Being able to solve this domain adaptation task, would dramatically improve the ability of human interfaces to be quickly and with minimal effort to be given to new users.Therefore our first task of this competition is a sleep stage classification task across different subjects. Sleep stage classification represents a cluster of diagnosis problems in the real-world where a large amount of EEG data from subjects is relatively easy to acquire from medical records. One property of these data sets is that there are usually hundreds to thousands of subjects in one dataset collected with the same standard (without cross-dataset variability). We would like to test the cross-subject transferability of algorithms when trained on the same data set with enough subjects. There are many algorithms for EEG transfer learning in the past, and there is much recent progress based on deep learning . This task provides a common platform to compare the performance of these algorithms in the literature.
Second, transferring between different data sets and studies. Here differences arise because even for the same task (e.g. EEG sleep stage analysis) there are differences in how data was collected, due to different protocols and handling of subjects, as well as differences in the specific interfacing hardware (e.g. different manufacturers, different number of electrodes, etc.). Being able to solve this transfer challenge would allow combining dozens of existing and future large data sets into supersets that would approach the scale of Imagenet, but for bioneural signals, so the second task in this competition is a 3-way motor imagery classification challenge (left-hand, right-hand motor imagery and 'reject'). It represents a cluster of EEG decoding problems where data from subjects are not easy to acquire. Motor imagery collection is high-cost, subjects get tired easily after one or two hours of movement imagination. It's typical that a public motor imagery data set contains several subjects only, and different public data set may have different distributions of EEG, according to different imagery strategies, different feedback conditions, different amplifiers and other hardware. Therefore, if large data is intended to be used in such tasks, transfer learning algorithms should be able to handle not only cross-subject variability but also cross-dataset variability. We would like to test both the cross-subject and cross-dataset transferability of algorithms when they are trained on multiple data sets. Currently, most studies focus on transfer learning within a single data set. This task provides a platform to test that they can be extended to utilise big data across both subjects and data sets. There is limited work on cross-dataset transfer learning, nor a systematical experimental comparison of them in the literature. This task provides a common platform to compare the performance of current transfer learning algorithms across both subjects and data sets.