Sleep is essential to human wellbeing both physically and mentally. Therefore, sleep research is a major discipline in most countries. With the help of supercomputing, a team at Aarhus University (AU), Denmark, have contributed to enabling researchers better exploit the vast amounts of international data.
Most of us know that sleeping well is not just a question of laying down with closed eyes for a given number of hours. Sleep consists of various phases, of which especially the deepest stages are most important. So, to accurately measure sleep quality researchers have developed scoring models which are applied for clinical trials. However, there is no one universal model, and researchers in different countries or institutions are not necessarily consistent in their way of scoring.
The Danish team has applied neural network machine learning to real sleep data, thereby offering a method for researchers to load data from future trials. More data leads to better sleep scoring models and more accurate interpretations of sleep data because neural networks become increasingly skilled at automatically reading sleep stages correctly.
The current “gold standard” in sleep scoring requires an expert in the field to determine the sleep stage of the individual sleeper every 30 seconds through a night’s recordings aided by a manual. To Assistant Professor Kaare Mikkelsen, Biomedical Technology, Department of Electrical and Computer Engineering at AU, this seemed possible to automate. He also knew the task would not be straightforward because of differences in culture and measuring devices, etc. Further, it was obvious that powerful computing would be required.
Using the LUMI supercomputer was suggested by Associate Dean Brian Vinther, AU. LUMI is operated by CSC, the national research and education network (NREN) of Finland, and Danish researchers may apply for computing time through the Danish NREN, DeiC.
Two AU students with the necessary experience in machine learning, Andreas Larsen Engholm and Jesper Strøm, were dedicated to the project. Their task was to train neural networks to perform sleep scoring on 21 datasets based on 20,000 PSG (Polysomnography) recordings.
The project started on AU’s own data infrastructure. However, already during the preparation phase, when only 1/5 of the dataset was uploaded, they encountered an error message stating that the data took up too much space. This accelerated the process of getting started on LUMI.
“We downloaded all the data directly from the internet to LUMI using scripts we developed ourselves. Then we had to start preprocessing, which involves processing the data so that all our 21 different datasets were in the format we wanted, which is the format the sleep scoring model was originally designed for,” says Andreas Larsen Engholm.
The project benefitted from the DeiC program “Sandbox”, assisting Danish researchers keen to get into supercomputing.
“Throughout the project, we’ve had great support from DeiC, which had connections to LUMI. We were granted additional time in Sandbox multiple times because we kept realizing that we needed more,” says Jesper Strøm.
In total, the project received 3,500 GPU hours at LUMI. If a single GPU had done the work, it would have taken 145 days, longer than the entire 4-month thesis period.
“In reality, we probably would have abandoned the project if we didn’t have access to LUMI. We would have had to move data back and forth because there wasn’t enough space, making it very inconvenient,” explains Jesper Strøm.
The datasets came from various places around the world and were in different folder structures and file formats.
“The major work of normalizing all the data for our models was actually what took the longest time, and that preprocessing pipeline is now accessible to other researchers and students, making it much easier to load dataset number 22. We emphasized finding a sustainable, scalable solution that could be used by others in the future,” says Jesper Strøm.
The text is inspired by the article “Big Sleep Data: LUMI supercomputer trains neural networks for sleep research” by Marie Charllotte Søbye at the DeiC website.
For more information please contact our contributor(s):