This TV-channel will be catering to a niche audience only. But nevertheless its long-term impact will be huge.
The European Spallation Source, a 1.843 M euro Big Science infrastructure under construction in Lund, Sweden, opens for scientific research in 2023. It is the world’s most powerful neutron source for research, and will allow scientists in biology, chemistry, engineering, and physics to understand basic structures of materials on a much more detailed level than before.
A wide range of areas will benefit from the knowledge produced in Lund, from aerospace and telecommunication to medicine, energy, transportation, and even heritage science, for instance metallurgy analyses of 10th century Vikings swords using non-invasive neutron techniques.
Part of the huge setup of the ESS is an on-demand streaming channel, offered through the ESS high-performance datacentre. Although the expected start of the ESS scientific user programme in 2023 is a long way from now, the datacentre team is already preparing the e-infrastructure. Strong connectivity and processing power is needed to achieve a processing speed that will allow ESS to stream the raw data from the experiments, process it, and return meaningful and scientifically valid data back to its users. The goal is to do this in as close to real time as possible. The visual data will be made available as a continuously streamed channel, both on-site in Lund and elsewhere. Individual scientists monitoring the experiments can subscribe and “tune in” frame by frame (in this case, a frame is 71 milliseconds, the time interval between the ESS 14 Hz neutron beam pulses).
As mentioned above, it will take a while before the ESS TV-channel will begin broadcasting. The ESS is still under construction, and so is its Data Management & Software Centre, DSMC. But from the moment the first neutrons produced by the ESS are registered on a detector, the raw experiment data will flow from Lund to the ESS datacentre in Copenhagen, Denmark, and from there to the ESS scientific user community throughout the World.
Local and regional research & education networks are deeply involved in securing the powerful connectivity needed to process, store and transport these large amounts of data.
The ESS instruments and the ESS datacentre are located a few kilometres apart. While the ESS proton accellerator, The Target (where the neutrons are generated), and the 15 neutron instruments around it are situated in Lund, the ESS data centre is located in Copenhagen right across the narrow Öresund strait dividing Denmark and Sweden.
The research & education networks of the two countries, Danish DeiC and Swedish SUNET, together with pan-Nordic NORDUnet, are collaborating on the ESS e-infrastructure. SUNET is providing 2 connections to ESS in Lund, while DeiC is providing 2 connections to ESS DMSC in Copenhagen. Furthermore, NORDUnet is providing a managed solution that ensures international connectivity for ESS and a redundant VPN connecting the two sites in one LAN.
It is no small task to process and manage the data produced by the ESS, and it has to be done in a way that maximises the facility’s research potential, extracting the maximum amount of scientific information from the data, as well as making the data as accessible as possible.
Jonathan Taylor, Head of the ESS Data Management & Software Centre, explains:
“All data coming from the ESS is timestamped. We are the first facility that has the synchronous time stamps injected right at the beginning, when the data is created. We have a split site, with all the neutron instruments placed in Lund, and in Lund there is as a server room where all the data acquisition takes place. When this timestamped data is generated into files it is copied across to the Copenhagen data centre, and stored in accordance to our data policy. This gives users access to their data after they’ve done their experiments, for them to be able to continue their data analysis. For that we have the hardware and the data treatment and data analysis software plus some expertise to allow the data to be well used.”
“The research & education networks have been very important to us. The original plan was that we would do everything ourselves, including buying our own dark fibre. We soon realized that was not a good idea, and now we collaborate with people with a high level of network expertise. That has been truly rewarding, and I think we’ll rely more and more on that side of things, not just the ESS, but all of these kind of facilities, because the data is simply too big to fit on a laptop. You have to access it over a network, so the network becomes really critical, and increasingly so, as the data keeps getting bigger and bigger.”
“Although the ESS user programme won’t begin before 2023, our data acquisition system and streaming of the neutron events is already in place. It is based on the Apache Kafka open source software for making a network stream of data available. To run Apache Kafka you need high-performance networks, and it is fast, scalable and widely used. For instance it is the underlying stack for running Netflix and LinkedIn.”
“According to the trend towards Open Science, we want to make ESS data openly available from a single source, and in a way that aligns with the EU FAIR data principles, making data findable, accessible, interoperable and reuseable.”
“A lot of Big Science facilities have made all their data open. But the real challenge is, that even if you can find the data file you can’t do very much with it unless you have all the metadata that describes what is in the data file. We spend a lot of effort on this, doing data management. Partly we are aided by the data acquisition scheme of the ESS, where the experimental configuration of the instrument gets put into the data file, together with metadata about why and how this particular measurement was made. You have to connect that information to the data itself. Otherwise it is less useful when it becomes open.”
“Another aspect of Open Data is, that our users not only use the ESS. They’ll use a lot of other neutron and photon facilities, because that is the way research is done these days. So, what is really needed is for all of these facilities to have some commonality and standardization. We are quite lucky with neutrons and photons that we have a standard data format called Nexus, which is quite well advanced, when you compare our domain to other areas of science. Another issue then is standardizing on metadata, and that is a challenge. In that regard we have the European Open Science Cloud initiative, and we’re involved in a number of EOSC projects on developing a solution for that and to ensure that the policies are in place to allow FAIR and Open Data to work across different facilities.”
By 2023, the ESS is expected to produce 3-5 petabytes of data per year, rising to 7-11 PB over the following years. When going into operation the ESS Data Management & Software Centre is expected to have 60-70 employees.