The DNA sequencing of thousands of organisms has revolutionised the life sciences world, unleashing huge volumes of data which is doubling every five months. Access to this information allows vital analysis to be completed, cost-effectively, in minutes rather than years. The challenge is to store, manage and integrate the skyrocketing volume and variety of biological information produced in life sciences research, and to make it available to the global scientific community.

The home for big data in biology

As Europe’s primary centre for looking after biological and biomedical data from research groups around the world, the European Bioinformatics Institute (EBI) near Cambridge in the UK is leading the revolution in biological information sharing. Part of the European Molecular Biology Laboratory (EMBL), the EBI collects, analyses, archives and then distributes this data globally for further research.

The data we’re generating is approaching the scale of CERN’s Large Hadron Collider

It collaborates with hundreds of partners around the world to manage terabytes of information from a multitude of life sciences projects. As biological research becomes increasingly collaborative, the reliable, efficient and secure sharing of ever larger datasets, from genome research for example, is crucial to many biomedical research projects.

Seamless, reliable and secure data sharing

Working in partnership, the GÉANT and Janet networks provide seamless, high-performance links between the EMBL-EBI in Cambridge and scientists located throughout the world, enabling real-time, global access to the world’s largest collection of molecular databases. The results of large-scale, ground-breaking initiatives such as the 1000 Genomes Project, an effort to map to the extent possible complete human worldwide variations, can now be distributed and made available for analysis quickly.

Novel applications

Traditionally EBI-EMBL served data to biological researchers. However now the field is moving towards concrete applications in the biochemical and agricultural areas. Increasingly data are shared with hospitals, clinics as well as agricultural researchers. This data will empower the discovery of new drugs, therapies, diagnostics and agrochemicals and new ways to tackle pests.

Dr Ewan Birney, Director of EMBL-EBI and Senior Scientist:

“Biology, and in particular genome sequencing, is producing incredible amounts of new experimental data every day, which is a goldmine for researchers but presents a new set of engineering challenges for the life sciences. The data we’re generating is approaching the scale of CERN’s Large Hadron Collider – only one order of magnitude smaller.  Now more than ever, we need robust networks, scalable storage and fast compute so that we can realise the opportunities these data provide.”

Published: 02/2017

For more information please contact the contributor/s: