Large-scale experiments and simulations in science supply an increasing amount of data. The way from data and information to findings and knowledge, however, also needs a new quality of memory and analysis options. The Helmholtz Association now assumes a pioneer role in the permanent, secure, and usable storage of data. For managing big data in science, it has established the Helmholtz Data Federation (HDF) coordinated by KIT. Within the next five years, about EUR 49.5 million will be invested into multi-thematic data centers and modern data management.
„Eine zukunftsfähige Dateninfrastruktur ist das Rückgrat des Forschungsstandorts Deutschland“, unterstreicht der Präsident des KIT, Professor Holger Hanselka. „Um die großen Herausforderungen Energie, Mobilität und Information zu lösen, braucht es die Fähigkeit aus Big Data schnell Smart Data zu machen. Am KIT als der Forschungsuniversität in der Helmholtz-Gemeinschaft sind die dazu notwendigen Kompetenzen verknüpft.“
”A viable data infrastructure is the backbone of Germany as a location of research,” the President of KIT, Professor Holger Hanselka, emphasizes. “To master the big challenges of energy, mobility, and information, we have to be capable of turning big data rapidly into smart data. At KIT, the research university in the Helmholtz Association, we pool the competencies necessary for this purpose.”
“The Helmholtz Centres are prepared to preserve research data in suitable data infrastructures in the long term and to make them as open as possible for later use by science and the society,” Professor Otmar D. Wiestler, President of the Helmholtz Association, says.
“Germany’s leading data centers join the Helmholtz Data Federation in order to store the flows of research data from various scientific disciplines in an ordered manner, to interconnect them with each other, and to make them available for joint use,” Professor Achim Streit of KIT, coordinator of the HDF, points out. “The HDF might serve as a blueprint for data-intensive research in Germany and Europe, an open harbor for access to and turnover of research data.”
The HDF is a central element of the recently adopted position paper of the Helmholtz Association on the handling of research data, which is entitled “Die Ressource Information besser nutzbar machen!” (Improving the usability of information resources). Thanks to its secure federation structure and the setup of multi-thematic data centers, the HDF will enable data-intensive science communities to make their scientific data visible, to share their data while retaining data sovereignty, to use them across disciplines, and to archive these data reliably. The federation is based on three key elements: Innovative software for research data management, excellent user support, and latest storage and analysis hardware. The partners plan medium-term investments into memory systems of double-digit petabyte capacity and into ten thousands of processor cores for data analysis and management. Until 2021, a total of EUR 49.5 million is planned to be financed from the strategic development funds of the Helmholtz Association.
The HDF partners in the first phase are these six centers focusing on five research fields of the Helmholtz Association: Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research (Earth and Environment), Deutsches Elektronen-Synchrotron DESY and GSI Helmholtz Centre of Heavy Ion Research (both research field Matter), German Cancer Research Centre (Health), Forschungszentrum Jülich, and Karlsruhe Institute of Technology (both Energy, Key Technologies, Matter, Earth and Environment). The HDF represents the nucleus of a national research data infrastructure across science organizations, which is open to users in the whole German science community. International connections will make it compatible with the future European Science Cloud (EOSC).
KIT already operates several infrastructures for big data. The Smart Data Innovation Lab (SDIL) provides a Germany-wide research platform with latest analysis functions for companies. The Smart Data Solution Center Baden-Württemberg (SDSC) supports small and medium-sized enterprises of the region in accessing smart data technologies. The GridKa data center is part of the worldwide distributed network for the European particle accelerator center CERN. With the Large-Scale Data Facility – LSDF for science in Baden-Württemberg and the Large-Scale Data Management and Analysis Initiative – LSDMA of the Helmholtz Association, KIT has already established the basis for coordinating the HDF. In addition, KIT’s informatics institutes study analysis methods, evaluation algorithms and data security.
Research data projects of KIT:
http://www.smart-data-solution-center.de/(in German only)
http://www.gridka.de/welcome-en.html
https://www.scc.kit.edu/forschung/lsdf.php
http://www.helmholtz-lsdma.de/
Being “The University in the Helmholtz Association”, KIT creates and imparts knowledge for the society and the environment. It is the objective to make significant contributions to the global challenges in the fields of energy, mobility, and information. For this, about 10,000 employees cooperate in a broad range of disciplines in natural sciences, engineering sciences, economics, and the humanities and social sciences. KIT prepares its 22,800 students for responsible tasks in society, industry, and science by offering research-based study programs. Innovation efforts at KIT build a bridge between important scientific findings and their application for the benefit of society, economic prosperity, and the preservation of our natural basis of life. KIT is one of the German universities of excellence.
 
                