In close cooperation with the Institute of Toxicology and Genetics (ITG) and the Institute for Applied Computer Science (IAI), the Steinbuch Centre for Computing (SCC) of Karlsruhe Institute of Technology (KIT) is developing a novel large-scale data facility (LSDF) for the storage of scientific data. The facility will be provided with several petabytes of disk and tape storage volume within the next years. Then, it will be available for systems biologists worldwide.
Under the focus of the project “Reconstruction of the Embryonal Development of Zebra Fish Using High-throughput Microscopy”, ITG produces about 300,000 images within 36 hours, which corresponds to 2 – 3 terabytes of data. “To meet our requirements concerning the storage and administration of large data volumes, SCC with its vast expertise in this field is the ideal partner”, underlines Professor Uwe Strähle, Head of the ITG.
Setup of the LSDF infrastructure does not only cover the supply of large data storage and computer resources, but also creates new research needs that will be met by the SCC and the Institutes of Data Processing and Electronics (IPE) and Applied Computer Science (IAI).
Research and development focus on safe, high-performance access to the facility, automatic workflows for transferring the data to various storage classes, long-term archiving while maintaining integrity, data analysis, and the development of complex image processing algorithms and LSDF interfaces. “We consider LSDF a promising field of research for the SCC in the long term. With LSDF, we wish to support not only systems biology, but also other sciences”, says Professor Wilfried Juling, Managing Director of the SCC.
ITG identifies and characterizes molecules controlling the cell behavior by using zebra fish and medaka freshwater species as animal models. For this purpose, ITG operates one of the largest experimental facilities for keeping and cultivating the fish. It is planned to extend this facility to a European resource center. The link of this resource center to LSDF opens up entirely new possibilities for investigating evolution mechanisms and will be unique worldwide. By integrating the LSDF and using large sets of data, it is aimed at generating models to simulate organ development and regeneration. “In the long term, we want to generate virtual embryos and organs by computer simulation in order to better understand nature,” say ITG Professors Strähle and Wittbrodt.
SCC looks back on ten years of experience in the management of large data volumes. In the Worldwide Large Hadron Collider Computing Grid (WLCG) Project of CERN, Geneva, SCC, with its TIER1 computing center GridKa to which the data are transferred directly from CERN, has assumed the role of a leading data provider for the experiments of high-energy physics. At the moment, 10 petabytes of disk and tape stores are available.
Being „The Research University in the Helmholtz Association“, KIT creates and imparts knowledge for the society and the environment. It is the objective to make significant contributions to the global challenges in the fields of energy, mobility and information. For this, about 9,300 employees cooperate in a broad range of disciplines in natural sciences, engineering sciences, economics, and the humanities and social sciences. KIT prepares its 25,100 students for responsible tasks in society, industry, and science by offering research-based study programs. Innovation efforts at KIT build a bridge between important scientific findings and their application for the benefit of society, economic prosperity, and the preservation of our natural basis of life.