Doris

Doris – high performance measurement & computation

The high-performance measuring and computing (HPMC) group within NFDI4Ing had a prosper year and is eagerly working on all measures. The interdisciplinary and interinstitutional team is fully staffed and achieved first important milestones.

Tool development:

To naturally create ontology-conforming metadata fields accompanying research, DORIS recently released a metadata crawler. It allows automated extraction of metadata from simulations or experiments conducted on HPMC systems (crawler: gitlab.lrz.de/nfdi4ing/crawler). To get the most appropriate use of the automated metadata extraction, a HPMC-sub-ontology is developed in consultation with the special interest group “Metadata and Ontologies” as well as pilot users from the HPMC community. Due to the high storage demands (hundreds of TB or even PB), HPMC data are currently immobile and too large to be copied to local work stations. Therefore, new concepts for data storage, findability, access and transfer have to be implemented and evaluated. As first test, a searchable and indexable local storage to front-end interface has been provided ([TUM Workbench(https://www.ub.tum.de/workbench)] ↔ LRZ DSS storage). The storage is accessible through the local virtual research environment; an individual access management system can be implemented. To support the scientific community and to gain further improvement through testing and feedback, a users’ guide has been published.

Applications:

The (post-) processing of archetypal research data generated in the HPMC community usually is executed on HPC systems. Individual and custom-tailored software solutions that can be transferred to other systems are developed and tested within DORIS. To evaluate the feasibility of containers/dockers for reproducibility on HPC systems, DORIS successfully issued several project applications at all GCS participants. Test runs on SuperMUC/LRZ and JUWELS/JSC have already been conducted. Due to current low performance and efficiency, DORIS will improve methods to develop and share best-practice recommendations for reproducibility of simulations on HPC systems.

Obstacles:

Common frames within project applications at HPC facilities don’t consider RDM-projects, and some systems or software cannot be accessed by non-institutional users. These obstacles could be resolved through pilot users and project members at GCS facilities or associated institutions. In general, a strong dependence on 3rd party service providers, their policies and infrastructure (computing centers, software development) can be seen as a major obstacle. As a consequence, networking and an inclusive, interinstitutional approach are inevitable means for success.

Community outreach:

The full benefit for the NFDI effort in the context of HPMC with very large data lies in the usage of previously inaccessible data through third-party research groups. To transfer the gained knowledge and outcome, and to promote the preexisting or newly developed tools, community-based training and workshops are provided by DORIS in coordination with special interest group “Basic RDM Training”. First benefits could be shown to the community in a local (Munich based) workshop and will be consolidated in a nationwide workshop in April 2022. DORIS contributed to several meetings and conferences and always tries to increase the acceptance of NFDI4Ing benefits by promoting the outcomes and supporting the scientific community. A user based requirement assessment workshop regarding protection of legitimate expectation (“Vertrauensschutz”) for data has been held, and a generic template for a HPMC data management plan has been published.

To intrigue future scientists for RDM, and to enable them to apply their RDM skills within undergraduate studies, a new module on RDM in engineering sciences at TU Munich has been proposed and approved. The lecture will premiere in the summer semester 2022 with participation of internal and external lecturers and could modularly be expanded to other universities or research domains.

[Doris has 5 items on her scientific dissemination list 2021]