22–25 Sept 2025
GW1 Uni Bremen
Europe/Berlin timezone

The OpenData portal for DESY, HIFIS, NFDI and EOSC

25 Sept 2025, 09:00
20m
Raum B0080 (GW1, Uni Bremen)

Raum B0080

GW1, Uni Bremen

Vortrag (10 min + 5 min) Gemeinsamer Teil

Speaker

Tim Wetzel (DESY-IT (Research and Innovation in Scientific Computing))

Description

The DESY Research Infrastructure historically supports a large variety of sciences, like High Energy and Astroparticle Physics, Dark matter research, Physics with Photons and Structural Biology. Most of those domains generate large amounts of precious data, handled according to domain specific policies and taking into account embargo periods and license restrictions. However, a significant section of this data is supposed to become “Open Data”, often enforced by funding agencies. To support its scientific communities in producing and using open data, DESY-IT is developing and installing central services, making open data sets easily findable, browsable and viewable. In addition, mechanisms will be provided to analyse data for the long tail of science, not covered by large e-Infrastructures.

Following the principles of Open and FAIR data, we provide a metadata catalogue to make the data findable. The accessibility aspect is approached by making use of federated user accounts via eduGAIN and will enable community members to use their institutional accounts for data access. The interoperability of the datasets is ensured by using community approved data formats such as HDF5, specifically NeXuS, ORSO and openPMD wherever possible. Providing the technical and scientific metadata will finally make the open data sets reusable for subsequent analyses and research.

Our setup consists of three components: the metadata catalogue SciCat, the storage system dCache and a JupyterHub. Scientific data can then be placed in a specific directory on dCache together with its metadata and will be ingested into SciCat availabe for access and download. Simultaneously, access to the dataset within is available from within the JupyterHub for data exploration without the need to download large data volumes to one's own computer.

During the talk, we will present the architecture of the system, its individual components as well as their interplay. The system is also publicly available to be tried out.

Zustimmung zu Streaming/Agree to streaming ja/yes
Zustimmung zur Bereitstellung von Aufzeichnung/Agree to internal publication of recording ja/yes

Authors

Dr Armando Bermudez Martinez (DESY-IT (Research and Innovation in Scientific Computing)) Dr Patrick Fuhrmann (DESY-IT (Research and Innovation in Scientific Computing)) Tim Wetzel (DESY-IT (Research and Innovation in Scientific Computing))

Co-authors

Dr Johannes Reppin (DESY-IT (Information Fabrics)) Dr Regina Kwee-Hinzmann (DESY-IT (Information Fabrics)) Uwe Jandt (DESY)

Presentation materials