Every day, scientists from various fields generate vast amounts of scientific data with tremendous value that extends beyond its initial scientific purpose, including potential applications in AI models. Searching through this data, however, remains a significant challenge for researchers, scientists, and students due to its scattered nature, isolation in different domains, and the lack of unified access to search options.
DoubleCloud has become a corporate partner of The University of Utah’s Scientific Computing and Imaging (SCI) Institute in their groundbreaking pilot project called the National Science Data Fabric (NSDF). The Multi-Federated National Science Data Fabric Catalog is being developed to bridge gaps and streamline the access and storage of extensive scientific data generated and processed by leading scientific organizations and laboratories.
Currently, the catalog already contains around 70 repositories, with over 1.5 billion records totalling more than 75 petabytes of data. The data ranges from geologic-geophysical research databases to NASA image datasets.
The NSDF-Catalog is designed to achieve several related goals within a flexible microservice, including:
Coordinating data movement and replication from source repositories within the NSDF federation
Creating a registry of existing scientific data for the development of next-generation cyberinfrastructure
Providing a set of tools for data discovery for trans-disciplinary research.
“At DoubleCloud, we believe that democratizing access to scientific data is crucial for innovation. We are proud to collaborate with the National Science Data Fabric initiative to develop the NSDF catalog. Being part of this project means for us that we can help researchers from various fields to find and access the data they need. It means new discoveries and scientific advancements. And if DoubleCloud can help in this journey, we will do so,” said Natalia Shuliak, COO of DoubleCloud.