Accessible, well-maintained, and efficiently operated data resources are critical enablers of modern biomedical research. Data resources, through good data management practices, are the key to data and knowledge discovery, integration, and data reuse, as outlined by the FAIR Data Principles (Findable, Accessible, Interoperable, and Reusable digital objects). In order to sustain a healthy and productive data resource ecosystem, it is critical to ensure that each component: (a) delivers scientific impact to the communities that they serve; (b) employs and promotes good data management practices and efficient operation for quality and services; (c) engages with the user community and continuously address their needs; and (d) supports a process for data life-cycle analysis, long-term preservation, and trustworthy governance.
In order to better support such a modern data resource ecosystem, NIH makes a distinction between data repositories and knowledgebases. While each activity is important for advancing biomedical research, data repositories and knowledgebases can have unique functions, metrics for success and sustainability needs. Biomedical data repositories accept submission of relevant data from the community to store, organize, validate, archive, preserve and distribute the data, in compliance with the FAIR Data Principles. Biomedical knowledgebases on the other hand extract, accumulate, organize, annotate, and link the growing body of information that is related to and relies on core datasets. The funding announcement for Biomedical knowledgebases is at PAR-23-078.
This funding opportunity announcement supports the biomedical data repositories that are important to the mission of the NIH Institutes and Centers participating in this announcement. The evaluation of the repositories will empasize their utility and impact, quality of data and services and efficiency of operations, community needs and engagement, trustworthiness of stewardship and governance .
Biomedical data repositories under this announcement should have the primary function to ingest, archive, preserve, manage, distribute, and make accessible the data related to a particular system or systems. Support for data curation must be limited to that which improves the efficiency and accessibility of data ingestion, management, and use and reuse by the user communities. Support for software and tool development must be limited to that which provides essential functions or significantly increases the efficiency of operation of the repository.