Skip to Main Content

NSF Data Management and Sharing Policy

Digital repositories for scientific data

NSF was an active participant in the development of the National Science and Technology Council document Desirable Characteristics of Data Repositories for Federally Funded Research and fully endorses this starting point for understanding basic properties of such repositories, though development of additional federal guidance in this area is necessary.

In the 2021 implementation of public access for datasets in NSF-PAR, the agency was intent on neither undermining nor replacing long-established disciplinary repositories. Rather, NSF-PAR’s approach to indexing metadata — including the use of persistent identifiers, or PIDs, assigned to research datasets — leverages a federated approach whereby research data reside in appropriate repositories and NSF-PAR serves as a point of discovery.

NSF-PAR FAQ

Desirable characteristics of data repositories

Free and easy access: The repository provides broad, equitable, and maximally open access to datasets and their metadata free of charge in a timely manner after submission

Clear use guidance: The repository ensures datasets are accompanied by documentation describing terms of dataset access and use

Risk management: The repository has documented capabilities for ensuring that...safeguards are employed to comply with applicable confidentiality, risk management, and continuous monitoring requirements for sensitive data.

Retention policy: The repository provides documentation on policies for data retention.

Long-term organizational sustainability: The repository has a plan for long-term management of data, including maintaining integrity, authenticity, and availability of datasets

Unique persistent identifiers: The repository assigns a dataset a citable, unique persistent identifier (PID or DPI), such as a digital object identifier (DOI), to support data discovery, reporting, and research assessment

Metadata: The repository ensures datasets are accompanied by metadata to enable discovery, reuse, and citation of datasets

Curation and quality assurance: The repository provides or facilitates expert curation and quality assurance to improve the accuracy and integrity of datasets and metadata.

Broad and measured reuse: The repository ensures datasets are accompanied by metadata that describe terms of reuse and provides the ability to measure attribution, citation, and reuse of data

Common format: The repository allows datasets and metadata to be accessed, downloaded, or exported from the repository

Provenance: The repository has mechanisms in place to record the origin, chain of custody, version control, and any other modifications to submitted datasets and metadata.

Authentication: The repository has technical capabilities that facilitate associating submitter PIDs with those assigned to their deposited digital objects, such as datasets.

Long-term technical sustainability: The repository has a plan for long-term management of data

Security and integrity: The repository has documented measures in place to meet well established cybersecurity criteria for preventing unauthorized access to, modification of, or release of data

 

The Desirable Characteristics of Data Repositories For Federally Funded Research

Does a generalist repository fit? Your sharable data can go in OSF!