Skip to Main Content

Data Repository Selection: Federal data

This guide contains information on how to compare and select repositories for research data.

Changes in federal policies

Most federal research policy changes enacted since January 2025, have concerned funding and grant applications. However, research data has also been impacted, with over 3,000 datasets removed from federal repositories. Most removed datasets come from the Department of Energy, the National Oceanic and Atmospheric Administration, and the Environmental Protection Agency, but the Center for Disease Control and the National Institutes of Health were also impacted. Removed datasets tend to address topics such as climate change and clean energy, gender-based and sexual violence, racial discrimination, LGBTQ+ populations, and mass shootings. In the health sciences field, datasets on COVID-19, opioid addiction, healthcare disparities impacting marginalized groups, and reproductive care have been particularly impacted. Many of the removed CDC datasets were longstanding surveys that provided researchers across fields with information about social-economic status, vaccination rates, behavioral risks, and other demographic information. Some data that was initially removed has since been restored, sometimes in an edited or altered form.

KFF tracks healthcare policy news, and has covered the removal of data and other federal changes since January 25, 2025 in detail. Their article A Look at Federal Health Data Taken Offline from February 2025 is a good introduction to recent changes to data repositories and federal datasets.

In July 2025, the NIH released a five year strategic plan for data science. This includes increased support for the new NIH Data Management and Sharing Policy, more data science training opportunities, and progress towards a federated data system that will integrate disparate databases. You can read more about 2025-2030 Strategic Plan for Data Science on the NIH website. 

Where to find removed federal data

There are many organizations working to preserve federal data that has been removed. The End-of-Term Project works to preserve websites, databases, and information every time a presidential administration changes. The Internet Archive has archived CDC datasets uploaded before January 28th, 2025, and Git for LSIT at UCSB has archived many NIH datasets

If you would like to see how a website you use or are interested in using has changed because of changes in federal policy, you can compare versions of federal websites before and after January 2025 with GovDiff.com

For more information on how to find removed federal data, read our MSK Library blog post on the availability of federal data, or visit the Data Rescue Project, which compiles data preservation efforts, tracks data removal, and provides resources for researchers. 

Uploading to federal data repositories

You may have noticed that many of the federal repositories and resources on this Library Guide have the following banner at the top of their webpages:

A blue banner than reads "The repositories on this page are currently being reviewed for potential updates to comply with Administrative directives and agency priorities."

Researchers may have questions on whether or not they should submit data to federal repositories in case it is removed or restricted. Currently, there are no overarching changes to federal data repository submission policies, so we recommend that researchers continue to deposit in federal repositories - and indeed, some publishers or funders might require that you do. However, we do encourage you to keep monitoring the state of federal websites and policies. Volunteering for or contributing to efforts such as the Data Rescue Project, the Data Liberation Project, or Datahoarding.org will help ensure the preservation of valuable resources in the future.