Sharing De-identified Data for Publication


As part of a push to improve research reproducibility an increasing number of health science funders and publishers are asking researchers to share the de-identified data underlying their research. NIH already has a genomic data sharing requirement, and recently announced new requirement for data management and sharing plans that will go into effect in January 2023. This page explains how to plan for data sharing so that you can meet these requirements while also following UCSF guidance on privacy and data security.

Note that sharing P4 data types including identified PHI/PII/RHI data does not fall under this process, and researchers interested in sharing this kind of data will need to follow the guidance on this page.

If you have questions about sharing models or algorithms please reach out to Industry Contracts at [email protected]

Project Planning + Grant Writing Stage

  1. Check your sponsor agreements for any guidance or restriction on data sharing.
  2. Write a data management plan with information about how your data will be stored, organized, and shared.
  3. Include data sharing language in your IRB paperwork and consent forms.
    1. UCSF consent form templates include appropriate sample language.
  4. Research data repositories, paying attention to recommended data formats, access restrictions, costs, and submission timelines.
  5. Plan for data sharing costs in your grants – including data curation work and de-identification charges, which can be substantial.

Data Submission Stage

  1. Select a data repository appropriate to your data and area of research. Note that if you are working with sensitive data you should consider a restricted access repository.
    1. The UCSF Library can help you select a relevant data repository.  
  2. Prepare your data and documentation.
    1. Prepare a de-identified version of your dataset following appropriate de-identification methods.
      1. CTSI’s data de-identification service can provide advice on de-identification and connect you with third party de-identification validation services.
      2. The UCSF Privacy Office is another resource for questions about HIPAA and data de-identification.
    2. Gather all necessary data documentation and format your data to meet the standards of your repository, using open file formats whenever possible.
  3. Upload your de-identified data to your selected data repository.
    1. If data use agreements or data transfer agreements are required by your data repository, work with UCSF Industry Contracts to evaluate and sign these forms.


Contact [email protected] or [email protected] for help and guidance