UCSF Clinical Data

UCSF electronic health record data: What's available for research?

UCSF has identified and de-identified data available for research.

* Identified data: requires an IRB approved protocol and typically requires funding to work with centralized data experts to extract data on your behalf. FAQs & Learn more!

* De-identified (DeID) data: does not require IRB approval and is self-serve via SQL server or point-and-click tools. Learn more!


Many access paths, one point of entry

Need help or guidance? Do you need data for your research with identifiers like birth dates or medical record numbers? Want to make sure you are compliant with regulations? Your first hour of consultation is free. Make sure you head in the right direction.


About the UCSF electronic health record (EHR) data:

  • APeX data dating back to 2012
  • STOR data dating back to 1988
  • Benioff Children's Hospital (BCH) Oakland data dating from March 2020 (with additional select historical data) 
  • Images
  • Clinical notes

Plus additional data, such as:

>> COVID-19 specific data for research is also available


There’s a big difference between "identified" and "de-identified (DeID)" data.







Comparing de-identified data

De-identified Clinical Data Warehouse


Information Commons AWS Cluster

Learn more & access:  De-ID CDW Knowledge Base (login req'd)

Learn more and access (login req'd)
Additional data, including:
  • Financial data
  • Utilization data
  • Historical STOR data

Data based on DeID CDW

Machine-redacted Clinical notes


  • Images**
  • Concepts extracted from notes
Access via SQL server In cloud (AWS)
Suited for high speed queries & data mining
Large files, need analytics tool skills for queries  Berkeley Spark based, need SQL, Python or R skills

Includes De-identified Data from APex:

  • Demographics
  • Encounters
  • Diagnosis
  • Medications
  • Labs
  • Procedures
  • Flowsheets
  • Vital status from CA Death Registry
  • Refreshed monthly
Does not require IRB approval
        Point & click interface available

** Requires IRB approval currently, but "certified" de-identified versions are coming.

First time User? Request Data Access for Research 

Not sure what option is best for your project? Request a free brief consultation for advice.

Already using the DeID CDW or DeID OMOP? Join the active User Group! 


Request identified clinical data; you need a consultation

Identified data is provided by consultation only. The first hour of your consultation is free!

  • Clarity - closest data to APeX; clinical notes available
  • Clinical Data Warehouse (CDW) - concise, pulls common data in Clarity into one field
  • OMOP - uses a national common data model on data derived from PCORnet pSCANNER
    Data is further from original state and there is potential to lose information

The consultant will help you define a data specification. The APeX Pick List and/or ZSFG Pick List (Large Excel files via UCSF Box) are helpful tools for this work - see more information below.

Working with clinical data? Preparation is key.

Be ready with adequate computing capabilities and tools for:

Use the APeX Pick List  or the ZSFG Pick List (Large Excel files via UCSF Box) to identify variables for your research and to define your cohort.

  • Diagnoses
  • Meds
  • Labs
  • Procedures
  • Flowsheet
  • Departments
  • Smart Data Elements