UCSF Clinical Data

Many access paths, one point of entry

Need help or guidance? Do you need data with identifiers like birth dates or medical record numbers? Want to make sure you are compliant with regulations? Your first hour of consultation is free. Make sure you head in the right direction.

UCSF electronic medical record data: What's available for research?

UCSF electronic medical record (EMR) data:

  • APeX data dating back to 2012
  • STOR data dating back to 1988
  • Images
  • Clinical notes

Plus additional data, such as:

  • Geocoded address data
  • CA Death Registry data 
  • ZSFG and other Department of Public Health data
  • UC Health data (EMR data from UC Davis, UC Irvine, UCLA, UCSD, UCSF and many others) - patient counts available via ACT Network


There’s a big difference between "identified" and "de-identified" data. And, a lot of acronyms!

Comparing de-identified data

Research Data Browser (RDB)

De-identified Clinical Data Warehouse


Information Commons AWS Cluster

Includes De-identified Data from APex:

  • Demographics
  • Encounters
  • Diagnosis
  • Medications
  • Labs
  • Procedures
  • Flowsheets
  • Vital status from CA Death Registry
  • Refreshed monthly
Does not require IRB approval
Point & click interface available
Will be replaced by De-ID CDW and decommissioned in 2020* Additional data, including:
  • Financial data
  • Utilization data
  • Historical STOR data

Data based on RDB data

De-ID CDW coming soon 


  • Images**
  • Clinical notes**
  • Concepts extracted from notes **
Flat Files available - large file size, need analytics tool skills for queries Berkeley Spark based
  Access via SQL server In cloud (AWS)
Ability to use your own preferred programming language with flat files Need SQL, Python or R skills
Useful for getting patient counts Suited for high speed queries & data mining
Learn more and access Learn more and access Learn more and access

** Requires IRB approval currently, but de-identified versions are coming.

First time User? Request access to Research Data and Tools 

Not sure what option is best for your project? Request a free brief consultation for advice.

Request identified clinical data; you need a consultation

Identified data is provided by consultation only. The first hour of your consultation is free!

  • Clarity - closest data to APeX; clinical notes available
  • Clinical Data Warehouse (CDW) - concise, pulls common data in Clarity into one field
  • OMOP - uses a national common data model on data derived from PCORnet pSCANNER
    Data is further from original state and there is potential to lose information

The consultant will help you define a data specification. The APeX Pick List (Excel 26MB, download from UCSF Box) is a helpful tool for this work - see more information below.

Working with clinical data? Preparation is key.

Be ready with adequate computing capabilities and tools for:

Use the APeX Pick List (Excel 26MB via UCSF Box) to identify variables for your research and to define your cohort.

  • Diagnoses
  • Meds
  • Labs
  • Procedures
  • Flowsheet
  • Departments
  • Smart Data Elements