Data Sets

UCSF Clinical Data

Research access to UCSF electronic medical record data (APeX) - Research Data Browser (RDB), Clinical Data Warehouse (CDW), and more.

  • Summary statistics
  • Generation of condition-specific patient populations for study recruitment / chart review
  • Outcomes research using historical data in hospital databases

Information Commons

Clinical data at scale and very high performance, and an environment suited to pattern recognition and machine learning. This high performance compute cluster on AWS offers:

  • Access to de-identified structured EHR data; additional data sets coming soon, including de-identified clinical notes and images
  • Spark analytics engine, that enables fast data query via Spark-SQL, Machine Learning via Spark MLib, R via SparkR
  • Query data using PatientExploreR  
Free for UCSF Community

VA Data Core Consultation

Access a central data repository with health information from the electronic medical records of over 9 million US Veterans.

  • Provides consultation regarding available data
  • Facilitates necessary paperwork, approvals and regulatory compliance
  • Assists with identifying, extracting, and merging variables of interest
Hourly Recharge, first hour free per project

OptumLabs Data Warehouse (OLDW)

UCoP is partnering with OptumLabs, a collaborative center for research and innovation.

  • Access 160 million de-identified records across claims and clinical information to conduct investigations on populations
  • Annual opportunity for funded projects - sign up for notifications

Large Health Dataset Inventory

Data repository infrastructure for accelerated access to and use of local and national health datasets.


Self-service online tool allows researchers to:

  • Describe, upload and share research data using any file format
  • Search for and download research data sets