Data Management

Data Systems Services

Department of Epidemiology & Biostatistics provides data collection, cleaning, and storage services to research investigators.

  • Cloud computing and server/desktop virtualization, hosted within the UCSF network and compliant with NIST-mandated security protocols
  • Customized programming and data services
  • Customized databases for outcome ascertainment studies

DMPTool

An online application that helps researchers create data management plans.

  • Meets funder requirements
  • Quick-start guide
  • General data management guidance

Eureka Research Platform

For online, mobile, and hybrid research studies, use Eureka, the cloud-based, HIPAA compliant research platform, created at UCSF and approved by UCSF IT Security. Key features include: 

  • Mobile and web app study experience for fully remote or hybrid data collection
  • Rapid remote recruitment from 400,000+ already-engaged participants or your own new participants
  • Electronic / remote consent process, questionnaires, and other patient-generated data
  • Wearable and mobile device data collection
  • EHR integration via FHIR
  • Coordinator data entry (eg: eCRFs)
  • Flexible automated reminders for optimized engagement and follow-up
  • Study management portal with raw data and customizable reports on demand with flexible access permissions

Information Commons

Clinical data at scale and very high performance, and an environment suited to pattern recognition and machine learning. This high performance compute cluster on AWS offers:

  • Access to de-identified structured EHR data; additional data sets coming soon, including de-identified clinical notes and images
  • Spark analytics engine, that enables fast data query via Spark-SQL, Machine Learning via Spark MLib, R via SparkR
  • Query data using PatientExploreR  
Free for UCSF Community

Library Data Science Initiative

Workshops, programs and expertise/office hours in:

  • Finding, Managing & Sharing Data
  • Statistics, Bioinformatics and Genomics
  • Programming in R, Python and more
  • Data Visualization with Tableau

NLP@UCSF

UCSF's NLP community curates knowledge as participants experiment, learn and implement NLP tools in clinical and biomedical research projects.

  • Slack channel and regular meetups
  • Recommended tools for textual analysis of clinical notes

Research Analysis Environment (RAE)

Formerly known as MyResearch, RAE offers secure hosting for sensitive data with web-based management and collaboration tools including support for study-specific large datasets and AWS resource options.

  • View, manipulate, and save data entirely in a protected environment without storing files on personal computers
  • Free access to research software applications, such as SAS, Stata, SPSS, R, TreeAge, Atlas.ti, MS Office, and Matlab - see full list
  • Collaboration tools, such as SharePoint, facilitate the conduct of multi-site research studies
Free up to 10GB/month for UCSF PIs

Research Electronic Data Capture (REDCap)

Web-based HIPAA-compliant and secure electronic data capture and storage for research studies.

  • Develop data entry forms and surveys
  • Data validation
  • Database reports

San Francisco Coordinating Center

Combines scientific expertise with broad experience in managing multi-center studies, and offers access to a network of high quality, experienced clinical centers.

  • Study design, coordination and implementation
  • Measurement selection
  • Protocol development
  • Database design
  • Research study data collection via fax
  • Data quality control