Deidentified Clinical Data Repository (DCDR)

Clinical data may contain Protected Health Information (PHI) that requires a considerable number of authorizations and security measures for analysis. Impermissible use or disclosure of PHI is subjected to public notification (>500 subjects) and possible penalties. HIPAA policies allow limited reuse of unadulterated PHI, but provides greater flexibility for analysis of “limited datasets” or “fully anonymized datasets.  Generating these datasets requires high levels of expertise.

Our team can help with PHI anonymization or transformation in limited datasets by utilizing a pipeline of state-of-the-art public and commercial software and guidelines set forth by HHS (http://www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-id...): 

  • Safe Harbor: deletion of the 18 identifiers
  • Expert Method: methods to obfuscate the dates by generating per patient random offsets
  • Expert Method: narratives de-identification algorithms (de-idata, Harvard Scrubber)