Alchemist / Critical Care

Documentation on the data exchange format for HIC Critical Care and the Alchemist ingestion pipeline.

Pseudonymisation

Working notes on how to provide access to data to each site whilst minimising exposure to direct identifiers and minimising the risk of re-identification via indirect identifiers.

Technical options considered:

  • Keep a single centralised schema and use views to present anonymised identifiers.
  • Maintain an anonymised copy of the data (this simplifies the management of permissions but takes more space and may incur in synchronisation errors).
  • Provide column- and table-level permissions, hiding identifiable data.

These notes will be built up as per each “bundle”.

OMOP Person bundle

Hide the following items:

  • person_source_value : this is very likely to be the local MRN (Medical Record Number).

Replace the following items:

  • person_id : in Silver, we will assume that the id is potentially identifying (i.e. it is possible but unlikely that sites chose to use the local hospital number). This value will be replaced by a generated number when transferring data into Gold.

Do not hide the following items (on the basis that these are not direct identifiers):

  • Columns relating to the patient’s date of birth.
  • Columns relating to gender.

Data Visibility

Data Category Fields Description Bundle Silver Gold Pseudo anonymised Bespoke Release
Episode Descriptor person_id ID unique to the data set OMOP Person Yes Yes Yes
Episode Descriptor person_source_value ID unique at the patients site OMOP Person Yes No No
Direct identifier NHS number Only used for Data Linkage HIC Person Yes No No
Identifying year_of_birth   OMOP Person Yes Yes Provided as age at admission
Direct identifier birth_date_time   OMOP Person Yes Provided as age at admission Provided as age at admission
Direct identifier death_date   HIC Person Yes Revert to 01/01/2020 maintaining $\Delta$ with date of admission Revert to 01/01/2020 maintaining $\Delta$ with date of admission
Identifying Post Code Hospital, GP, Person HIC Person Yes Max 2 inbound or transformation to deprivation index Deprivation Index Only
Date time visit_start_date Hospital, ICU, Ward HIC Person Yes Only if directly required by research question Revert to 01/01/2020 maintaining seasonal cadence
Date time visit_end_date Hospital, ICU, Ward HIC Person Yes Only if directly required by research question Revert to 01/01/2020 maintaining seasonal cadence
Date time Multiple possibilites Tests, interventions, results All Bundles Yes Only if directly required by research question Revert to 01/01/2020 maintaining seasonal cadence
Sensitive Comorbidities Sensitive or related diagnosis Diagnoses Yes Only if directly required by research question No
Sensitive Diagnosis Sensitive or related diagnosis Diagnoses Yes Only if directly required by research question No
Sensitive Drugs Sensitive or related diagnosis related Drugs Basics Yes Only if directly required by research question No
Sensitive Test Results Sensitive or related diagnosis related Pathology Basics Yes Only if directly required by research question No