Concept
The Standardized Vocabularies contains records, or Concepts, that uniquely identify each fundamental unit of meaning used to express clinical information in all domain tables of the CDM. Concepts are derived from vocabularies, which represent clinical information across a domain (e.g. conditions, drugs, procedures) through the use of codes and associated descriptions. Some Concepts are designated Standard Concepts, meaning these Concepts can be used as normative expressions of a clinical entity within the OMOP Common Data Model and within standardized analytics. Each Standard Concept belongs to one domain, which defines the location where the Concept would be expected to occur within data tables of the CDM.
Concepts can represent broad categories (like “Cardiovascular disease”), detailed clinical elements (“Myocardial infarction of the anterolateral wall”) or modifying characteristics and attributes that define Concepts at various levels of detail (severity of a disease, associated morphology, etc.).
Records in the Standardized Vocabularies tables are derived from national or international vocabularies such as SNOMED-CT, RxNorm, and LOINC, or custom Concepts defined to cover various aspects of observational data analysis. For a detailed description of these vocabularies, their use in the OMOP CDM and their relationships to each other please refer to the specifications.
Field | Required | Type | FK Table | Description |
---|---|---|---|---|
concept_id | Yes | bigint | A unique identifier for each Concept across all domains. | |
concept_name | Yes | varchar(255) | An unambiguous, meaningful and descriptive name for the Concept. | |
domain_id | Yes | varchar(20) | DOMAIN | A foreign key to the Domain the Concept belongs to. |
vocabulary_id | Yes | varchar(20) | VOCABULARY | A foreign key to the Vocabulary record indicating from which source the Concept has been adapted. |
concept_class_id | Yes | varchar(20) | CONCEPT_CLASS | The attribute or Concept Class of the Concept. Examples are “Clinical Drug”, “Ingredient”, “Clinical Finding”, etc. |
standard_concept | No | varchar(1) | This flag determines where a Concept is a Standard Concept, i.e. is used in the data, a Classification Concept, or a non-standard Source Concept. The allowables values are “S” (Standard Concept) and “C” (Classification Concept), otherwise the content is NULL
. |
|
concept_code | Yes | varchar(50) | The concept code represents the identifier of the Concept in the source vocabulary, such as SNOMED-CT concept IDs, RxNorm RXCUIs, etc. Note that concept codes are not unique across vocabularies. | |
valid_start_date | Yes | date | The date when the Concept was first recorded. The default value is “1-Jan-1970”, meaning, the Concept has no (known) date of inception. | |
valid_end_date | Yes | date | The date when the Concept became invalid because it was deleted or superseded (updated) by a new concept. The default value is “31-Dec-2099”, meaning, the Concept is valid until it becomes deprecated. | |
invalid_reason | No | varchar(1) | Reason the Concept was invalidated. Possible values are “D” (deleted), “U” (replaced with an update) or NULL
when valid_end_date
has the default value. |
Conventions
Concepts in the Common Data Model are derived from a number of public or proprietary terminologies such as SNOMED-CT and RxNorm, or custom generated to standardize aspects of observational data. Both types of Concepts are integrated based on the following rules:
- All Concepts are maintained centrally by the CDM and Vocabularies Working Group. Additional concepts can be added, as needed, upon request.
- For all Concepts, whether they are custom generated or adopted from
published terminologies, a unique numeric identifier
concept_id
is assigned and used as the key to link all observational data to the corresponding Concept reference data. - The
concept_id
of a Concept is persistent, i.e. stays the same for the same Concept between releases of the Standardized Vocabularies. - A descriptive name for each Concept is stored as the Concept Name as part of the CONCEPT table. Additional names and descriptions for the Concept are stored as Synonyms in the CONCEPT_SYNONYM table.
- Each Concept is assigned to a Domain. For Standard Concepts, this is always
a single Domain. Source Concepts can be composite or coordinated entities,
and therefore can belong to more than one Domain. The
domain_id
field of the record contains the abbreviation of the Domain, or Domain combination. Please refer to the Standardized Vocabularies specification for details of the Domain Assignment. - For details of the Vocabularies adopted for use in the OMOP CDM refer to the Standardized Vocabularies specification.
- Concept Class designation are attributes of Concepts. Each Vocabulary has its own set of permissible Concept Classes, although the same Concept Class can be used by more than one Vocabulary. Depending on the Vocabulary, the Concept Class may categorize Concepts vertically (parallel) or horizontally (hierarchically). See the specification of each vocabulary for details.
- Concept Class attributes should not be confused with Classification Concepts. These are separate Concepts that have a hierarchical relationship to Standard Concepts or each other, while Concept Classes are unique Vocabulary-specific attributes for each Concept.
- For Concepts inherited from published terminologies, the source code is
retained in the
concept_code
field and can be used to reference the source vocabulary. - Standard Concepts (designated as “S” in the
standard_concept
field) may appear in CDM tables in all*_concept_id fields
, whereas Classification Concepts (“C”) should not appear in the CDM data, but participate in the construction of the CONCEPT_ANCESTOR table and can be used to identify Descendants that may appear in the data. Non-standard Concepts can only appear in*_source_concept_id
fields and are not used in CONCEPT_ANCESTOR table. Please refer to the Standardized Vocabularies specifications for details of the Standard Concept designation. - All logical data elements associated with the various CDM tables (usually in
the
<domain>_type_concept_id
fields) are called Type Concepts, including defining characteristics, qualifying attributes, etc. They are also stored as Concepts in the CONCEPT table. Since they are generated by OMOP, there is no meaningfulconcept_code
. - The lifespan of a Concept is recorded through its
valid_start_date
,valid_end_date
and theinvalid_reason
fields. This allows Concepts to correctly reflect at which point in time were defined. Usually, Concepts get deprecated if their meaning was deemed ambiguous, a duplication of another Concept, or needed revision for scientific reason. For example, drug ingredients get updated when different salt or isomer variants enter the market. Usually, drugs taken off the market do not cause a deprecation by the terminology vendor. Since observational data are valid with respect to the time they are recorded, it is key for the Standardized Vocabularies to provide even obsolete codes and maintain their relationships to other current Concepts. - Concepts without a known instantiated date are assigned a
valid_start_date
of “1-Jan-1970”. - Concepts that are not invalid are assigned a
valid_end_date
of “31-Dec-2099”. - Deprecated Concepts (with a
valid_end_date
before the release date of the Standardized Vocabularies) will have a value of “D” (deprecated without successor) or “U” (updated). The updated Concepts have a record in the CONCEPT_RELATIONSHIP table indicating their active replacement Concept. - Values for Concept IDs generated as part of Standardized Vocabularies will be reserved from 0 to 2,000,000,000. Above this range, Concept IDs are available for local use and are guaranteed not to clash with future releases of the Standardized Vocabularies.