Initial commit
This commit is contained in:
284
skills/pyhealth/references/medical_coding.md
Normal file
284
skills/pyhealth/references/medical_coding.md
Normal file
@@ -0,0 +1,284 @@
|
||||
# PyHealth Medical Code Translation
|
||||
|
||||
## Overview
|
||||
|
||||
Healthcare data uses multiple coding systems and standards. PyHealth's MedCode module enables translation and mapping between medical coding systems through ontology lookups and cross-system mappings.
|
||||
|
||||
## Core Classes
|
||||
|
||||
### InnerMap
|
||||
Handles within-system ontology lookups and hierarchical navigation.
|
||||
|
||||
**Key Capabilities:**
|
||||
- Code lookup with attributes (names, descriptions)
|
||||
- Ancestor/descendant hierarchy traversal
|
||||
- Code standardization and conversion
|
||||
- Parent-child relationship navigation
|
||||
|
||||
### CrossMap
|
||||
Manages cross-system mappings between different coding standards.
|
||||
|
||||
**Key Capabilities:**
|
||||
- Translation between coding systems
|
||||
- Many-to-many relationship handling
|
||||
- Hierarchical level specification (for medications)
|
||||
- Bidirectional mapping support
|
||||
|
||||
## Supported Coding Systems
|
||||
|
||||
### Diagnosis Codes
|
||||
|
||||
**ICD-9-CM (International Classification of Diseases, 9th Revision, Clinical Modification)**
|
||||
- Legacy diagnosis coding system
|
||||
- Hierarchical structure with 3-5 digit codes
|
||||
- Used in US healthcare pre-2015
|
||||
- Usage: `from pyhealth.medcode import InnerMap`
|
||||
- `icd9_map = InnerMap.load("ICD9CM")`
|
||||
|
||||
**ICD-10-CM (International Classification of Diseases, 10th Revision, Clinical Modification)**
|
||||
- Current diagnosis coding standard
|
||||
- Alphanumeric codes (3-7 characters)
|
||||
- More granular than ICD-9
|
||||
- Usage: `from pyhealth.medcode import InnerMap`
|
||||
- `icd10_map = InnerMap.load("ICD10CM")`
|
||||
|
||||
**CCSCM (Clinical Classifications Software for ICD-CM)**
|
||||
- Groups ICD codes into clinically meaningful categories
|
||||
- Reduces dimensionality for analysis
|
||||
- Single-level and multi-level hierarchies
|
||||
- Usage: `from pyhealth.medcode import CrossMap`
|
||||
- `icd_to_ccs = CrossMap.load("ICD9CM", "CCSCM")`
|
||||
|
||||
### Procedure Codes
|
||||
|
||||
**ICD-9-PROC (ICD-9 Procedure Codes)**
|
||||
- Inpatient procedure classification
|
||||
- 3-4 digit numeric codes
|
||||
- Legacy system (pre-2015)
|
||||
- Usage: `from pyhealth.medcode import InnerMap`
|
||||
- `icd9proc_map = InnerMap.load("ICD9PROC")`
|
||||
|
||||
**ICD-10-PROC (ICD-10 Procedure Coding System)**
|
||||
- Current procedural coding standard
|
||||
- 7-character alphanumeric codes
|
||||
- More detailed than ICD-9-PROC
|
||||
- Usage: `from pyhealth.medcode import InnerMap`
|
||||
- `icd10proc_map = InnerMap.load("ICD10PROC")`
|
||||
|
||||
**CCSPROC (Clinical Classifications Software for Procedures)**
|
||||
- Groups procedure codes into categories
|
||||
- Simplifies procedure analysis
|
||||
- Usage: `from pyhealth.medcode import CrossMap`
|
||||
- `proc_to_ccs = CrossMap.load("ICD9PROC", "CCSPROC")`
|
||||
|
||||
### Medication Codes
|
||||
|
||||
**NDC (National Drug Code)**
|
||||
- US FDA drug identification system
|
||||
- 10 or 11-digit codes
|
||||
- Product-level specificity (manufacturer, strength, package)
|
||||
- Usage: `from pyhealth.medcode import InnerMap`
|
||||
- `ndc_map = InnerMap.load("NDC")`
|
||||
|
||||
**RxNorm**
|
||||
- Standardized drug terminology
|
||||
- Normalized drug names and relationships
|
||||
- Links multiple drug vocabularies
|
||||
- Usage: `from pyhealth.medcode import CrossMap`
|
||||
- `ndc_to_rxnorm = CrossMap.load("NDC", "RXNORM")`
|
||||
|
||||
**ATC (Anatomical Therapeutic Chemical Classification)**
|
||||
- WHO drug classification system
|
||||
- 5-level hierarchy:
|
||||
- **Level 1**: Anatomical main group (1 letter)
|
||||
- **Level 2**: Therapeutic subgroup (2 digits)
|
||||
- **Level 3**: Pharmacological subgroup (1 letter)
|
||||
- **Level 4**: Chemical subgroup (1 letter)
|
||||
- **Level 5**: Chemical substance (2 digits)
|
||||
- Example: "C03CA01" = Furosemide
|
||||
- C = Cardiovascular system
|
||||
- C03 = Diuretics
|
||||
- C03C = High-ceiling diuretics
|
||||
- C03CA = Sulfonamides
|
||||
- C03CA01 = Furosemide
|
||||
|
||||
**Usage:**
|
||||
```python
|
||||
from pyhealth.medcode import CrossMap
|
||||
ndc_to_atc = CrossMap.load("NDC", "ATC")
|
||||
atc_codes = ndc_to_atc.map("00074-3799-13", level=3) # Get ATC level 3
|
||||
```
|
||||
|
||||
## Common Operations
|
||||
|
||||
### InnerMap Operations
|
||||
|
||||
**1. Code Lookup**
|
||||
```python
|
||||
from pyhealth.medcode import InnerMap
|
||||
|
||||
icd9_map = InnerMap.load("ICD9CM")
|
||||
info = icd9_map.lookup("428.0") # Heart failure
|
||||
# Returns: name, description, additional attributes
|
||||
```
|
||||
|
||||
**2. Ancestor Traversal**
|
||||
```python
|
||||
# Get all parent codes in hierarchy
|
||||
ancestors = icd9_map.get_ancestors("428.0")
|
||||
# Returns: ["428", "420-429", "390-459"]
|
||||
```
|
||||
|
||||
**3. Descendant Traversal**
|
||||
```python
|
||||
# Get all child codes
|
||||
descendants = icd9_map.get_descendants("428")
|
||||
# Returns: ["428.0", "428.1", "428.2", ...]
|
||||
```
|
||||
|
||||
**4. Code Standardization**
|
||||
```python
|
||||
# Normalize code format
|
||||
standard_code = icd9_map.standardize("4280") # Returns "428.0"
|
||||
```
|
||||
|
||||
### CrossMap Operations
|
||||
|
||||
**1. Direct Translation**
|
||||
```python
|
||||
from pyhealth.medcode import CrossMap
|
||||
|
||||
# ICD-9-CM to CCS
|
||||
icd_to_ccs = CrossMap.load("ICD9CM", "CCSCM")
|
||||
ccs_codes = icd_to_ccs.map("82101") # Coronary atherosclerosis
|
||||
# Returns: ["101"] # CCS category for coronary atherosclerosis
|
||||
```
|
||||
|
||||
**2. Hierarchical Drug Mapping**
|
||||
```python
|
||||
# NDC to ATC at different levels
|
||||
ndc_to_atc = CrossMap.load("NDC", "ATC")
|
||||
|
||||
# Get specific ATC level
|
||||
atc_level_1 = ndc_to_atc.map("00074-3799-13", level=1) # Anatomical group
|
||||
atc_level_3 = ndc_to_atc.map("00074-3799-13", level=3) # Pharmacological
|
||||
atc_level_5 = ndc_to_atc.map("00074-3799-13", level=5) # Chemical substance
|
||||
```
|
||||
|
||||
**3. Bidirectional Mapping**
|
||||
```python
|
||||
# Map in either direction
|
||||
rxnorm_to_ndc = CrossMap.load("RXNORM", "NDC")
|
||||
ndc_codes = rxnorm_to_ndc.map("197381") # Get all NDC codes for RxNorm
|
||||
```
|
||||
|
||||
## Workflow Examples
|
||||
|
||||
### Example 1: Standardize and Group Diagnoses
|
||||
```python
|
||||
from pyhealth.medcode import InnerMap, CrossMap
|
||||
|
||||
# Load maps
|
||||
icd9_map = InnerMap.load("ICD9CM")
|
||||
icd_to_ccs = CrossMap.load("ICD9CM", "CCSCM")
|
||||
|
||||
# Process diagnosis codes
|
||||
raw_codes = ["4280", "428.0", "42800"]
|
||||
|
||||
standardized = [icd9_map.standardize(code) for code in raw_codes]
|
||||
# All become "428.0"
|
||||
|
||||
ccs_categories = [icd_to_ccs.map(code)[0] for code in standardized]
|
||||
# All map to CCS category "108" (Heart failure)
|
||||
```
|
||||
|
||||
### Example 2: Drug Classification Analysis
|
||||
```python
|
||||
from pyhealth.medcode import CrossMap
|
||||
|
||||
# Map NDC to ATC for drug class analysis
|
||||
ndc_to_atc = CrossMap.load("NDC", "ATC")
|
||||
|
||||
patient_drugs = ["00074-3799-13", "00074-7286-01", "00456-0765-01"]
|
||||
|
||||
# Get therapeutic subgroups (ATC level 2)
|
||||
drug_classes = []
|
||||
for ndc in patient_drugs:
|
||||
atc_codes = ndc_to_atc.map(ndc, level=2)
|
||||
if atc_codes:
|
||||
drug_classes.append(atc_codes[0])
|
||||
|
||||
# Analyze drug class distribution
|
||||
```
|
||||
|
||||
### Example 3: ICD-9 to ICD-10 Migration
|
||||
```python
|
||||
from pyhealth.medcode import CrossMap
|
||||
|
||||
# Load ICD-9 to ICD-10 mapping
|
||||
icd9_to_icd10 = CrossMap.load("ICD9CM", "ICD10CM")
|
||||
|
||||
# Convert historical ICD-9 codes
|
||||
icd9_code = "428.0"
|
||||
icd10_codes = icd9_to_icd10.map(icd9_code)
|
||||
# Returns: ["I50.9", "I50.1", ...] # Multiple possible ICD-10 codes
|
||||
|
||||
# Handle one-to-many mappings
|
||||
for icd10_code in icd10_codes:
|
||||
print(f"ICD-9 {icd9_code} -> ICD-10 {icd10_code}")
|
||||
```
|
||||
|
||||
## Integration with Datasets
|
||||
|
||||
Medical code translation integrates seamlessly with PyHealth datasets:
|
||||
|
||||
```python
|
||||
from pyhealth.datasets import MIMIC4Dataset
|
||||
from pyhealth.medcode import CrossMap
|
||||
|
||||
# Load dataset
|
||||
dataset = MIMIC4Dataset(root="/path/to/data")
|
||||
|
||||
# Load code mapping
|
||||
icd_to_ccs = CrossMap.load("ICD10CM", "CCSCM")
|
||||
|
||||
# Process patient diagnoses
|
||||
for patient in dataset.iter_patients():
|
||||
for visit in patient.visits:
|
||||
diagnosis_events = [e for e in visit.events if e.vocabulary == "ICD10CM"]
|
||||
|
||||
for event in diagnosis_events:
|
||||
ccs_codes = icd_to_ccs.map(event.code)
|
||||
print(f"Diagnosis {event.code} -> CCS {ccs_codes}")
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Clinical Research
|
||||
- Standardize diagnoses across different coding systems
|
||||
- Group related conditions for cohort identification
|
||||
- Harmonize multi-site studies with different standards
|
||||
|
||||
### Drug Safety Analysis
|
||||
- Classify medications by therapeutic class
|
||||
- Identify drug-drug interactions at class level
|
||||
- Analyze polypharmacy patterns
|
||||
|
||||
### Healthcare Analytics
|
||||
- Reduce diagnosis/procedure dimensionality
|
||||
- Create meaningful clinical categories
|
||||
- Enable longitudinal analysis across coding system changes
|
||||
|
||||
### Machine Learning
|
||||
- Create consistent feature representations
|
||||
- Handle vocabulary mismatch in training/test data
|
||||
- Generate hierarchical embeddings
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always standardize codes** before mapping to ensure consistent format
|
||||
2. **Handle one-to-many mappings** appropriately (some codes map to multiple targets)
|
||||
3. **Specify ATC level** explicitly when mapping drugs to avoid ambiguity
|
||||
4. **Use CCS categories** to reduce diagnosis/procedure dimensionality
|
||||
5. **Validate mappings** as some codes may not have direct translations
|
||||
6. **Document code versions** (ICD-9 vs ICD-10) to maintain data provenance
|
||||
Reference in New Issue
Block a user