Initial commit
This commit is contained in:
472
skills/fda-database/references/other.md
Normal file
472
skills/fda-database/references/other.md
Normal file
@@ -0,0 +1,472 @@
|
||||
# FDA Other Databases - Substances and NSDE
|
||||
|
||||
This reference covers FDA substance-related and other specialized API endpoints accessible through openFDA.
|
||||
|
||||
## Overview
|
||||
|
||||
The FDA maintains additional databases for substance-level information that is precise to the molecular level. These databases support regulatory activities across drugs, biologics, devices, foods, and cosmetics.
|
||||
|
||||
## Available Endpoints
|
||||
|
||||
### 1. Substance Data
|
||||
|
||||
**Endpoint**: `https://api.fda.gov/other/substance.json`
|
||||
|
||||
**Purpose**: Access substance information that is precise to the molecular level for internal and external use. This includes information about active pharmaceutical ingredients, excipients, and other substances used in FDA-regulated products.
|
||||
|
||||
**Data Source**: FDA Global Substance Registration System (GSRS)
|
||||
|
||||
**Key Fields**:
|
||||
- `uuid` - Unique substance identifier (UUID)
|
||||
- `approvalID` - FDA Unique Ingredient Identifier (UNII)
|
||||
- `approved` - Approval date
|
||||
- `substanceClass` - Type of substance (chemical, protein, nucleic acid, polymer, etc.)
|
||||
- `names` - Array of substance names
|
||||
- `names.name` - Name text
|
||||
- `names.type` - Name type (systematic, brand, common, etc.)
|
||||
- `names.preferred` - Whether preferred name
|
||||
- `codes` - Array of substance codes
|
||||
- `codes.code` - Code value
|
||||
- `codes.codeSystem` - Code system (CAS, ECHA, EINECS, etc.)
|
||||
- `codes.type` - Code type
|
||||
- `relationships` - Array of substance relationships
|
||||
- `relationships.type` - Relationship type (ACTIVE MOIETY, METABOLITE, IMPURITY, etc.)
|
||||
- `relationships.relatedSubstance` - Related substance reference
|
||||
- `moieties` - Molecular moieties
|
||||
- `properties` - Array of physicochemical properties
|
||||
- `properties.name` - Property name
|
||||
- `properties.value` - Property value
|
||||
- `properties.propertyType` - Property type
|
||||
- `structure` - Chemical structure information
|
||||
- `structure.smiles` - SMILES notation
|
||||
- `structure.inchi` - InChI string
|
||||
- `structure.inchiKey` - InChI key
|
||||
- `structure.formula` - Molecular formula
|
||||
- `structure.molecularWeight` - Molecular weight
|
||||
- `modifications` - Structural modifications (for proteins, etc.)
|
||||
- `protein` - Protein-specific information
|
||||
- `protein.subunits` - Protein subunits
|
||||
- `protein.sequenceType` - Sequence type
|
||||
- `nucleicAcid` - Nucleic acid information
|
||||
- `nucleicAcid.subunits` - Sequence subunits
|
||||
- `polymer` - Polymer information
|
||||
- `mixture` - Mixture components
|
||||
- `mixture.components` - Component substances
|
||||
- `tags` - Substance tags
|
||||
- `references` - Literature references
|
||||
|
||||
**Substance Classes**:
|
||||
- **Chemical** - Small molecules with defined chemical structure
|
||||
- **Protein** - Proteins and peptides
|
||||
- **Nucleic Acid** - DNA, RNA, oligonucleotides
|
||||
- **Polymer** - Polymeric substances
|
||||
- **Structurally Diverse** - Complex mixtures, botanicals
|
||||
- **Mixture** - Defined mixtures
|
||||
- **Concept** - Abstract concepts (e.g., groups)
|
||||
|
||||
**Common Use Cases**:
|
||||
- Active ingredient identification
|
||||
- Molecular structure lookup
|
||||
- UNII code resolution
|
||||
- Chemical identifier mapping (CAS to UNII, etc.)
|
||||
- Substance relationship analysis
|
||||
- Excipient identification
|
||||
- Botanical substance information
|
||||
- Protein and biologic characterization
|
||||
|
||||
**Example Queries**:
|
||||
```python
|
||||
import requests
|
||||
|
||||
api_key = "YOUR_API_KEY"
|
||||
url = "https://api.fda.gov/other/substance.json"
|
||||
|
||||
# Look up substance by UNII code
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "approvalID:R16CO5Y76E", # Aspirin UNII
|
||||
"limit": 1
|
||||
}
|
||||
|
||||
response = requests.get(url, params=params)
|
||||
data = response.json()
|
||||
```
|
||||
|
||||
```python
|
||||
# Search by substance name
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "names.name:acetaminophen",
|
||||
"limit": 5
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
# Find substances by CAS number
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "codes.code:50-78-2", # Aspirin CAS
|
||||
"limit": 1
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
# Get chemical substances only
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "substanceClass:chemical",
|
||||
"limit": 100
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
# Search by molecular formula
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "structure.formula:C8H9NO2", # Acetaminophen
|
||||
"limit": 10
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
# Find protein substances
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "substanceClass:protein",
|
||||
"limit": 50
|
||||
}
|
||||
```
|
||||
|
||||
### 2. NSDE (National Substance Database Entry)
|
||||
|
||||
**Endpoint**: `https://api.fda.gov/other/nsde.json`
|
||||
|
||||
**Purpose**: Access historical substance data from legacy National Drug Code (NDC) directory entries. This endpoint provides substance information as it appears in historical drug product listings.
|
||||
|
||||
**Note**: This database is primarily for historical reference. For current substance information, use the Substance Data endpoint.
|
||||
|
||||
**Key Fields**:
|
||||
- `proprietary_name` - Product proprietary name
|
||||
- `nonproprietary_name` - Nonproprietary name
|
||||
- `dosage_form` - Dosage form
|
||||
- `route` - Route of administration
|
||||
- `company_name` - Company name
|
||||
- `substance_name` - Substance name
|
||||
- `active_numerator_strength` - Active ingredient strength (numerator)
|
||||
- `active_ingred_unit` - Active ingredient unit
|
||||
- `pharm_classes` - Pharmacological classes
|
||||
- `dea_schedule` - DEA controlled substance schedule
|
||||
|
||||
**Common Use Cases**:
|
||||
- Historical drug formulation research
|
||||
- Legacy system integration
|
||||
- Historical substance name mapping
|
||||
- Pharmaceutical history research
|
||||
|
||||
**Example Queries**:
|
||||
```python
|
||||
# Search by substance name
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "substance_name:ibuprofen",
|
||||
"limit": 20
|
||||
}
|
||||
|
||||
response = requests.get("https://api.fda.gov/other/nsde.json", params=params)
|
||||
```
|
||||
|
||||
```python
|
||||
# Find controlled substances by DEA schedule
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": "dea_schedule:CII",
|
||||
"limit": 50
|
||||
}
|
||||
```
|
||||
|
||||
## Integration Tips
|
||||
|
||||
### UNII to CAS Mapping
|
||||
|
||||
```python
|
||||
def get_substance_identifiers(unii, api_key):
|
||||
"""
|
||||
Get all identifiers for a substance given its UNII code.
|
||||
|
||||
Args:
|
||||
unii: FDA Unique Ingredient Identifier
|
||||
api_key: FDA API key
|
||||
|
||||
Returns:
|
||||
Dictionary with substance identifiers
|
||||
"""
|
||||
import requests
|
||||
|
||||
url = "https://api.fda.gov/other/substance.json"
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": f"approvalID:{unii}",
|
||||
"limit": 1
|
||||
}
|
||||
|
||||
response = requests.get(url, params=params)
|
||||
data = response.json()
|
||||
|
||||
if "results" not in data or len(data["results"]) == 0:
|
||||
return None
|
||||
|
||||
substance = data["results"][0]
|
||||
|
||||
identifiers = {
|
||||
"unii": substance.get("approvalID"),
|
||||
"uuid": substance.get("uuid"),
|
||||
"preferred_name": None,
|
||||
"cas_numbers": [],
|
||||
"other_codes": {}
|
||||
}
|
||||
|
||||
# Extract names
|
||||
if "names" in substance:
|
||||
for name in substance["names"]:
|
||||
if name.get("preferred"):
|
||||
identifiers["preferred_name"] = name.get("name")
|
||||
break
|
||||
if not identifiers["preferred_name"] and len(substance["names"]) > 0:
|
||||
identifiers["preferred_name"] = substance["names"][0].get("name")
|
||||
|
||||
# Extract codes
|
||||
if "codes" in substance:
|
||||
for code in substance["codes"]:
|
||||
code_system = code.get("codeSystem", "").upper()
|
||||
code_value = code.get("code")
|
||||
|
||||
if "CAS" in code_system:
|
||||
identifiers["cas_numbers"].append(code_value)
|
||||
else:
|
||||
if code_system not in identifiers["other_codes"]:
|
||||
identifiers["other_codes"][code_system] = []
|
||||
identifiers["other_codes"][code_system].append(code_value)
|
||||
|
||||
return identifiers
|
||||
```
|
||||
|
||||
### Chemical Structure Lookup
|
||||
|
||||
```python
|
||||
def get_chemical_structure(substance_name, api_key):
|
||||
"""
|
||||
Get chemical structure information for a substance.
|
||||
|
||||
Args:
|
||||
substance_name: Name of the substance
|
||||
api_key: FDA API key
|
||||
|
||||
Returns:
|
||||
Dictionary with structure information
|
||||
"""
|
||||
import requests
|
||||
|
||||
url = "https://api.fda.gov/other/substance.json"
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": f"names.name:{substance_name}",
|
||||
"limit": 1
|
||||
}
|
||||
|
||||
response = requests.get(url, params=params)
|
||||
data = response.json()
|
||||
|
||||
if "results" not in data or len(data["results"]) == 0:
|
||||
return None
|
||||
|
||||
substance = data["results"][0]
|
||||
|
||||
if "structure" not in substance:
|
||||
return None
|
||||
|
||||
structure = substance["structure"]
|
||||
|
||||
return {
|
||||
"smiles": structure.get("smiles"),
|
||||
"inchi": structure.get("inchi"),
|
||||
"inchi_key": structure.get("inchiKey"),
|
||||
"formula": structure.get("formula"),
|
||||
"molecular_weight": structure.get("molecularWeight"),
|
||||
"substance_class": substance.get("substanceClass")
|
||||
}
|
||||
```
|
||||
|
||||
### Substance Relationship Mapping
|
||||
|
||||
```python
|
||||
def get_substance_relationships(unii, api_key):
|
||||
"""
|
||||
Get all related substances (metabolites, active moieties, etc.).
|
||||
|
||||
Args:
|
||||
unii: FDA Unique Ingredient Identifier
|
||||
api_key: FDA API key
|
||||
|
||||
Returns:
|
||||
Dictionary organizing relationships by type
|
||||
"""
|
||||
import requests
|
||||
|
||||
url = "https://api.fda.gov/other/substance.json"
|
||||
params = {
|
||||
"api_key": api_key,
|
||||
"search": f"approvalID:{unii}",
|
||||
"limit": 1
|
||||
}
|
||||
|
||||
response = requests.get(url, params=params)
|
||||
data = response.json()
|
||||
|
||||
if "results" not in data or len(data["results"]) == 0:
|
||||
return None
|
||||
|
||||
substance = data["results"][0]
|
||||
|
||||
relationships = {}
|
||||
|
||||
if "relationships" in substance:
|
||||
for rel in substance["relationships"]:
|
||||
rel_type = rel.get("type")
|
||||
if rel_type not in relationships:
|
||||
relationships[rel_type] = []
|
||||
|
||||
related = {
|
||||
"uuid": rel.get("relatedSubstance", {}).get("uuid"),
|
||||
"unii": rel.get("relatedSubstance", {}).get("approvalID"),
|
||||
"name": rel.get("relatedSubstance", {}).get("refPname")
|
||||
}
|
||||
relationships[rel_type].append(related)
|
||||
|
||||
return relationships
|
||||
```
|
||||
|
||||
### Active Ingredient Extraction
|
||||
|
||||
```python
|
||||
def find_active_ingredients_by_product(product_name, api_key):
|
||||
"""
|
||||
Find active ingredients in a drug product.
|
||||
|
||||
Args:
|
||||
product_name: Drug product name
|
||||
api_key: FDA API key
|
||||
|
||||
Returns:
|
||||
List of active ingredient UNIIs and names
|
||||
"""
|
||||
import requests
|
||||
|
||||
# First search drug label database
|
||||
label_url = "https://api.fda.gov/drug/label.json"
|
||||
label_params = {
|
||||
"api_key": api_key,
|
||||
"search": f"openfda.brand_name:{product_name}",
|
||||
"limit": 1
|
||||
}
|
||||
|
||||
response = requests.get(label_url, params=label_params)
|
||||
data = response.json()
|
||||
|
||||
if "results" not in data or len(data["results"]) == 0:
|
||||
return None
|
||||
|
||||
label = data["results"][0]
|
||||
|
||||
# Extract UNIIs from openfda section
|
||||
active_ingredients = []
|
||||
|
||||
if "openfda" in label:
|
||||
openfda = label["openfda"]
|
||||
|
||||
# Get UNIIs
|
||||
unii_list = openfda.get("unii", [])
|
||||
generic_names = openfda.get("generic_name", [])
|
||||
|
||||
for i, unii in enumerate(unii_list):
|
||||
ingredient = {"unii": unii}
|
||||
if i < len(generic_names):
|
||||
ingredient["name"] = generic_names[i]
|
||||
|
||||
# Get additional substance info
|
||||
substance_info = get_substance_identifiers(unii, api_key)
|
||||
if substance_info:
|
||||
ingredient.update(substance_info)
|
||||
|
||||
active_ingredients.append(ingredient)
|
||||
|
||||
return active_ingredients
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use UNII as primary identifier** - Most consistent across FDA databases
|
||||
2. **Map between identifier systems** - CAS, UNII, InChI Key for cross-referencing
|
||||
3. **Handle substance variations** - Different salt forms, hydrates have different UNIIs
|
||||
4. **Check substance class** - Different classes have different data structures
|
||||
5. **Validate chemical structures** - SMILES and InChI should be verified
|
||||
6. **Consider substance relationships** - Active moiety vs. salt form matters
|
||||
7. **Use preferred names** - More consistent than trade names
|
||||
8. **Cache substance data** - Substance information changes infrequently
|
||||
9. **Cross-reference with other endpoints** - Link substances to drugs/products
|
||||
10. **Handle mixture components** - Complex products have multiple components
|
||||
|
||||
## UNII System
|
||||
|
||||
The FDA Unique Ingredient Identifier (UNII) system provides:
|
||||
- **Unique identifiers** - Each substance gets one UNII
|
||||
- **Substance specificity** - Different forms (salts, hydrates) get different UNIIs
|
||||
- **Global recognition** - Used internationally
|
||||
- **Stability** - UNIIs don't change once assigned
|
||||
- **Free access** - No licensing required
|
||||
|
||||
**UNII Format**: 10-character alphanumeric code (e.g., `R16CO5Y76E`)
|
||||
|
||||
## Substance Classes Explained
|
||||
|
||||
### Chemical
|
||||
- Traditional small molecule drugs
|
||||
- Have defined molecular structure
|
||||
- Include organic and inorganic compounds
|
||||
- SMILES, InChI, molecular formula available
|
||||
|
||||
### Protein
|
||||
- Polypeptides and proteins
|
||||
- Sequence information available
|
||||
- May have post-translational modifications
|
||||
- Includes antibodies, enzymes, hormones
|
||||
|
||||
### Nucleic Acid
|
||||
- DNA and RNA sequences
|
||||
- Oligonucleotides
|
||||
- Antisense, siRNA, mRNA
|
||||
- Sequence data available
|
||||
|
||||
### Polymer
|
||||
- Synthetic and natural polymers
|
||||
- Structural repeat units
|
||||
- Molecular weight distributions
|
||||
- Used as excipients and active ingredients
|
||||
|
||||
### Structurally Diverse
|
||||
- Complex natural products
|
||||
- Botanical extracts
|
||||
- Materials without single molecular structure
|
||||
- Characterized by source and composition
|
||||
|
||||
### Mixture
|
||||
- Defined combinations of substances
|
||||
- Fixed or variable composition
|
||||
- Each component trackable
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- FDA Substance Registration System: https://fdasis.nlm.nih.gov/srs/
|
||||
- UNII Search: https://precision.fda.gov/uniisearch
|
||||
- OpenFDA Other APIs: https://open.fda.gov/apis/other/
|
||||
- API Basics: See `api_basics.md` in this references directory
|
||||
- Python examples: See `scripts/fda_substance_query.py`
|
||||
Reference in New Issue
Block a user