# UniProt ID Mapping Databases Complete list of databases supported by the UniProt ID Mapping service. Use these database names when calling the ID mapping API. ## Retrieving Database List Programmatically ```python import requests response = requests.get("https://rest.uniprot.org/configure/idmapping/fields") databases = response.json() ``` ## UniProt Databases ### UniProtKB - `UniProtKB_AC-ID` - UniProt accession and ID - `UniProtKB` - UniProt Knowledgebase - `UniProtKB-Swiss-Prot` - Reviewed (Swiss-Prot) - `UniProtKB-TrEMBL` - Unreviewed (TrEMBL) - `UniParc` - UniProt Archive - `UniRef50` - UniRef 50% identity clusters - `UniRef90` - UniRef 90% identity clusters - `UniRef100` - UniRef 100% identity clusters ## Sequence Databases ### Nucleotide Sequence - `EMBL` - EMBL/GenBank/DDBJ - `EMBL-CDS` - EMBL coding sequences - `RefSeq_Nucleotide` - RefSeq nucleotide sequences - `CCDS` - Consensus CDS ### Protein Sequence - `RefSeq_Protein` - RefSeq protein sequences - `PIR` - Protein Information Resource ## Gene Databases - `GeneID` - Entrez Gene - `Gene_Name` - Gene name - `Gene_Synonym` - Gene synonym - `Gene_OrderedLocusName` - Ordered locus name - `Gene_ORFName` - ORF name ## Genome Databases ### General - `Ensembl` - Ensembl - `EnsemblGenomes` - Ensembl Genomes - `EnsemblGenomes_PRO` - Ensembl Genomes protein - `EnsemblGenomes_TRS` - Ensembl Genomes transcript - `Ensembl_PRO` - Ensembl protein - `Ensembl_TRS` - Ensembl transcript ### Organism-Specific - `KEGG` - KEGG Genes - `PATRIC` - PATRIC - `UCSC` - UCSC Genome Browser - `VectorBase` - VectorBase - `WBParaSite` - WormBase ParaSite ## Structure Databases - `PDB` - Protein Data Bank - `AlphaFoldDB` - AlphaFold Database - `BMRB` - Biological Magnetic Resonance Data Bank - `PDBsum` - PDB summary - `SASBDB` - Small Angle Scattering Biological Data Bank - `SMR` - SWISS-MODEL Repository ## Protein Family and Domain Databases - `InterPro` - InterPro - `Pfam` - Pfam protein families - `PROSITE` - PROSITE - `SMART` - SMART domains - `CDD` - Conserved Domain Database - `HAMAP` - HAMAP - `PANTHER` - PANTHER - `PRINTS` - PRINTS - `ProDom` - ProDom - `SFLD` - Structure-Function Linkage Database - `SUPFAM` - SUPERFAMILY - `TIGRFAMs` - TIGRFAMs ## Organism-Specific Databases ### Model Organisms - `MGI` - Mouse Genome Informatics - `RGD` - Rat Genome Database - `FlyBase` - FlyBase (Drosophila) - `WormBase` - WormBase (C. elegans) - `Xenbase` - Xenbase (Xenopus) - `ZFIN` - Zebrafish Information Network - `dictyBase` - dictyBase (Dictyostelium) - `EcoGene` - EcoGene (E. coli) - `SGD` - Saccharomyces Genome Database (yeast) - `PomBase` - PomBase (S. pombe) - `TAIR` - The Arabidopsis Information Resource ### Human-Specific - `HGNC` - HUGO Gene Nomenclature Committee - `CCDS` - Consensus Coding Sequence Database ## Pathway Databases - `Reactome` - Reactome - `BioCyc` - BioCyc - `PlantReactome` - Plant Reactome - `SIGNOR` - SIGNOR - `SignaLink` - SignaLink ## Enzyme and Metabolism - `EC` - Enzyme Commission number - `BRENDA` - BRENDA enzyme database - `SABIO-RK` - SABIO-RK (biochemical reactions) - `MetaCyc` - MetaCyc ## Disease and Phenotype Databases - `OMIM` - Online Mendelian Inheritance in Man - `MIM` - MIM (same as OMIM) - `OrphaNet` - Orphanet (rare diseases) - `DisGeNET` - DisGeNET - `MalaCards` - MalaCards - `CTD` - Comparative Toxicogenomics Database - `OpenTargets` - Open Targets ## Drug and Chemical Databases - `ChEMBL` - ChEMBL - `DrugBank` - DrugBank - `DrugCentral` - DrugCentral - `GuidetoPHARMACOLOGY` - Guide to Pharmacology - `SwissLipids` - SwissLipids ## Gene Expression Databases - `Bgee` - Bgee gene expression - `ExpressionAtlas` - Expression Atlas - `Genevisible` - Genevisible - `CleanEx` - CleanEx ## Proteomics Databases - `PRIDE` - PRIDE proteomics - `PeptideAtlas` - PeptideAtlas - `ProteomicsDB` - ProteomicsDB - `CPTAC` - CPTAC - `jPOST` - jPOST - `MassIVE` - MassIVE - `MaxQB` - MaxQB - `PaxDb` - PaxDb - `TopDownProteomics` - Top Down Proteomics ## Protein-Protein Interaction - `STRING` - STRING - `BioGRID` - BioGRID - `IntAct` - IntAct - `MINT` - MINT - `DIP` - Database of Interacting Proteins - `ComplexPortal` - Complex Portal ## Ontologies - `GO` - Gene Ontology - `GeneTree` - Ensembl GeneTree - `HOGENOM` - HOGENOM - `HOVERGEN` - HOVERGEN - `KO` - KEGG Orthology - `OMA` - OMA orthology - `OrthoDB` - OrthoDB - `TreeFam` - TreeFam ## Other Specialized Databases ### Glycosylation - `GlyConnect` - GlyConnect - `GlyGen` - GlyGen ### Protein Modifications - `PhosphoSitePlus` - PhosphoSitePlus - `iPTMnet` - iPTMnet ### Antibodies - `Antibodypedia` - Antibodypedia - `DNASU` - DNASU ### Protein Localization - `COMPARTMENTS` - COMPARTMENTS - `NeXtProt` - NeXtProt (human proteins) ### Evolution and Phylogeny - `eggNOG` - eggNOG - `GeneTree` - Ensembl GeneTree - `InParanoid` - InParanoid ### Technical Resources - `PRO` - Protein Ontology - `GenomeRNAi` - GenomeRNAi - `PubMed` - PubMed literature references ## Common Mapping Scenarios ### Example 1: UniProt to PDB ```python from_db = "UniProtKB_AC-ID" to_db = "PDB" ids = ["P01308", "P04637"] ``` ### Example 2: Gene Name to UniProt ```python from_db = "Gene_Name" to_db = "UniProtKB" ids = ["BRCA1", "TP53", "INSR"] ``` ### Example 3: UniProt to Ensembl ```python from_db = "UniProtKB_AC-ID" to_db = "Ensembl" ids = ["P12345"] ``` ### Example 4: RefSeq to UniProt ```python from_db = "RefSeq_Protein" to_db = "UniProtKB" ids = ["NP_000207.1"] ``` ### Example 5: UniProt to GO Terms ```python from_db = "UniProtKB_AC-ID" to_db = "GO" ids = ["P01308"] ``` ## Usage Notes 1. **Database names are case-sensitive**: Use exact names as listed 2. **Many-to-many mappings**: One ID may map to multiple target IDs 3. **Failed mappings**: Some IDs may not have mappings; check the `failedIds` field in results 4. **Batch size limit**: Maximum 100,000 IDs per job 5. **Result expiration**: Results are stored for 7 days 6. **Bidirectional mapping**: Most databases support mapping in both directions ## API Endpoints ### Get available databases ``` GET https://rest.uniprot.org/configure/idmapping/fields ``` ### Submit mapping job ``` POST https://rest.uniprot.org/idmapping/run Content-Type: application/x-www-form-urlencoded from={from_db}&to={to_db}&ids={comma_separated_ids} ``` ### Check job status ``` GET https://rest.uniprot.org/idmapping/status/{jobId} ``` ### Get results ``` GET https://rest.uniprot.org/idmapping/results/{jobId} ``` ## Resources - ID Mapping tool: https://www.uniprot.org/id-mapping - API documentation: https://www.uniprot.org/help/id_mapping - Programmatic access: https://www.uniprot.org/help/api_idmapping