7.5 KiB
UniProt API Fields Reference
Complete list of available fields for customizing UniProt API queries. Use these fields with the fields parameter to retrieve only the data you need.
Usage
Add fields parameter to your query:
https://rest.uniprot.org/uniprotkb/search?query=insulin&fields=accession,gene_names,organism_name,length
Multiple fields are comma-separated. No spaces after commas.
Core Fields
Identification
accession- Primary accession number (e.g., P12345)id- Entry name (e.g., INSR_HUMAN)uniprotkb_id- Same as identryType- REVIEWED (Swiss-Prot) or UNREVIEWED (TrEMBL)
Protein Names
protein_name- Recommended and alternative protein namesgene_names- Gene name(s)gene_primary- Primary gene namegene_synonym- Gene synonymsgene_oln- Ordered locus namesgene_orf- ORF names
Organism Information
organism_name- Organism scientific nameorganism_id- NCBI taxonomy identifierlineage- Taxonomic lineagevirus_hosts- Virus host organisms (for viral proteins)
Sequence Information
sequence- Amino acid sequencelength- Sequence lengthmass- Molecular mass (Daltons)fragment- Whether entry is a fragmentchecksum- Sequence CRC64 checksum
Annotation Fields
Function and Biology
cc_function- Function descriptioncc_catalytic_activity- Catalytic activitycc_activity_regulation- Activity regulationcc_pathway- Metabolic pathway informationcc_cofactor- Cofactor information
Interaction and Localization
cc_interaction- Protein-protein interactionscc_subunit- Subunit structurecc_subcellular_location- Subcellular locationcc_tissue_specificity- Tissue specificitycc_developmental_stage- Developmental stage expression
Disease and Phenotype
cc_disease- Disease associationscc_disruption_phenotype- Disruption phenotypecc_allergen- Allergen informationcc_toxic_dose- Toxic dose information
Post-translational Modifications
cc_ptm- Post-translational modificationscc_mass_spectrometry- Mass spectrometry data
Other Comments
cc_alternative_products- Alternative products (isoforms)cc_polymorphism- Polymorphism informationcc_rna_editing- RNA editingcc_caution- Caution notescc_miscellaneous- Miscellaneous informationcc_similarity- Sequence similaritiescc_sequence_caution- Sequence cautioncc_web_resource- Web resources
Feature Fields (ft_)
Molecular Processing
ft_signal- Signal peptideft_transit- Transit peptideft_init_met- Initiator methionineft_propep- Propeptideft_chain- Chain (mature protein)ft_peptide- Peptide
Regions and Sites
ft_domain- Domainft_repeat- Repeatft_ca_bind- Calcium bindingft_zn_fing- Zinc fingerft_dna_bind- DNA bindingft_np_bind- Nucleotide bindingft_region- Region of interestft_coiled- Coiled coilft_motif- Short sequence motifft_compbias- Compositional bias
Sites and Modifications
ft_act_site- Active siteft_metal- Metal bindingft_binding- Binding siteft_site- Siteft_mod_res- Modified residueft_lipid- Lipidationft_carbohyd- Glycosylationft_disulfid- Disulfide bondft_crosslnk- Cross-link
Structural Features
ft_helix- Helixft_strand- Beta strandft_turn- Turnft_transmem- Transmembrane regionft_intramem- Intramembrane regionft_topo_dom- Topological domain
Variation and Conflict
ft_variant- Natural variantft_var_seq- Alternative sequenceft_mutagen- Mutagenesisft_unsure- Unsure residueft_conflict- Sequence conflictft_non_cons- Non-consecutive residuesft_non_ter- Non-terminal residueft_non_std- Non-standard residue
Gene Ontology (GO)
go- All GO termsgo_p- Biological processgo_c- Cellular componentgo_f- Molecular functiongo_id- GO term identifiers
Cross-References (xref_)
Sequence Databases
xref_embl- EMBL/GenBank/DDBJxref_refseq- RefSeqxref_ccds- CCDSxref_pir- PIR
3D Structure Databases
xref_pdb- Protein Data Bankxref_pcddb- PCD databasexref_alphafolddb- AlphaFold databasexref_smr- SWISS-MODEL Repository
Protein Family/Domain Databases
xref_interpro- InterProxref_pfam- Pfamxref_prosite- PROSITExref_smart- SMART
Genome Databases
xref_ensembl- Ensemblxref_ensemblgenomes- Ensembl Genomesxref_geneid- Entrez Genexref_kegg- KEGG
Organism-Specific Databases
xref_mgi- MGI (mouse)xref_rgd- RGD (rat)xref_flybase- FlyBase (fly)xref_wormbase- WormBase (worm)xref_xenbase- Xenbase (frog)xref_zfin- ZFIN (zebrafish)
Pathway Databases
xref_reactome- Reactomexref_signor- SIGNORxref_signalink- SignaLink
Disease Databases
xref_disgenet- DisGeNETxref_malacards- MalaCardsxref_omim- OMIMxref_orphanet- Orphanet
Drug Databases
xref_chembl- ChEMBLxref_drugbank- DrugBankxref_guidetopharmacology- Guide to Pharmacology
Expression Databases
xref_bgee- Bgeexref_expressionetatlas- Expression Atlasxref_genevisible- Genevisible
Metadata Fields
Dates
date_created- Entry creation datedate_modified- Last modification datedate_sequence_modified- Last sequence modification date
Evidence and Quality
annotation_score- Annotation score (1-5)protein_existence- Protein existence levelreviewed- Whether entry is reviewed (Swiss-Prot)
Literature
lit_pubmed_id- PubMed identifierslit_doi- DOI identifiers
Proteomics
proteome- Proteome identifiertools- Tools used for annotation
Retrieving Available Fields Programmatically
Use the configuration endpoint to get all available fields:
curl https://rest.uniprot.org/configure/uniprotkb/result-fields
Or in Python:
import requests
response = requests.get("https://rest.uniprot.org/configure/uniprotkb/result-fields")
fields = response.json()
Common Field Combinations
Basic protein information
fields=accession,id,protein_name,gene_names,organism_name,length
Sequence and structure
fields=accession,sequence,length,mass,xref_pdb,xref_alphafolddb
Functional annotation
fields=accession,protein_name,cc_function,cc_catalytic_activity,cc_pathway,go
Disease information
fields=accession,protein_name,gene_names,cc_disease,xref_omim,xref_malacards
Expression patterns
fields=accession,gene_names,cc_tissue_specificity,cc_developmental_stage,xref_bgee
Complete annotation
fields=accession,id,protein_name,gene_names,organism_name,sequence,length,cc_*,ft_*,go,xref_pdb
Notes
-
Wildcards: Some fields support wildcards (e.g.,
cc_*for all comment fields,ft_*for all features) -
Performance: Requesting fewer fields improves response time and reduces bandwidth
-
Format dependency: Some fields may be formatted differently depending on output format (JSON vs TSV)
-
Null values: Fields without data may be omitted from response (JSON) or empty (TSV)
-
Arrays vs strings: In JSON format, many fields return arrays of objects rather than simple strings
Resources
- Interactive field explorer: https://www.uniprot.org/api-documentation
- API fields endpoint: https://rest.uniprot.org/configure/uniprotkb/result-fields
- Return fields documentation: https://www.uniprot.org/help/return_fields