zhongwei/gh-k-dense-ai-claude-scientific-skills-scientific-skills

Fork 0

Files

Zhongwei Li f0bd18fb4e Initial commit

2025-11-30 08:30:10 +08:00

16 KiB

Raw Permalink Blame History

Metabolomics Workbench REST API Reference

Base URL

All API requests use the following base URL:

https://www.metabolomicsworkbench.org/rest/

API Structure

The REST API follows a consistent URL pattern:

/context/input_item/input_value/output_item/output_format

context: The type of resource to access (study, compound, refmet, metstat, gene, protein, moverz)
input_item: The type of identifier or search parameter
input_value: The specific value to search for
output_item: What data to return (e.g., all, name, summary)
output_format: json or txt (json is default if omitted)

Output Formats

json: Machine-readable JSON format (default)
txt: Tab-delimited text format for human readability

Context 1: Compound

Retrieve metabolite structure and identification data.

Input Items

Input Item	Description	Example
`regno`	Metabolomics Workbench registry number	11
`pubchem_cid`	PubChem Compound ID	5281365
`inchi_key`	International Chemical Identifier Key	WQZGKKKJIJFFOK-GASJEMHNSA-N
`formula`	Molecular formula	C6H12O6
`lm_id`	LIPID MAPS ID	LM...
`hmdb_id`	Human Metabolome Database ID	HMDB0000122
`kegg_id`	KEGG Compound ID	C00031

Output Items

Output Item	Description
`all`	All available compound data
`classification`	Compound classification
`regno`	Registry number
`formula`	Molecular formula
`exactmass`	Exact mass
`inchi_key`	InChI Key
`name`	Common name
`sys_name`	Systematic name
`smiles`	SMILES notation
`lm_id`	LIPID MAPS ID
`pubchem_cid`	PubChem CID
`hmdb_id`	HMDB ID
`kegg_id`	KEGG ID
`chebi_id`	ChEBI ID
`metacyc_id`	MetaCyc ID
`molfile`	MOL file structure
`png`	PNG image of structure

Example Requests

# Get all compound data by PubChem CID
curl "https://www.metabolomicsworkbench.org/rest/compound/pubchem_cid/5281365/all/json"

# Get compound name by registry number
curl "https://www.metabolomicsworkbench.org/rest/compound/regno/11/name/json"

# Download structure as PNG
curl "https://www.metabolomicsworkbench.org/rest/compound/regno/11/png" -o structure.png

# Get compound by KEGG ID
curl "https://www.metabolomicsworkbench.org/rest/compound/kegg_id/C00031/all/json"

# Get compound by molecular formula
curl "https://www.metabolomicsworkbench.org/rest/compound/formula/C6H12O6/all/json"

Context 2: Study

Access metabolomics research study metadata and experimental results.

Input Items

Input Item	Description	Example
`study_id`	Study identifier	ST000001
`analysis_id`	Analysis identifier	AN000001
`study_title`	Keywords in study title	diabetes
`institute`	Institute name	UCSD
`last_name`	Investigator last name	Smith
`metabolite_id`	Metabolite registry number	11
`refmet_name`	RefMet standardized name	Glucose
`kegg_id`	KEGG compound ID	C00031

Output Items

Output Item	Description
`summary`	Study overview and metadata
`factors`	Experimental factors and design
`analysis`	Analysis methods and parameters
`metabolites`	List of measured metabolites
`data`	Complete experimental data
`mwtab`	Complete study in mwTab format
`number_of_metabolites`	Count of metabolites measured
`species`	Organism species
`disease`	Disease studied
`source`	Sample source/tissue type
`untarg_studies`	Untargeted study information
`untarg_factors`	Untargeted study factors
`untarg_data`	Untargeted experimental data
`datatable`	Formatted data table
`available`	List available studies (use with ST as input_value)

Example Requests

# List all publicly available studies
curl "https://www.metabolomicsworkbench.org/rest/study/study_id/ST/available/json"

# Get study summary
curl "https://www.metabolomicsworkbench.org/rest/study/study_id/ST000001/summary/json"

# Get experimental data
curl "https://www.metabolomicsworkbench.org/rest/study/study_id/ST000001/data/json"

# Get study factors
curl "https://www.metabolomicsworkbench.org/rest/study/study_id/ST000001/factors/json"

# Find studies containing a specific metabolite
curl "https://www.metabolomicsworkbench.org/rest/study/refmet_name/Tyrosine/summary/json"

# Search studies by investigator
curl "https://www.metabolomicsworkbench.org/rest/study/last_name/Smith/summary/json"

# Download complete study in mwTab format
curl "https://www.metabolomicsworkbench.org/rest/study/study_id/ST000001/mwtab/txt"

Context 3: RefMet

Query the standardized metabolite nomenclature database with hierarchical classification.

Input Items

Input Item	Description	Example
`name`	Metabolite name	glucose
`inchi_key`	InChI Key	WQZGKKKJIJFFOK-GASJEMHNSA-N
`pubchem_cid`	PubChem CID	5793
`exactmass`	Exact mass	180.0634
`formula`	Molecular formula	C6H12O6
`super_class`	Super class name	Organic compounds
`main_class`	Main class name	Carbohydrates
`sub_class`	Sub class name	Monosaccharides
`match`	Name matching/standardization	citrate
`refmet_id`	RefMet identifier	12345
`all`	Retrieve all RefMet entries	(no value needed)

Output Items

Output Item	Description
`all`	All available RefMet data
`name`	Standardized RefMet name
`inchi_key`	InChI Key
`pubchem_cid`	PubChem CID
`exactmass`	Exact mass
`formula`	Molecular formula
`sys_name`	Systematic name
`super_class`	Super class classification
`main_class`	Main class classification
`sub_class`	Sub class classification
`refmet_id`	RefMet identifier

Example Requests

# Standardize a metabolite name
curl "https://www.metabolomicsworkbench.org/rest/refmet/match/citrate/name/json"

# Get all RefMet data for a metabolite
curl "https://www.metabolomicsworkbench.org/rest/refmet/name/Glucose/all/json"

# Query by molecular formula
curl "https://www.metabolomicsworkbench.org/rest/refmet/formula/C6H12O6/all/json"

# Get all metabolites in a main class
curl "https://www.metabolomicsworkbench.org/rest/refmet/main_class/Fatty%20Acids/all/json"

# Query by exact mass
curl "https://www.metabolomicsworkbench.org/rest/refmet/exactmass/180.0634/all/json"

# Download complete RefMet database
curl "https://www.metabolomicsworkbench.org/rest/refmet/all/json"

RefMet Classification Hierarchy

RefMet provides four-level structural resolution:

Super Class: Broadest categorization (e.g., "Organic compounds", "Lipids")
Main Class: Major biochemical categories (e.g., "Fatty Acids", "Carbohydrates")
Sub Class: More specific groupings (e.g., "Monosaccharides", "Amino acids")
Individual Metabolite: Specific compound with standardized name

Context 4: MetStat

Filter studies by analytical and biological parameters using semicolon-delimited format.

Format

/metstat/ANALYSIS_TYPE;POLARITY;CHROMATOGRAPHY;SPECIES;SAMPLE_SOURCE;DISEASE;KEGG_ID;REFMET_NAME

Parameters

Position	Parameter	Options
1	Analysis Type	LCMS, GCMS, NMR, MS, ICPMS
2	Polarity	POSITIVE, NEGATIVE
3	Chromatography	HILIC, RP (Reverse Phase), GC, IC
4	Species	Human, Mouse, Rat, etc.
5	Sample Source	Blood, Plasma, Serum, Urine, Liver, etc.
6	Disease	Diabetes, Cancer, Alzheimer, etc.
7	KEGG ID	C00031, etc.
8	RefMet Name	Glucose, Tyrosine, etc.

Note: Use empty positions (consecutive semicolons) to skip parameters. All parameters are optional.

Example Requests

# Human blood diabetes studies with LC-MS HILIC positive mode
curl "https://www.metabolomicsworkbench.org/rest/metstat/LCMS;POSITIVE;HILIC;Human;Blood;Diabetes/json"

# All human blood studies containing tyrosine
curl "https://www.metabolomicsworkbench.org/rest/metstat/;;;Human;Blood;;;Tyrosine/json"

# All GC-MS studies regardless of other parameters
curl "https://www.metabolomicsworkbench.org/rest/metstat/GCMS;;;;;;/json"

# Mouse liver studies
curl "https://www.metabolomicsworkbench.org/rest/metstat/;;;Mouse;Liver;;/json"

# All studies measuring glucose
curl "https://www.metabolomicsworkbench.org/rest/metstat/;;;;;;;Glucose/json"

Context 5: Moverz

Perform mass spectrometry precursor ion searches by m/z value.

Format for m/z Search

/moverz/DATABASE/mass/adduct/tolerance/format

DATABASE: MB (Metabolomics Workbench), LIPIDS, REFMET
mass: m/z value (e.g., 635.52)
adduct: Ion adduct type (see table below)
tolerance: Mass tolerance in Daltons (e.g., 0.5)
format: json or txt

Format for Exact Mass Calculation

/moverz/exactmass/metabolite_name/adduct/format

Ion Adduct Types

Positive Mode Adducts

Adduct	Description	Example Use
`M+H`	Protonated molecule	Most common positive ESI
`M+Na`	Sodium adduct	Common in ESI
`M+K`	Potassium adduct	Less common ESI
`M+NH4`	Ammonium adduct	Common with ammonium salts
`M+2H`	Doubly protonated	Multiply charged ions
`M+H-H2O`	Dehydrated protonated	Loss of water
`M+2Na-H`	Disodium minus hydrogen	Multiple sodium
`M+CH3OH+H`	Methanol adduct	Methanol in mobile phase
`M+ACN+H`	Acetonitrile adduct	ACN in mobile phase
`M+ACN+Na`	ACN + sodium	ACN and sodium

Negative Mode Adducts

Adduct	Description	Example Use
`M-H`	Deprotonated molecule	Most common negative ESI
`M+Cl`	Chloride adduct	Chlorinated mobile phases
`M+FA-H`	Formate adduct	Formic acid in mobile phase
`M+HAc-H`	Acetate adduct	Acetic acid in mobile phase
`M-H-H2O`	Deprotonated minus water	Water loss
`M-2H`	Doubly deprotonated	Multiply charged ions
`M+Na-2H`	Sodium minus two protons	Mixed charge states

Uncharged

Adduct	Description
`M`	Uncharged molecule

Example Requests

# Search for compounds with m/z 635.52 (M+H) in MB database
curl "https://www.metabolomicsworkbench.org/rest/moverz/MB/635.52/M+H/0.5/json"

# Search in RefMet with negative mode
curl "https://www.metabolomicsworkbench.org/rest/moverz/REFMET/200.15/M-H/0.3/json"

# Search lipids database
curl "https://www.metabolomicsworkbench.org/rest/moverz/LIPIDS/760.59/M+Na/0.5/json"

# Calculate exact mass for known metabolite
curl "https://www.metabolomicsworkbench.org/rest/moverz/exactmass/PC(34:1)/M+H/json"

# High-resolution MS search (tight tolerance)
curl "https://www.metabolomicsworkbench.org/rest/moverz/MB/180.0634/M+H/0.01/json"

Context 6: Gene

Access gene information from the Metabolome Gene/Protein (MGP) database.

Input Items

Input Item	Description	Example
`mgp_id`	MGP database ID	MGP001
`gene_id`	NCBI Gene ID	31
`gene_name`	Full gene name	acetyl-CoA carboxylase
`gene_symbol`	Gene symbol	ACACA
`taxid`	Taxonomy ID	9606 (human)

Output Items

Output Item	Description
`all`	All gene information
`mgp_id`	MGP identifier
`gene_id`	NCBI Gene ID
`gene_name`	Full gene name
`gene_symbol`	Gene symbol
`gene_synonyms`	Alternative names
`alt_names`	Alternative nomenclature
`chromosome`	Chromosomal location
`map_location`	Genetic map position
`summary`	Gene description
`taxid`	Taxonomy ID
`species`	Species short name
`species_long`	Full species name

Example Requests

# Get gene information by symbol
curl "https://www.metabolomicsworkbench.org/rest/gene/gene_symbol/ACACA/all/json"

# Get gene by NCBI Gene ID
curl "https://www.metabolomicsworkbench.org/rest/gene/gene_id/31/all/json"

# Search by gene name
curl "https://www.metabolomicsworkbench.org/rest/gene/gene_name/carboxylase/summary/json"

Context 7: Protein

Retrieve protein sequence and annotation data.

Input Items

Input Item	Description	Example
`mgp_id`	MGP database ID	MGP001
`gene_id`	NCBI Gene ID	31
`gene_name`	Gene name	acetyl-CoA carboxylase
`gene_symbol`	Gene symbol	ACACA
`taxid`	Taxonomy ID	9606
`mrna_id`	mRNA identifier	NM_001093.3
`refseq_id`	RefSeq ID	NP_001084
`protein_gi`	GenInfo Identifier	4557237
`uniprot_id`	UniProt ID	Q13085
`protein_entry`	Protein entry name	ACACA_HUMAN
`protein_name`	Protein name	Acetyl-CoA carboxylase

Output Items

Output Item	Description
`all`	All protein information
`mgp_id`	MGP identifier
`gene_id`	NCBI Gene ID
`gene_name`	Gene name
`gene_symbol`	Gene symbol
`taxid`	Taxonomy ID
`species`	Species short name
`species_long`	Full species name
`mrna_id`	mRNA identifier
`refseq_id`	RefSeq protein ID
`protein_gi`	GenInfo Identifier
`uniprot_id`	UniProt accession
`protein_entry`	Protein entry name
`protein_name`	Full protein name
`seqlength`	Sequence length
`seq`	Amino acid sequence
`is_identical_to`	Identical sequences

Example Requests

# Get protein information by UniProt ID
curl "https://www.metabolomicsworkbench.org/rest/protein/uniprot_id/Q13085/all/json"

# Get protein by gene symbol
curl "https://www.metabolomicsworkbench.org/rest/protein/gene_symbol/ACACA/all/json"

# Get protein sequence
curl "https://www.metabolomicsworkbench.org/rest/protein/uniprot_id/Q13085/seq/json"

# Search by RefSeq ID
curl "https://www.metabolomicsworkbench.org/rest/protein/refseq_id/NP_001084/all/json"

Error Handling

The API returns appropriate HTTP status codes:

200 OK: Successful request
400 Bad Request: Invalid parameters or malformed request
404 Not Found: Resource not found
500 Internal Server Error: Server-side error

When no results are found, the API typically returns an empty array or object rather than an error code.

Rate Limiting

As of 2025, the Metabolomics Workbench REST API does not enforce strict rate limits for reasonable use. However, best practices include:

Implementing delays between bulk requests
Caching frequently accessed reference data
Using appropriate batch sizes for large-scale queries

Additional Resources

Interactive REST URL Creator: https://www.metabolomicsworkbench.org/tools/mw_rest.php
Official API Specification: https://www.metabolomicsworkbench.org/tools/MWRestAPIv1.1.pdf
Python Library: mwtab package for Python users
R Package: metabolomicsWorkbenchR (Bioconductor)
Julia Package: MetabolomicsWorkbenchAPI.jl

Python Example: Complete Workflow

import requests
import json

# 1. Standardize metabolite name using RefMet
metabolite = "citrate"
response = requests.get(f'https://www.metabolomicsworkbench.org/rest/refmet/match/{metabolite}/name/json')
standardized_name = response.json()['name']

# 2. Search for studies containing this metabolite
response = requests.get(f'https://www.metabolomicsworkbench.org/rest/study/refmet_name/{standardized_name}/summary/json')
studies = response.json()

# 3. Get detailed data from a specific study
study_id = studies[0]['study_id']
response = requests.get(f'https://www.metabolomicsworkbench.org/rest/study/study_id/{study_id}/data/json')
data = response.json()

# 4. Perform m/z search for compound identification
mz_value = 180.06
response = requests.get(f'https://www.metabolomicsworkbench.org/rest/moverz/MB/{mz_value}/M+H/0.5/json')
matches = response.json()

# 5. Get compound structure
regno = matches[0]['regno']
response = requests.get(f'https://www.metabolomicsworkbench.org/rest/compound/regno/{regno}/png')
with open('structure.png', 'wb') as f:
    f.write(response.content)

16 KiB Raw Permalink Blame History

Metabolomics Workbench REST API Reference

Base URL

API Structure

Output Formats

Context 1: Compound

Input Items

Output Items

Example Requests

Context 2: Study

Input Items

Output Items

Example Requests

Context 3: RefMet

Input Items

Output Items

Example Requests

RefMet Classification Hierarchy

Context 4: MetStat

Format

Parameters

Example Requests

Context 5: Moverz

Format for m/z Search

Format for Exact Mass Calculation

Ion Adduct Types

Positive Mode Adducts

Negative Mode Adducts

Uncharged

Example Requests

Context 6: Gene

Input Items

Output Items

Example Requests

Context 7: Protein

Input Items

Output Items

Example Requests

Error Handling

Rate Limiting

Additional Resources

Python Example: Complete Workflow

16 KiB

Raw Permalink Blame History