7.0 KiB
Resolve Endpoint - Entity Identification
Purpose
The Resolve API identifies Data Commons IDs (DCIDs) for entities in the knowledge graph. DCIDs are required for most queries in the Data Commons API, so resolution is typically the first step in any workflow.
Key Capabilities
The endpoint currently supports place entities only and allows resolution through multiple methods:
- By name: Search using descriptive terms like "San Francisco, CA"
- By Wikidata ID: Lookup using external identifiers (e.g., "Q30" for USA)
- By coordinates: Locate places via latitude/longitude
- By relation expressions: Advanced searches using synthetic attributes
Available Methods
1. fetch()
General resolution using relation expressions—most flexible method.
Parameters:
nodes: List of search terms or identifiersproperty: Property to search (e.g., "name", "wikidataId")
Example Usage:
from datacommons_client import DataCommonsClient
client = DataCommonsClient()
# Resolve by name
response = client.resolve.fetch(
nodes=["California", "Texas"],
property="name"
)
2. fetch_dcids_by_name()
Name-based lookup with optional type filtering—most commonly used method.
Parameters:
names: List of place names to resolveentity_type: Optional type filter (e.g., "City", "State", "County")
Returns: ResolveResponse object with candidates for each name
Example Usage:
# Basic name resolution
response = client.resolve.fetch_dcids_by_name(
names=["San Francisco, CA", "Los Angeles"]
)
# With type filtering
response = client.resolve.fetch_dcids_by_name(
names=["San Francisco"],
entity_type="City"
)
# Access results
for name, result in response.to_dict().items():
print(f"{name}: {result['candidates']}")
3. fetch_dcids_by_wikidata_id()
Wikidata ID resolution for entities with known Wikidata identifiers.
Parameters:
wikidata_ids: List of Wikidata IDs (e.g., "Q30", "Q99")
Example Usage:
# Resolve Wikidata IDs
response = client.resolve.fetch_dcids_by_wikidata_id(
wikidata_ids=["Q30", "Q99"] # USA and California
)
4. fetch_dcid_by_coordinates()
Geographic coordinate lookup to find the place at specific lat/long coordinates.
Parameters:
latitude: Latitude coordinatelongitude: Longitude coordinate
Returns: Single DCID string for the place at those coordinates
Example Usage:
# Find place at coordinates
dcid = client.resolve.fetch_dcid_by_coordinates(
latitude=37.7749,
longitude=-122.4194
)
# Returns DCID for San Francisco
Response Structure
All methods (except fetch_dcid_by_coordinates) return a ResolveResponse object containing:
- node: The search term provided
- candidates: List of matching DCIDs with optional metadata
- Each candidate may include
dominantTypefield for disambiguation
- Each candidate may include
- Helper methods:
to_dict(): Full response as dictionaryto_json(): JSON string formatto_flat_dict(): Simplified format with just DCIDs
Example Response:
response = client.resolve.fetch_dcids_by_name(names=["Springfield"])
# May return multiple candidates since many cities named Springfield exist
# {
# "Springfield": {
# "candidates": [
# {"dcid": "geoId/1767000", "dominantType": "City"}, # Springfield, IL
# {"dcid": "geoId/2567000", "dominantType": "City"}, # Springfield, MA
# ...
# ]
# }
# }
Common Use Cases
Use Case 1: Resolve Place Names Before Querying
Most workflows start by resolving names to DCIDs:
# Step 1: Resolve names
resolve_response = client.resolve.fetch_dcids_by_name(
names=["California", "Texas"]
)
# Step 2: Extract DCIDs
dcids = []
for name, result in resolve_response.to_dict().items():
if result["candidates"]:
dcids.append(result["candidates"][0]["dcid"])
# Step 3: Query data using DCIDs
data_response = client.observation.fetch(
variable_dcids=["Count_Person"],
entity_dcids=dcids,
date="latest"
)
Use Case 2: Handle Ambiguous Names
When multiple candidates exist, use dominantType or be more specific:
# Ambiguous name
response = client.resolve.fetch_dcids_by_name(names=["Springfield"])
candidates = response.to_dict()["Springfield"]["candidates"]
# Filter by type or choose based on context
city_candidates = [c for c in candidates if c.get("dominantType") == "City"]
# Or be more specific in the query
response = client.resolve.fetch_dcids_by_name(
names=["Springfield, Illinois"],
entity_type="City"
)
Use Case 3: Batch Resolution
Resolve multiple entities efficiently:
places = [
"San Francisco, CA",
"Los Angeles, CA",
"San Diego, CA",
"Sacramento, CA"
]
response = client.resolve.fetch_dcids_by_name(names=places)
# Build mapping of name to DCID
name_to_dcid = {}
for name, result in response.to_dict().items():
if result["candidates"]:
name_to_dcid[name] = result["candidates"][0]["dcid"]
Use Case 4: Coordinate-Based Queries
Find the administrative place for a location:
# User provides coordinates, find the place
latitude, longitude = 37.7749, -122.4194
dcid = client.resolve.fetch_dcid_by_coordinates(
latitude=latitude,
longitude=longitude
)
# Now query data for that place
response = client.observation.fetch(
variable_dcids=["Count_Person", "MedianIncome_Household"],
entity_dcids=[dcid],
date="latest"
)
Use Case 5: External ID Integration
When working with external datasets that use Wikidata IDs:
# External dataset has Wikidata IDs
wikidata_ids = ["Q30", "Q99", "Q1384"] # USA, California, New York
# Convert to Data Commons DCIDs
response = client.resolve.fetch_dcids_by_wikidata_id(
wikidata_ids=wikidata_ids
)
# Extract DCIDs for further queries
dcids = []
for wid, result in response.to_dict().items():
if result["candidates"]:
dcids.append(result["candidates"][0]["dcid"])
Important Limitations
-
Place entities only: The Resolve API currently supports only place entities (countries, states, cities, counties, etc.). For other entity types, DCIDs must be obtained through other means (e.g., Node API exploration).
-
Cannot resolve linked entity properties: For queries involving relationships like
containedInPlace, use the Node API instead. -
Ambiguity handling: When multiple candidates exist, the API returns all matches. The application must decide which is correct based on context or additional filtering.
Best Practices
- Always resolve names first: Never assume DCID format—always use the Resolve API
- Cache resolutions: If querying the same places repeatedly, cache name→DCID mappings
- Handle ambiguity: Check for multiple candidates and use
entity_typefiltering or more specific names - Validate results: Always check that
candidateslist is not empty before accessing DCIDs - Use appropriate method:
- Names →
fetch_dcids_by_name() - Coordinates →
fetch_dcid_by_coordinates() - Wikidata IDs →
fetch_dcids_by_wikidata_id()
- Names →