Files
2025-11-30 08:30:10 +08:00

1.6 KiB

GeoPandas Data Structures

GeoSeries

A GeoSeries is a vector where each entry is a set of shapes corresponding to one observation (similar to a pandas Series but with geometric data).

import geopandas as gpd
from shapely.geometry import Point, Polygon

# Create a GeoSeries from geometries
points = gpd.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])

# Access geometric properties
points.area
points.length
points.bounds

GeoDataFrame

A GeoDataFrame is a tabular data structure that contains a GeoSeries (similar to a pandas DataFrame but with geographic data).

# Create from dictionary
gdf = gpd.GeoDataFrame({
    'name': ['Point A', 'Point B'],
    'value': [100, 200],
    'geometry': [Point(1, 1), Point(2, 2)]
})

# Create from pandas DataFrame with coordinates
import pandas as pd
df = pd.DataFrame({'x': [1, 2, 3], 'y': [1, 2, 3], 'name': ['A', 'B', 'C']})
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.x, df.y))

Key Properties

  • geometry: The active geometry column (can have multiple geometry columns)
  • crs: Coordinate reference system
  • bounds: Bounding box of all geometries
  • total_bounds: Overall bounding box

Setting Active Geometry

When a GeoDataFrame has multiple geometry columns:

# Set active geometry column
gdf = gdf.set_geometry('other_geom_column')

# Check active geometry column
gdf.geometry.name

Indexing and Selection

Use standard pandas indexing with spatial data:

# Select by label
gdf.loc[0]

# Boolean indexing
large_areas = gdf[gdf.area > 100]

# Select columns
gdf[['name', 'geometry']]