771 lines
24 KiB
Markdown
771 lines
24 KiB
Markdown
# Seaborn Function Reference
|
|
|
|
This document provides a comprehensive reference for all major seaborn functions, organized by category.
|
|
|
|
## Relational Plots
|
|
|
|
### scatterplot()
|
|
|
|
**Purpose:** Create a scatter plot with points representing individual observations.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict of arrays
|
|
- `x, y` - Variables for x and y axes
|
|
- `hue` - Grouping variable for color encoding
|
|
- `size` - Grouping variable for size encoding
|
|
- `style` - Grouping variable for marker style
|
|
- `palette` - Color palette name or list
|
|
- `hue_order` - Order for categorical hue levels
|
|
- `hue_norm` - Normalization for numeric hue (tuple or Normalize object)
|
|
- `sizes` - Size range for size encoding (tuple or dict)
|
|
- `size_order` - Order for categorical size levels
|
|
- `size_norm` - Normalization for numeric size
|
|
- `markers` - Marker style(s) (string, list, or dict)
|
|
- `style_order` - Order for categorical style levels
|
|
- `legend` - How to draw legend: "auto", "brief", "full", or False
|
|
- `ax` - Matplotlib axes to plot on
|
|
|
|
**Example:**
|
|
```python
|
|
sns.scatterplot(data=df, x='height', y='weight',
|
|
hue='gender', size='age', style='smoker',
|
|
palette='Set2', sizes=(20, 200))
|
|
```
|
|
|
|
### lineplot()
|
|
|
|
**Purpose:** Draw a line plot with automatic aggregation and confidence intervals for repeated measures.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict of arrays
|
|
- `x, y` - Variables for x and y axes
|
|
- `hue` - Grouping variable for color encoding
|
|
- `size` - Grouping variable for line width
|
|
- `style` - Grouping variable for line style (dashes)
|
|
- `units` - Grouping variable for sampling units (no aggregation within units)
|
|
- `estimator` - Function for aggregating across observations (default: mean)
|
|
- `errorbar` - Method for error bars: "sd", "se", "pi", ("ci", level), ("pi", level), or None
|
|
- `n_boot` - Number of bootstrap iterations for CI computation
|
|
- `seed` - Random seed for reproducible bootstrapping
|
|
- `sort` - Sort data before plotting
|
|
- `err_style` - "band" or "bars" for error representation
|
|
- `err_kws` - Additional parameters for error representation
|
|
- `markers` - Marker style(s) for emphasizing data points
|
|
- `dashes` - Dash style(s) for lines
|
|
- `legend` - How to draw legend
|
|
- `ax` - Matplotlib axes to plot on
|
|
|
|
**Example:**
|
|
```python
|
|
sns.lineplot(data=timeseries, x='time', y='signal',
|
|
hue='condition', style='subject',
|
|
errorbar=('ci', 95), markers=True)
|
|
```
|
|
|
|
### relplot()
|
|
|
|
**Purpose:** Figure-level interface for drawing relational plots (scatter or line) onto a FacetGrid.
|
|
|
|
**Key Parameters:**
|
|
All parameters from `scatterplot()` and `lineplot()`, plus:
|
|
- `kind` - "scatter" or "line"
|
|
- `col` - Categorical variable for column facets
|
|
- `row` - Categorical variable for row facets
|
|
- `col_wrap` - Wrap columns after this many columns
|
|
- `col_order` - Order for column facet levels
|
|
- `row_order` - Order for row facet levels
|
|
- `height` - Height of each facet in inches
|
|
- `aspect` - Aspect ratio (width = height * aspect)
|
|
- `facet_kws` - Additional parameters for FacetGrid
|
|
|
|
**Example:**
|
|
```python
|
|
sns.relplot(data=df, x='time', y='measurement',
|
|
hue='treatment', style='batch',
|
|
col='cell_line', row='timepoint',
|
|
kind='line', height=3, aspect=1.5)
|
|
```
|
|
|
|
## Distribution Plots
|
|
|
|
### histplot()
|
|
|
|
**Purpose:** Plot univariate or bivariate histograms with flexible binning.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variables (y optional for bivariate)
|
|
- `hue` - Grouping variable
|
|
- `weights` - Variable for weighting observations
|
|
- `stat` - Aggregate statistic: "count", "frequency", "probability", "percent", "density"
|
|
- `bins` - Number of bins, bin edges, or method ("auto", "fd", "doane", "scott", "stone", "rice", "sturges", "sqrt")
|
|
- `binwidth` - Width of bins (overrides bins)
|
|
- `binrange` - Range for binning (tuple)
|
|
- `discrete` - Treat x as discrete (centers bars on values)
|
|
- `cumulative` - Compute cumulative distribution
|
|
- `common_bins` - Use same bins for all hue levels
|
|
- `common_norm` - Normalize across hue levels
|
|
- `multiple` - How to handle hue: "layer", "dodge", "stack", "fill"
|
|
- `element` - Visual element: "bars", "step", "poly"
|
|
- `fill` - Fill bars/elements
|
|
- `shrink` - Scale bar width (for multiple="dodge")
|
|
- `kde` - Overlay KDE estimate
|
|
- `kde_kws` - Parameters for KDE
|
|
- `line_kws` - Parameters for step/poly elements
|
|
- `thresh` - Minimum count threshold for bins
|
|
- `pthresh` - Minimum probability threshold
|
|
- `pmax` - Maximum probability for color scaling
|
|
- `log_scale` - Log scale for axis (bool or base)
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.histplot(data=df, x='measurement', hue='condition',
|
|
stat='density', bins=30, kde=True,
|
|
multiple='layer', alpha=0.5)
|
|
```
|
|
|
|
### kdeplot()
|
|
|
|
**Purpose:** Plot univariate or bivariate kernel density estimates.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variables (y optional for bivariate)
|
|
- `hue` - Grouping variable
|
|
- `weights` - Variable for weighting observations
|
|
- `palette` - Color palette
|
|
- `hue_order` - Order for hue levels
|
|
- `hue_norm` - Normalization for numeric hue
|
|
- `multiple` - How to handle hue: "layer", "stack", "fill"
|
|
- `common_norm` - Normalize across hue levels
|
|
- `common_grid` - Use same grid for all hue levels
|
|
- `cumulative` - Compute cumulative distribution
|
|
- `bw_method` - Method for bandwidth: "scott", "silverman", or scalar
|
|
- `bw_adjust` - Bandwidth multiplier (higher = smoother)
|
|
- `log_scale` - Log scale for axis
|
|
- `levels` - Number or values for contour levels (bivariate)
|
|
- `thresh` - Minimum density threshold for contours
|
|
- `gridsize` - Grid resolution
|
|
- `cut` - Extension beyond data extremes (in bandwidth units)
|
|
- `clip` - Data range for curve (tuple)
|
|
- `fill` - Fill area under curve/contours
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
# Univariate
|
|
sns.kdeplot(data=df, x='measurement', hue='condition',
|
|
fill=True, common_norm=False, bw_adjust=1.5)
|
|
|
|
# Bivariate
|
|
sns.kdeplot(data=df, x='var1', y='var2',
|
|
fill=True, levels=10, thresh=0.05)
|
|
```
|
|
|
|
### ecdfplot()
|
|
|
|
**Purpose:** Plot empirical cumulative distribution functions.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variables (specify one)
|
|
- `hue` - Grouping variable
|
|
- `weights` - Variable for weighting observations
|
|
- `stat` - "proportion" or "count"
|
|
- `complementary` - Plot complementary CDF (1 - ECDF)
|
|
- `palette` - Color palette
|
|
- `hue_order` - Order for hue levels
|
|
- `hue_norm` - Normalization for numeric hue
|
|
- `log_scale` - Log scale for axis
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.ecdfplot(data=df, x='response_time', hue='treatment',
|
|
stat='proportion', complementary=False)
|
|
```
|
|
|
|
### rugplot()
|
|
|
|
**Purpose:** Plot tick marks showing individual observations along an axis.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variable (specify one)
|
|
- `hue` - Grouping variable
|
|
- `height` - Height of ticks (proportion of axis)
|
|
- `expand_margins` - Add margin space for rug
|
|
- `palette` - Color palette
|
|
- `hue_order` - Order for hue levels
|
|
- `hue_norm` - Normalization for numeric hue
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.rugplot(data=df, x='value', hue='category', height=0.05)
|
|
```
|
|
|
|
### displot()
|
|
|
|
**Purpose:** Figure-level interface for distribution plots onto a FacetGrid.
|
|
|
|
**Key Parameters:**
|
|
All parameters from `histplot()`, `kdeplot()`, and `ecdfplot()`, plus:
|
|
- `kind` - "hist", "kde", "ecdf"
|
|
- `rug` - Add rug plot on marginal axes
|
|
- `rug_kws` - Parameters for rug plot
|
|
- `col` - Categorical variable for column facets
|
|
- `row` - Categorical variable for row facets
|
|
- `col_wrap` - Wrap columns
|
|
- `col_order` - Order for column facets
|
|
- `row_order` - Order for row facets
|
|
- `height` - Height of each facet
|
|
- `aspect` - Aspect ratio
|
|
- `facet_kws` - Additional parameters for FacetGrid
|
|
|
|
**Example:**
|
|
```python
|
|
sns.displot(data=df, x='measurement', hue='treatment',
|
|
col='timepoint', kind='kde', fill=True,
|
|
height=3, aspect=1.5, rug=True)
|
|
```
|
|
|
|
### jointplot()
|
|
|
|
**Purpose:** Draw a bivariate plot with marginal univariate plots.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame
|
|
- `x, y` - Variables for x and y axes
|
|
- `hue` - Grouping variable
|
|
- `kind` - "scatter", "kde", "hist", "hex", "reg", "resid"
|
|
- `height` - Size of the figure (square)
|
|
- `ratio` - Ratio of joint to marginal axes
|
|
- `space` - Space between joint and marginal axes
|
|
- `dropna` - Drop missing values
|
|
- `xlim, ylim` - Axis limits (tuples)
|
|
- `marginal_ticks` - Show ticks on marginal axes
|
|
- `joint_kws` - Parameters for joint plot
|
|
- `marginal_kws` - Parameters for marginal plots
|
|
- `hue_order` - Order for hue levels
|
|
- `palette` - Color palette
|
|
|
|
**Example:**
|
|
```python
|
|
sns.jointplot(data=df, x='var1', y='var2', hue='group',
|
|
kind='scatter', height=6, ratio=4,
|
|
joint_kws={'alpha': 0.5})
|
|
```
|
|
|
|
### pairplot()
|
|
|
|
**Purpose:** Plot pairwise relationships in a dataset.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame
|
|
- `hue` - Grouping variable for color encoding
|
|
- `hue_order` - Order for hue levels
|
|
- `palette` - Color palette
|
|
- `vars` - Variables to plot (default: all numeric)
|
|
- `x_vars, y_vars` - Variables for x and y axes (non-square grid)
|
|
- `kind` - "scatter", "kde", "hist", "reg"
|
|
- `diag_kind` - "auto", "hist", "kde", None
|
|
- `markers` - Marker style(s)
|
|
- `height` - Height of each facet
|
|
- `aspect` - Aspect ratio
|
|
- `corner` - Plot only lower triangle
|
|
- `dropna` - Drop missing values
|
|
- `plot_kws` - Parameters for non-diagonal plots
|
|
- `diag_kws` - Parameters for diagonal plots
|
|
- `grid_kws` - Parameters for PairGrid
|
|
|
|
**Example:**
|
|
```python
|
|
sns.pairplot(data=df, hue='species', palette='Set2',
|
|
vars=['sepal_length', 'sepal_width', 'petal_length'],
|
|
corner=True, height=2.5)
|
|
```
|
|
|
|
## Categorical Plots
|
|
|
|
### stripplot()
|
|
|
|
**Purpose:** Draw a categorical scatterplot with jittered points.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variables (one categorical, one continuous)
|
|
- `hue` - Grouping variable
|
|
- `order` - Order for categorical levels
|
|
- `hue_order` - Order for hue levels
|
|
- `jitter` - Amount of jitter: True, float, or False
|
|
- `dodge` - Separate hue levels side-by-side
|
|
- `orient` - "v" or "h" (usually inferred)
|
|
- `color` - Single color for all elements
|
|
- `palette` - Color palette
|
|
- `size` - Marker size
|
|
- `edgecolor` - Marker edge color
|
|
- `linewidth` - Marker edge width
|
|
- `native_scale` - Use numeric scale for categorical axis
|
|
- `formatter` - Formatter for categorical axis
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.stripplot(data=df, x='day', y='total_bill',
|
|
hue='sex', dodge=True, jitter=0.2)
|
|
```
|
|
|
|
### swarmplot()
|
|
|
|
**Purpose:** Draw a categorical scatterplot with non-overlapping points.
|
|
|
|
**Key Parameters:**
|
|
Same as `stripplot()`, except:
|
|
- No `jitter` parameter
|
|
- `size` - Marker size (important for avoiding overlap)
|
|
- `warn_thresh` - Threshold for warning about too many points (default: 0.05)
|
|
|
|
**Note:** Computationally intensive for large datasets. Use stripplot for >1000 points.
|
|
|
|
**Example:**
|
|
```python
|
|
sns.swarmplot(data=df, x='day', y='total_bill',
|
|
hue='time', dodge=True, size=5)
|
|
```
|
|
|
|
### boxplot()
|
|
|
|
**Purpose:** Draw a box plot showing quartiles and outliers.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variables (one categorical, one continuous)
|
|
- `hue` - Grouping variable
|
|
- `order` - Order for categorical levels
|
|
- `hue_order` - Order for hue levels
|
|
- `orient` - "v" or "h"
|
|
- `color` - Single color for boxes
|
|
- `palette` - Color palette
|
|
- `saturation` - Color saturation intensity
|
|
- `width` - Width of boxes
|
|
- `dodge` - Separate hue levels side-by-side
|
|
- `fliersize` - Size of outlier markers
|
|
- `linewidth` - Box line width
|
|
- `whis` - IQR multiplier for whiskers (default: 1.5)
|
|
- `notch` - Draw notched boxes
|
|
- `showcaps` - Show whisker caps
|
|
- `showmeans` - Show mean value
|
|
- `meanprops` - Properties for mean marker
|
|
- `boxprops` - Properties for boxes
|
|
- `whiskerprops` - Properties for whiskers
|
|
- `capprops` - Properties for caps
|
|
- `flierprops` - Properties for outliers
|
|
- `medianprops` - Properties for median line
|
|
- `native_scale` - Use numeric scale
|
|
- `formatter` - Formatter for categorical axis
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.boxplot(data=df, x='day', y='total_bill',
|
|
hue='smoker', palette='Set3',
|
|
showmeans=True, notch=True)
|
|
```
|
|
|
|
### violinplot()
|
|
|
|
**Purpose:** Draw a violin plot combining boxplot and KDE.
|
|
|
|
**Key Parameters:**
|
|
Same as `boxplot()`, plus:
|
|
- `bw_method` - KDE bandwidth method
|
|
- `bw_adjust` - KDE bandwidth multiplier
|
|
- `cut` - KDE extension beyond extremes
|
|
- `density_norm` - "area", "count", "width"
|
|
- `inner` - "box", "quartile", "point", "stick", None
|
|
- `split` - Split violins for hue comparison
|
|
- `scale` - Scaling method: "area", "count", "width"
|
|
- `scale_hue` - Scale across hue levels
|
|
- `gridsize` - KDE grid resolution
|
|
|
|
**Example:**
|
|
```python
|
|
sns.violinplot(data=df, x='day', y='total_bill',
|
|
hue='sex', split=True, inner='quartile',
|
|
palette='muted')
|
|
```
|
|
|
|
### boxenplot()
|
|
|
|
**Purpose:** Draw enhanced box plot for larger datasets showing more quantiles.
|
|
|
|
**Key Parameters:**
|
|
Same as `boxplot()`, plus:
|
|
- `k_depth` - "tukey", "proportion", "trustworthy", "full", or int
|
|
- `outlier_prop` - Proportion of data as outliers
|
|
- `trust_alpha` - Alpha for trustworthy depth
|
|
- `showfliers` - Show outlier points
|
|
|
|
**Example:**
|
|
```python
|
|
sns.boxenplot(data=df, x='day', y='total_bill',
|
|
hue='time', palette='Set2')
|
|
```
|
|
|
|
### barplot()
|
|
|
|
**Purpose:** Draw a bar plot with error bars showing statistical estimates.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame, array, or dict
|
|
- `x, y` - Variables (one categorical, one continuous)
|
|
- `hue` - Grouping variable
|
|
- `order` - Order for categorical levels
|
|
- `hue_order` - Order for hue levels
|
|
- `estimator` - Aggregation function (default: mean)
|
|
- `errorbar` - Error representation: "sd", "se", "pi", ("ci", level), ("pi", level), or None
|
|
- `n_boot` - Bootstrap iterations
|
|
- `seed` - Random seed
|
|
- `units` - Identifier for sampling units
|
|
- `weights` - Observation weights
|
|
- `orient` - "v" or "h"
|
|
- `color` - Single bar color
|
|
- `palette` - Color palette
|
|
- `saturation` - Color saturation
|
|
- `width` - Bar width
|
|
- `dodge` - Separate hue levels side-by-side
|
|
- `errcolor` - Error bar color
|
|
- `errwidth` - Error bar line width
|
|
- `capsize` - Error bar cap width
|
|
- `native_scale` - Use numeric scale
|
|
- `formatter` - Formatter for categorical axis
|
|
- `legend` - Whether to show legend
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.barplot(data=df, x='day', y='total_bill',
|
|
hue='sex', estimator='median',
|
|
errorbar=('ci', 95), capsize=0.1)
|
|
```
|
|
|
|
### countplot()
|
|
|
|
**Purpose:** Show counts of observations in each categorical bin.
|
|
|
|
**Key Parameters:**
|
|
Same as `barplot()`, but:
|
|
- Only specify one of x or y (the categorical variable)
|
|
- No estimator or errorbar (shows counts)
|
|
- `stat` - "count" or "percent"
|
|
|
|
**Example:**
|
|
```python
|
|
sns.countplot(data=df, x='day', hue='time',
|
|
palette='pastel', dodge=True)
|
|
```
|
|
|
|
### pointplot()
|
|
|
|
**Purpose:** Show point estimates and confidence intervals with connecting lines.
|
|
|
|
**Key Parameters:**
|
|
Same as `barplot()`, plus:
|
|
- `markers` - Marker style(s)
|
|
- `linestyles` - Line style(s)
|
|
- `scale` - Scale for markers
|
|
- `join` - Connect points with lines
|
|
- `capsize` - Error bar cap width
|
|
|
|
**Example:**
|
|
```python
|
|
sns.pointplot(data=df, x='time', y='total_bill',
|
|
hue='sex', markers=['o', 's'],
|
|
linestyles=['-', '--'], capsize=0.1)
|
|
```
|
|
|
|
### catplot()
|
|
|
|
**Purpose:** Figure-level interface for categorical plots onto a FacetGrid.
|
|
|
|
**Key Parameters:**
|
|
All parameters from categorical plots, plus:
|
|
- `kind` - "strip", "swarm", "box", "violin", "boxen", "bar", "point", "count"
|
|
- `col` - Categorical variable for column facets
|
|
- `row` - Categorical variable for row facets
|
|
- `col_wrap` - Wrap columns
|
|
- `col_order` - Order for column facets
|
|
- `row_order` - Order for row facets
|
|
- `height` - Height of each facet
|
|
- `aspect` - Aspect ratio
|
|
- `sharex, sharey` - Share axes across facets
|
|
- `legend` - Whether to show legend
|
|
- `legend_out` - Place legend outside figure
|
|
- `facet_kws` - Additional FacetGrid parameters
|
|
|
|
**Example:**
|
|
```python
|
|
sns.catplot(data=df, x='day', y='total_bill',
|
|
hue='smoker', col='time',
|
|
kind='violin', split=True,
|
|
height=4, aspect=0.8)
|
|
```
|
|
|
|
## Regression Plots
|
|
|
|
### regplot()
|
|
|
|
**Purpose:** Plot data and a linear regression fit.
|
|
|
|
**Key Parameters:**
|
|
- `data` - DataFrame
|
|
- `x, y` - Variables or data vectors
|
|
- `x_estimator` - Apply estimator to x bins
|
|
- `x_bins` - Bin x for estimator
|
|
- `x_ci` - CI for binned estimates
|
|
- `scatter` - Show scatter points
|
|
- `fit_reg` - Plot regression line
|
|
- `ci` - CI for regression estimate (int or None)
|
|
- `n_boot` - Bootstrap iterations for CI
|
|
- `units` - Identifier for sampling units
|
|
- `seed` - Random seed
|
|
- `order` - Polynomial regression order
|
|
- `logistic` - Fit logistic regression
|
|
- `lowess` - Fit lowess smoother
|
|
- `robust` - Fit robust regression
|
|
- `logx` - Log-transform x
|
|
- `x_partial, y_partial` - Partial regression (regress out variables)
|
|
- `truncate` - Limit regression line to data range
|
|
- `dropna` - Drop missing values
|
|
- `x_jitter, y_jitter` - Add jitter to data
|
|
- `label` - Label for legend
|
|
- `color` - Color for all elements
|
|
- `marker` - Marker style
|
|
- `scatter_kws` - Parameters for scatter
|
|
- `line_kws` - Parameters for regression line
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
sns.regplot(data=df, x='total_bill', y='tip',
|
|
order=2, robust=True, ci=95,
|
|
scatter_kws={'alpha': 0.5})
|
|
```
|
|
|
|
### lmplot()
|
|
|
|
**Purpose:** Figure-level interface for regression plots onto a FacetGrid.
|
|
|
|
**Key Parameters:**
|
|
All parameters from `regplot()`, plus:
|
|
- `hue` - Grouping variable
|
|
- `col` - Column facets
|
|
- `row` - Row facets
|
|
- `palette` - Color palette
|
|
- `col_wrap` - Wrap columns
|
|
- `height` - Facet height
|
|
- `aspect` - Aspect ratio
|
|
- `markers` - Marker style(s)
|
|
- `sharex, sharey` - Share axes
|
|
- `hue_order` - Order for hue levels
|
|
- `col_order` - Order for column facets
|
|
- `row_order` - Order for row facets
|
|
- `legend` - Whether to show legend
|
|
- `legend_out` - Place legend outside
|
|
- `facet_kws` - FacetGrid parameters
|
|
|
|
**Example:**
|
|
```python
|
|
sns.lmplot(data=df, x='total_bill', y='tip',
|
|
hue='smoker', col='time', row='sex',
|
|
height=3, aspect=1.2, ci=None)
|
|
```
|
|
|
|
### residplot()
|
|
|
|
**Purpose:** Plot residuals of a regression.
|
|
|
|
**Key Parameters:**
|
|
Same as `regplot()`, but:
|
|
- Always plots residuals (y - predicted) vs x
|
|
- Adds horizontal line at y=0
|
|
- `lowess` - Fit lowess smoother to residuals
|
|
|
|
**Example:**
|
|
```python
|
|
sns.residplot(data=df, x='x', y='y', lowess=True,
|
|
scatter_kws={'alpha': 0.5})
|
|
```
|
|
|
|
## Matrix Plots
|
|
|
|
### heatmap()
|
|
|
|
**Purpose:** Plot rectangular data as a color-encoded matrix.
|
|
|
|
**Key Parameters:**
|
|
- `data` - 2D array-like data
|
|
- `vmin, vmax` - Anchor values for colormap
|
|
- `cmap` - Colormap name or object
|
|
- `center` - Value at colormap center
|
|
- `robust` - Use robust quantiles for colormap range
|
|
- `annot` - Annotate cells: True, False, or array
|
|
- `fmt` - Format string for annotations (e.g., ".2f")
|
|
- `annot_kws` - Parameters for annotations
|
|
- `linewidths` - Width of cell borders
|
|
- `linecolor` - Color of cell borders
|
|
- `cbar` - Draw colorbar
|
|
- `cbar_kws` - Colorbar parameters
|
|
- `cbar_ax` - Axes for colorbar
|
|
- `square` - Force square cells
|
|
- `xticklabels, yticklabels` - Tick labels (True, False, int, or list)
|
|
- `mask` - Boolean array to mask cells
|
|
- `ax` - Matplotlib axes
|
|
|
|
**Example:**
|
|
```python
|
|
# Correlation matrix
|
|
corr = df.corr()
|
|
mask = np.triu(np.ones_like(corr, dtype=bool))
|
|
sns.heatmap(corr, mask=mask, annot=True, fmt='.2f',
|
|
cmap='coolwarm', center=0, square=True,
|
|
linewidths=1, cbar_kws={'shrink': 0.8})
|
|
```
|
|
|
|
### clustermap()
|
|
|
|
**Purpose:** Plot a hierarchically-clustered heatmap.
|
|
|
|
**Key Parameters:**
|
|
All parameters from `heatmap()`, plus:
|
|
- `pivot_kws` - Parameters for pivoting (if needed)
|
|
- `method` - Linkage method: "single", "complete", "average", "weighted", "centroid", "median", "ward"
|
|
- `metric` - Distance metric for clustering
|
|
- `standard_scale` - Standardize data: 0 (rows), 1 (columns), or None
|
|
- `z_score` - Z-score normalize data: 0 (rows), 1 (columns), or None
|
|
- `row_cluster, col_cluster` - Cluster rows/columns
|
|
- `row_linkage, col_linkage` - Precomputed linkage matrices
|
|
- `row_colors, col_colors` - Additional color annotations
|
|
- `dendrogram_ratio` - Ratio of dendrogram to heatmap
|
|
- `colors_ratio` - Ratio of color annotations to heatmap
|
|
- `cbar_pos` - Colorbar position (tuple: x, y, width, height)
|
|
- `tree_kws` - Parameters for dendrogram
|
|
- `figsize` - Figure size
|
|
|
|
**Example:**
|
|
```python
|
|
sns.clustermap(data, method='average', metric='euclidean',
|
|
z_score=0, cmap='viridis',
|
|
row_colors=row_colors, col_colors=col_colors,
|
|
figsize=(12, 12), dendrogram_ratio=0.1)
|
|
```
|
|
|
|
## Multi-Plot Grids
|
|
|
|
### FacetGrid
|
|
|
|
**Purpose:** Multi-plot grid for plotting conditional relationships.
|
|
|
|
**Initialization:**
|
|
```python
|
|
g = sns.FacetGrid(data, row=None, col=None, hue=None,
|
|
col_wrap=None, sharex=True, sharey=True,
|
|
height=3, aspect=1, palette=None,
|
|
row_order=None, col_order=None, hue_order=None,
|
|
hue_kws=None, dropna=False, legend_out=True,
|
|
despine=True, margin_titles=False,
|
|
xlim=None, ylim=None, subplot_kws=None,
|
|
gridspec_kws=None)
|
|
```
|
|
|
|
**Methods:**
|
|
- `map(func, *args, **kwargs)` - Apply function to each facet
|
|
- `map_dataframe(func, *args, **kwargs)` - Apply function with full DataFrame
|
|
- `set_axis_labels(x_var, y_var)` - Set axis labels
|
|
- `set_titles(template, **kwargs)` - Set subplot titles
|
|
- `set(kwargs)` - Set attributes on all axes
|
|
- `add_legend(legend_data, title, label_order, **kwargs)` - Add legend
|
|
- `savefig(*args, **kwargs)` - Save figure
|
|
|
|
**Example:**
|
|
```python
|
|
g = sns.FacetGrid(df, col='time', row='sex', hue='smoker',
|
|
height=3, aspect=1.5, margin_titles=True)
|
|
g.map(sns.scatterplot, 'total_bill', 'tip', alpha=0.7)
|
|
g.add_legend()
|
|
g.set_axis_labels('Total Bill ($)', 'Tip ($)')
|
|
g.set_titles('{col_name} | {row_name}')
|
|
```
|
|
|
|
### PairGrid
|
|
|
|
**Purpose:** Grid for plotting pairwise relationships in a dataset.
|
|
|
|
**Initialization:**
|
|
```python
|
|
g = sns.PairGrid(data, hue=None, vars=None,
|
|
x_vars=None, y_vars=None,
|
|
hue_order=None, palette=None,
|
|
hue_kws=None, corner=False,
|
|
diag_sharey=True, height=2.5,
|
|
aspect=1, layout_pad=0.5,
|
|
despine=True, dropna=False)
|
|
```
|
|
|
|
**Methods:**
|
|
- `map(func, **kwargs)` - Apply function to all subplots
|
|
- `map_diag(func, **kwargs)` - Apply to diagonal
|
|
- `map_offdiag(func, **kwargs)` - Apply to off-diagonal
|
|
- `map_upper(func, **kwargs)` - Apply to upper triangle
|
|
- `map_lower(func, **kwargs)` - Apply to lower triangle
|
|
- `add_legend(legend_data, **kwargs)` - Add legend
|
|
- `savefig(*args, **kwargs)` - Save figure
|
|
|
|
**Example:**
|
|
```python
|
|
g = sns.PairGrid(df, hue='species', vars=['a', 'b', 'c', 'd'],
|
|
corner=True, height=2.5)
|
|
g.map_upper(sns.scatterplot, alpha=0.5)
|
|
g.map_lower(sns.kdeplot)
|
|
g.map_diag(sns.histplot, kde=True)
|
|
g.add_legend()
|
|
```
|
|
|
|
### JointGrid
|
|
|
|
**Purpose:** Grid for bivariate plot with marginal univariate plots.
|
|
|
|
**Initialization:**
|
|
```python
|
|
g = sns.JointGrid(data=None, x=None, y=None, hue=None,
|
|
height=6, ratio=5, space=0.2,
|
|
dropna=False, xlim=None, ylim=None,
|
|
marginal_ticks=False, hue_order=None,
|
|
palette=None)
|
|
```
|
|
|
|
**Methods:**
|
|
- `plot(joint_func, marginal_func, **kwargs)` - Plot both joint and marginals
|
|
- `plot_joint(func, **kwargs)` - Plot joint distribution
|
|
- `plot_marginals(func, **kwargs)` - Plot marginal distributions
|
|
- `refline(x, y, **kwargs)` - Add reference line
|
|
- `set_axis_labels(xlabel, ylabel, **kwargs)` - Set axis labels
|
|
- `savefig(*args, **kwargs)` - Save figure
|
|
|
|
**Example:**
|
|
```python
|
|
g = sns.JointGrid(data=df, x='x', y='y', hue='group',
|
|
height=6, ratio=5, space=0.2)
|
|
g.plot_joint(sns.scatterplot, alpha=0.5)
|
|
g.plot_marginals(sns.histplot, kde=True)
|
|
g.set_axis_labels('Variable X', 'Variable Y')
|
|
```
|