gh-k-dense-ai-claude-scient…/skills/networkx/references/graph-basics.md

# NetworkX Graph Basics

## Graph Types

NetworkX supports four main graph classes:

### Graph (Undirected)
```python
import networkx as nx
G = nx.Graph()
```
- Undirected graphs with single edges between nodes
- No parallel edges allowed
- Edges are bidirectional

### DiGraph (Directed)
```python
G = nx.DiGraph()
```
- Directed graphs with one-way connections
- Edge direction matters: (u, v) ≠ (v, u)
- Used for modeling directed relationships

### MultiGraph (Undirected Multi-edge)
```python
G = nx.MultiGraph()
```
- Allows multiple edges between same node pairs
- Useful for modeling multiple relationships

### MultiDiGraph (Directed Multi-edge)
```python
G = nx.MultiDiGraph()
```
- Directed graph with multiple edges between nodes
- Combines features of DiGraph and MultiGraph

## Creating and Adding Nodes

### Single Node Addition
```python
G.add_node(1)
G.add_node("protein_A")
G.add_node((x, y))  # Nodes can be any hashable type
```

### Bulk Node Addition
```python
G.add_nodes_from([2, 3, 4])
G.add_nodes_from(range(100, 110))
```

### Nodes with Attributes
```python
G.add_node(1, time='5pm', color='red')
G.add_nodes_from([
    (4, {"color": "red"}),
    (5, {"color": "blue", "weight": 1.5})
])
```

### Important Node Properties
- Nodes can be any hashable Python object: strings, tuples, numbers, custom objects
- Node attributes stored as key-value pairs
- Use meaningful node identifiers for clarity

## Creating and Adding Edges

### Single Edge Addition
```python
G.add_edge(1, 2)
G.add_edge('gene_A', 'gene_B')
```

### Bulk Edge Addition
```python
G.add_edges_from([(1, 2), (1, 3), (2, 4)])
G.add_edges_from(edge_list)
```

### Edges with Attributes
```python
G.add_edge(1, 2, weight=4.7, relation='interacts')
G.add_edges_from([
    (1, 2, {'weight': 4.7}),
    (2, 3, {'weight': 8.2, 'color': 'blue'})
])
```

### Adding from Edge List with Attributes
```python
# From pandas DataFrame
import pandas as pd
df = pd.DataFrame({'source': [1, 2], 'target': [2, 3], 'weight': [4.7, 8.2]})
G = nx.from_pandas_edgelist(df, 'source', 'target', edge_attr='weight')
```

## Examining Graph Structure

### Basic Properties
```python
# Get collections
G.nodes              # NodeView of all nodes
G.edges              # EdgeView of all edges
G.adj                # AdjacencyView for neighbor relationships

# Count elements
G.number_of_nodes()  # Total node count
G.number_of_edges()  # Total edge count
len(G)              # Number of nodes (shorthand)

# Degree information
G.degree()          # DegreeView of all node degrees
G.degree(1)         # Degree of specific node
list(G.degree())    # List of (node, degree) pairs
```

### Checking Existence
```python
# Check if node exists
1 in G              # Returns True/False
G.has_node(1)

# Check if edge exists
G.has_edge(1, 2)
```

### Accessing Neighbors
```python
# Get neighbors of node 1
list(G.neighbors(1))
list(G[1])          # Dictionary-like access

# For directed graphs
list(G.predecessors(1))  # Incoming edges
list(G.successors(1))    # Outgoing edges
```

### Iterating Over Elements
```python
# Iterate over nodes
for node in G.nodes:
    print(node, G.nodes[node])  # Access node attributes

# Iterate over edges
for u, v in G.edges:
    print(u, v, G[u][v])  # Access edge attributes

# Iterate with attributes
for node, attrs in G.nodes(data=True):
    print(node, attrs)

for u, v, attrs in G.edges(data=True):
    print(u, v, attrs)
```

## Modifying Graphs

### Removing Elements
```python
# Remove single node (also removes incident edges)
G.remove_node(1)

# Remove multiple nodes
G.remove_nodes_from([1, 2, 3])

# Remove edges
G.remove_edge(1, 2)
G.remove_edges_from([(1, 2), (2, 3)])
```

### Clearing Graph
```python
G.clear()           # Remove all nodes and edges
G.clear_edges()     # Remove only edges, keep nodes
```

## Attributes and Metadata

### Graph-Level Attributes
```python
G.graph['name'] = 'Social Network'
G.graph['date'] = '2025-01-15'
print(G.graph)
```

### Node Attributes
```python
# Set at creation
G.add_node(1, time='5pm', weight=0.5)

# Set after creation
G.nodes[1]['time'] = '6pm'
nx.set_node_attributes(G, {1: 'red', 2: 'blue'}, 'color')

# Get attributes
G.nodes[1]
G.nodes[1]['time']
nx.get_node_attributes(G, 'color')
```

### Edge Attributes
```python
# Set at creation
G.add_edge(1, 2, weight=4.7, color='red')

# Set after creation
G[1][2]['weight'] = 5.0
nx.set_edge_attributes(G, {(1, 2): 10.5}, 'weight')

# Get attributes
G[1][2]
G[1][2]['weight']
G.edges[1, 2]
nx.get_edge_attributes(G, 'weight')
```

## Subgraphs and Views

### Subgraph Creation
```python
# Create subgraph from node list
nodes_subset = [1, 2, 3, 4]
H = G.subgraph(nodes_subset)  # Returns view (references original)

# Create independent copy
H = G.subgraph(nodes_subset).copy()

# Edge-induced subgraph
edge_subset = [(1, 2), (2, 3)]
H = G.edge_subgraph(edge_subset)
```

### Graph Views
```python
# Reverse view (for directed graphs)
G_reversed = G.reverse()

# Convert between directed/undirected
G_undirected = G.to_undirected()
G_directed = G.to_directed()
```

## Graph Information and Diagnostics

### Basic Information
```python
print(nx.info(G))   # Summary of graph structure

# Density (ratio of actual edges to possible edges)
nx.density(G)

# Check if graph is directed
G.is_directed()

# Check if graph is multigraph
G.is_multigraph()
```

### Connectivity Checks
```python
# For undirected graphs
nx.is_connected(G)
nx.number_connected_components(G)

# For directed graphs
nx.is_strongly_connected(G)
nx.is_weakly_connected(G)
```

## Important Considerations

### Floating Point Precision
Once graphs contain floating point numbers, all results are inherently approximate due to precision limitations. Small arithmetic errors can affect algorithm outcomes, particularly in minimum/maximum computations.

### Memory Considerations
Each time a script starts, graph data must be loaded into memory. For large datasets, this can cause performance issues. Consider:
- Using efficient data formats (pickle for Python objects)
- Loading only necessary subgraphs
- Using graph databases for very large networks

### Node and Edge Removal Behavior
When a node is removed, all edges incident with that node are automatically removed as well.