Files
gh-vuer-ai-vuer-skill-marke…/docs/tutorials/camera/grab-render-virtual-camera.md
2025-11-30 09:05:02 +08:00

236 lines
5.9 KiB
Markdown

# Collecting Render from Virtual Cameras
## Overview
This tutorial covers two methods for capturing rendered images from virtual cameras in Vuer, with a focus on the recommended ondemand approach.
## Methods Overview
### Method 1: Frame/Time Mode (Legacy)
Uses event handlers to collect rendered images. Only supported in `stream='frame'` or `stream='time'` mode.
**Limitations:**
- Less backend control
- Continuous rendering even when not needed
- Higher resource usage
### Method 2: OnDemand Mode (Recommended)
Uses a synchronous `grab_render` RPC API. Only available in `stream='ondemand'` mode.
**Advantages:**
- Superior backend control
- Renders only when explicitly requested
- Lower computational overhead
- Support for depth rendering
## Complete Example: OnDemand Mode
```python
import asyncio
import numpy as np
from vuer import Vuer, VuerSession
from vuer.schemas import Scene, CameraView, DefaultScene, Box, Urdf
from PIL import Image
app = Vuer()
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up the scene
session.set @ Scene(
DefaultScene(),
# Add some objects to render
Box(
args=[1, 1, 1],
position=[0, 0.5, 0],
color="red",
key="box",
),
Urdf(
src="/static/robot.urdf",
position=[2, 0, 0],
key="robot",
),
# Configure camera with ondemand streaming
CameraView(
key="ego",
fov=50,
width=640,
height=480,
position=[0, 2, 5],
rotation=[0, 0, 0],
stream="ondemand",
renderDepth=True, # Enable depth rendering
near=0.1,
far=100,
),
)
# Wait for scene to initialize
await asyncio.sleep(0.5)
# Capture renders from different positions
for i in range(10):
# Update camera position
x = 5 * np.cos(i * 0.2)
z = 5 * np.sin(i * 0.2)
session.update @ CameraView(
key="ego",
position=[x, 2, z],
rotation=[0, i * 0.2, 0],
)
# Small delay for camera update
await asyncio.sleep(0.1)
# Grab the render
result = session.grab_render(downsample=1, key="ego")
if result:
# Process RGB image
rgb_data = result.get("rgb")
if rgb_data:
# Convert to numpy array
img_array = np.frombuffer(rgb_data, dtype=np.uint8)
img_array = img_array.reshape((480, 640, 3))
# Save image
img = Image.fromarray(img_array)
img.save(f"render_{i:03d}.png")
print(f"Saved render_{i:03d}.png")
# Process depth map
depth_data = result.get("depth")
if depth_data:
depth_array = np.frombuffer(depth_data, dtype=np.float32)
depth_array = depth_array.reshape((480, 640))
# Save depth map
depth_img = Image.fromarray(
(depth_array * 255).astype(np.uint8)
)
depth_img.save(f"depth_{i:03d}.png")
print(f"Saved depth_{i:03d}.png")
print("Finished capturing renders")
# Keep session alive
while True:
await asyncio.sleep(1.0)
app.run()
```
## Key API: `grab_render()`
```python
result = session.grab_render(downsample=1, key="ego")
```
### Parameters
- **downsample**: Downsample factor (1 = no downsampling, 2 = half resolution)
- **key**: Camera key to capture from
### Returns
Dictionary containing:
- **rgb**: RGB image data as bytes
- **depth**: Depth map data as float32 array (if `renderDepth=True`)
## Depth Rendering
Enable depth map capture by setting `renderDepth=True`:
```python
CameraView(
key="ego",
renderDepth=True,
stream="ondemand",
# ... other parameters
)
```
**Benefits:**
- Captures depth without changing object materials
- Available since 2024 update
- Minimal computational overhead
## Legacy Method: Event Handler
For `stream='frame'` or `stream='time'` mode:
```python
async def handle_camera_view(event, session):
"""Handle CAMERA_VIEW events"""
if event.key != "ego":
return
# Access rendered image
image_data = event.value.get("image")
# Process image data
print(f"Received image: {len(image_data)} bytes")
app.add_handler("CAMERA_VIEW", handle_camera_view)
```
## Multi-Camera Capture
Capture from multiple cameras:
```python
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up multiple cameras
session.set @ Scene(
DefaultScene(),
CameraView(
key="front-camera",
position=[0, 1, 5],
stream="ondemand",
width=640,
height=480,
),
CameraView(
key="top-camera",
position=[0, 10, 0],
rotation=[-1.57, 0, 0],
stream="ondemand",
width=640,
height=480,
),
)
await asyncio.sleep(0.5)
# Capture from both cameras
front_render = session.grab_render(key="front-camera")
top_render = session.grab_render(key="top-camera")
# Process renders...
```
## Best Practices
1. **Use ondemand mode** - More efficient for programmatic rendering
2. **Enable depth rendering** - Get depth maps without material changes
3. **Add small delays** - Wait for camera updates before grabbing
4. **Set appropriate resolution** - Balance quality and performance
5. **Use downsampling** - Reduce data size when full resolution isn't needed
## Performance Considerations
The ondemand approach:
- Minimizes computational overhead
- Only renders when explicitly requested
- Ideal for resource-constrained applications
- Perfect for dataset generation and batch processing
## Source
Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/grab_render_virtual_camera.html