5.9 KiB
5.9 KiB
Collecting Render from Virtual Cameras
Overview
This tutorial covers two methods for capturing rendered images from virtual cameras in Vuer, with a focus on the recommended ondemand approach.
Methods Overview
Method 1: Frame/Time Mode (Legacy)
Uses event handlers to collect rendered images. Only supported in stream='frame' or stream='time' mode.
Limitations:
- Less backend control
- Continuous rendering even when not needed
- Higher resource usage
Method 2: OnDemand Mode (Recommended)
Uses a synchronous grab_render RPC API. Only available in stream='ondemand' mode.
Advantages:
- Superior backend control
- Renders only when explicitly requested
- Lower computational overhead
- Support for depth rendering
Complete Example: OnDemand Mode
import asyncio
import numpy as np
from vuer import Vuer, VuerSession
from vuer.schemas import Scene, CameraView, DefaultScene, Box, Urdf
from PIL import Image
app = Vuer()
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up the scene
session.set @ Scene(
DefaultScene(),
# Add some objects to render
Box(
args=[1, 1, 1],
position=[0, 0.5, 0],
color="red",
key="box",
),
Urdf(
src="/static/robot.urdf",
position=[2, 0, 0],
key="robot",
),
# Configure camera with ondemand streaming
CameraView(
key="ego",
fov=50,
width=640,
height=480,
position=[0, 2, 5],
rotation=[0, 0, 0],
stream="ondemand",
renderDepth=True, # Enable depth rendering
near=0.1,
far=100,
),
)
# Wait for scene to initialize
await asyncio.sleep(0.5)
# Capture renders from different positions
for i in range(10):
# Update camera position
x = 5 * np.cos(i * 0.2)
z = 5 * np.sin(i * 0.2)
session.update @ CameraView(
key="ego",
position=[x, 2, z],
rotation=[0, i * 0.2, 0],
)
# Small delay for camera update
await asyncio.sleep(0.1)
# Grab the render
result = session.grab_render(downsample=1, key="ego")
if result:
# Process RGB image
rgb_data = result.get("rgb")
if rgb_data:
# Convert to numpy array
img_array = np.frombuffer(rgb_data, dtype=np.uint8)
img_array = img_array.reshape((480, 640, 3))
# Save image
img = Image.fromarray(img_array)
img.save(f"render_{i:03d}.png")
print(f"Saved render_{i:03d}.png")
# Process depth map
depth_data = result.get("depth")
if depth_data:
depth_array = np.frombuffer(depth_data, dtype=np.float32)
depth_array = depth_array.reshape((480, 640))
# Save depth map
depth_img = Image.fromarray(
(depth_array * 255).astype(np.uint8)
)
depth_img.save(f"depth_{i:03d}.png")
print(f"Saved depth_{i:03d}.png")
print("Finished capturing renders")
# Keep session alive
while True:
await asyncio.sleep(1.0)
app.run()
Key API: grab_render()
result = session.grab_render(downsample=1, key="ego")
Parameters
- downsample: Downsample factor (1 = no downsampling, 2 = half resolution)
- key: Camera key to capture from
Returns
Dictionary containing:
- rgb: RGB image data as bytes
- depth: Depth map data as float32 array (if
renderDepth=True)
Depth Rendering
Enable depth map capture by setting renderDepth=True:
CameraView(
key="ego",
renderDepth=True,
stream="ondemand",
# ... other parameters
)
Benefits:
- Captures depth without changing object materials
- Available since 2024 update
- Minimal computational overhead
Legacy Method: Event Handler
For stream='frame' or stream='time' mode:
async def handle_camera_view(event, session):
"""Handle CAMERA_VIEW events"""
if event.key != "ego":
return
# Access rendered image
image_data = event.value.get("image")
# Process image data
print(f"Received image: {len(image_data)} bytes")
app.add_handler("CAMERA_VIEW", handle_camera_view)
Multi-Camera Capture
Capture from multiple cameras:
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up multiple cameras
session.set @ Scene(
DefaultScene(),
CameraView(
key="front-camera",
position=[0, 1, 5],
stream="ondemand",
width=640,
height=480,
),
CameraView(
key="top-camera",
position=[0, 10, 0],
rotation=[-1.57, 0, 0],
stream="ondemand",
width=640,
height=480,
),
)
await asyncio.sleep(0.5)
# Capture from both cameras
front_render = session.grab_render(key="front-camera")
top_render = session.grab_render(key="top-camera")
# Process renders...
Best Practices
- Use ondemand mode - More efficient for programmatic rendering
- Enable depth rendering - Get depth maps without material changes
- Add small delays - Wait for camera updates before grabbing
- Set appropriate resolution - Balance quality and performance
- Use downsampling - Reduce data size when full resolution isn't needed
Performance Considerations
The ondemand approach:
- Minimizes computational overhead
- Only renders when explicitly requested
- Ideal for resource-constrained applications
- Perfect for dataset generation and batch processing
Source
Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/grab_render_virtual_camera.html