Files
gh-vuer-ai-vuer-skill-marke…/docs/tutorials/camera/grab-render-virtual-camera.md
2025-11-30 09:05:02 +08:00

5.9 KiB

Collecting Render from Virtual Cameras

Overview

This tutorial covers two methods for capturing rendered images from virtual cameras in Vuer, with a focus on the recommended ondemand approach.

Methods Overview

Method 1: Frame/Time Mode (Legacy)

Uses event handlers to collect rendered images. Only supported in stream='frame' or stream='time' mode.

Limitations:

  • Less backend control
  • Continuous rendering even when not needed
  • Higher resource usage

Uses a synchronous grab_render RPC API. Only available in stream='ondemand' mode.

Advantages:

  • Superior backend control
  • Renders only when explicitly requested
  • Lower computational overhead
  • Support for depth rendering

Complete Example: OnDemand Mode

import asyncio
import numpy as np
from vuer import Vuer, VuerSession
from vuer.schemas import Scene, CameraView, DefaultScene, Box, Urdf
from PIL import Image

app = Vuer()

@app.spawn(start=True)
async def main(session: VuerSession):
    # Set up the scene
    session.set @ Scene(
        DefaultScene(),

        # Add some objects to render
        Box(
            args=[1, 1, 1],
            position=[0, 0.5, 0],
            color="red",
            key="box",
        ),

        Urdf(
            src="/static/robot.urdf",
            position=[2, 0, 0],
            key="robot",
        ),

        # Configure camera with ondemand streaming
        CameraView(
            key="ego",
            fov=50,
            width=640,
            height=480,
            position=[0, 2, 5],
            rotation=[0, 0, 0],
            stream="ondemand",
            renderDepth=True,  # Enable depth rendering
            near=0.1,
            far=100,
        ),
    )

    # Wait for scene to initialize
    await asyncio.sleep(0.5)

    # Capture renders from different positions
    for i in range(10):
        # Update camera position
        x = 5 * np.cos(i * 0.2)
        z = 5 * np.sin(i * 0.2)

        session.update @ CameraView(
            key="ego",
            position=[x, 2, z],
            rotation=[0, i * 0.2, 0],
        )

        # Small delay for camera update
        await asyncio.sleep(0.1)

        # Grab the render
        result = session.grab_render(downsample=1, key="ego")

        if result:
            # Process RGB image
            rgb_data = result.get("rgb")
            if rgb_data:
                # Convert to numpy array
                img_array = np.frombuffer(rgb_data, dtype=np.uint8)
                img_array = img_array.reshape((480, 640, 3))

                # Save image
                img = Image.fromarray(img_array)
                img.save(f"render_{i:03d}.png")
                print(f"Saved render_{i:03d}.png")

            # Process depth map
            depth_data = result.get("depth")
            if depth_data:
                depth_array = np.frombuffer(depth_data, dtype=np.float32)
                depth_array = depth_array.reshape((480, 640))

                # Save depth map
                depth_img = Image.fromarray(
                    (depth_array * 255).astype(np.uint8)
                )
                depth_img.save(f"depth_{i:03d}.png")
                print(f"Saved depth_{i:03d}.png")

    print("Finished capturing renders")

    # Keep session alive
    while True:
        await asyncio.sleep(1.0)

app.run()

Key API: grab_render()

result = session.grab_render(downsample=1, key="ego")

Parameters

  • downsample: Downsample factor (1 = no downsampling, 2 = half resolution)
  • key: Camera key to capture from

Returns

Dictionary containing:

  • rgb: RGB image data as bytes
  • depth: Depth map data as float32 array (if renderDepth=True)

Depth Rendering

Enable depth map capture by setting renderDepth=True:

CameraView(
    key="ego",
    renderDepth=True,
    stream="ondemand",
    # ... other parameters
)

Benefits:

  • Captures depth without changing object materials
  • Available since 2024 update
  • Minimal computational overhead

Legacy Method: Event Handler

For stream='frame' or stream='time' mode:

async def handle_camera_view(event, session):
    """Handle CAMERA_VIEW events"""
    if event.key != "ego":
        return

    # Access rendered image
    image_data = event.value.get("image")

    # Process image data
    print(f"Received image: {len(image_data)} bytes")

app.add_handler("CAMERA_VIEW", handle_camera_view)

Multi-Camera Capture

Capture from multiple cameras:

@app.spawn(start=True)
async def main(session: VuerSession):
    # Set up multiple cameras
    session.set @ Scene(
        DefaultScene(),

        CameraView(
            key="front-camera",
            position=[0, 1, 5],
            stream="ondemand",
            width=640,
            height=480,
        ),

        CameraView(
            key="top-camera",
            position=[0, 10, 0],
            rotation=[-1.57, 0, 0],
            stream="ondemand",
            width=640,
            height=480,
        ),
    )

    await asyncio.sleep(0.5)

    # Capture from both cameras
    front_render = session.grab_render(key="front-camera")
    top_render = session.grab_render(key="top-camera")

    # Process renders...

Best Practices

  1. Use ondemand mode - More efficient for programmatic rendering
  2. Enable depth rendering - Get depth maps without material changes
  3. Add small delays - Wait for camera updates before grabbing
  4. Set appropriate resolution - Balance quality and performance
  5. Use downsampling - Reduce data size when full resolution isn't needed

Performance Considerations

The ondemand approach:

  • Minimizes computational overhead
  • Only renders when explicitly requested
  • Ideal for resource-constrained applications
  • Perfect for dataset generation and batch processing

Source

Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/grab_render_virtual_camera.html