Initial commit

2025-11-30 09:05:02 +08:00
commit 265175ed82
23 changed files with 3329 additions and 0 deletions
--- a/docs/tutorials/camera/grab-render-virtual-camera.md
+++ b/docs/tutorials/camera/grab-render-virtual-camera.md
@@ -0,0 +1,235 @@
+# Collecting Render from Virtual Cameras
+
+## Overview
+This tutorial covers two methods for capturing rendered images from virtual cameras in Vuer, with a focus on the recommended ondemand approach.
+
+## Methods Overview
+
+### Method 1: Frame/Time Mode (Legacy)
+Uses event handlers to collect rendered images. Only supported in `stream='frame'` or `stream='time'` mode.
+
+**Limitations:**
+- Less backend control
+- Continuous rendering even when not needed
+- Higher resource usage
+
+### Method 2: OnDemand Mode (Recommended)
+Uses a synchronous `grab_render` RPC API. Only available in `stream='ondemand'` mode.
+
+**Advantages:**
+- Superior backend control
+- Renders only when explicitly requested
+- Lower computational overhead
+- Support for depth rendering
+
+## Complete Example: OnDemand Mode
+
+```python
+import asyncio
+import numpy as np
+from vuer import Vuer, VuerSession
+from vuer.schemas import Scene, CameraView, DefaultScene, Box, Urdf
+from PIL import Image
+
+app = Vuer()
+
+@app.spawn(start=True)
+async def main(session: VuerSession):
+    # Set up the scene
+    session.set @ Scene(
+        DefaultScene(),
+
+        # Add some objects to render
+        Box(
+            args=[1, 1, 1],
+            position=[0, 0.5, 0],
+            color="red",
+            key="box",
+        ),
+
+        Urdf(
+            src="/static/robot.urdf",
+            position=[2, 0, 0],
+            key="robot",
+        ),
+
+        # Configure camera with ondemand streaming
+        CameraView(
+            key="ego",
+            fov=50,
+            width=640,
+            height=480,
+            position=[0, 2, 5],
+            rotation=[0, 0, 0],
+            stream="ondemand",
+            renderDepth=True,  # Enable depth rendering
+            near=0.1,
+            far=100,
+        ),
+    )
+
+    # Wait for scene to initialize
+    await asyncio.sleep(0.5)
+
+    # Capture renders from different positions
+    for i in range(10):
+        # Update camera position
+        x = 5 * np.cos(i * 0.2)
+        z = 5 * np.sin(i * 0.2)
+
+        session.update @ CameraView(
+            key="ego",
+            position=[x, 2, z],
+            rotation=[0, i * 0.2, 0],
+        )
+
+        # Small delay for camera update
+        await asyncio.sleep(0.1)
+
+        # Grab the render
+        result = session.grab_render(downsample=1, key="ego")
+
+        if result:
+            # Process RGB image
+            rgb_data = result.get("rgb")
+            if rgb_data:
+                # Convert to numpy array
+                img_array = np.frombuffer(rgb_data, dtype=np.uint8)
+                img_array = img_array.reshape((480, 640, 3))
+
+                # Save image
+                img = Image.fromarray(img_array)
+                img.save(f"render_{i:03d}.png")
+                print(f"Saved render_{i:03d}.png")
+
+            # Process depth map
+            depth_data = result.get("depth")
+            if depth_data:
+                depth_array = np.frombuffer(depth_data, dtype=np.float32)
+                depth_array = depth_array.reshape((480, 640))
+
+                # Save depth map
+                depth_img = Image.fromarray(
+                    (depth_array * 255).astype(np.uint8)
+                )
+                depth_img.save(f"depth_{i:03d}.png")
+                print(f"Saved depth_{i:03d}.png")
+
+    print("Finished capturing renders")
+
+    # Keep session alive
+    while True:
+        await asyncio.sleep(1.0)
+
+app.run()
+```
+
+## Key API: `grab_render()`
+
+```python
+result = session.grab_render(downsample=1, key="ego")
+```
+
+### Parameters
+- **downsample**: Downsample factor (1 = no downsampling, 2 = half resolution)
+- **key**: Camera key to capture from
+
+### Returns
+Dictionary containing:
+- **rgb**: RGB image data as bytes
+- **depth**: Depth map data as float32 array (if `renderDepth=True`)
+
+## Depth Rendering
+
+Enable depth map capture by setting `renderDepth=True`:
+
+```python
+CameraView(
+    key="ego",
+    renderDepth=True,
+    stream="ondemand",
+    # ... other parameters
+)
+```
+
+**Benefits:**
+- Captures depth without changing object materials
+- Available since 2024 update
+- Minimal computational overhead
+
+## Legacy Method: Event Handler
+
+For `stream='frame'` or `stream='time'` mode:
+
+```python
+async def handle_camera_view(event, session):
+    """Handle CAMERA_VIEW events"""
+    if event.key != "ego":
+        return
+
+    # Access rendered image
+    image_data = event.value.get("image")
+
+    # Process image data
+    print(f"Received image: {len(image_data)} bytes")
+
+app.add_handler("CAMERA_VIEW", handle_camera_view)
+```
+
+## Multi-Camera Capture
+
+Capture from multiple cameras:
+
+```python
+@app.spawn(start=True)
+async def main(session: VuerSession):
+    # Set up multiple cameras
+    session.set @ Scene(
+        DefaultScene(),
+
+        CameraView(
+            key="front-camera",
+            position=[0, 1, 5],
+            stream="ondemand",
+            width=640,
+            height=480,
+        ),
+
+        CameraView(
+            key="top-camera",
+            position=[0, 10, 0],
+            rotation=[-1.57, 0, 0],
+            stream="ondemand",
+            width=640,
+            height=480,
+        ),
+    )
+
+    await asyncio.sleep(0.5)
+
+    # Capture from both cameras
+    front_render = session.grab_render(key="front-camera")
+    top_render = session.grab_render(key="top-camera")
+
+    # Process renders...
+```
+
+## Best Practices
+
+1. **Use ondemand mode** - More efficient for programmatic rendering
+2. **Enable depth rendering** - Get depth maps without material changes
+3. **Add small delays** - Wait for camera updates before grabbing
+4. **Set appropriate resolution** - Balance quality and performance
+5. **Use downsampling** - Reduce data size when full resolution isn't needed
+
+## Performance Considerations
+
+The ondemand approach:
+- Minimizes computational overhead
+- Only renders when explicitly requested
+- Ideal for resource-constrained applications
+- Perfect for dataset generation and batch processing
+
+## Source
+
+Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/grab_render_virtual_camera.html