Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 09:05:02 +08:00
commit 265175ed82
23 changed files with 3329 additions and 0 deletions

View File

@@ -0,0 +1,235 @@
# Collecting Render from Virtual Cameras
## Overview
This tutorial covers two methods for capturing rendered images from virtual cameras in Vuer, with a focus on the recommended ondemand approach.
## Methods Overview
### Method 1: Frame/Time Mode (Legacy)
Uses event handlers to collect rendered images. Only supported in `stream='frame'` or `stream='time'` mode.
**Limitations:**
- Less backend control
- Continuous rendering even when not needed
- Higher resource usage
### Method 2: OnDemand Mode (Recommended)
Uses a synchronous `grab_render` RPC API. Only available in `stream='ondemand'` mode.
**Advantages:**
- Superior backend control
- Renders only when explicitly requested
- Lower computational overhead
- Support for depth rendering
## Complete Example: OnDemand Mode
```python
import asyncio
import numpy as np
from vuer import Vuer, VuerSession
from vuer.schemas import Scene, CameraView, DefaultScene, Box, Urdf
from PIL import Image
app = Vuer()
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up the scene
session.set @ Scene(
DefaultScene(),
# Add some objects to render
Box(
args=[1, 1, 1],
position=[0, 0.5, 0],
color="red",
key="box",
),
Urdf(
src="/static/robot.urdf",
position=[2, 0, 0],
key="robot",
),
# Configure camera with ondemand streaming
CameraView(
key="ego",
fov=50,
width=640,
height=480,
position=[0, 2, 5],
rotation=[0, 0, 0],
stream="ondemand",
renderDepth=True, # Enable depth rendering
near=0.1,
far=100,
),
)
# Wait for scene to initialize
await asyncio.sleep(0.5)
# Capture renders from different positions
for i in range(10):
# Update camera position
x = 5 * np.cos(i * 0.2)
z = 5 * np.sin(i * 0.2)
session.update @ CameraView(
key="ego",
position=[x, 2, z],
rotation=[0, i * 0.2, 0],
)
# Small delay for camera update
await asyncio.sleep(0.1)
# Grab the render
result = session.grab_render(downsample=1, key="ego")
if result:
# Process RGB image
rgb_data = result.get("rgb")
if rgb_data:
# Convert to numpy array
img_array = np.frombuffer(rgb_data, dtype=np.uint8)
img_array = img_array.reshape((480, 640, 3))
# Save image
img = Image.fromarray(img_array)
img.save(f"render_{i:03d}.png")
print(f"Saved render_{i:03d}.png")
# Process depth map
depth_data = result.get("depth")
if depth_data:
depth_array = np.frombuffer(depth_data, dtype=np.float32)
depth_array = depth_array.reshape((480, 640))
# Save depth map
depth_img = Image.fromarray(
(depth_array * 255).astype(np.uint8)
)
depth_img.save(f"depth_{i:03d}.png")
print(f"Saved depth_{i:03d}.png")
print("Finished capturing renders")
# Keep session alive
while True:
await asyncio.sleep(1.0)
app.run()
```
## Key API: `grab_render()`
```python
result = session.grab_render(downsample=1, key="ego")
```
### Parameters
- **downsample**: Downsample factor (1 = no downsampling, 2 = half resolution)
- **key**: Camera key to capture from
### Returns
Dictionary containing:
- **rgb**: RGB image data as bytes
- **depth**: Depth map data as float32 array (if `renderDepth=True`)
## Depth Rendering
Enable depth map capture by setting `renderDepth=True`:
```python
CameraView(
key="ego",
renderDepth=True,
stream="ondemand",
# ... other parameters
)
```
**Benefits:**
- Captures depth without changing object materials
- Available since 2024 update
- Minimal computational overhead
## Legacy Method: Event Handler
For `stream='frame'` or `stream='time'` mode:
```python
async def handle_camera_view(event, session):
"""Handle CAMERA_VIEW events"""
if event.key != "ego":
return
# Access rendered image
image_data = event.value.get("image")
# Process image data
print(f"Received image: {len(image_data)} bytes")
app.add_handler("CAMERA_VIEW", handle_camera_view)
```
## Multi-Camera Capture
Capture from multiple cameras:
```python
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up multiple cameras
session.set @ Scene(
DefaultScene(),
CameraView(
key="front-camera",
position=[0, 1, 5],
stream="ondemand",
width=640,
height=480,
),
CameraView(
key="top-camera",
position=[0, 10, 0],
rotation=[-1.57, 0, 0],
stream="ondemand",
width=640,
height=480,
),
)
await asyncio.sleep(0.5)
# Capture from both cameras
front_render = session.grab_render(key="front-camera")
top_render = session.grab_render(key="top-camera")
# Process renders...
```
## Best Practices
1. **Use ondemand mode** - More efficient for programmatic rendering
2. **Enable depth rendering** - Get depth maps without material changes
3. **Add small delays** - Wait for camera updates before grabbing
4. **Set appropriate resolution** - Balance quality and performance
5. **Use downsampling** - Reduce data size when full resolution isn't needed
## Performance Considerations
The ondemand approach:
- Minimizes computational overhead
- Only renders when explicitly requested
- Ideal for resource-constrained applications
- Perfect for dataset generation and batch processing
## Source
Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/grab_render_virtual_camera.html

View File

@@ -0,0 +1,212 @@
# Manipulating Camera Pose in Vuer
## Overview
This tutorial demonstrates how to programmatically control virtual camera positions and orientations within the Vuer framework, along with tracking user interactions.
## Key Concepts
### CameraView Component
Virtual cameras in Vuer are controlled through the `CameraView` component with parameters:
- **fov**: Field of view in degrees
- **width, height**: Resolution in pixels
- **position**: Camera position `[x, y, z]`
- **rotation**: Camera rotation `[x, y, z]`
- **matrix**: 4x4 transformation matrix (alternative to position/rotation)
- **stream**: Streaming mode (`"time"`, `"frame"`, or `"ondemand"`)
- **fps**: Frame rate for streaming
- **near, far**: Clipping planes
## Complete Example
```python
import asyncio
import pickle
from vuer import Vuer, VuerSession
from vuer.events import ClientEvent
from vuer.schemas import Scene, CameraView, DefaultScene, Urdf
from ml_logger import ML_Logger
# Initialize logger
logger = ML_Logger(root=".", prefix="assets")
# Load pre-recorded camera matrices
with open("assets/camera_movement.pkl", "rb") as f:
data = pickle.load(f)
matrices = [item["matrix"] for item in data]
app = Vuer()
# Event handler to track camera movements
async def track_movement(event: ClientEvent, sess: VuerSession):
"""Capture camera movement events"""
if event.key != "ego":
return
logger.log(**event.value, flush=True, silent=True)
print(f"Camera moved: {event.value['position']}")
app.add_handler("CAMERA_MOVE", track_movement)
@app.spawn(start=True)
async def main(proxy: VuerSession):
# Set up the scene
proxy.set @ Scene(
DefaultScene(),
# Add a robot for reference
Urdf(
src="/static/robot.urdf",
position=[0, 0, 0],
key="robot",
),
)
# Animate camera through recorded positions
for i in range(len(matrices)):
proxy.update @ [
CameraView(
key="ego",
fov=50,
width=320,
height=240,
matrix=matrices[i % len(matrices)],
stream="time",
fps=30,
near=0.1,
far=100,
),
]
await asyncio.sleep(0.033) # 30 FPS
# Keep session alive
while True:
await asyncio.sleep(1.0)
app.run()
```
## Dynamic Camera Control Methods
### Method 1: Using Transformation Matrix
```python
session.update @ CameraView(
key="ego",
matrix=[
[1, 0, 0, x],
[0, 1, 0, y],
[0, 0, 1, z],
[0, 0, 0, 1],
],
)
```
### Method 2: Using Position and Rotation
```python
session.update @ CameraView(
key="ego",
position=[x, y, z],
rotation=[rx, ry, rz], # Euler angles in radians
)
```
### Method 3: Animated Camera Path
```python
import math
for i in range(360):
theta = math.radians(i)
radius = 5
# Circular orbit
x = radius * math.cos(theta)
z = radius * math.sin(theta)
session.update @ CameraView(
key="ego",
position=[x, 2, z],
rotation=[0, theta, 0],
)
await asyncio.sleep(0.033) # 30 FPS
```
## Replaying Recorded Movements
Load and replay pre-recorded camera movements:
```python
import pickle
# Load recorded movements
with open("assets/camera_movement.pkl", "rb") as f:
movements = pickle.load(f)
# Replay movements
for movement in movements:
session.update @ CameraView(
key="ego",
matrix=movement["matrix"],
fov=50,
width=320,
height=240,
)
await asyncio.sleep(0.033) # 30 FPS
```
## Event Handling
Track user-initiated camera movements:
```python
async def track_movement(event: ClientEvent, sess: VuerSession):
"""Log user camera movements"""
if event.key != "ego":
return
# Access camera data
position = event.value.get("position")
rotation = event.value.get("rotation")
matrix = event.value.get("matrix")
print(f"Position: {position}")
print(f"Rotation: {rotation}")
# Save to logger
logger.log(**event.value, flush=True, silent=True)
app.add_handler("CAMERA_MOVE", track_movement)
```
## Streaming Modes
### "time" Mode
Continuous streaming at specified FPS:
```python
CameraView(stream="time", fps=30)
```
### "frame" Mode
Stream individual frames on demand.
### "ondemand" Mode
Only render when explicitly requested (most efficient):
```python
CameraView(stream="ondemand")
```
## Best Practices
1. **Use matrices for complex movements** - More precise than position/rotation
2. **Track user movements** - Enable interactive camera control
3. **Set appropriate FPS** - Balance smoothness and performance
4. **Use clipping planes** - Optimize rendering with near/far settings
5. **Use ondemand mode** - Save resources when continuous streaming isn't needed
## Source
Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/move_camera.html

View File

@@ -0,0 +1,157 @@
# Recording Camera Movements in Vuer
## Overview
This tutorial demonstrates how to capture user camera movements in a Vuer application and save them to a file for later programmatic control.
## Purpose
Record camera movements to produce a camera movement file (`assets/camera_movement.pkl`) that can be used to:
- Replay camera movements
- Control camera movements programmatically
- Analyze user navigation patterns
## Complete Example
```python
import os
import asyncio
from vuer import Vuer, VuerSession
from vuer.events import ClientEvent
from vuer.schemas import Scene, CameraView, DefaultScene
from ml_logger import ML_Logger
# Initialize logger
logger = ML_Logger(root=os.getcwd(), prefix="assets")
app = Vuer()
# Event handler for camera movements
async def track_movement(event: ClientEvent, sess: VuerSession):
"""Capture and log camera movement events"""
if event.key != "ego":
return
print("camera moved", event.value["matrix"])
# Save camera data to file
logger.log(**event.value, flush=True, file="camera_movement.pkl")
# Register the event handler
app.add_handler("CAMERA_MOVE", track_movement)
@app.spawn(start=True)
async def main(session: VuerSession):
# Set up the scene
session.set @ Scene(
DefaultScene(),
# Configure the camera view
CameraView(
fov=50,
width=320,
height=240,
position=[0, 2, 5],
rotation=[0, 0, 0],
key="ego",
),
)
# Keep session alive
while True:
await asyncio.sleep(1.0)
app.run()
```
## Key Components
### 1. Event Handler Setup
Create an event listener for camera movement events:
```python
async def track_movement(event: ClientEvent, sess: VuerSession):
if event.key != "ego":
return
print("camera moved", event.value["matrix"])
```
The handler:
- Filters for the ego camera (`event.key != "ego"`)
- Accesses movement data via `event.value["matrix"]`
- Can process or log the camera transformation matrix
### 2. Initialize Logger
Uses ML-Logger to persist camera data to disk:
```python
from ml_logger import ML_Logger
logger = ML_Logger(root=os.getcwd(), prefix="assets")
```
### 3. Register Handler
Connect the handler to the app:
```python
app.add_handler("CAMERA_MOVE", track_movement)
```
### 4. Configure Camera View
The scene includes a `CameraView` component with:
- **fov**: Field of view in degrees
- **width, height**: Resolution
- **position**: Initial camera position `[x, y, z]`
- **rotation**: Initial camera rotation `[x, y, z]`
- **key**: Unique identifier (used to filter events)
## Saving Camera Data
Camera movement data is saved using:
```python
logger.log(**event.value, flush=True, file="camera_movement.pkl")
```
This creates a persistent record in `assets/camera_movement.pkl`.
## Data Format
The `event.value` dictionary typically contains:
- **matrix**: 4x4 transformation matrix
- **position**: Camera position `[x, y, z]`
- **rotation**: Camera rotation (quaternion or Euler angles)
- **timestamp**: Event timestamp
## Usage in Subsequent Tutorials
The recorded camera movements can be loaded and replayed:
```python
import pickle
# Load recorded movements
with open("assets/camera_movement.pkl", "rb") as f:
movements = pickle.load(f)
# Replay movements
for movement in movements:
session.update @ CameraView(
matrix=movement["matrix"],
key="ego",
)
await asyncio.sleep(0.033) # 30 FPS
```
## Installation Requirements
```bash
pip install ml-logger
```
## Source
Documentation: https://docs.vuer.ai/en/latest/tutorials/camera/record_camera_movement.html