Initial commit
This commit is contained in:
481
references/code-execution-patterns.md
Normal file
481
references/code-execution-patterns.md
Normal file
@@ -0,0 +1,481 @@
|
||||
# Code Execution Patterns
|
||||
|
||||
Complete guide to using code execution with Google Gemini API for computational tasks, data analysis, and problem-solving.
|
||||
|
||||
---
|
||||
|
||||
## What is Code Execution?
|
||||
|
||||
Code Execution allows Gemini models to generate and execute Python code to solve problems requiring computation, enabling the model to:
|
||||
- Perform precise mathematical calculations
|
||||
- Analyze data with pandas/numpy
|
||||
- Generate charts and visualizations
|
||||
- Implement algorithms
|
||||
- Process files and data structures
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Model receives prompt** requiring computation
|
||||
2. **Model generates Python code** to solve the problem
|
||||
3. **Code executes in sandbox** (secure, isolated environment)
|
||||
4. **Results return to model** for incorporation into response
|
||||
5. **Model explains results** in natural language
|
||||
|
||||
---
|
||||
|
||||
## Enabling Code Execution
|
||||
|
||||
### Basic Setup (SDK)
|
||||
|
||||
```typescript
|
||||
import { GoogleGenAI } from '@google/genai';
|
||||
|
||||
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash', // Or gemini-2.5-pro
|
||||
contents: 'Calculate the sum of first 50 prime numbers',
|
||||
config: {
|
||||
tools: [{ codeExecution: {} }] // Enable code execution
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Basic Setup (Fetch)
|
||||
|
||||
```typescript
|
||||
const response = await fetch(
|
||||
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
|
||||
{
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'x-goog-api-key': env.GEMINI_API_KEY,
|
||||
},
|
||||
body: JSON.stringify({
|
||||
tools: [{ code_execution: {} }],
|
||||
contents: [{ parts: [{ text: 'Calculate...' }] }]
|
||||
}),
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Available Python Packages
|
||||
|
||||
### Standard Library
|
||||
- `math`, `statistics`, `random`
|
||||
- `datetime`, `time`, `calendar`
|
||||
- `json`, `csv`, `re`
|
||||
- `collections`, `itertools`, `functools`
|
||||
|
||||
### Data Science
|
||||
- `numpy` - numerical computing
|
||||
- `pandas` - data analysis and manipulation
|
||||
- `scipy` - scientific computing
|
||||
|
||||
### Visualization
|
||||
- `matplotlib` - plotting and charts
|
||||
- `seaborn` - statistical visualization
|
||||
|
||||
**Note**: This is a **limited sandbox environment** - not all PyPI packages are available.
|
||||
|
||||
---
|
||||
|
||||
## Response Structure
|
||||
|
||||
### Parsing Code Execution Results
|
||||
|
||||
```typescript
|
||||
for (const part of response.candidates[0].content.parts) {
|
||||
// Inline text
|
||||
if (part.text) {
|
||||
console.log('Text:', part.text);
|
||||
}
|
||||
|
||||
// Generated code
|
||||
if (part.executableCode) {
|
||||
console.log('Language:', part.executableCode.language); // "PYTHON"
|
||||
console.log('Code:', part.executableCode.code);
|
||||
}
|
||||
|
||||
// Execution results
|
||||
if (part.codeExecutionResult) {
|
||||
console.log('Outcome:', part.codeExecutionResult.outcome); // "OUTCOME_OK" or "OUTCOME_FAILED"
|
||||
console.log('Output:', part.codeExecutionResult.output);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example Response
|
||||
|
||||
```json
|
||||
{
|
||||
"candidates": [{
|
||||
"content": {
|
||||
"parts": [
|
||||
{ "text": "I'll calculate that for you." },
|
||||
{
|
||||
"executableCode": {
|
||||
"language": "PYTHON",
|
||||
"code": "primes = []\nnum = 2\nwhile len(primes) < 50:\n if is_prime(num):\n primes.append(num)\n num += 1\nprint(sum(primes))"
|
||||
}
|
||||
},
|
||||
{
|
||||
"codeExecutionResult": {
|
||||
"outcome": "OUTCOME_OK",
|
||||
"output": "5117\n"
|
||||
}
|
||||
},
|
||||
{ "text": "The sum is 5117." }
|
||||
]
|
||||
}
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### 1. Mathematical Calculations
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Calculate the 100th Fibonacci number',
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
```
|
||||
|
||||
**Prompting Tip**: Use phrases like "generate and run code" or "calculate using code" to explicitly request code execution.
|
||||
|
||||
### 2. Data Analysis
|
||||
|
||||
```typescript
|
||||
const prompt = `
|
||||
Analyze this sales data:
|
||||
|
||||
month,revenue,customers
|
||||
Jan,50000,120
|
||||
Feb,62000,145
|
||||
Mar,58000,138
|
||||
|
||||
Calculate:
|
||||
1. Total revenue
|
||||
2. Average revenue per customer
|
||||
3. Month-over-month growth rate
|
||||
|
||||
Use pandas or numpy for analysis.
|
||||
`;
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: prompt,
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
```
|
||||
|
||||
### 3. Chart Generation
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Create a bar chart showing prime number distribution by last digit (0-9) for primes under 100',
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
```
|
||||
|
||||
**Note**: Chart image data appears in `codeExecutionResult.output` (base64 encoded in some cases).
|
||||
|
||||
### 4. Algorithm Implementation
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Implement quicksort and sort this list: [64, 34, 25, 12, 22, 11, 90]. Show the sorted result.',
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
```
|
||||
|
||||
### 5. File Processing (In-Memory)
|
||||
|
||||
```typescript
|
||||
const csvData = `name,age,city
|
||||
Alice,30,NYC
|
||||
Bob,25,LA
|
||||
Charlie,35,Chicago`;
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: `Parse this CSV data and calculate average age:\n\n${csvData}`,
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Chat with Code Execution
|
||||
|
||||
### Multi-Turn Computational Conversations
|
||||
|
||||
```typescript
|
||||
const chat = await ai.chats.create({
|
||||
model: 'gemini-2.5-flash',
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
|
||||
// First turn
|
||||
let response = await chat.sendMessage('I have a data analysis question');
|
||||
console.log(response.text);
|
||||
|
||||
// Second turn (will use code execution)
|
||||
response = await chat.sendMessage(`
|
||||
Calculate statistics for: [12, 15, 18, 22, 25, 28, 30]
|
||||
- Mean
|
||||
- Median
|
||||
- Standard deviation
|
||||
`);
|
||||
|
||||
for (const part of response.candidates[0].content.parts) {
|
||||
if (part.text) console.log(part.text);
|
||||
if (part.executableCode) console.log('Code:', part.executableCode.code);
|
||||
if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Checking Execution Outcome
|
||||
|
||||
```typescript
|
||||
for (const part of response.candidates[0].content.parts) {
|
||||
if (part.codeExecutionResult) {
|
||||
if (part.codeExecutionResult.outcome === 'OUTCOME_OK') {
|
||||
console.log('✅ Success:', part.codeExecutionResult.output);
|
||||
} else if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
|
||||
console.error('❌ Execution failed:', part.codeExecutionResult.output);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Common Execution Errors
|
||||
|
||||
**Timeout**:
|
||||
```
|
||||
Error: Execution timed out after 30 seconds
|
||||
```
|
||||
**Solution**: Simplify computation or reduce data size.
|
||||
|
||||
**Import Error**:
|
||||
```
|
||||
ModuleNotFoundError: No module named 'requests'
|
||||
```
|
||||
**Solution**: Use only available packages (numpy, pandas, matplotlib, seaborn, scipy).
|
||||
|
||||
**Syntax Error**:
|
||||
```
|
||||
SyntaxError: invalid syntax
|
||||
```
|
||||
**Solution**: Model generated invalid code - try rephrasing prompt or regenerating.
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### ✅ Do
|
||||
|
||||
1. **Be Explicit**: Use phrases like "generate and run code" to trigger code execution
|
||||
2. **Provide Data**: Include data directly in prompt for analysis
|
||||
3. **Specify Output**: Ask for specific calculations or metrics
|
||||
4. **Use Available Packages**: Stick to numpy, pandas, matplotlib, scipy
|
||||
5. **Check Outcome**: Always verify `outcome === 'OUTCOME_OK'`
|
||||
|
||||
### ❌ Don't
|
||||
|
||||
1. **Network Access**: Code cannot make HTTP requests
|
||||
2. **File System**: No persistent file storage between executions
|
||||
3. **Long Computations**: Timeout limits apply (~30 seconds)
|
||||
4. **External Dependencies**: Can't install new packages
|
||||
5. **State Persistence**: Each execution is isolated (no global state)
|
||||
|
||||
---
|
||||
|
||||
## Limitations
|
||||
|
||||
### Sandbox Restrictions
|
||||
|
||||
- **No Network Access**: Cannot call external APIs
|
||||
- **No File I/O**: Cannot read/write to disk (in-memory only)
|
||||
- **Limited Packages**: Only pre-installed packages available
|
||||
- **Execution Timeout**: ~30 seconds maximum
|
||||
- **No State**: Each execution is independent
|
||||
|
||||
### Supported Models
|
||||
|
||||
✅ **Works with**:
|
||||
- `gemini-2.5-pro`
|
||||
- `gemini-2.5-flash`
|
||||
|
||||
❌ **Does NOT work with**:
|
||||
- `gemini-2.5-flash-lite` (no code execution support)
|
||||
- Gemini 1.5 models (use Gemini 2.5)
|
||||
|
||||
---
|
||||
|
||||
## Advanced Patterns
|
||||
|
||||
### Iterative Analysis
|
||||
|
||||
```typescript
|
||||
const chat = await ai.chats.create({
|
||||
model: 'gemini-2.5-flash',
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
|
||||
// Step 1: Initial analysis
|
||||
let response = await chat.sendMessage('Analyze data: [10, 20, 30, 40, 50]');
|
||||
|
||||
// Step 2: Follow-up based on results
|
||||
response = await chat.sendMessage('Now calculate the variance');
|
||||
|
||||
// Step 3: Visualization
|
||||
response = await chat.sendMessage('Create a histogram of this data');
|
||||
```
|
||||
|
||||
### Combining with Function Calling
|
||||
|
||||
```typescript
|
||||
const weatherFunction = {
|
||||
name: 'get_current_weather',
|
||||
description: 'Get weather for a city',
|
||||
parametersJsonSchema: {
|
||||
type: 'object',
|
||||
properties: { city: { type: 'string' } },
|
||||
required: ['city']
|
||||
}
|
||||
};
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Get weather for NYC, LA, Chicago. Calculate the average temperature.',
|
||||
config: {
|
||||
tools: [
|
||||
{ functionDeclarations: [weatherFunction] },
|
||||
{ codeExecution: {} }
|
||||
]
|
||||
}
|
||||
});
|
||||
|
||||
// Model will:
|
||||
// 1. Call get_current_weather for each city
|
||||
// 2. Generate code to calculate average
|
||||
// 3. Return result
|
||||
```
|
||||
|
||||
### Data Transformation Pipeline
|
||||
|
||||
```typescript
|
||||
const prompt = `
|
||||
Transform this data:
|
||||
Input: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
|
||||
|
||||
Pipeline:
|
||||
1. Filter odd numbers
|
||||
2. Square each number
|
||||
3. Calculate sum
|
||||
4. Return result
|
||||
|
||||
Use code to process.
|
||||
`;
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: prompt,
|
||||
config: { tools: [{ codeExecution: {} }] }
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Optimization Tips
|
||||
|
||||
### 1. Clear Instructions
|
||||
|
||||
**❌ Vague**:
|
||||
```typescript
|
||||
contents: 'Analyze this data'
|
||||
```
|
||||
|
||||
**✅ Specific**:
|
||||
```typescript
|
||||
contents: 'Calculate mean, median, and standard deviation for: [12, 15, 18, 22, 25]'
|
||||
```
|
||||
|
||||
### 2. Provide Complete Data
|
||||
|
||||
```typescript
|
||||
const csvData = `...complete dataset...`;
|
||||
const prompt = `Analyze this CSV data:\n\n${csvData}\n\nCalculate total revenue.`;
|
||||
```
|
||||
|
||||
### 3. Request Code Explicitly
|
||||
|
||||
```typescript
|
||||
contents: 'Generate and run code to calculate the factorial of 20'
|
||||
```
|
||||
|
||||
### 4. Handle Large Datasets
|
||||
|
||||
For large data, consider:
|
||||
- Sampling (analyze subset)
|
||||
- Aggregation (group by categories)
|
||||
- Pagination (process in chunks)
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Code Not Executing
|
||||
|
||||
**Symptom**: Response has text but no `executableCode`
|
||||
|
||||
**Causes**:
|
||||
1. Code execution not enabled (`tools: [{ codeExecution: {} }]`)
|
||||
2. Model decided code wasn't necessary
|
||||
3. Using `gemini-2.5-flash-lite` (doesn't support code execution)
|
||||
|
||||
**Solution**: Be explicit in prompt: "Use code to calculate..."
|
||||
|
||||
### Timeout Errors
|
||||
|
||||
**Symptom**: `OUTCOME_FAILED` with timeout message
|
||||
|
||||
**Causes**: Computation too complex or data too large
|
||||
|
||||
**Solution**:
|
||||
- Simplify algorithm
|
||||
- Reduce data size
|
||||
- Use more efficient approach
|
||||
|
||||
### Import Errors
|
||||
|
||||
**Symptom**: `ModuleNotFoundError`
|
||||
|
||||
**Causes**: Trying to import unavailable package
|
||||
|
||||
**Solution**: Use only available packages (numpy, pandas, matplotlib, seaborn, scipy)
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Official Docs: https://ai.google.dev/gemini-api/docs/code-execution
|
||||
- Templates: See `code-execution.ts` for working examples
|
||||
- Available Packages: See "Available Python Packages" section above
|
||||
373
references/context-caching-guide.md
Normal file
373
references/context-caching-guide.md
Normal file
@@ -0,0 +1,373 @@
|
||||
# Context Caching Guide
|
||||
|
||||
Complete guide to using context caching with Google Gemini API to reduce costs by up to 90%.
|
||||
|
||||
---
|
||||
|
||||
## What is Context Caching?
|
||||
|
||||
Context caching allows you to cache frequently used content (system instructions, large documents, videos) and reuse it across multiple requests, significantly reducing token costs and improving latency.
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Create a cache** with your repeated content (documents, videos, system instructions)
|
||||
2. **Set TTL** (time-to-live) for cache expiration
|
||||
3. **Reference the cache** in subsequent API calls
|
||||
4. **Pay less** - cached tokens cost ~90% less than regular input tokens
|
||||
|
||||
---
|
||||
|
||||
## Benefits
|
||||
|
||||
### Cost Savings
|
||||
- **Cached input tokens**: ~90% cheaper than regular tokens
|
||||
- **Output tokens**: Same price (not cached)
|
||||
- **Example**: 100K token document cached → ~10K token cost equivalent
|
||||
|
||||
### Performance
|
||||
- **Reduced latency**: Cached content is preprocessed
|
||||
- **Faster responses**: No need to reprocess large context
|
||||
- **Consistent results**: Same context every time
|
||||
|
||||
### Use Cases
|
||||
- Large documents analyzed repeatedly
|
||||
- Long system instructions used across sessions
|
||||
- Video/audio files queried multiple times
|
||||
- Consistent conversation context
|
||||
|
||||
---
|
||||
|
||||
## Cache Creation
|
||||
|
||||
### Basic Cache (SDK)
|
||||
|
||||
```typescript
|
||||
import { GoogleGenAI } from '@google/genai';
|
||||
|
||||
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
|
||||
|
||||
const cache = await ai.caches.create({
|
||||
model: 'gemini-2.5-flash-001', // Must use explicit version!
|
||||
config: {
|
||||
displayName: 'my-cache',
|
||||
systemInstruction: 'You are a helpful assistant.',
|
||||
contents: 'Large document content here...',
|
||||
ttl: '3600s', // 1 hour
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Cache with Expiration Time
|
||||
|
||||
```typescript
|
||||
// Set specific expiration time (timezone-aware)
|
||||
const expirationTime = new Date(Date.now() + 2 * 60 * 60 * 1000); // 2 hours from now
|
||||
|
||||
const cache = await ai.caches.create({
|
||||
model: 'gemini-2.5-flash-001',
|
||||
config: {
|
||||
displayName: 'my-cache',
|
||||
contents: documentText,
|
||||
expireTime: expirationTime, // Use expireTime instead of ttl
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## TTL (Time-To-Live) Guidelines
|
||||
|
||||
### Recommended TTL Values
|
||||
|
||||
| Use Case | TTL | Reason |
|
||||
|----------|-----|--------|
|
||||
| Quick analysis session | 300s (5 min) | Short-lived tasks |
|
||||
| Extended conversation | 3600s (1 hour) | Standard session length |
|
||||
| Daily batch processing | 86400s (24 hours) | Reuse across day |
|
||||
| Long-term analysis | 604800s (7 days) | Maximum allowed |
|
||||
|
||||
### TTL vs Expiration Time
|
||||
|
||||
**TTL (time-to-live)**:
|
||||
- Relative duration from cache creation
|
||||
- Format: `"3600s"` (string with 's' suffix)
|
||||
- Easy for session-based caching
|
||||
|
||||
**Expiration Time**:
|
||||
- Absolute timestamp
|
||||
- Must be timezone-aware Date object
|
||||
- Precise control over cache lifetime
|
||||
|
||||
---
|
||||
|
||||
## Using a Cache
|
||||
|
||||
### Generate Content with Cache (SDK)
|
||||
|
||||
```typescript
|
||||
// Use cache name as model parameter
|
||||
const response = await ai.models.generateContent({
|
||||
model: cache.name, // Use cache.name, not original model name
|
||||
contents: 'Summarize the document'
|
||||
});
|
||||
|
||||
console.log(response.text);
|
||||
```
|
||||
|
||||
### Multiple Queries with Same Cache
|
||||
|
||||
```typescript
|
||||
const queries = [
|
||||
'What are the key points?',
|
||||
'Who are the main characters?',
|
||||
'What is the conclusion?'
|
||||
];
|
||||
|
||||
for (const query of queries) {
|
||||
const response = await ai.models.generateContent({
|
||||
model: cache.name,
|
||||
contents: query
|
||||
});
|
||||
console.log(`Q: ${query}`);
|
||||
console.log(`A: ${response.text}\n`);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cache Management
|
||||
|
||||
### Update Cache TTL
|
||||
|
||||
```typescript
|
||||
// Extend cache lifetime before it expires
|
||||
await ai.caches.update({
|
||||
name: cache.name,
|
||||
config: {
|
||||
ttl: '7200s' // Extend to 2 hours
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### List All Caches
|
||||
|
||||
```typescript
|
||||
const caches = await ai.caches.list();
|
||||
caches.forEach(cache => {
|
||||
console.log(`${cache.displayName}: ${cache.name}`);
|
||||
console.log(`Expires: ${cache.expireTime}`);
|
||||
});
|
||||
```
|
||||
|
||||
### Delete Cache
|
||||
|
||||
```typescript
|
||||
// Delete when no longer needed
|
||||
await ai.caches.delete({ name: cache.name });
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Use Cases
|
||||
|
||||
### Caching Video Files
|
||||
|
||||
```typescript
|
||||
import fs from 'fs';
|
||||
|
||||
// 1. Upload video
|
||||
const videoFile = await ai.files.upload({
|
||||
file: fs.createReadStream('./video.mp4')
|
||||
});
|
||||
|
||||
// 2. Wait for processing
|
||||
while (videoFile.state.name === 'PROCESSING') {
|
||||
await new Promise(resolve => setTimeout(resolve, 2000));
|
||||
videoFile = await ai.files.get({ name: videoFile.name });
|
||||
}
|
||||
|
||||
// 3. Create cache with video
|
||||
const cache = await ai.caches.create({
|
||||
model: 'gemini-2.5-flash-001',
|
||||
config: {
|
||||
displayName: 'video-cache',
|
||||
systemInstruction: 'Analyze this video.',
|
||||
contents: [videoFile],
|
||||
ttl: '600s'
|
||||
}
|
||||
});
|
||||
|
||||
// 4. Query video multiple times
|
||||
const response1 = await ai.models.generateContent({
|
||||
model: cache.name,
|
||||
contents: 'What happens in the first minute?'
|
||||
});
|
||||
|
||||
const response2 = await ai.models.generateContent({
|
||||
model: cache.name,
|
||||
contents: 'Who are the main people?'
|
||||
});
|
||||
```
|
||||
|
||||
### Caching with System Instructions
|
||||
|
||||
```typescript
|
||||
const cache = await ai.caches.create({
|
||||
model: 'gemini-2.5-flash-001',
|
||||
config: {
|
||||
displayName: 'legal-expert-cache',
|
||||
systemInstruction: `
|
||||
You are a legal expert specializing in contract law.
|
||||
Always cite relevant sections when making claims.
|
||||
Use clear, professional language.
|
||||
`,
|
||||
contents: largeContractDocument,
|
||||
ttl: '3600s'
|
||||
}
|
||||
});
|
||||
|
||||
// System instruction is part of cached context
|
||||
const response = await ai.models.generateContent({
|
||||
model: cache.name,
|
||||
contents: 'Is this contract enforceable?'
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Model Version Requirement
|
||||
|
||||
**⚠️ You MUST use explicit version suffixes when creating caches:**
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT
|
||||
model: 'gemini-2.5-flash-001'
|
||||
|
||||
// ❌ WRONG (will fail)
|
||||
model: 'gemini-2.5-flash'
|
||||
```
|
||||
|
||||
### Cache Expiration
|
||||
|
||||
- Caches are **automatically deleted** after TTL expires
|
||||
- **Cannot recover** expired caches - must recreate
|
||||
- Update TTL **before expiration** to extend lifetime
|
||||
|
||||
### Cost Calculation
|
||||
|
||||
```
|
||||
Regular request: 100,000 input tokens = 100K token cost
|
||||
|
||||
With caching (after cache creation):
|
||||
- Cached tokens: 100,000 × 0.1 (90% discount) = 10K equivalent cost
|
||||
- New tokens: 1,000 × 1.0 = 1K cost
|
||||
- Total: 11K equivalent (89% savings!)
|
||||
```
|
||||
|
||||
### Limitations
|
||||
|
||||
- Maximum TTL: 7 days (604800s)
|
||||
- Cache creation costs same as regular tokens (first time only)
|
||||
- Subsequent uses get 90% discount
|
||||
- Only input tokens are cached (output tokens never cached)
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### When to Use Caching
|
||||
|
||||
✅ **Good Use Cases:**
|
||||
- Large documents queried repeatedly (legal docs, research papers)
|
||||
- Video/audio files analyzed with different questions
|
||||
- Long system instructions used across many requests
|
||||
- Consistent context in multi-turn conversations
|
||||
|
||||
❌ **Bad Use Cases:**
|
||||
- Single-use content (no benefit)
|
||||
- Frequently changing content
|
||||
- Short content (<1000 tokens) - minimal savings
|
||||
- Content used only once per day (cache might expire)
|
||||
|
||||
### Optimization Tips
|
||||
|
||||
1. **Cache Early**: Create cache at session start
|
||||
2. **Extend TTL**: Update before expiration if still needed
|
||||
3. **Monitor Usage**: Track how often cache is reused
|
||||
4. **Clean Up**: Delete unused caches to avoid clutter
|
||||
5. **Combine Features**: Use caching with code execution, grounding for powerful workflows
|
||||
|
||||
### Cache Naming
|
||||
|
||||
Use descriptive `displayName` for easy identification:
|
||||
|
||||
```typescript
|
||||
// ✅ Good names
|
||||
displayName: 'financial-report-2024-q3'
|
||||
displayName: 'legal-contract-acme-corp'
|
||||
displayName: 'video-analysis-project-x'
|
||||
|
||||
// ❌ Vague names
|
||||
displayName: 'cache1'
|
||||
displayName: 'test'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Invalid model name" Error
|
||||
|
||||
**Problem**: Using `gemini-2.5-flash` instead of `gemini-2.5-flash-001`
|
||||
|
||||
**Solution**: Always use explicit version suffix:
|
||||
|
||||
```typescript
|
||||
model: 'gemini-2.5-flash-001' // Correct
|
||||
```
|
||||
|
||||
### Cache Expired Error
|
||||
|
||||
**Problem**: Trying to use cache after TTL expired
|
||||
|
||||
**Solution**: Check expiration before use or extend TTL proactively:
|
||||
|
||||
```typescript
|
||||
const cache = await ai.caches.get({ name: cacheName });
|
||||
if (new Date(cache.expireTime) < new Date()) {
|
||||
// Cache expired, recreate it
|
||||
cache = await ai.caches.create({ ... });
|
||||
}
|
||||
```
|
||||
|
||||
### High Costs Despite Caching
|
||||
|
||||
**Problem**: Creating new cache for each request
|
||||
|
||||
**Solution**: Reuse the same cache across multiple requests:
|
||||
|
||||
```typescript
|
||||
// ❌ Wrong - creates new cache each time
|
||||
for (const query of queries) {
|
||||
const cache = await ai.caches.create({ ... }); // Expensive!
|
||||
const response = await ai.models.generateContent({ model: cache.name, ... });
|
||||
}
|
||||
|
||||
// ✅ Correct - create once, use many times
|
||||
const cache = await ai.caches.create({ ... }); // Create once
|
||||
for (const query of queries) {
|
||||
const response = await ai.models.generateContent({ model: cache.name, ... });
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Official Docs: https://ai.google.dev/gemini-api/docs/caching
|
||||
- Cost Optimization: See "Cost Optimization" in main SKILL.md
|
||||
- Templates: See `context-caching.ts` for working examples
|
||||
59
references/function-calling-patterns.md
Normal file
59
references/function-calling-patterns.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# Function Calling Patterns
|
||||
|
||||
Complete guide to implementing function calling (tool use) with Gemini API.
|
||||
|
||||
---
|
||||
|
||||
## Basic Pattern
|
||||
|
||||
1. Define function declarations
|
||||
2. Send request with tools
|
||||
3. Check if model wants to call functions
|
||||
4. Execute functions
|
||||
5. Send results back to model
|
||||
6. Get final response
|
||||
|
||||
---
|
||||
|
||||
## Function Declaration Schema
|
||||
|
||||
```typescript
|
||||
{
|
||||
name: string, // Function name (no spaces)
|
||||
description: string, // What the function does
|
||||
parametersJsonSchema: { // Subset of OpenAPI schema
|
||||
type: 'object',
|
||||
properties: {
|
||||
[paramName]: {
|
||||
type: string, // 'string' | 'number' | 'boolean' | 'array' | 'object'
|
||||
description: string, // Parameter description
|
||||
enum?: string[] // Optional: allowed values
|
||||
}
|
||||
},
|
||||
required: string[] // Required parameter names
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Calling Modes
|
||||
|
||||
- **AUTO** (default): Model decides when to call
|
||||
- **ANY**: Force at least one function call
|
||||
- **NONE**: Disable function calling
|
||||
|
||||
---
|
||||
|
||||
## Parallel vs Compositional
|
||||
|
||||
**Parallel**: Independent functions run simultaneously
|
||||
**Compositional**: Sequential dependencies (A → B → C)
|
||||
|
||||
Gemini automatically detects which pattern to use.
|
||||
|
||||
---
|
||||
|
||||
## Official Docs
|
||||
|
||||
https://ai.google.dev/gemini-api/docs/function-calling
|
||||
57
references/generation-config.md
Normal file
57
references/generation-config.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Generation Configuration Reference
|
||||
|
||||
Complete reference for all generation parameters.
|
||||
|
||||
---
|
||||
|
||||
## All Parameters
|
||||
|
||||
```typescript
|
||||
config: {
|
||||
temperature: number, // 0.0-2.0 (default: 1.0)
|
||||
topP: number, // 0.0-1.0 (default: 0.95)
|
||||
topK: number, // 1-100+ (default: 40)
|
||||
maxOutputTokens: number, // 1-65536
|
||||
stopSequences: string[], // Stop at these strings
|
||||
responseMimeType: string, // 'text/plain' | 'application/json'
|
||||
candidateCount: number, // Usually 1
|
||||
thinkingConfig: {
|
||||
thinkingBudget: number // Max thinking tokens
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Parameter Guidelines
|
||||
|
||||
### temperature
|
||||
- **0.0**: Deterministic, focused
|
||||
- **1.0**: Balanced (default)
|
||||
- **2.0**: Very creative, random
|
||||
|
||||
### topP (nucleus sampling)
|
||||
- **0.95**: Default, good balance
|
||||
- Lower = more focused
|
||||
|
||||
### topK
|
||||
- **40**: Default
|
||||
- Higher = more diversity
|
||||
|
||||
### maxOutputTokens
|
||||
- Always set this to prevent excessive generation
|
||||
- Max: 65,536 tokens
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
**Factual tasks**: temperature=0.0, topP=0.8
|
||||
**Creative tasks**: temperature=1.2, topP=0.95
|
||||
**Code generation**: temperature=0.3, topP=0.9
|
||||
|
||||
---
|
||||
|
||||
## Official Docs
|
||||
|
||||
https://ai.google.dev/gemini-api/docs/models/generative-models#model-parameters
|
||||
602
references/grounding-guide.md
Normal file
602
references/grounding-guide.md
Normal file
@@ -0,0 +1,602 @@
|
||||
# Grounding with Google Search Guide
|
||||
|
||||
Complete guide to using grounding with Google Search to connect Gemini models to real-time web information, reducing hallucinations and providing verifiable, up-to-date responses.
|
||||
|
||||
---
|
||||
|
||||
## What is Grounding?
|
||||
|
||||
Grounding connects the Gemini model to Google Search, allowing it to:
|
||||
- Access real-time information beyond training cutoff
|
||||
- Reduce hallucinations with fact-checked web sources
|
||||
- Provide citations and source URLs
|
||||
- Answer questions about current events
|
||||
- Verify information against the web
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Model receives query** (e.g., "Who won Euro 2024?")
|
||||
2. **Model determines** if current information is needed
|
||||
3. **Performs Google Search** automatically
|
||||
4. **Processes search results** (web pages, snippets)
|
||||
5. **Incorporates findings** into response
|
||||
6. **Provides citations** with source URLs
|
||||
|
||||
---
|
||||
|
||||
## Two Grounding APIs
|
||||
|
||||
### 1. Google Search (`googleSearch`) - Recommended for Gemini 2.5
|
||||
|
||||
**Simple, automatic grounding**:
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Who won the euro 2024?',
|
||||
config: {
|
||||
tools: [{ googleSearch: {} }]
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Features**:
|
||||
- Simple configuration (empty object)
|
||||
- Automatic search when model needs current info
|
||||
- Available on all Gemini 2.5 models
|
||||
- Recommended for new projects
|
||||
|
||||
### 2. Google Search Retrieval (`googleSearchRetrieval`) - Legacy for Gemini 1.5
|
||||
|
||||
**Dynamic threshold control**:
|
||||
|
||||
```typescript
|
||||
import { DynamicRetrievalConfigMode } from '@google/genai';
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-1.5-flash',
|
||||
contents: 'Who won the euro 2024?',
|
||||
config: {
|
||||
tools: [{
|
||||
googleSearchRetrieval: {
|
||||
dynamicRetrievalConfig: {
|
||||
mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
|
||||
dynamicThreshold: 0.7 // Search only if confidence < 70%
|
||||
}
|
||||
}
|
||||
}]
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Features**:
|
||||
- Control when searches happen via threshold
|
||||
- Used with Gemini 1.5 models
|
||||
- More configuration options
|
||||
|
||||
**Recommendation**: Use `googleSearch` for Gemini 2.5 models (simpler and newer).
|
||||
|
||||
---
|
||||
|
||||
## Basic Usage
|
||||
|
||||
### SDK Approach (Gemini 2.5)
|
||||
|
||||
```typescript
|
||||
import { GoogleGenAI } from '@google/genai';
|
||||
|
||||
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'What are the latest developments in AI?',
|
||||
config: {
|
||||
tools: [{ googleSearch: {} }]
|
||||
}
|
||||
});
|
||||
|
||||
console.log(response.text);
|
||||
|
||||
// Check if grounding was used
|
||||
if (response.candidates[0].groundingMetadata) {
|
||||
console.log('✓ Search performed');
|
||||
console.log('Sources:', response.candidates[0].groundingMetadata.webPages);
|
||||
} else {
|
||||
console.log('✓ Answered from model knowledge');
|
||||
}
|
||||
```
|
||||
|
||||
### Fetch Approach (Cloudflare Workers)
|
||||
|
||||
```typescript
|
||||
const response = await fetch(
|
||||
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
|
||||
{
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'x-goog-api-key': env.GEMINI_API_KEY,
|
||||
},
|
||||
body: JSON.stringify({
|
||||
contents: [{ parts: [{ text: 'What are the latest developments in AI?' }] }],
|
||||
tools: [{ google_search: {} }]
|
||||
}),
|
||||
}
|
||||
);
|
||||
|
||||
const data = await response.json();
|
||||
console.log(data.candidates[0].content.parts[0].text);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Grounding Metadata
|
||||
|
||||
### Structure
|
||||
|
||||
```typescript
|
||||
{
|
||||
groundingMetadata: {
|
||||
// Search queries performed
|
||||
searchQueries: [
|
||||
{ text: "euro 2024 winner" }
|
||||
],
|
||||
|
||||
// Web pages retrieved
|
||||
webPages: [
|
||||
{
|
||||
url: "https://example.com/euro-2024",
|
||||
title: "UEFA Euro 2024 Results",
|
||||
snippet: "Spain won UEFA Euro 2024..."
|
||||
}
|
||||
],
|
||||
|
||||
// Citations (inline references)
|
||||
citations: [
|
||||
{
|
||||
startIndex: 42,
|
||||
endIndex: 47,
|
||||
uri: "https://example.com/euro-2024"
|
||||
}
|
||||
],
|
||||
|
||||
// Retrieval queries (alternative search terms)
|
||||
retrievalQueries: [
|
||||
{ query: "who won euro 2024 final" }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Accessing Metadata
|
||||
|
||||
```typescript
|
||||
if (response.candidates[0].groundingMetadata) {
|
||||
const metadata = response.candidates[0].groundingMetadata;
|
||||
|
||||
// Display sources
|
||||
console.log('Sources:');
|
||||
metadata.webPages?.forEach((page, i) => {
|
||||
console.log(`${i + 1}. ${page.title}`);
|
||||
console.log(` ${page.url}`);
|
||||
});
|
||||
|
||||
// Display citations
|
||||
console.log('\nCitations:');
|
||||
metadata.citations?.forEach((citation) => {
|
||||
console.log(`Position ${citation.startIndex}-${citation.endIndex}: ${citation.uri}`);
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Use Grounding
|
||||
|
||||
### ✅ Good Use Cases
|
||||
|
||||
**Current Events**:
|
||||
```typescript
|
||||
'What happened in the news today?'
|
||||
'Who won the latest sports championship?'
|
||||
'What are the current stock prices?'
|
||||
```
|
||||
|
||||
**Recent Developments**:
|
||||
```typescript
|
||||
'What are the latest AI breakthroughs?'
|
||||
'What are recent changes in climate policy?'
|
||||
```
|
||||
|
||||
**Fact-Checking**:
|
||||
```typescript
|
||||
'Is this claim true: [claim]?'
|
||||
'What does the latest research say about [topic]?'
|
||||
```
|
||||
|
||||
**Real-Time Data**:
|
||||
```typescript
|
||||
'What is the current weather in Tokyo?'
|
||||
'What are today's cryptocurrency prices?'
|
||||
```
|
||||
|
||||
### ❌ Not Recommended For
|
||||
|
||||
**General Knowledge**:
|
||||
```typescript
|
||||
'What is the capital of France?' // Model knows this
|
||||
'How does photosynthesis work?' // Stable knowledge
|
||||
```
|
||||
|
||||
**Mathematical Calculations**:
|
||||
```typescript
|
||||
'What is 15 * 27?' // Use code execution instead
|
||||
```
|
||||
|
||||
**Creative Tasks**:
|
||||
```typescript
|
||||
'Write a poem about autumn' // No search needed
|
||||
```
|
||||
|
||||
**Code Generation**:
|
||||
```typescript
|
||||
'Write a sorting algorithm' // Internal reasoning sufficient
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Chat with Grounding
|
||||
|
||||
### Multi-Turn Conversations
|
||||
|
||||
```typescript
|
||||
const chat = await ai.chats.create({
|
||||
model: 'gemini-2.5-flash',
|
||||
config: {
|
||||
tools: [{ googleSearch: {} }]
|
||||
}
|
||||
});
|
||||
|
||||
// First question
|
||||
let response = await chat.sendMessage('What are the latest quantum computing developments?');
|
||||
console.log(response.text);
|
||||
|
||||
// Display sources
|
||||
if (response.candidates[0].groundingMetadata) {
|
||||
const sources = response.candidates[0].groundingMetadata.webPages || [];
|
||||
console.log(`\nSources: ${sources.length} web pages`);
|
||||
sources.forEach(s => console.log(`- ${s.title}: ${s.url}`));
|
||||
}
|
||||
|
||||
// Follow-up question
|
||||
response = await chat.sendMessage('Which company made the biggest breakthrough?');
|
||||
console.log('\n' + response.text);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Combining with Other Features
|
||||
|
||||
### Grounding + Function Calling
|
||||
|
||||
```typescript
|
||||
const weatherFunction = {
|
||||
name: 'get_current_weather',
|
||||
description: 'Get weather for a location',
|
||||
parametersJsonSchema: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
location: { type: 'string', description: 'City name' }
|
||||
},
|
||||
required: ['location']
|
||||
}
|
||||
};
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'What is the weather like in the city that won Euro 2024?',
|
||||
config: {
|
||||
tools: [
|
||||
{ googleSearch: {} }, // For finding Euro 2024 winner
|
||||
{ functionDeclarations: [weatherFunction] } // For weather lookup
|
||||
]
|
||||
}
|
||||
});
|
||||
|
||||
// Model will:
|
||||
// 1. Use Google Search to find Euro 2024 winner (Madrid/Spain)
|
||||
// 2. Call get_current_weather function with the city
|
||||
// 3. Combine both results in response
|
||||
```
|
||||
|
||||
### Grounding + Code Execution
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Find the current stock prices for AAPL, GOOGL, MSFT and calculate their average',
|
||||
config: {
|
||||
tools: [
|
||||
{ googleSearch: {} }, // For current stock prices
|
||||
{ codeExecution: {} } // For averaging
|
||||
]
|
||||
}
|
||||
});
|
||||
|
||||
// Model will:
|
||||
// 1. Search for current stock prices
|
||||
// 2. Generate code to calculate average
|
||||
// 3. Execute code with the found prices
|
||||
// 4. Return result with citations
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checking Grounding Usage
|
||||
|
||||
### Determine if Search Was Performed
|
||||
|
||||
```typescript
|
||||
const queries = [
|
||||
'What is 2+2?', // Should NOT use search
|
||||
'What happened in the news today?' // Should use search
|
||||
];
|
||||
|
||||
for (const query of queries) {
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: query,
|
||||
config: { tools: [{ googleSearch: {} }] }
|
||||
});
|
||||
|
||||
console.log(`Query: ${query}`);
|
||||
console.log(`Search used: ${response.candidates[0].groundingMetadata ? 'YES' : 'NO'}`);
|
||||
console.log();
|
||||
}
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```
|
||||
Query: What is 2+2?
|
||||
Search used: NO
|
||||
|
||||
Query: What happened in the news today?
|
||||
Search used: YES
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dynamic Retrieval (Gemini 1.5)
|
||||
|
||||
### Threshold-Based Grounding
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-1.5-flash',
|
||||
contents: 'Who won the euro 2024?',
|
||||
config: {
|
||||
tools: [{
|
||||
googleSearchRetrieval: {
|
||||
dynamicRetrievalConfig: {
|
||||
mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
|
||||
dynamicThreshold: 0.7 // Search only if confidence < 70%
|
||||
}
|
||||
}
|
||||
}]
|
||||
}
|
||||
});
|
||||
|
||||
if (!response.candidates[0].groundingMetadata) {
|
||||
console.log('Model answered from knowledge (confidence >= 70%)');
|
||||
} else {
|
||||
console.log('Search performed (confidence < 70%)');
|
||||
}
|
||||
```
|
||||
|
||||
**How It Works**:
|
||||
- Model evaluates confidence in its internal knowledge
|
||||
- If confidence < threshold → performs search
|
||||
- If confidence >= threshold → uses internal knowledge
|
||||
|
||||
**Threshold Values**:
|
||||
- `0.0`: Never search (always use internal knowledge)
|
||||
- `0.5`: Search if moderately uncertain
|
||||
- `0.7`: Search if somewhat uncertain (good default)
|
||||
- `1.0`: Always search
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### ✅ Do
|
||||
|
||||
1. **Check Metadata**: Always verify if grounding was used
|
||||
```typescript
|
||||
if (response.candidates[0].groundingMetadata) { ... }
|
||||
```
|
||||
|
||||
2. **Display Citations**: Show sources to users for transparency
|
||||
```typescript
|
||||
metadata.webPages.forEach(page => {
|
||||
console.log(`Source: ${page.title} (${page.url})`);
|
||||
});
|
||||
```
|
||||
|
||||
3. **Use Specific Queries**: Better search results with clear questions
|
||||
```typescript
|
||||
// ✅ Good: "What are Microsoft's Q3 2024 earnings?"
|
||||
// ❌ Vague: "Tell me about Microsoft"
|
||||
```
|
||||
|
||||
4. **Combine Features**: Use with function calling/code execution for powerful workflows
|
||||
|
||||
5. **Handle Missing Metadata**: Not all queries trigger search
|
||||
```typescript
|
||||
const sources = response.candidates[0].groundingMetadata?.webPages || [];
|
||||
```
|
||||
|
||||
### ❌ Don't
|
||||
|
||||
1. **Don't Assume Search Always Happens**: Model decides when to search
|
||||
2. **Don't Ignore Citations**: They're crucial for fact-checking
|
||||
3. **Don't Use for Stable Knowledge**: Waste of resources for unchanging facts
|
||||
4. **Don't Expect Perfect Coverage**: Not all information is on the web
|
||||
|
||||
---
|
||||
|
||||
## Cost and Performance
|
||||
|
||||
### Cost Considerations
|
||||
|
||||
- **Added Latency**: Search takes 1-3 seconds typically
|
||||
- **Token Costs**: Retrieved content counts as input tokens
|
||||
- **Rate Limits**: Subject to API rate limits
|
||||
|
||||
### Optimization
|
||||
|
||||
**Use Dynamic Threshold** (Gemini 1.5):
|
||||
```typescript
|
||||
dynamicThreshold: 0.7 // Higher = more searches, lower = fewer searches
|
||||
```
|
||||
|
||||
**Cache Grounding Results** (if appropriate):
|
||||
```typescript
|
||||
const cache = await ai.caches.create({
|
||||
model: 'gemini-2.5-flash-001',
|
||||
config: {
|
||||
displayName: 'grounding-cache',
|
||||
tools: [{ googleSearch: {} }],
|
||||
contents: 'Initial query that triggers search...',
|
||||
ttl: '3600s'
|
||||
}
|
||||
});
|
||||
// Subsequent queries reuse cached grounding results
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Grounding Not Working
|
||||
|
||||
**Symptom**: No `groundingMetadata` in response
|
||||
|
||||
**Causes**:
|
||||
1. Grounding not enabled: `tools: [{ googleSearch: {} }]`
|
||||
2. Model decided search wasn't needed (query answerable from knowledge)
|
||||
3. Google Cloud project not configured (grounding requires GCP)
|
||||
|
||||
**Solution**:
|
||||
- Verify `tools` configuration
|
||||
- Use queries requiring current information
|
||||
- Set up Google Cloud project
|
||||
|
||||
### Poor Search Quality
|
||||
|
||||
**Symptom**: Irrelevant sources or wrong information
|
||||
|
||||
**Causes**:
|
||||
- Vague query
|
||||
- Search terms ambiguous
|
||||
- Recent events not yet indexed
|
||||
|
||||
**Solution**:
|
||||
- Make queries more specific
|
||||
- Include context in prompt
|
||||
- Verify search queries in metadata
|
||||
|
||||
### Citations Missing
|
||||
|
||||
**Symptom**: `groundingMetadata` present but no citations
|
||||
|
||||
**Explanation**: Citations are **inline references** - they may not always be present if model doesn't directly quote sources.
|
||||
|
||||
**Solution**: Check `webPages` instead for full source list
|
||||
|
||||
---
|
||||
|
||||
## Important Requirements
|
||||
|
||||
### Google Cloud Project
|
||||
|
||||
**⚠️ Grounding requires a Google Cloud project, not just an API key.**
|
||||
|
||||
**Setup**:
|
||||
1. Create Google Cloud project
|
||||
2. Enable Generative Language API
|
||||
3. Configure billing
|
||||
4. Use API key from that project
|
||||
|
||||
**Error if Missing**:
|
||||
```
|
||||
Error: Grounding requires Google Cloud project configuration
|
||||
```
|
||||
|
||||
### Model Support
|
||||
|
||||
**✅ Supported**:
|
||||
- All Gemini 2.5 models (`googleSearch`)
|
||||
- All Gemini 1.5 models (`googleSearchRetrieval`)
|
||||
|
||||
**❌ Not Supported**:
|
||||
- Gemini 1.0 models
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### News Summary
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Summarize today's top 3 technology news headlines',
|
||||
config: { tools: [{ googleSearch: {} }] }
|
||||
});
|
||||
|
||||
console.log(response.text);
|
||||
metadata.webPages?.forEach((page, i) => {
|
||||
console.log(`${i + 1}. ${page.title}: ${page.url}`);
|
||||
});
|
||||
```
|
||||
|
||||
### Fact Verification
|
||||
|
||||
```typescript
|
||||
const claim = "The Earth is flat";
|
||||
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: `Is this claim true: "${claim}"? Use reliable sources to verify.`,
|
||||
config: { tools: [{ googleSearch: {} }] }
|
||||
});
|
||||
|
||||
console.log(response.text);
|
||||
```
|
||||
|
||||
### Market Research
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'What are the current trends in electric vehicle adoption in 2024?',
|
||||
config: { tools: [{ googleSearch: {} }] }
|
||||
});
|
||||
|
||||
console.log(response.text);
|
||||
console.log('\nSources:');
|
||||
metadata.webPages?.forEach(page => {
|
||||
console.log(`- ${page.title}`);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Official Docs: https://ai.google.dev/gemini-api/docs/grounding
|
||||
- Google Search Docs: https://ai.google.dev/gemini-api/docs/google-search
|
||||
- Templates: See `grounding-search.ts` for working examples
|
||||
- Combined Features: See `combined-advanced.ts` for integration patterns
|
||||
289
references/models-guide.md
Normal file
289
references/models-guide.md
Normal file
@@ -0,0 +1,289 @@
|
||||
# Gemini Models Guide (2025)
|
||||
|
||||
**Last Updated**: 2025-11-19 (Gemini 3 preview release)
|
||||
|
||||
---
|
||||
|
||||
## Gemini 3 Series (Preview - November 2025)
|
||||
|
||||
### gemini-3-pro-preview
|
||||
|
||||
**Model ID**: `gemini-3-pro-preview`
|
||||
|
||||
**Status**: 🆕 Preview release (November 18, 2025)
|
||||
|
||||
**Context Windows**:
|
||||
- Input: TBD (documentation pending)
|
||||
- Output: TBD (documentation pending)
|
||||
|
||||
**Description**: Google's newest and most intelligent AI model with state-of-the-art reasoning and multimodal understanding. Outperforms Gemini 2.5 Pro on every major AI benchmark.
|
||||
|
||||
**Best For**:
|
||||
- Most complex reasoning tasks
|
||||
- Advanced multimodal analysis (images, videos, PDFs, audio)
|
||||
- Benchmark-critical applications
|
||||
- Cutting-edge projects requiring latest capabilities
|
||||
- Tasks requiring absolute best quality
|
||||
|
||||
**Features**:
|
||||
- ✅ Enhanced multimodal understanding
|
||||
- ✅ Function calling
|
||||
- ✅ Streaming
|
||||
- ✅ System instructions
|
||||
- ✅ JSON mode
|
||||
- TBD Thinking mode (documentation pending)
|
||||
|
||||
**Knowledge Cutoff**: TBD
|
||||
|
||||
**Pricing**: Preview pricing (likely higher than 2.5 Pro)
|
||||
|
||||
**⚠️ Preview Status**: Use for evaluation and testing. Consider `gemini-2.5-pro` for production-critical decisions until Gemini 3 reaches stable general availability.
|
||||
|
||||
**New Capabilities**:
|
||||
- Record-breaking benchmark performance
|
||||
- Enhanced generative UI responses
|
||||
- Advanced coding capabilities (Google Antigravity integration)
|
||||
- State-of-the-art multimodal understanding
|
||||
|
||||
---
|
||||
|
||||
## Current Production Models (Gemini 2.5 - Stable)
|
||||
|
||||
### gemini-2.5-pro
|
||||
|
||||
**Model ID**: `gemini-2.5-pro`
|
||||
|
||||
**Context Windows**:
|
||||
- Input: 1,048,576 tokens (NOT 2M!)
|
||||
- Output: 65,536 tokens
|
||||
|
||||
**Description**: State-of-the-art thinking model capable of reasoning over complex problems in code, math, and STEM.
|
||||
|
||||
**Best For**:
|
||||
- Complex reasoning tasks
|
||||
- Advanced code generation and optimization
|
||||
- Mathematical problem-solving
|
||||
- Multi-step logical analysis
|
||||
- STEM applications
|
||||
|
||||
**Features**:
|
||||
- ✅ Thinking mode (enabled by default)
|
||||
- ✅ Function calling
|
||||
- ✅ Multimodal (text, images, video, audio, PDFs)
|
||||
- ✅ Streaming
|
||||
- ✅ System instructions
|
||||
- ✅ JSON mode
|
||||
|
||||
**Knowledge Cutoff**: January 2025
|
||||
|
||||
**Pricing**: Higher cost, use for tasks requiring best quality
|
||||
|
||||
---
|
||||
|
||||
### gemini-2.5-flash
|
||||
|
||||
**Model ID**: `gemini-2.5-flash`
|
||||
|
||||
**Context Windows**:
|
||||
- Input: 1,048,576 tokens
|
||||
- Output: 65,536 tokens
|
||||
|
||||
**Description**: Best price-performance model for large-scale processing, low-latency, and high-volume tasks.
|
||||
|
||||
**Best For**:
|
||||
- General-purpose AI applications
|
||||
- High-volume API calls
|
||||
- Agentic workflows
|
||||
- Cost-sensitive applications
|
||||
- Production workloads
|
||||
|
||||
**Features**:
|
||||
- ✅ Thinking mode (enabled by default)
|
||||
- ✅ Function calling
|
||||
- ✅ Multimodal (text, images, video, audio, PDFs)
|
||||
- ✅ Streaming
|
||||
- ✅ System instructions
|
||||
- ✅ JSON mode
|
||||
|
||||
**Knowledge Cutoff**: January 2025
|
||||
|
||||
**Pricing**: Best price-performance ratio
|
||||
|
||||
**⭐ Recommended**: This is the default choice for most applications
|
||||
|
||||
---
|
||||
|
||||
### gemini-2.5-flash-lite
|
||||
|
||||
**Model ID**: `gemini-2.5-flash-lite`
|
||||
|
||||
**Context Windows**:
|
||||
- Input: 1,048,576 tokens
|
||||
- Output: 65,536 tokens
|
||||
|
||||
**Description**: Most cost-efficient and fastest 2.5 model, optimized for high throughput.
|
||||
|
||||
**Best For**:
|
||||
- High-throughput applications
|
||||
- Simple text generation
|
||||
- Cost-critical use cases
|
||||
- Speed-prioritized workloads
|
||||
|
||||
**Features**:
|
||||
- ✅ Thinking mode (enabled by default)
|
||||
- ❌ **NO function calling** (critical limitation!)
|
||||
- ✅ Multimodal (text, images, video, audio, PDFs)
|
||||
- ✅ Streaming
|
||||
- ✅ System instructions
|
||||
- ✅ JSON mode
|
||||
|
||||
**Knowledge Cutoff**: January 2025
|
||||
|
||||
**Pricing**: Lowest cost
|
||||
|
||||
**⚠️ Important**: Flash-Lite does NOT support function calling! Use Flash or Pro if you need tool use.
|
||||
|
||||
---
|
||||
|
||||
## Model Comparison Matrix
|
||||
|
||||
| Feature | Pro | Flash | Flash-Lite |
|
||||
|---------|-----|-------|------------|
|
||||
| **Thinking Mode** | ✅ Default ON | ✅ Default ON | ✅ Default ON |
|
||||
| **Function Calling** | ✅ Yes | ✅ Yes | ❌ **NO** |
|
||||
| **Multimodal** | ✅ Full | ✅ Full | ✅ Full |
|
||||
| **Streaming** | ✅ Yes | ✅ Yes | ✅ Yes |
|
||||
| **Input Tokens** | 1,048,576 | 1,048,576 | 1,048,576 |
|
||||
| **Output Tokens** | 65,536 | 65,536 | 65,536 |
|
||||
| **Reasoning Quality** | Best | Good | Basic |
|
||||
| **Speed** | Moderate | Fast | Fastest |
|
||||
| **Cost** | Highest | Medium | Lowest |
|
||||
|
||||
---
|
||||
|
||||
## Previous Generation Models (Still Available)
|
||||
|
||||
### Gemini 2.0 Flash
|
||||
|
||||
**Model ID**: `gemini-2.0-flash`
|
||||
|
||||
**Context**: 1M input / 65K output tokens
|
||||
|
||||
**Status**: Previous generation, 2.5 Flash recommended instead
|
||||
|
||||
### Gemini 1.5 Pro
|
||||
|
||||
**Model ID**: `gemini-1.5-pro`
|
||||
|
||||
**Context**: 2M input tokens (this is the ONLY model with 2M!)
|
||||
|
||||
**Status**: Older model, 2.5 models recommended
|
||||
|
||||
---
|
||||
|
||||
## Context Window Clarification
|
||||
|
||||
**⚠️ CRITICAL CORRECTION**:
|
||||
|
||||
**ACCURATE**: Gemini 2.5 models support **1,048,576 input tokens** (approximately 1 million)
|
||||
|
||||
**INACCURATE**: Claiming Gemini 2.5 has 2M token context window
|
||||
|
||||
**WHY THIS MATTERS**:
|
||||
- Gemini 1.5 Pro (older model) had 2M tokens
|
||||
- Gemini 2.5 models (current) have ~1M tokens
|
||||
- This is a common mistake that causes confusion!
|
||||
|
||||
**This skill prevents this error by providing accurate information.**
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Guide
|
||||
|
||||
### Use gemini-2.5-pro When:
|
||||
- ✅ Complex reasoning required (math, logic, STEM)
|
||||
- ✅ Advanced code generation and optimization
|
||||
- ✅ Multi-step problem-solving
|
||||
- ✅ Quality is more important than cost
|
||||
- ✅ Tasks require maximum capability
|
||||
|
||||
### Use gemini-2.5-flash When:
|
||||
- ✅ General-purpose AI applications
|
||||
- ✅ High-volume production workloads
|
||||
- ✅ Function calling required
|
||||
- ✅ Agentic workflows
|
||||
- ✅ Good balance of cost and quality needed
|
||||
- ⭐ **Recommended default choice**
|
||||
|
||||
### Use gemini-2.5-flash-lite When:
|
||||
- ✅ Simple text generation only
|
||||
- ✅ No function calling needed
|
||||
- ✅ High throughput required
|
||||
- ✅ Cost is primary concern
|
||||
- ⚠️ **Only if you don't need function calling!**
|
||||
|
||||
---
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
### ❌ Mistake 1: Using Wrong Model Name
|
||||
```typescript
|
||||
// WRONG - old model name
|
||||
model: 'gemini-1.5-pro'
|
||||
|
||||
// CORRECT - current model
|
||||
model: 'gemini-2.5-flash'
|
||||
```
|
||||
|
||||
### ❌ Mistake 2: Claiming 2M Context for 2.5 Models
|
||||
```typescript
|
||||
// WRONG ASSUMPTION
|
||||
// "Gemini 2.5 has 2M token context window"
|
||||
|
||||
// CORRECT
|
||||
// Gemini 2.5 has 1,048,576 input tokens
|
||||
// Only Gemini 1.5 Pro (older) had 2M
|
||||
```
|
||||
|
||||
### ❌ Mistake 3: Using Flash-Lite for Function Calling
|
||||
```typescript
|
||||
// WRONG - Flash-Lite doesn't support function calling!
|
||||
model: 'gemini-2.5-flash-lite',
|
||||
config: {
|
||||
tools: [{ functionDeclarations: [...] }] // This will FAIL
|
||||
}
|
||||
|
||||
// CORRECT
|
||||
model: 'gemini-2.5-flash', // or gemini-2.5-pro
|
||||
config: {
|
||||
tools: [{ functionDeclarations: [...] }]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rate Limits (Free vs Paid)
|
||||
|
||||
### Free Tier
|
||||
- **15 RPM** (requests per minute)
|
||||
- **1M TPM** (tokens per minute)
|
||||
- **1,500 RPD** (requests per day)
|
||||
|
||||
### Paid Tier
|
||||
- **360 RPM**
|
||||
- **4M TPM**
|
||||
- Unlimited daily requests
|
||||
|
||||
**Tip**: Monitor your usage and implement rate limiting to stay within quotas.
|
||||
|
||||
---
|
||||
|
||||
## Official Documentation
|
||||
|
||||
- **Models Overview**: https://ai.google.dev/gemini-api/docs/models
|
||||
- **Gemini 2.5 Announcement**: https://developers.googleblog.com/en/gemini-2-5-thinking-model-updates/
|
||||
- **Pricing**: https://ai.google.dev/pricing
|
||||
|
||||
---
|
||||
|
||||
**Production Tip**: Always use gemini-2.5-flash as your default unless you specifically need Pro's advanced reasoning or want to minimize cost with Flash-Lite (and don't need function calling).
|
||||
58
references/multimodal-guide.md
Normal file
58
references/multimodal-guide.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# Multimodal Guide
|
||||
|
||||
Complete guide to using images, video, audio, and PDFs with Gemini API.
|
||||
|
||||
---
|
||||
|
||||
## Supported Formats
|
||||
|
||||
### Images
|
||||
- JPEG, PNG, WebP, HEIC, HEIF
|
||||
- Max size: 20MB
|
||||
|
||||
### Video
|
||||
- MP4, MPEG, MOV, AVI, FLV, MPG, WebM, WMV
|
||||
- Max size: 2GB
|
||||
- Max length (inline): 2 minutes
|
||||
|
||||
### Audio
|
||||
- MP3, WAV, FLAC, AAC, OGG, OPUS
|
||||
- Max size: 20MB
|
||||
|
||||
### PDFs
|
||||
- Max size: 30MB
|
||||
- Text-based PDFs work best
|
||||
|
||||
---
|
||||
|
||||
## Usage Pattern
|
||||
|
||||
```typescript
|
||||
contents: [
|
||||
{
|
||||
parts: [
|
||||
{ text: 'Your question' },
|
||||
{
|
||||
inlineData: {
|
||||
data: base64EncodedData,
|
||||
mimeType: 'image/jpeg' // or video/mp4, audio/mp3, application/pdf
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
- Use specific, detailed prompts
|
||||
- Combine multiple modalities in one request
|
||||
- For large files (>2GB), use File API (Phase 2)
|
||||
|
||||
---
|
||||
|
||||
## Official Docs
|
||||
|
||||
https://ai.google.dev/gemini-api/docs/vision
|
||||
235
references/sdk-migration-guide.md
Normal file
235
references/sdk-migration-guide.md
Normal file
@@ -0,0 +1,235 @@
|
||||
# SDK Migration Guide
|
||||
|
||||
**From**: `@google/generative-ai` (DEPRECATED)
|
||||
**To**: `@google/genai` (CURRENT)
|
||||
|
||||
**Deadline**: November 30, 2025 (deprecated SDK sunset)
|
||||
|
||||
---
|
||||
|
||||
## Why Migrate?
|
||||
|
||||
The `@google/generative-ai` SDK is deprecated and will stop receiving updates on **November 30, 2025**.
|
||||
|
||||
The new `@google/genai` SDK:
|
||||
- ✅ Works with both Gemini API and Vertex AI
|
||||
- ✅ Supports Gemini 2.0+ features
|
||||
- ✅ Better TypeScript support
|
||||
- ✅ Unified API across platforms
|
||||
- ✅ Active development and updates
|
||||
|
||||
---
|
||||
|
||||
## Migration Steps
|
||||
|
||||
### 1. Update Package
|
||||
|
||||
```bash
|
||||
# Remove deprecated SDK
|
||||
npm uninstall @google/generative-ai
|
||||
|
||||
# Install current SDK
|
||||
npm install @google/genai@1.27.0
|
||||
```
|
||||
|
||||
### 2. Update Imports
|
||||
|
||||
**Old (DEPRECATED)**:
|
||||
```typescript
|
||||
import { GoogleGenerativeAI } from '@google/generative-ai';
|
||||
|
||||
const genAI = new GoogleGenerativeAI(apiKey);
|
||||
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
|
||||
```
|
||||
|
||||
**New (CURRENT)**:
|
||||
```typescript
|
||||
import { GoogleGenAI } from '@google/genai';
|
||||
|
||||
const ai = new GoogleGenAI({ apiKey });
|
||||
// No need to get model separately
|
||||
```
|
||||
|
||||
### 3. Update API Calls
|
||||
|
||||
**Old**:
|
||||
```typescript
|
||||
const result = await model.generateContent(prompt);
|
||||
const response = await result.response;
|
||||
const text = response.text();
|
||||
```
|
||||
|
||||
**New**:
|
||||
```typescript
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: prompt
|
||||
});
|
||||
const text = response.text;
|
||||
```
|
||||
|
||||
### 4. Update Streaming
|
||||
|
||||
**Old**:
|
||||
```typescript
|
||||
const result = await model.generateContentStream(prompt);
|
||||
for await (const chunk of result.stream) {
|
||||
console.log(chunk.text());
|
||||
}
|
||||
```
|
||||
|
||||
**New**:
|
||||
```typescript
|
||||
const response = await ai.models.generateContentStream({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: prompt
|
||||
});
|
||||
for await (const chunk of response) {
|
||||
console.log(chunk.text);
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Update Chat
|
||||
|
||||
**Old**:
|
||||
```typescript
|
||||
const chat = model.startChat({
|
||||
history: []
|
||||
});
|
||||
const result = await chat.sendMessage(message);
|
||||
const response = await result.response;
|
||||
console.log(response.text());
|
||||
```
|
||||
|
||||
**New**:
|
||||
```typescript
|
||||
const chat = await ai.models.createChat({
|
||||
model: 'gemini-2.5-flash',
|
||||
history: []
|
||||
});
|
||||
const response = await chat.sendMessage(message);
|
||||
console.log(response.text);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complete Before/After Example
|
||||
|
||||
### Before (Deprecated SDK)
|
||||
|
||||
```typescript
|
||||
import { GoogleGenerativeAI } from '@google/generative-ai';
|
||||
|
||||
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
|
||||
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
|
||||
|
||||
// Generate
|
||||
const result = await model.generateContent('Hello');
|
||||
const response = await result.response;
|
||||
console.log(response.text());
|
||||
|
||||
// Stream
|
||||
const streamResult = await model.generateContentStream('Write a story');
|
||||
for await (const chunk of streamResult.stream) {
|
||||
console.log(chunk.text());
|
||||
}
|
||||
|
||||
// Chat
|
||||
const chat = model.startChat();
|
||||
const chatResult = await chat.sendMessage('Hi');
|
||||
const chatResponse = await chatResult.response;
|
||||
console.log(chatResponse.text());
|
||||
```
|
||||
|
||||
### After (Current SDK)
|
||||
|
||||
```typescript
|
||||
import { GoogleGenAI } from '@google/genai';
|
||||
|
||||
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
|
||||
|
||||
// Generate
|
||||
const response = await ai.models.generateContent({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Hello'
|
||||
});
|
||||
console.log(response.text);
|
||||
|
||||
// Stream
|
||||
const streamResponse = await ai.models.generateContentStream({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Write a story'
|
||||
});
|
||||
for await (const chunk of streamResponse) {
|
||||
console.log(chunk.text);
|
||||
}
|
||||
|
||||
// Chat
|
||||
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
|
||||
const chatResponse = await chat.sendMessage('Hi');
|
||||
console.log(chatResponse.text);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Differences
|
||||
|
||||
| Aspect | Old SDK | New SDK |
|
||||
|--------|---------|---------|
|
||||
| Package | `@google/generative-ai` | `@google/genai` |
|
||||
| Class | `GoogleGenerativeAI` | `GoogleGenAI` |
|
||||
| Model Init | `genAI.getGenerativeModel()` | Specify in each call |
|
||||
| Text Access | `response.text()` (method) | `response.text` (property) |
|
||||
| Stream Iteration | `result.stream` | Direct iteration |
|
||||
| Chat Creation | `model.startChat()` | `ai.models.createChat()` |
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Error: "Cannot find module '@google/generative-ai'"
|
||||
|
||||
**Cause**: Old import statement after migration
|
||||
|
||||
**Solution**: Update all imports to `@google/genai`
|
||||
|
||||
### Error: "Property 'text' does not exist"
|
||||
|
||||
**Cause**: Using `response.text()` (method) instead of `response.text` (property)
|
||||
|
||||
**Solution**: Remove parentheses: `response.text` not `response.text()`
|
||||
|
||||
### Error: "generateContent is not a function"
|
||||
|
||||
**Cause**: Trying to call methods on old model object
|
||||
|
||||
**Solution**: Use `ai.models.generateContent()` directly
|
||||
|
||||
---
|
||||
|
||||
## Automated Migration Script
|
||||
|
||||
```bash
|
||||
# Find all files using old SDK
|
||||
rg "@google/generative-ai" --type ts
|
||||
|
||||
# Replace import statements
|
||||
find . -name "*.ts" -exec sed -i 's/@google\/generative-ai/@google\/genai/g' {} +
|
||||
|
||||
# Replace class name
|
||||
find . -name "*.ts" -exec sed -i 's/GoogleGenerativeAI/GoogleGenAI/g' {} +
|
||||
```
|
||||
|
||||
**⚠️ Note**: This script handles imports but NOT API changes. Manual review required!
|
||||
|
||||
---
|
||||
|
||||
## Official Resources
|
||||
|
||||
- **Migration Guide**: https://ai.google.dev/gemini-api/docs/migrate-to-genai
|
||||
- **New SDK Docs**: https://github.com/googleapis/js-genai
|
||||
- **Deprecated SDK**: https://github.com/google-gemini/deprecated-generative-ai-js
|
||||
|
||||
---
|
||||
|
||||
**Deadline Reminder**: November 30, 2025 - Deprecated SDK sunset
|
||||
81
references/streaming-patterns.md
Normal file
81
references/streaming-patterns.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# Streaming Patterns
|
||||
|
||||
Complete guide to implementing streaming with Gemini API.
|
||||
|
||||
---
|
||||
|
||||
## SDK Approach (Async Iteration)
|
||||
|
||||
```typescript
|
||||
const response = await ai.models.generateContentStream({
|
||||
model: 'gemini-2.5-flash',
|
||||
contents: 'Write a story'
|
||||
});
|
||||
|
||||
for await (const chunk of response) {
|
||||
process.stdout.write(chunk.text);
|
||||
}
|
||||
```
|
||||
|
||||
**Pros**: Simple, automatic parsing
|
||||
**Cons**: Requires Node.js or compatible runtime
|
||||
|
||||
---
|
||||
|
||||
## Fetch Approach (SSE Parsing)
|
||||
|
||||
```typescript
|
||||
const response = await fetch(
|
||||
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent',
|
||||
{ /* ... */ }
|
||||
);
|
||||
|
||||
const reader = response.body.getReader();
|
||||
const decoder = new TextDecoder();
|
||||
let buffer = '';
|
||||
|
||||
while (true) {
|
||||
const { done, value } = await reader.read();
|
||||
if (done) break;
|
||||
|
||||
buffer += decoder.decode(value, { stream: true });
|
||||
const lines = buffer.split('\n');
|
||||
buffer = lines.pop() || '';
|
||||
|
||||
for (const line of lines) {
|
||||
if (!line.startsWith('data: ')) continue;
|
||||
|
||||
const data = JSON.parse(line.slice(6));
|
||||
const text = data.candidates[0]?.content?.parts[0]?.text;
|
||||
if (text) process.stdout.write(text);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Pros**: Works in any environment
|
||||
**Cons**: Manual SSE parsing required
|
||||
|
||||
---
|
||||
|
||||
## SSE Format
|
||||
|
||||
```
|
||||
data: {"candidates":[{"content":{"parts":[{"text":"Hello"}]}}]}
|
||||
data: {"candidates":[{"content":{"parts":[{"text":" world"}]}}]}
|
||||
data: [DONE]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
- Always use `streamGenerateContent` endpoint
|
||||
- Handle incomplete chunks in buffer
|
||||
- Skip empty lines and `[DONE]` markers
|
||||
- Use streaming for better UX on long responses
|
||||
|
||||
---
|
||||
|
||||
## Official Docs
|
||||
|
||||
https://ai.google.dev/gemini-api/docs/streaming
|
||||
59
references/thinking-mode-guide.md
Normal file
59
references/thinking-mode-guide.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# Thinking Mode Guide
|
||||
|
||||
Complete guide to thinking mode in Gemini 2.5 models.
|
||||
|
||||
---
|
||||
|
||||
## What is Thinking Mode?
|
||||
|
||||
Gemini 2.5 models "think" internally before responding, improving accuracy on complex tasks.
|
||||
|
||||
**Key Points**:
|
||||
- ✅ Always enabled on 2.5 models (cannot disable)
|
||||
- ✅ Transparent (you don't see the thinking process)
|
||||
- ✅ Configurable thinking budget
|
||||
- ✅ Improves reasoning quality
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
```typescript
|
||||
config: {
|
||||
thinkingConfig: {
|
||||
thinkingBudget: 8192 // Max tokens for internal reasoning
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Increase Budget
|
||||
|
||||
✅ Complex math/logic problems
|
||||
✅ Multi-step reasoning
|
||||
✅ Code optimization
|
||||
✅ Detailed analysis
|
||||
|
||||
---
|
||||
|
||||
## When Default is Fine
|
||||
|
||||
⏺️ Simple questions
|
||||
⏺️ Creative writing
|
||||
⏺️ Translation
|
||||
⏺️ Summarization
|
||||
|
||||
---
|
||||
|
||||
## Model Comparison
|
||||
|
||||
- **gemini-2.5-pro**: Best for complex reasoning
|
||||
- **gemini-2.5-flash**: Good balance
|
||||
- **gemini-2.5-flash-lite**: Basic thinking
|
||||
|
||||
---
|
||||
|
||||
## Official Docs
|
||||
|
||||
https://ai.google.dev/gemini-api/docs/thinking
|
||||
304
references/top-errors.md
Normal file
304
references/top-errors.md
Normal file
@@ -0,0 +1,304 @@
|
||||
# Top Errors and Solutions
|
||||
|
||||
22 common Gemini API errors with solutions (Phase 1 + Phase 2).
|
||||
|
||||
---
|
||||
|
||||
## 1. Using Deprecated SDK
|
||||
|
||||
**Error**: `Cannot find module '@google/generative-ai'`
|
||||
|
||||
**Cause**: Using old SDK after migration
|
||||
|
||||
**Solution**: Install `@google/genai` instead
|
||||
|
||||
---
|
||||
|
||||
## 2. Wrong Context Window Claims
|
||||
|
||||
**Error**: Input exceeds model capacity
|
||||
|
||||
**Cause**: Assuming 2M tokens for Gemini 2.5
|
||||
|
||||
**Solution**: Gemini 2.5 has 1,048,576 input tokens (NOT 2M!)
|
||||
|
||||
---
|
||||
|
||||
## 3. Model Not Found
|
||||
|
||||
**Error**: `models/gemini-3.0-flash is not found`
|
||||
|
||||
**Cause**: Wrong model name
|
||||
|
||||
**Solution**: Use: `gemini-2.5-pro`, `gemini-2.5-flash`, or `gemini-2.5-flash-lite`
|
||||
|
||||
---
|
||||
|
||||
## 4. Function Calling on Flash-Lite
|
||||
|
||||
**Error**: Function calling not working
|
||||
|
||||
**Cause**: Flash-Lite doesn't support function calling
|
||||
|
||||
**Solution**: Use `gemini-2.5-flash` or `gemini-2.5-pro`
|
||||
|
||||
---
|
||||
|
||||
## 5. Invalid API Key (401)
|
||||
|
||||
**Error**: `API key not valid`
|
||||
|
||||
**Cause**: Missing or wrong `GEMINI_API_KEY`
|
||||
|
||||
**Solution**: Set environment variable correctly
|
||||
|
||||
---
|
||||
|
||||
## 6. Rate Limit Exceeded (429)
|
||||
|
||||
**Error**: `Resource has been exhausted`
|
||||
|
||||
**Cause**: Too many requests
|
||||
|
||||
**Solution**: Implement exponential backoff
|
||||
|
||||
---
|
||||
|
||||
## 7. Streaming Parse Errors
|
||||
|
||||
**Error**: Invalid JSON in SSE stream
|
||||
|
||||
**Cause**: Incomplete chunk parsing
|
||||
|
||||
**Solution**: Use buffer to handle partial chunks
|
||||
|
||||
---
|
||||
|
||||
## 8. Multimodal Format Errors
|
||||
|
||||
**Error**: Invalid base64 or MIME type
|
||||
|
||||
**Cause**: Wrong image encoding
|
||||
|
||||
**Solution**: Use correct base64 encoding and MIME type
|
||||
|
||||
---
|
||||
|
||||
## 9. Context Length Exceeded
|
||||
|
||||
**Error**: `Request payload size exceeds the limit`
|
||||
|
||||
**Cause**: Input too large
|
||||
|
||||
**Solution**: Reduce input size (max 1,048,576 tokens)
|
||||
|
||||
---
|
||||
|
||||
## 10. Chat Not Working with Fetch
|
||||
|
||||
**Error**: No chat helper available
|
||||
|
||||
**Cause**: Chat helpers are SDK-only
|
||||
|
||||
**Solution**: Manually manage conversation history or use SDK
|
||||
|
||||
---
|
||||
|
||||
## 11. Thinking Mode Not Supported
|
||||
|
||||
**Error**: Trying to disable thinking mode
|
||||
|
||||
**Cause**: Thinking mode always enabled on 2.5
|
||||
|
||||
**Solution**: You can only configure budget, not disable
|
||||
|
||||
---
|
||||
|
||||
## 12. Parameter Conflicts
|
||||
|
||||
**Error**: Unsupported parameters
|
||||
|
||||
**Cause**: Using wrong config options
|
||||
|
||||
**Solution**: Use only supported parameters (see generation-config.md)
|
||||
|
||||
---
|
||||
|
||||
## 13. System Instruction Placement
|
||||
|
||||
**Error**: System instruction not working
|
||||
|
||||
**Cause**: Placed inside contents array
|
||||
|
||||
**Solution**: Place at top level, not in contents
|
||||
|
||||
---
|
||||
|
||||
## 14. Token Counting Errors
|
||||
|
||||
**Error**: Unexpected token usage
|
||||
|
||||
**Cause**: Multimodal inputs use more tokens
|
||||
|
||||
**Solution**: Images/video/audio count toward token limit
|
||||
|
||||
---
|
||||
|
||||
## 15. Parallel Function Call Errors
|
||||
|
||||
**Error**: Functions not executing in parallel
|
||||
|
||||
**Cause**: Dependencies between functions
|
||||
|
||||
**Solution**: Gemini auto-detects; ensure functions are independent
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Errors
|
||||
|
||||
### 16. Invalid Model Version for Caching
|
||||
|
||||
**Error**: `Invalid model name for caching`
|
||||
|
||||
**Cause**: Using `gemini-2.5-flash` instead of `gemini-2.5-flash-001`
|
||||
|
||||
**Solution**: Must use explicit version suffix when creating caches
|
||||
|
||||
```typescript
|
||||
// ✅ Correct
|
||||
model: 'gemini-2.5-flash-001'
|
||||
|
||||
// ❌ Wrong
|
||||
model: 'gemini-2.5-flash'
|
||||
```
|
||||
|
||||
**Source**: https://ai.google.dev/gemini-api/docs/caching
|
||||
|
||||
---
|
||||
|
||||
### 17. Cache Expired or Not Found
|
||||
|
||||
**Error**: `Cache not found` or `Cache expired`
|
||||
|
||||
**Cause**: Trying to use cache after TTL expiration
|
||||
|
||||
**Solution**: Check expiration before use or recreate cache
|
||||
|
||||
```typescript
|
||||
const cache = await ai.caches.get({ name: cacheName });
|
||||
if (new Date(cache.expireTime) < new Date()) {
|
||||
// Recreate cache
|
||||
cache = await ai.caches.create({ ... });
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 18. Cannot Update Expired Cache TTL
|
||||
|
||||
**Error**: `Cannot update expired cache`
|
||||
|
||||
**Cause**: Trying to extend TTL after cache already expired
|
||||
|
||||
**Solution**: Update TTL before expiration or create new cache
|
||||
|
||||
```typescript
|
||||
// Update TTL before expiration
|
||||
await ai.caches.update({
|
||||
name: cache.name,
|
||||
config: { ttl: '7200s' }
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 19. Code Execution Timeout
|
||||
|
||||
**Error**: `Execution timed out after 30 seconds` with `OUTCOME_FAILED`
|
||||
|
||||
**Cause**: Python code taking too long to execute
|
||||
|
||||
**Solution**: Simplify computation or reduce data size
|
||||
|
||||
```typescript
|
||||
// Check outcome before using results
|
||||
if (part.codeExecutionResult?.outcome === 'OUTCOME_FAILED') {
|
||||
console.error('Execution failed:', part.codeExecutionResult.output);
|
||||
}
|
||||
```
|
||||
|
||||
**Source**: https://ai.google.dev/gemini-api/docs/code-execution
|
||||
|
||||
---
|
||||
|
||||
### 20. Python Package Not Available
|
||||
|
||||
**Error**: `ModuleNotFoundError: No module named 'requests'`
|
||||
|
||||
**Cause**: Trying to import package not in sandbox
|
||||
|
||||
**Solution**: Use only available packages (numpy, pandas, matplotlib, seaborn, scipy)
|
||||
|
||||
**Available Packages**:
|
||||
- Standard library: math, statistics, json, csv, datetime
|
||||
- Data science: numpy, pandas, scipy
|
||||
- Visualization: matplotlib, seaborn
|
||||
|
||||
---
|
||||
|
||||
### 21. Code Execution on Flash-Lite
|
||||
|
||||
**Error**: Code execution not working
|
||||
|
||||
**Cause**: `gemini-2.5-flash-lite` doesn't support code execution
|
||||
|
||||
**Solution**: Use `gemini-2.5-flash` or `gemini-2.5-pro`
|
||||
|
||||
```typescript
|
||||
// ✅ Correct
|
||||
model: 'gemini-2.5-flash' // Supports code execution
|
||||
|
||||
// ❌ Wrong
|
||||
model: 'gemini-2.5-flash-lite' // NO code execution support
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 22. Grounding Requires Google Cloud Project
|
||||
|
||||
**Error**: `Grounding requires Google Cloud project configuration`
|
||||
|
||||
**Cause**: Using API key not associated with GCP project
|
||||
|
||||
**Solution**: Set up Google Cloud project and enable Generative Language API
|
||||
|
||||
**Steps**:
|
||||
1. Create Google Cloud project
|
||||
2. Enable Generative Language API
|
||||
3. Configure billing
|
||||
4. Use API key from that project
|
||||
|
||||
**Source**: https://ai.google.dev/gemini-api/docs/grounding
|
||||
|
||||
---
|
||||
|
||||
## Quick Debugging Checklist
|
||||
|
||||
### Phase 1 (Core)
|
||||
- [ ] Using @google/genai (NOT @google/generative-ai)
|
||||
- [ ] Model name is gemini-2.5-pro/flash/flash-lite
|
||||
- [ ] API key is set correctly
|
||||
- [ ] Input under 1,048,576 tokens
|
||||
- [ ] Not using Flash-Lite for function calling
|
||||
- [ ] System instruction at top level
|
||||
- [ ] Streaming endpoint is streamGenerateContent
|
||||
- [ ] MIME types are correct for multimodal
|
||||
|
||||
### Phase 2 (Advanced)
|
||||
- [ ] Caching: Using explicit model version (e.g., gemini-2.5-flash-001)
|
||||
- [ ] Caching: Cache not expired (check expireTime)
|
||||
- [ ] Code Execution: Not using Flash-Lite
|
||||
- [ ] Code Execution: Using only available Python packages
|
||||
- [ ] Grounding: Google Cloud project configured
|
||||
- [ ] Grounding: Checking groundingMetadata for search results
|
||||
|
||||
Reference in New Issue
Block a user