Initial commit

2025-11-29 17:59:39 +08:00
commit 0b993003eb
9 changed files with 2987 additions and 0 deletions
--- a/skills/ollama/references/nodejs_api.md
+++ b/skills/ollama/references/nodejs_api.md
@@ -0,0 +1,507 @@
+# Ollama Node.js API Reference
+
+This reference provides comprehensive examples for integrating Ollama into Node.js projects using the official `ollama` npm package.
+
+**IMPORTANT**: Always use streaming responses for better user experience.
+
+## Table of Contents
+
+1. [Package Setup](#package-setup)
+2. [Installation & Setup](#installation--setup)
+3. [Verifying Ollama Connection](#verifying-ollama-connection)
+4. [Model Selection](#model-selection)
+5. [Generate API (Text Completion)](#generate-api-text-completion)
+6. [Chat API (Conversational)](#chat-api-conversational)
+7. [Embeddings](#embeddings)
+8. [Error Handling](#error-handling)
+
+## Package Setup
+
+### ES Modules (package.json)
+
+When creating Node.js scripts for users, always use ES modules. Create a `package.json` with:
+
+```json
+{
+  "type": "module",
+  "dependencies": {
+    "ollama": "^0.5.0"
+  }
+}
+```
+
+This allows using modern `import` syntax instead of `require`.
+
+### Running Scripts
+
+```bash
+# Install dependencies
+npm install
+
+# Run script
+node script.js
+```
+
+## Installation & Setup
+
+### Installation
+
+```bash
+npm install ollama
+```
+
+### Import
+
+```javascript
+import { Ollama } from 'ollama';
+```
+
+### Configuration
+
+**IMPORTANT**: Always ask users for their Ollama URL. Do not assume localhost.
+
+```javascript
+import { Ollama } from 'ollama';
+
+// Create client with custom URL
+const ollama = new Ollama({ host: 'http://localhost:11434' });
+
+// Or for remote Ollama instance
+// const ollama = new Ollama({ host: 'http://192.168.1.100:11434' });
+```
+
+## Verifying Ollama Connection
+
+### Check Connection (Development)
+
+During development, verify Ollama is running and check available models using curl:
+
+```bash
+# Check Ollama is running and get version
+curl http://localhost:11434/api/version
+
+# List available models
+curl http://localhost:11434/api/tags
+```
+
+### Check Ollama Version (Node.js)
+
+```javascript
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function checkOllama() {
+  try {
+    // Simple way to verify connection
+    const models = await ollama.list();
+    console.log('✓ Connected to Ollama');
+    console.log(`  Available models: ${models.models.length}`);
+    return true;
+  } catch (error) {
+    console.log(`✗ Failed to connect to Ollama: ${error.message}`);
+    return false;
+  }
+}
+
+// Usage
+await checkOllama();
+```
+
+## Model Selection
+
+**IMPORTANT**: Always ask users which model they want to use. Don't assume a default.
+
+### Listing Available Models
+
+```javascript
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function listAvailableModels() {
+  const { models } = await ollama.list();
+  return models.map(m => m.name);
+}
+
+// Usage - show available models to user
+const available = await listAvailableModels();
+console.log('Available models:');
+available.forEach(model => {
+  console.log(`  - ${model}`);
+});
+```
+
+### Finding Models
+
+If the user doesn't have a model installed or wants to use a different one:
+- **Browse models**: Direct them to https://ollama.com/search
+- **Popular choices**: llama3.2, llama3.1, mistral, phi3, qwen2.5
+- **Specialized models**: codellama (coding), llava (vision), nomic-embed-text (embeddings)
+
+### Model Selection Flow
+
+```javascript
+async function selectModel() {
+  const available = await listAvailableModels();
+
+  if (available.length === 0) {
+    console.log('No models installed!');
+    console.log('Visit https://ollama.com/search to find models');
+    console.log('Then run: ollama pull <model-name>');
+    return null;
+  }
+
+  console.log('Available models:');
+  available.forEach((model, i) => {
+    console.log(`  ${i + 1}. ${model}`);
+  });
+
+  // In practice, you'd ask the user to choose
+  return available[0];  // Default to first available
+}
+```
+
+## Generate API (Text Completion)
+
+### Streaming Text Generation
+
+```javascript
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function generateStream(prompt, model = 'llama3.2') {
+  const response = await ollama.generate({
+    model: model,
+    prompt: prompt,
+    stream: true
+  });
+
+  for await (const chunk of response) {
+    process.stdout.write(chunk.response);
+  }
+}
+
+// Usage
+process.stdout.write('Response: ');
+await generateStream('Why is the sky blue?', 'llama3.2');
+process.stdout.write('\n');
+```
+
+### With Options (Temperature, Top-P, etc.)
+
+```javascript
+async function generateWithOptions(prompt, model = 'llama3.2') {
+  const response = await ollama.generate({
+    model: model,
+    prompt: prompt,
+    stream: true,
+    options: {
+      temperature: 0.7,
+      top_p: 0.9,
+      top_k: 40,
+      num_predict: 100  // Max tokens
+    }
+  });
+
+  for await (const chunk of response) {
+    process.stdout.write(chunk.response);
+  }
+}
+
+// Usage
+process.stdout.write('Response: ');
+await generateWithOptions('Write a haiku about programming');
+process.stdout.write('\n');
+```
+
+## Chat API (Conversational)
+
+### Streaming Chat
+
+```javascript
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function chatStream(messages, model = 'llama3.2') {
+  /*
+   * Chat with a model using conversation history with streaming.
+   *
+   * Args:
+   *   messages: Array of message objects with 'role' and 'content'
+   *            role can be 'system', 'user', or 'assistant'
+   */
+  const response = await ollama.chat({
+    model: model,
+    messages: messages,
+    stream: true
+  });
+
+  for await (const chunk of response) {
+    process.stdout.write(chunk.message.content);
+  }
+}
+
+// Usage
+const messages = [
+  { role: 'system', content: 'You are a helpful assistant.' },
+  { role: 'user', content: 'What is the capital of France?' }
+];
+
+process.stdout.write('Response: ');
+await chatStream(messages);
+process.stdout.write('\n');
+```
+
+### Multi-turn Conversation
+
+```javascript
+import * as readline from 'readline';
+
+async function conversationLoop(model = 'llama3.2') {
+  const rl = readline.createInterface({
+    input: process.stdin,
+    output: process.stdout
+  });
+
+  const messages = [
+    { role: 'system', content: 'You are a helpful assistant.' }
+  ];
+
+  const askQuestion = () => {
+    rl.question('\nYou: ', async (input) => {
+      if (input.toLowerCase() === 'exit' || input.toLowerCase() === 'quit') {
+        rl.close();
+        return;
+      }
+
+      // Add user message
+      messages.push({ role: 'user', content: input });
+
+      // Stream response
+      process.stdout.write('Assistant: ');
+      let fullResponse = '';
+
+      const response = await ollama.chat({
+        model: model,
+        messages: messages,
+        stream: true
+      });
+
+      for await (const chunk of response) {
+        const content = chunk.message.content;
+        process.stdout.write(content);
+        fullResponse += content;
+      }
+      process.stdout.write('\n');
+
+      // Add assistant response to history
+      messages.push({ role: 'assistant', content: fullResponse });
+
+      askQuestion();
+    });
+  };
+
+  askQuestion();
+}
+
+// Usage
+await conversationLoop();
+```
+
+## Embeddings
+
+### Generate Embeddings
+
+```javascript
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function getEmbeddings(text, model = 'nomic-embed-text') {
+  /*
+   * Generate embeddings for text.
+   *
+   * Note: Use an embedding-specific model like 'nomic-embed-text'
+   * Regular models can generate embeddings, but dedicated models work better.
+   */
+  const response = await ollama.embeddings({
+    model: model,
+    prompt: text
+  });
+
+  return response.embedding;
+}
+
+// Usage
+const embedding = await getEmbeddings('Hello, world!');
+console.log(`Embedding dimension: ${embedding.length}`);
+console.log(`First 5 values: ${embedding.slice(0, 5)}`);
+```
+
+### Semantic Similarity
+
+```javascript
+function cosineSimilarity(vec1, vec2) {
+  const dotProduct = vec1.reduce((sum, val, i) => sum + val * vec2[i], 0);
+  const magnitude1 = Math.sqrt(vec1.reduce((sum, val) => sum + val * val, 0));
+  const magnitude2 = Math.sqrt(vec2.reduce((sum, val) => sum + val * val, 0));
+  return dotProduct / (magnitude1 * magnitude2);
+}
+
+// Usage
+const text1 = 'The cat sat on the mat';
+const text2 = 'A feline rested on a rug';
+const text3 = 'JavaScript is a programming language';
+
+const emb1 = await getEmbeddings(text1);
+const emb2 = await getEmbeddings(text2);
+const emb3 = await getEmbeddings(text3);
+
+console.log(`Similarity 1-2: ${cosineSimilarity(emb1, emb2).toFixed(3)}`);  // High
+console.log(`Similarity 1-3: ${cosineSimilarity(emb1, emb3).toFixed(3)}`);  // Low
+```
+
+## Error Handling
+
+### Comprehensive Error Handling
+
+```javascript
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function* safeGenerateStream(prompt, model = 'llama3.2') {
+  try {
+    const response = await ollama.generate({
+      model: model,
+      prompt: prompt,
+      stream: true
+    });
+
+    for await (const chunk of response) {
+      yield chunk.response;
+    }
+
+  } catch (error) {
+    // Model not found or other API errors
+    if (error.message.toLowerCase().includes('not found')) {
+      console.log(`\n✗ Model '${model}' not found`);
+      console.log(`  Run: ollama pull ${model}`);
+      console.log(`  Or browse models at: https://ollama.com/search`);
+    } else if (error.code === 'ECONNREFUSED') {
+      console.log('\n✗ Connection failed. Is Ollama running?');
+      console.log('  Start Ollama with: ollama serve');
+    } else {
+      console.log(`\n✗ Unexpected error: ${error.message}`);
+    }
+  }
+}
+
+// Usage
+process.stdout.write('Response: ');
+for await (const token of safeGenerateStream('Hello, world!', 'llama3.2')) {
+  process.stdout.write(token);
+}
+process.stdout.write('\n');
+```
+
+### Checking Model Availability
+
+```javascript
+async function ensureModelAvailable(model) {
+  try {
+    const { models } = await ollama.list();
+    const modelNames = models.map(m => m.name);
+
+    if (!modelNames.includes(model)) {
+      console.log(`Model '${model}' not available locally`);
+      console.log(`Available models: ${modelNames.join(', ')}`);
+      console.log(`\nTo download: ollama pull ${model}`);
+      console.log(`Browse models: https://ollama.com/search`);
+      return false;
+    }
+
+    return true;
+
+  } catch (error) {
+    console.log(`Failed to check models: ${error.message}`);
+    return false;
+  }
+}
+
+// Usage
+if (await ensureModelAvailable('llama3.2')) {
+  // Proceed with using the model
+}
+```
+
+## Best Practices
+
+1. **Always Use Streaming**: Stream responses for better user experience
+2. **Ask About Models**: Don't assume models - ask users which model they want to use
+3. **Verify Connection**: Check Ollama connection during development with curl
+4. **Error Handling**: Handle model not found and connection errors gracefully
+5. **Context Management**: Manage conversation history to avoid token limits
+6. **Model Selection**: Direct users to https://ollama.com/search to find models
+7. **Custom Hosts**: Always ask users for their Ollama URL, don't assume localhost
+8. **ES Modules**: Use `"type": "module"` in package.json for modern import syntax
+
+## Complete Example Script
+
+```javascript
+// script.js
+import { Ollama } from 'ollama';
+
+const ollama = new Ollama();
+
+async function main() {
+  const model = 'llama3.2';
+
+  // Check connection
+  try {
+    await ollama.list();
+  } catch (error) {
+    console.log(`Error: Cannot connect to Ollama - ${error.message}`);
+    console.log('Make sure Ollama is running: ollama serve');
+    return;
+  }
+
+  // Stream a response
+  console.log('Asking about JavaScript...\n');
+
+  const response = await ollama.generate({
+    model: model,
+    prompt: 'Explain JavaScript in one sentence',
+    stream: true
+  });
+
+  process.stdout.write('Response: ');
+  for await (const chunk of response) {
+    process.stdout.write(chunk.response);
+  }
+  process.stdout.write('\n');
+}
+
+main();
+```
+
+### package.json
+
+```json
+{
+  "type": "module",
+  "dependencies": {
+    "ollama": "^0.5.0"
+  }
+}
+```
+
+### Running
+
+```bash
+npm install
+node script.js
+```
--- a/skills/ollama/references/python_api.md
+++ b/skills/ollama/references/python_api.md
@@ -0,0 +1,454 @@
+# Ollama Python API Reference
+
+This reference provides comprehensive examples for integrating Ollama into Python projects using the official `ollama` Python library.
+
+**IMPORTANT**: Always use streaming responses for better user experience.
+
+## Table of Contents
+
+1. [PEP 723 Inline Script Metadata](#pep-723-inline-script-metadata)
+2. [Installation & Setup](#installation--setup)
+3. [Verifying Ollama Connection](#verifying-ollama-connection)
+4. [Model Selection](#model-selection)
+5. [Generate API (Text Completion)](#generate-api-text-completion)
+6. [Chat API (Conversational)](#chat-api-conversational)
+7. [Embeddings](#embeddings)
+8. [Error Handling](#error-handling)
+
+## Installation & Setup
+
+### Installation
+
+```bash
+pip install ollama
+```
+
+### Import
+
+```python
+import ollama
+```
+
+### Configuration
+
+**IMPORTANT**: Always ask users for their Ollama URL. Do not assume localhost.
+
+```python
+# Create client with custom URL
+client = ollama.Client(host='http://localhost:11434')
+
+# Or for remote Ollama instance
+# client = ollama.Client(host='http://192.168.1.100:11434')
+```
+
+## Verifying Ollama Connection
+
+### Check Connection (Development)
+
+During development, verify Ollama is running and check available models using curl:
+
+```bash
+# Check Ollama is running and get version
+curl http://localhost:11434/api/version
+
+# List available models
+curl http://localhost:11434/api/tags
+```
+
+### Check Ollama Version (Python)
+
+```python
+import ollama
+
+def check_ollama():
+    """Check if Ollama is running."""
+    try:
+        # Simple way to verify connection
+        models = ollama.list()
+        print(f"✓ Connected to Ollama")
+        print(f"  Available models: {len(models.get('models', []))}")
+        return True
+    except Exception as e:
+        print(f"✗ Failed to connect to Ollama: {e}")
+        return False
+
+# Usage
+check_ollama()
+```
+
+## Model Selection
+
+**IMPORTANT**: Always ask users which model they want to use. Don't assume a default.
+
+### Listing Available Models
+
+```python
+import ollama
+
+def list_available_models():
+    """List all locally installed models."""
+    models = ollama.list()
+    return [model['name'] for model in models.get('models', [])]
+
+# Usage - show available models to user
+available = list_available_models()
+print("Available models:")
+for model in available:
+    print(f"  - {model}")
+```
+
+### Finding Models
+
+If the user doesn't have a model installed or wants to use a different one:
+- **Browse models**: Direct them to https://ollama.com/search
+- **Popular choices**: llama3.2, llama3.1, mistral, phi3, qwen2.5
+- **Specialized models**: codellama (coding), llava (vision), nomic-embed-text (embeddings)
+
+### Model Selection Flow
+
+```python
+def select_model():
+    """Interactive model selection."""
+    available = list_available_models()
+
+    if not available:
+        print("No models installed!")
+        print("Visit https://ollama.com/search to find models")
+        print("Then run: ollama pull <model-name>")
+        return None
+
+    print("Available models:")
+    for i, model in enumerate(available, 1):
+        print(f"  {i}. {model}")
+
+    # In practice, you'd ask the user to choose
+    return available[0]  # Default to first available
+```
+
+## Generate API (Text Completion)
+
+### Streaming Text Generation
+
+```python
+import ollama
+
+def generate_stream(prompt, model="llama3.2"):
+    """Generate text with streaming (yields tokens as they arrive)."""
+    stream = ollama.generate(
+        model=model,
+        prompt=prompt,
+        stream=True
+    )
+
+    for chunk in stream:
+        yield chunk['response']
+
+# Usage
+print("Response: ", end="", flush=True)
+for token in generate_stream("Why is the sky blue?", model="llama3.2"):
+    print(token, end="", flush=True)
+print()
+```
+
+### With Options (Temperature, Top-P, etc.)
+
+```python
+def generate_with_options(prompt, model="llama3.2"):
+    """Generate with custom sampling parameters."""
+    stream = ollama.generate(
+        model=model,
+        prompt=prompt,
+        stream=True,
+        options={
+            'temperature': 0.7,
+            'top_p': 0.9,
+            'top_k': 40,
+            'num_predict': 100  # Max tokens
+        }
+    )
+
+    for chunk in stream:
+        yield chunk['response']
+
+# Usage
+print("Response: ", end="", flush=True)
+for token in generate_with_options("Write a haiku about programming"):
+    print(token, end="", flush=True)
+print()
+```
+
+## Chat API (Conversational)
+
+### Streaming Chat
+
+```python
+import ollama
+
+def chat_stream(messages, model="llama3.2"):
+    """
+    Chat with a model using conversation history with streaming.
+
+    Args:
+        messages: List of message dicts with 'role' and 'content'
+                 role can be 'system', 'user', or 'assistant'
+    """
+    stream = ollama.chat(
+        model=model,
+        messages=messages,
+        stream=True
+    )
+
+    for chunk in stream:
+        yield chunk['message']['content']
+
+# Usage
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": "What is the capital of France?"}
+]
+
+print("Response: ", end="", flush=True)
+for token in chat_stream(messages):
+    print(token, end="", flush=True)
+print()
+```
+
+### Multi-turn Conversation
+
+```python
+def conversation_loop(model="llama3.2"):
+    """Interactive chat loop with streaming responses."""
+    messages = [
+        {"role": "system", "content": "You are a helpful assistant."}
+    ]
+
+    while True:
+        user_input = input("\nYou: ")
+        if user_input.lower() in ['exit', 'quit']:
+            break
+
+        # Add user message
+        messages.append({"role": "user", "content": user_input})
+
+        # Stream response
+        print("Assistant: ", end="", flush=True)
+        full_response = ""
+        for token in chat_stream(messages, model):
+            print(token, end="", flush=True)
+            full_response += token
+        print()
+
+        # Add assistant response to history
+        messages.append({"role": "assistant", "content": full_response})
+
+# Usage
+conversation_loop()
+```
+
+
+## Embeddings
+
+### Generate Embeddings
+
+```python
+import ollama
+
+def get_embeddings(text, model="nomic-embed-text"):
+    """
+    Generate embeddings for text.
+
+    Note: Use an embedding-specific model like 'nomic-embed-text'
+    Regular models can generate embeddings, but dedicated models work better.
+    """
+    response = ollama.embeddings(
+        model=model,
+        prompt=text
+    )
+    return response['embedding']
+
+# Usage
+embedding = get_embeddings("Hello, world!")
+print(f"Embedding dimension: {len(embedding)}")
+print(f"First 5 values: {embedding[:5]}")
+```
+
+### Semantic Similarity
+
+```python
+import math
+
+def cosine_similarity(vec1, vec2):
+    """Calculate cosine similarity between two vectors."""
+    dot_product = sum(a * b for a, b in zip(vec1, vec2))
+    magnitude1 = math.sqrt(sum(a * a for a in vec1))
+    magnitude2 = math.sqrt(sum(b * b for b in vec2))
+    return dot_product / (magnitude1 * magnitude2)
+
+# Usage
+text1 = "The cat sat on the mat"
+text2 = "A feline rested on a rug"
+text3 = "Python is a programming language"
+
+emb1 = get_embeddings(text1)
+emb2 = get_embeddings(text2)
+emb3 = get_embeddings(text3)
+
+print(f"Similarity 1-2: {cosine_similarity(emb1, emb2):.3f}")  # High
+print(f"Similarity 1-3: {cosine_similarity(emb1, emb3):.3f}")  # Low
+```
+
+## Error Handling
+
+### Comprehensive Error Handling
+
+```python
+import ollama
+
+def safe_generate_stream(prompt, model="llama3.2"):
+    """Generate with comprehensive error handling."""
+    try:
+        stream = ollama.generate(
+            model=model,
+            prompt=prompt,
+            stream=True
+        )
+
+        for chunk in stream:
+            yield chunk['response']
+
+    except ollama.ResponseError as e:
+        # Model not found or other API errors
+        if "not found" in str(e).lower():
+            print(f"\n✗ Model '{model}' not found")
+            print(f"  Run: ollama pull {model}")
+            print(f"  Or browse models at: https://ollama.com/search")
+        else:
+            print(f"\n✗ API Error: {e}")
+
+    except ConnectionError:
+        print("\n✗ Connection failed. Is Ollama running?")
+        print("  Start Ollama with: ollama serve")
+
+    except Exception as e:
+        print(f"\n✗ Unexpected error: {e}")
+
+# Usage
+print("Response: ", end="", flush=True)
+for token in safe_generate_stream("Hello, world!", model="llama3.2"):
+    print(token, end="", flush=True)
+print()
+```
+
+### Checking Model Availability
+
+```python
+def ensure_model_available(model):
+    """Check if model is available, provide guidance if not."""
+    try:
+        available = ollama.list()
+        model_names = [m['name'] for m in available.get('models', [])]
+
+        if model not in model_names:
+            print(f"Model '{model}' not available locally")
+            print(f"Available models: {', '.join(model_names)}")
+            print(f"\nTo download: ollama pull {model}")
+            print(f"Browse models: https://ollama.com/search")
+            return False
+
+        return True
+
+    except Exception as e:
+        print(f"Failed to check models: {e}")
+        return False
+
+# Usage
+if ensure_model_available("llama3.2"):
+    # Proceed with using the model
+    pass
+```
+
+## Best Practices
+
+1. **Always Use Streaming**: Stream responses for better user experience
+2. **Ask About Models**: Don't assume models - ask users which model they want to use
+3. **Verify Connection**: Check Ollama connection during development with curl
+4. **Error Handling**: Handle model not found and connection errors gracefully
+5. **Context Management**: Manage conversation history to avoid token limits
+6. **Model Selection**: Direct users to https://ollama.com/search to find models
+7. **Custom Hosts**: Always ask users for their Ollama URL, don't assume localhost
+
+## PEP 723 Inline Script Metadata
+
+When creating standalone Python scripts for users, always include inline script metadata at the top of the file using PEP 723 format. This allows tools like `uv` and `pipx` to automatically manage dependencies.
+
+### Format
+
+```python
+# /// script
+# requires-python = ">=3.8"
+# dependencies = [
+#   "ollama>=0.1.0",
+# ]
+# ///
+
+import ollama
+
+# Your code here
+```
+
+### Running Scripts
+
+Users can run scripts with PEP 723 metadata using:
+
+```bash
+# Using uv (recommended)
+uv run script.py
+
+# Using pipx
+pipx run script.py
+
+# Traditional approach
+pip install ollama
+python script.py
+```
+
+### Complete Example Script
+
+```python
+# /// script
+# requires-python = ">=3.8"
+# dependencies = [
+#   "ollama>=0.1.0",
+# ]
+# ///
+
+import ollama
+
+def main():
+    """Simple streaming chat example."""
+    model = "llama3.2"
+
+    # Check connection
+    try:
+        ollama.list()
+    except Exception as e:
+        print(f"Error: Cannot connect to Ollama - {e}")
+        print("Make sure Ollama is running: ollama serve")
+        return
+
+    # Stream a response
+    print("Asking about Python...\n")
+    stream = ollama.generate(
+        model=model,
+        prompt="Explain Python in one sentence",
+        stream=True
+    )
+
+    print("Response: ", end="", flush=True)
+    for chunk in stream:
+        print(chunk['response'], end="", flush=True)
+    print()
+
+if __name__ == "__main__":
+    main()
+```