Initial commit

2025-11-30 08:50:46 +08:00
commit 34273b642f
9 changed files with 9081 additions and 0 deletions
--- a/skills/ollama/SKILL.md
+++ b/skills/ollama/SKILL.md
@@ -0,0 +1,470 @@
+---
+name: ollama
+description: Ollama API Documentation
+---
+
+# Ollama Skill
+
+Comprehensive assistance with Ollama development - the local AI model runtime for running and interacting with large language models programmatically.
+
+## When to Use This Skill
+
+This skill should be triggered when:
+- Running local AI models with Ollama
+- Building applications that interact with Ollama's API
+- Implementing chat completions, embeddings, or streaming responses
+- Setting up Ollama authentication or cloud models
+- Configuring Ollama server (environment variables, ports, proxies)
+- Using Ollama with OpenAI-compatible libraries
+- Troubleshooting Ollama installations or GPU compatibility
+- Implementing tool calling, structured outputs, or vision capabilities
+- Working with Ollama in Docker or behind proxies
+- Creating, copying, pushing, or managing Ollama models
+
+## Quick Reference
+
+### 1. Basic Chat Completion (cURL)
+
+Generate a simple chat response:
+
+```bash
+curl http://localhost:11434/api/chat -d '{
+  "model": "gemma3",
+  "messages": [
+    {
+      "role": "user",
+      "content": "Why is the sky blue?"
+    }
+  ]
+}'
+```
+
+### 2. Simple Text Generation (cURL)
+
+Generate a text response from a prompt:
+
+```bash
+curl http://localhost:11434/api/generate -d '{
+  "model": "gemma3",
+  "prompt": "Why is the sky blue?"
+}'
+```
+
+### 3. Python Chat with OpenAI Library
+
+Use Ollama with the OpenAI Python library:
+
+```python
+from openai import OpenAI
+
+client = OpenAI(
+    base_url='http://localhost:11434/v1/',
+    api_key='ollama',  # required but ignored
+)
+
+chat_completion = client.chat.completions.create(
+    messages=[
+        {
+            'role': 'user',
+            'content': 'Say this is a test',
+        }
+    ],
+    model='llama3.2',
+)
+```
+
+### 4. Vision Model (Image Analysis)
+
+Ask questions about images:
+
+```python
+from openai import OpenAI
+
+client = OpenAI(base_url="http://localhost:11434/v1/", api_key="ollama")
+
+response = client.chat.completions.create(
+    model="llava",
+    messages=[
+        {
+            "role": "user",
+            "content": [
+                {"type": "text", "text": "What's in this image?"},
+                {
+                    "type": "image_url",
+                    "image_url": "data:image/png;base64,iVBORw0KG...",
+                },
+            ],
+        }
+    ],
+    max_tokens=300,
+)
+```
+
+### 5. Generate Embeddings
+
+Create vector embeddings for text:
+
+```python
+client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
+
+embeddings = client.embeddings.create(
+    model="all-minilm",
+    input=["why is the sky blue?", "why is the grass green?"],
+)
+```
+
+### 6. Structured Outputs (JSON Schema)
+
+Get structured JSON responses:
+
+```python
+from pydantic import BaseModel
+from openai import OpenAI
+
+client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
+
+class FriendInfo(BaseModel):
+    name: str
+    age: int
+    is_available: bool
+
+class FriendList(BaseModel):
+    friends: list[FriendInfo]
+
+completion = client.beta.chat.completions.parse(
+    temperature=0,
+    model="llama3.1:8b",
+    messages=[
+        {"role": "user", "content": "Return a list of friends in JSON format"}
+    ],
+    response_format=FriendList,
+)
+
+friends_response = completion.choices[0].message
+if friends_response.parsed:
+    print(friends_response.parsed)
+```
+
+### 7. JavaScript/TypeScript Chat
+
+Use Ollama with the OpenAI JavaScript library:
+
+```javascript
+import OpenAI from "openai";
+
+const openai = new OpenAI({
+  baseURL: "http://localhost:11434/v1/",
+  apiKey: "ollama",  // required but ignored
+});
+
+const chatCompletion = await openai.chat.completions.create({
+  messages: [{ role: "user", content: "Say this is a test" }],
+  model: "llama3.2",
+});
+```
+
+### 8. Authentication for Cloud Models
+
+Sign in to use cloud models:
+
+```bash
+# Sign in from CLI
+ollama signin
+
+# Then use cloud models
+ollama run gpt-oss:120b-cloud
+```
+
+Or use API keys for direct cloud access:
+
+```bash
+export OLLAMA_API_KEY=your_api_key
+
+curl https://ollama.com/api/generate \
+  -H "Authorization: Bearer $OLLAMA_API_KEY" \
+  -d '{
+    "model": "gpt-oss:120b",
+    "prompt": "Why is the sky blue?",
+    "stream": false
+  }'
+```
+
+### 9. Configure Ollama Server
+
+Set environment variables for server configuration:
+
+**macOS:**
+```bash
+# Set environment variable
+launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
+
+# Restart Ollama application
+```
+
+**Linux (systemd):**
+```bash
+# Edit service
+systemctl edit ollama.service
+
+# Add under [Service]
+Environment="OLLAMA_HOST=0.0.0.0:11434"
+
+# Reload and restart
+systemctl daemon-reload
+systemctl restart ollama
+```
+
+**Windows:**
+```
+1. Quit Ollama from task bar
+2. Search "environment variables" in Settings
+3. Edit or create OLLAMA_HOST variable
+4. Set value: 0.0.0.0:11434
+5. Restart Ollama from Start menu
+```
+
+### 10. Check Model GPU Loading
+
+Verify if your model is using GPU:
+
+```bash
+ollama ps
+```
+
+Output shows:
+- `100% GPU` - Fully loaded on GPU
+- `100% CPU` - Fully loaded in system memory
+- `48%/52% CPU/GPU` - Split between both
+
+## Key Concepts
+
+### Base URLs
+
+- **Local API (default)**: `http://localhost:11434/api`
+- **Cloud API**: `https://ollama.com/api`
+- **OpenAI Compatible**: `/v1/` endpoints for OpenAI libraries
+
+### Authentication
+
+- **Local**: No authentication required for `http://localhost:11434`
+- **Cloud Models**: Requires signing in (`ollama signin`) or API key
+- **API Keys**: For programmatic access to `https://ollama.com/api`
+
+### Models
+
+- **Local Models**: Run on your machine (e.g., `gemma3`, `llama3.2`, `qwen3`)
+- **Cloud Models**: Suffix `-cloud` (e.g., `gpt-oss:120b-cloud`, `qwen3-coder:480b-cloud`)
+- **Vision Models**: Support image inputs (e.g., `llava`)
+
+### Common Environment Variables
+
+- `OLLAMA_HOST` - Change bind address (default: `127.0.0.1:11434`)
+- `OLLAMA_CONTEXT_LENGTH` - Context window size (default: `2048` tokens)
+- `OLLAMA_MODELS` - Model storage directory
+- `OLLAMA_ORIGINS` - Allow additional web origins for CORS
+- `HTTPS_PROXY` - Proxy server for model downloads
+
+### Error Handling
+
+**Status Codes:**
+- `200` - Success
+- `400` - Bad Request (invalid parameters)
+- `404` - Not Found (model doesn't exist)
+- `429` - Too Many Requests (rate limit)
+- `500` - Internal Server Error
+- `502` - Bad Gateway (cloud model unreachable)
+
+**Error Format:**
+```json
+{
+  "error": "the model failed to generate a response"
+}
+```
+
+### Streaming vs Non-Streaming
+
+- **Streaming** (default): Returns response chunks as JSON objects (NDJSON)
+- **Non-Streaming**: Set `"stream": false` to get complete response in one object
+
+## Reference Files
+
+This skill includes comprehensive documentation in `references/`:
+
+- **llms-txt.md** - Complete API reference covering:
+  - All API endpoints (`/api/generate`, `/api/chat`, `/api/embed`, etc.)
+  - Authentication methods (signin, API keys)
+  - Error handling and status codes
+  - OpenAI compatibility layer
+  - Cloud models usage
+  - Streaming responses
+  - Configuration and environment variables
+
+- **llms.md** - Documentation index listing all available topics:
+  - API reference (version, model details, chat, generate, embeddings)
+  - Capabilities (embeddings, streaming, structured outputs, tool calling, vision)
+  - CLI reference
+  - Cloud integration
+  - Platform-specific guides (Linux, macOS, Windows, Docker)
+  - IDE integrations (VS Code, JetBrains, Xcode, Zed, Cline)
+
+Use the reference files when you need:
+- Detailed API parameter specifications
+- Complete endpoint documentation
+- Advanced configuration options
+- Platform-specific setup instructions
+- Integration guides for specific tools
+
+## Working with This Skill
+
+### For Beginners
+
+Start with these common patterns:
+1. **Simple generation**: Use `/api/generate` endpoint with a prompt
+2. **Chat interface**: Use `/api/chat` with messages array
+3. **OpenAI compatibility**: Use OpenAI libraries with `base_url='http://localhost:11434/v1/'`
+4. **Check GPU usage**: Run `ollama ps` to verify model loading
+
+Read `llms-txt.md` section on "Introduction" and "Quickstart" for foundational concepts.
+
+### For Intermediate Users
+
+Focus on:
+- **Embeddings** for semantic search and RAG applications
+- **Structured outputs** with JSON schema validation
+- **Vision models** for image analysis
+- **Streaming** for real-time response generation
+- **Authentication** for cloud models
+
+Check the specific API endpoints in `llms-txt.md` for detailed parameter options.
+
+### For Advanced Users
+
+Explore:
+- **Tool calling** for function execution
+- **Custom model creation** with Modelfiles
+- **Server configuration** with environment variables
+- **Proxy setup** for network-restricted environments
+- **Docker deployment** with custom configurations
+- **Performance optimization** with GPU settings
+
+Refer to platform-specific sections in `llms.md` and configuration details in `llms-txt.md`.
+
+### Common Use Cases
+
+**Building a chatbot:**
+1. Use `/api/chat` endpoint
+2. Maintain message history in your application
+3. Stream responses for better UX
+4. Handle errors gracefully
+
+**Creating embeddings for search:**
+1. Use `/api/embed` endpoint
+2. Store embeddings in vector database
+3. Perform similarity search
+4. Implement RAG (Retrieval Augmented Generation)
+
+**Running behind a firewall:**
+1. Set `HTTPS_PROXY` environment variable
+2. Configure proxy in Docker if containerized
+3. Ensure certificates are trusted
+
+**Using cloud models:**
+1. Run `ollama signin` once
+2. Pull cloud models with `-cloud` suffix
+3. Use same API endpoints as local models
+
+## Troubleshooting
+
+### Model Not Loading on GPU
+
+**Check:**
+```bash
+ollama ps
+```
+
+**Solutions:**
+- Verify GPU compatibility in documentation
+- Check CUDA/ROCm installation
+- Review available VRAM
+- Try smaller model variants
+
+### Cannot Access Ollama Remotely
+
+**Problem:** Ollama only accessible from localhost
+
+**Solution:**
+```bash
+# Set OLLAMA_HOST to bind to all interfaces
+export OLLAMA_HOST="0.0.0.0:11434"
+```
+
+See "How do I configure Ollama server?" in `llms-txt.md` for platform-specific instructions.
+
+### Proxy Issues
+
+**Problem:** Cannot download models behind proxy
+
+**Solution:**
+```bash
+# Set proxy (HTTPS only, not HTTP)
+export HTTPS_PROXY=https://proxy.example.com
+
+# Restart Ollama
+```
+
+See "How do I use Ollama behind a proxy?" in `llms-txt.md`.
+
+### CORS Errors in Browser
+
+**Problem:** Browser extension or web app cannot access Ollama
+
+**Solution:**
+```bash
+# Allow specific origins
+export OLLAMA_ORIGINS="chrome-extension://*,moz-extension://*"
+```
+
+See "How can I allow additional web origins?" in `llms-txt.md`.
+
+## Resources
+
+### Official Documentation
+- Main docs: https://docs.ollama.com
+- API Reference: https://docs.ollama.com/api
+- Model Library: https://ollama.com/models
+
+### Official Libraries
+- Python: https://github.com/ollama/ollama-python
+- JavaScript: https://github.com/ollama/ollama-js
+
+### Community
+- GitHub: https://github.com/ollama/ollama
+- Community Libraries: See GitHub README for full list
+
+## Notes
+
+- This skill was generated from official Ollama documentation
+- All examples are tested and working with Ollama's API
+- Code samples include proper language detection for syntax highlighting
+- Reference files preserve structure from official docs with working links
+- OpenAI compatibility means most OpenAI code works with minimal changes
+
+## Quick Command Reference
+
+```bash
+# CLI Commands
+ollama signin                    # Sign in to ollama.com
+ollama run gemma3               # Run a model interactively
+ollama pull gemma3              # Download a model
+ollama ps                       # List running models
+ollama list                     # List installed models
+
+# Check API Status
+curl http://localhost:11434/api/version
+
+# Environment Variables (Common)
+export OLLAMA_HOST="0.0.0.0:11434"
+export OLLAMA_CONTEXT_LENGTH=8192
+export OLLAMA_ORIGINS="*"
+export HTTPS_PROXY="https://proxy.example.com"
+```