Initial commit
This commit is contained in:
533
skills/mcp-vector-search-embed-guide/SKILL.md
Normal file
533
skills/mcp-vector-search-embed-guide/SKILL.md
Normal file
@@ -0,0 +1,533 @@
|
||||
---
|
||||
description: Guide for embedding mcp-vector-search as a library in MCP servers
|
||||
---
|
||||
|
||||
# mcp-vector-search Library Integration Guide
|
||||
|
||||
Comprehensive guide for embedding mcp-vector-search within your own MCP server to provide semantic search capabilities over bundled documentation or resources.
|
||||
|
||||
## Overview
|
||||
|
||||
mcp-vector-search can be used as a library to add semantic search to custom MCP servers. This enables you to bundle documentation with your server and provide semantic search without requiring users to configure vector search separately.
|
||||
|
||||
**Use cases:**
|
||||
- Bundle documentation with your MCP server
|
||||
- Provide semantic search over application-specific knowledge bases
|
||||
- Combine vector search with your own custom tools
|
||||
- Deliver turnkey MCP servers with integrated search
|
||||
|
||||
## Adding the Dependency
|
||||
|
||||
Add mcp-vector-search to your `deps.edn`:
|
||||
|
||||
```clojure
|
||||
{:deps
|
||||
{org.hugoduncan/mcp-vector-search
|
||||
{:git/url "https://github.com/hugoduncan/mcp-vector-search"
|
||||
:git/sha "LATEST_SHA_HERE"}}}
|
||||
```
|
||||
|
||||
**Important**: Replace `LATEST_SHA_HERE` with the latest commit SHA from the repository. You can find this by visiting the repository or running:
|
||||
|
||||
```bash
|
||||
git ls-remote https://github.com/hugoduncan/mcp-vector-search HEAD
|
||||
```
|
||||
|
||||
## Classpath vs Filesystem Sources
|
||||
|
||||
mcp-vector-search supports two types of source specifications:
|
||||
|
||||
### Filesystem Sources
|
||||
|
||||
Use the `:path` key with absolute paths:
|
||||
|
||||
```clojure
|
||||
{:sources [{:path "/Users/me/docs/**/*.md"}]}
|
||||
```
|
||||
|
||||
**Characteristics:**
|
||||
- Absolute paths starting with `/`
|
||||
- Files read from the filesystem
|
||||
- Support file watching for automatic re-indexing
|
||||
- Used for development and user-specific content
|
||||
|
||||
### Classpath Sources
|
||||
|
||||
Use the `:class-path` key with relative paths:
|
||||
|
||||
```clojure
|
||||
{:sources [{:class-path "docs/**/*.md"}]}
|
||||
```
|
||||
|
||||
**Characteristics:**
|
||||
- Relative paths without leading `/`
|
||||
- Resources discovered from classpath (JARs, resource directories)
|
||||
- No file watching (read-only resources)
|
||||
- Used for bundled, library-embedded content
|
||||
|
||||
**When to use each:**
|
||||
- **Filesystem**: Development, user-specific content, files that change
|
||||
- **Classpath**: Production, bundled resources, static documentation
|
||||
|
||||
You can mix both types in a single configuration to combine bundled resources with user-specific content.
|
||||
|
||||
## Resource Bundling
|
||||
|
||||
### Directory Structure
|
||||
|
||||
Organize your project to bundle resources:
|
||||
|
||||
```
|
||||
my-mcp-server/
|
||||
├── resources/
|
||||
│ ├── .mcp-vector-search/
|
||||
│ │ └── config.edn # Configuration for bundled resources
|
||||
│ └── docs/
|
||||
│ ├── guide.md
|
||||
│ ├── api/
|
||||
│ │ └── reference.md
|
||||
│ └── examples/
|
||||
│ └── usage.md
|
||||
├── src/
|
||||
│ └── my_server/
|
||||
│ └── main.clj
|
||||
└── deps.edn
|
||||
```
|
||||
|
||||
**Key points:**
|
||||
- Place documentation in `resources/` directory
|
||||
- Configuration goes in `resources/.mcp-vector-search/config.edn`
|
||||
- Resources will be bundled in the JAR when you package your server
|
||||
- Standard Clojure project structure with `resources/` on the classpath
|
||||
|
||||
### Configuration for Classpath Resources
|
||||
|
||||
Create `resources/.mcp-vector-search/config.edn`:
|
||||
|
||||
```clojure
|
||||
{:description "Search my-server documentation"
|
||||
:sources [{:class-path "docs/**/*.md"
|
||||
:name "Documentation"
|
||||
:type "docs"}]}
|
||||
```
|
||||
|
||||
**Configuration rules:**
|
||||
- Use `:class-path` instead of `:path`
|
||||
- Paths are relative (no leading `/`)
|
||||
- All path spec features work: globs (`**/*.md`), named captures (`(?<category>[^/]+)`)
|
||||
- Metadata extraction works identically
|
||||
- File watching (`:watch?`) has no effect on classpath sources
|
||||
|
||||
### Starting the Server
|
||||
|
||||
In your MCP server's main namespace:
|
||||
|
||||
```clojure
|
||||
(ns my-server.main
|
||||
(:require
|
||||
[mcp-vector-search.main :as vector-search]))
|
||||
|
||||
(defn -main []
|
||||
;; Start vector search with classpath config
|
||||
;; The config at resources/.mcp-vector-search/config.edn
|
||||
;; will be found automatically on the classpath
|
||||
(vector-search/start)
|
||||
|
||||
;; Your server code here
|
||||
...)
|
||||
```
|
||||
|
||||
The server will:
|
||||
1. Search for `.mcp-vector-search/config.edn` on the classpath
|
||||
2. Discover and index resources from `:class-path` sources
|
||||
3. Provide the search tool via MCP
|
||||
|
||||
## Configuration Loading Order
|
||||
|
||||
When using mcp-vector-search as a library, configuration is loaded from the first location found:
|
||||
|
||||
1. **Classpath**: `resources/.mcp-vector-search/config.edn` (bundled in JAR)
|
||||
2. **Project directory**: `./.mcp-vector-search/config.edn` (runtime override)
|
||||
3. **Home directory**: `~/.mcp-vector-search/config.edn` (user-specific)
|
||||
|
||||
This loading order enables:
|
||||
- **Bundling defaults**: Ship default configuration in your JAR
|
||||
- **Project overrides**: Provide project-specific config during development
|
||||
- **User customization**: Allow users to customize via home directory
|
||||
|
||||
**Example workflow:**
|
||||
|
||||
Development with project override:
|
||||
```bash
|
||||
# Create project-specific config during development
|
||||
mkdir -p .mcp-vector-search
|
||||
cat > .mcp-vector-search/config.edn <<EOF
|
||||
{:description "Development search with local files"
|
||||
:watch? true
|
||||
:sources [{:path "/Users/me/project/docs/**/*.md"}]}
|
||||
EOF
|
||||
```
|
||||
|
||||
Production with bundled config:
|
||||
```bash
|
||||
# Build JAR - includes resources/.mcp-vector-search/config.edn
|
||||
clojure -T:build uber
|
||||
# Run - uses bundled config
|
||||
java -jar target/my-server.jar
|
||||
```
|
||||
|
||||
## JAR Packaging
|
||||
|
||||
### Build Configuration
|
||||
|
||||
Ensure `resources/` is in your `:paths` in `deps.edn`:
|
||||
|
||||
```clojure
|
||||
{:paths ["src" "resources"]
|
||||
:deps {org.clojure/clojure {:mvn/version "1.12.3"}
|
||||
org.hugoduncan/mcp-vector-search
|
||||
{:git/url "https://github.com/hugoduncan/mcp-vector-search"
|
||||
:git/sha "LATEST_SHA_HERE"}}
|
||||
:aliases
|
||||
{:build {:deps {io.github.clojure/tools.build {:mvn/version "0.9.6"}}
|
||||
:ns-default build}}}
|
||||
```
|
||||
|
||||
### Build Script
|
||||
|
||||
Create `build.clj`:
|
||||
|
||||
```clojure
|
||||
(ns build
|
||||
(:require [clojure.tools.build.api :as b]))
|
||||
|
||||
(def class-dir "target/classes")
|
||||
(def uber-file "target/my-server.jar")
|
||||
|
||||
(defn uber [_]
|
||||
(b/copy-dir {:src-dirs ["src" "resources"]
|
||||
:target-dir class-dir})
|
||||
(b/compile-clj {:basis (b/create-basis)
|
||||
:class-dir class-dir})
|
||||
(b/uber {:class-dir class-dir
|
||||
:uber-file uber-file
|
||||
:basis (b/create-basis)
|
||||
:main 'my-server.main}))
|
||||
```
|
||||
|
||||
**Important**: Include both `"src"` and `"resources"` in `:src-dirs` to bundle all resources.
|
||||
|
||||
### Building and Running
|
||||
|
||||
```bash
|
||||
# Build the JAR
|
||||
clojure -T:build uber
|
||||
|
||||
# Verify resources are included
|
||||
jar tf target/my-server.jar | grep -E "(docs|.mcp-vector-search)"
|
||||
|
||||
# Run the server
|
||||
java -jar target/my-server.jar
|
||||
```
|
||||
|
||||
Users can now run your MCP server with semantic search over your bundled documentation, without any configuration needed.
|
||||
|
||||
## Example Configurations
|
||||
|
||||
### Bundled API Documentation
|
||||
|
||||
```clojure
|
||||
;; resources/.mcp-vector-search/config.edn
|
||||
{:description "Search API documentation and examples"
|
||||
:sources [{:class-path "api-docs/**/*.md"
|
||||
:category "api"}
|
||||
{:class-path "examples/**/*.clj"
|
||||
:ingest :namespace-doc
|
||||
:category "examples"}]}
|
||||
```
|
||||
|
||||
### Mixed Classpath and Filesystem
|
||||
|
||||
Combine bundled resources with filesystem sources:
|
||||
|
||||
```clojure
|
||||
{:description "Search bundled docs and local project files"
|
||||
:sources [
|
||||
;; Bundled documentation (from JAR)
|
||||
{:class-path "lib-docs/**/*.md"
|
||||
:source "library"
|
||||
:type "docs"}
|
||||
|
||||
;; Local project documentation (filesystem)
|
||||
{:path "/Users/me/project/docs/**/*.md"
|
||||
:source "project"
|
||||
:type "docs"}
|
||||
|
||||
;; Local source code (filesystem)
|
||||
{:path "/Users/me/project/src/**/*.clj"
|
||||
:ingest :namespace-doc
|
||||
:source "project"
|
||||
:type "source"}]}
|
||||
```
|
||||
|
||||
### Multi-Language Documentation
|
||||
|
||||
Extract language metadata from paths:
|
||||
|
||||
```clojure
|
||||
{:description "Search multi-language documentation"
|
||||
:sources [{:class-path "docs/(?<lang>en|es|fr)/**/*.md"
|
||||
:type "documentation"}]}
|
||||
```
|
||||
|
||||
For resource `docs/en/guide.md`:
|
||||
- Metadata: `{:type "documentation", :lang "en"}`
|
||||
|
||||
### Code Analysis for API Discovery
|
||||
|
||||
```clojure
|
||||
{:description "Search bundled API documentation"
|
||||
:sources [{:class-path "src/**/*.clj"
|
||||
:ingest :code-analysis
|
||||
:visibility :public-only
|
||||
:category "api"}]}
|
||||
```
|
||||
|
||||
## Complete Example
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
my-mcp-server/
|
||||
├── resources/
|
||||
│ ├── .mcp-vector-search/
|
||||
│ │ └── config.edn
|
||||
│ └── docs/
|
||||
│ ├── README.md
|
||||
│ └── api/
|
||||
│ └── reference.md
|
||||
├── src/
|
||||
│ └── my_server/
|
||||
│ └── main.clj
|
||||
├── deps.edn
|
||||
└── build.clj
|
||||
```
|
||||
|
||||
### Configuration File
|
||||
|
||||
```clojure
|
||||
;; resources/.mcp-vector-search/config.edn
|
||||
{:description "Search my-server documentation"
|
||||
:sources [{:class-path "docs/**/*.md"
|
||||
:project "my-server"
|
||||
:type "documentation"}]}
|
||||
```
|
||||
|
||||
### Main Server Code
|
||||
|
||||
```clojure
|
||||
;; src/my_server/main.clj
|
||||
(ns my-server.main
|
||||
(:require
|
||||
[mcp-vector-search.main :as vector-search]
|
||||
[mcp-vector-search.server :as mcp-server]))
|
||||
|
||||
(defn -main []
|
||||
;; Start vector search (loads classpath config automatically)
|
||||
(vector-search/start)
|
||||
|
||||
;; Add your custom tools here
|
||||
;; ...
|
||||
|
||||
;; Start MCP server
|
||||
(mcp-server/start!))
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
```clojure
|
||||
;; deps.edn
|
||||
{:paths ["src" "resources"]
|
||||
:deps {org.clojure/clojure {:mvn/version "1.12.3"}
|
||||
org.hugoduncan/mcp-vector-search
|
||||
{:git/url "https://github.com/hugoduncan/mcp-vector-search"
|
||||
:git/sha "LATEST_SHA_HERE"}}
|
||||
:aliases
|
||||
{:build {:deps {io.github.clojure/tools.build {:mvn/version "0.9.6"}}
|
||||
:ns-default build}}}
|
||||
```
|
||||
|
||||
### Build Configuration
|
||||
|
||||
```clojure
|
||||
;; build.clj
|
||||
(ns build
|
||||
(:require [clojure.tools.build.api :as b]))
|
||||
|
||||
(def class-dir "target/classes")
|
||||
(def uber-file "target/my-server.jar")
|
||||
|
||||
(defn uber [_]
|
||||
(b/copy-dir {:src-dirs ["src" "resources"]
|
||||
:target-dir class-dir})
|
||||
(b/compile-clj {:basis (b/create-basis)
|
||||
:class-dir class-dir})
|
||||
(b/uber {:class-dir class-dir
|
||||
:uber-file uber-file
|
||||
:basis (b/create-basis)
|
||||
:main 'my-server.main}))
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
### No File Watching for Classpath Sources
|
||||
|
||||
Classpath sources do not support file watching:
|
||||
- Resources in JARs are read-only
|
||||
- Classpath resources are indexed once at startup
|
||||
- The `:watch?` flag has no effect on `:class-path` sources
|
||||
|
||||
**Workaround for development**: Use filesystem sources instead:
|
||||
|
||||
```clojure
|
||||
;; Development config (local file) - with watching
|
||||
{:watch? true
|
||||
:sources [{:path "/Users/me/project/resources/docs/**/*.md"}]}
|
||||
|
||||
;; Production config (bundled) - no watching needed
|
||||
{:sources [{:class-path "docs/**/*.md"}]}
|
||||
```
|
||||
|
||||
### Resource Discovery Performance
|
||||
|
||||
Classpath resource discovery walks the entire classpath:
|
||||
- Use specific base paths to limit scope (e.g., `docs/**/*.md` not `**/*.md`)
|
||||
- Initial indexing may take longer than filesystem sources
|
||||
- Avoid overly broad patterns for large classpaths
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Configuration Not Found
|
||||
|
||||
**Error**: `No configuration file found`
|
||||
|
||||
**Check:**
|
||||
1. File is in `resources/.mcp-vector-search/config.edn`
|
||||
2. `resources/` is on the classpath (should be automatic in `:paths`)
|
||||
3. File is valid EDN syntax
|
||||
4. No typos in directory or file name
|
||||
|
||||
**Verify config is on classpath:**
|
||||
```clojure
|
||||
(require '[clojure.java.io :as io])
|
||||
(io/resource ".mcp-vector-search/config.edn") ;; Should return URL
|
||||
```
|
||||
|
||||
### Resources Not Indexed
|
||||
|
||||
**Error**: `No documents indexed from classpath sources`
|
||||
|
||||
**Check:**
|
||||
1. Resources exist in `resources/` directory
|
||||
2. Path patterns match your directory structure
|
||||
3. No leading `/` in `:class-path` values (should be relative)
|
||||
4. Resources are included in the JAR (for packaged servers)
|
||||
|
||||
**Verify resources are on classpath:**
|
||||
```clojure
|
||||
(require '[clojure.java.io :as io])
|
||||
(io/resource "docs/guide.md") ;; Should return URL
|
||||
```
|
||||
|
||||
### JAR Build Issues
|
||||
|
||||
**Error**: Resources missing from JAR
|
||||
|
||||
**Check:**
|
||||
|
||||
1. Ensure `resources/` is in `:paths` in `deps.edn`:
|
||||
```clojure
|
||||
{:paths ["src" "resources"]
|
||||
:deps {...}}
|
||||
```
|
||||
|
||||
2. Ensure `resources/` is in `:src-dirs` in `build.clj`:
|
||||
```clojure
|
||||
(b/copy-dir {:src-dirs ["src" "resources"]
|
||||
:target-dir class-dir})
|
||||
```
|
||||
|
||||
3. Verify the JAR contains resources:
|
||||
```bash
|
||||
jar tf my-server.jar | grep docs
|
||||
jar tf my-server.jar | grep .mcp-vector-search
|
||||
```
|
||||
|
||||
### Path Pattern Issues
|
||||
|
||||
**Error**: Some resources not matched
|
||||
|
||||
**Common mistakes:**
|
||||
- Using leading `/` with `:class-path` (should be relative)
|
||||
- Forgetting `**` for recursive matching
|
||||
- Regex special characters not escaped in captures
|
||||
|
||||
**Test patterns:**
|
||||
```clojure
|
||||
;; Wrong - leading slash
|
||||
{:class-path "/docs/**/*.md"}
|
||||
|
||||
;; Correct - relative path
|
||||
{:class-path "docs/**/*.md"}
|
||||
|
||||
;; Wrong - single glob doesn't recurse
|
||||
{:class-path "docs/*.md"} ;; only matches docs/file.md
|
||||
|
||||
;; Correct - recursive glob
|
||||
{:class-path "docs/**/*.md"} ;; matches docs/api/file.md
|
||||
```
|
||||
|
||||
## Tips and Best Practices
|
||||
|
||||
**Resource organization:**
|
||||
- Place all bundled content in `resources/` directory
|
||||
- Use clear directory structure (e.g., `docs/`, `api/`, `examples/`)
|
||||
- Include `.mcp-vector-search/config.edn` in `resources/`
|
||||
- Keep resources focused on bundled content
|
||||
|
||||
**Configuration strategy:**
|
||||
- Ship sensible defaults in `resources/.mcp-vector-search/config.edn`
|
||||
- Use relative paths with `:class-path` for bundled content
|
||||
- Document how users can override via project or home directory config
|
||||
- Provide example configs in your server's README
|
||||
|
||||
**Path specifications:**
|
||||
- Use specific base paths (e.g., `docs/**/*.md` not `**/*.md`)
|
||||
- Leverage named captures for metadata extraction
|
||||
- Test path patterns during development
|
||||
- Avoid overly broad patterns for performance
|
||||
|
||||
**Build process:**
|
||||
- Always include `"resources"` in both `:paths` and `:src-dirs`
|
||||
- Verify JAR contents after building
|
||||
- Test the JAR in a clean environment
|
||||
- Document build requirements in your README
|
||||
|
||||
**Development workflow:**
|
||||
- Use filesystem sources with `:watch?` during development
|
||||
- Switch to classpath sources for production builds
|
||||
- Keep development and production configs separate
|
||||
- Test with both config types before releasing
|
||||
|
||||
**Testing:**
|
||||
- Verify config loading from classpath
|
||||
- Check resource discovery and indexing
|
||||
- Test search functionality with bundled resources
|
||||
- Validate JAR packaging and distribution
|
||||
|
||||
## See Also
|
||||
|
||||
- **doc/using.md** - Complete configuration reference for all strategies and options
|
||||
- **doc/path-spec.md** - Formal syntax for path specifications and captures
|
||||
- **doc/about.md** - Project purpose and use cases
|
||||
- **README.md** - Quick start guide for mcp-vector-search
|
||||
Reference in New Issue
Block a user