5.7 KiB
slug
| slug |
|---|
| /query-interfaces-of-api |
query - vector query
The query() method is used to perform vector similarity search to find the most similar documents to the query vector.
:::info
This interface is only available when using the Client. For more information about the Client, see Client.
:::
Prerequisites
-
You have installed pyseekdb. For more information about how to install pyseekdb, see Get Started.
-
You have connected to the database. For more information about how to connect to the database, see Client.
-
You have created a collection and inserted data. For more information about how to create a collection and insert data, see create_collection - Create a collection and add - Insert data.
Request parameters
query()
| Parameter | Value type | Required | Description | Example value |
|---|---|---|---|---|
query_embeddings |
List[float] or List[List[float]] | Yes | A single vector or a list of vectors for batch queries; if provided, it will be used directly (ignoring embedding_function); if not provided, query_text must be provided, and the collection must have an embedding_function |
[1.0, 2.0, 3.0] |
query_texts |
str or List[str] | No | A single text or a list of texts for query; if provided, it will be used directly (ignoring embedding_function); if not provided, documents must be provided, and the collection must have an embedding_function |
["my query text"] |
n_results |
int | Yes | The number of similar results to return, default is 10 | 3 |
where |
dict | No | Metadata filter conditions. | {"category": {"$eq": "AI"}} |
where_document |
dict | No | Document filter conditions. | {"$contains": "machine"} |
include |
List[str] | No | List of fields to include: ["documents", "metadatas", "embeddings"] |
["documents", "metadatas", "embeddings"] |
:::info
The embedding_function used is associated with the collection (set during create_collection() or get_collection()). You cannot override it for each operation.
:::
Request example
import pyseekdb
# Create a client
client = pyseekdb.Client()
collection = client.get_collection("my_collection")
collection1 = client.get_collection("my_collection1")
# Basic vector similarity query (embedding_function not used)
results = collection.query(
query_embeddings=[1.0, 2.0, 3.0],
n_results=3
)
# Iterate over results
for i in range(len(results["ids"][0])):
print(f"ID: {results['ids'][0][i]}, Distance: {results['distances'][0][i]}")
if results.get("documents"):
print(f"Document: {results['documents'][0][i]}")
if results.get("metadatas"):
print(f"Metadata: {results['metadatas'][0][i]}")
# Query by texts - vectors auto-generated by embedding_function
# Requires: collection must have embedding_function set
results = collection1.query(
query_texts=["my query text"],
n_results=10
)
# The collection's embedding_function will automatically convert query_texts to query_embeddings
# Query by multiple texts (batch query)
results = collection1.query(
query_texts=["query text 1", "query text 2"],
n_results=5
)
# Returns dict with lists of lists, one list per query text
for i in range(len(results["ids"])):
print(f"Query {i}: {len(results['ids'][i])} results")
# Query with metadata filter (using query_texts)
results = collection1.query(
query_texts=["AI research"],
where={"category": {"$eq": "AI"}},
n_results=5
)
# Query with comparison operator (using query_texts)
results = collection1.query(
query_texts=["machine learning"],
where={"score": {"$gte": 90}},
n_results=5
)
# Query with document filter (using query_texts)
results = collection1.query(
query_texts=["neural networks"],
where_document={"$contains": "machine learning"},
n_results=5
)
# Query with combined filters (using query_texts)
results = collection1.query(
query_texts=["AI research"],
where={"category": {"$eq": "AI"}, "score": {"$gte": 90}},
where_document={"$contains": "machine"},
n_results=5
)
# Query with multiple vectors (batch query)
results = collection.query(
query_embeddings=[[1.0, 2.0, 3.0], [2.0, 3.0, 4.0]],
n_results=2
)
# Returns dict with lists of lists, one list per query vector
for i in range(len(results["ids"])):
print(f"Query {i}: {len(results['ids'][i])} results")
# Query with specific fields
results = collection.query(
query_embeddings=[1.0, 2.0, 3.0],
include=["documents", "metadatas", "embeddings"],
n_results=3
)
Return parameters
| Parameter | Value type | Required | Description | Example value |
|---|---|---|---|---|
ids |
List[List[str]] | Yes | The IDs to add or modify. It can be a single ID or an array of IDs. | item1 |
embeddings |
[List[List[List[float]]]] | No | The vectors; if provided, it will be used directly (ignoring embedding_function), if not provided, documents can be provided to generate vectors automatically. |
[0.1, 0.2, 0.3] |
documents |
[List[List[Dict]]] | No | The documents. If vectors are not provided, documents will be converted to vectors using the embedding_function of the collection. |
"Document text" |
metadatas |
[List[List[Dict]]] | No | The metadata. | {"category": "AI"} |
distances |
[List[List[Dict]]] | No | {"category": "AI"} |
Return example
ID: vec1, Distance: 0.0
Document: None
Metadata: {}
ID: vec2, Distance: 0.025368153802923787
Document: None
Metadata: {}
Query 0: 4 results
Query 1: 4 results
Query 0: 2 results
Query 1: 2 results