Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:44:54 +08:00
commit eb309b7b59
133 changed files with 21979 additions and 0 deletions

View File

@@ -0,0 +1,16 @@
---
slug: /dml-overview-of-api
---
# DML operations
DML (Data Manipulation Language) operations allow you to insert, update, and delete data in a collection.
For DML operations, you can use the following APIs.
| API | Description | Documentation |
|---|---|---|
| `add()` | Inserts a new record into a collection. | [Documentation](200.add-data-of-api.md) |
| `update()` | Updates an existing record in a collection. |[Documentation](300.update-data-of-api.md)|
| `upsert()` | Inserts a new record or updates an existing record. |[Documentation](400.upsert-data-of-api.md)|
| `delete()` | Deletes a record from a collection.|[Documentation](500.delete-data-of-api.md)|

View File

@@ -0,0 +1,117 @@
---
slug: /add-data-of-api
---
# add - Insert data
The `add()` method inserts new data into a collection. If a record with the same ID already exists, an error is returned.
:::info
This API is only available when using a Client. For more information about the Client, see [Client](../50.client.md).
:::
## Prerequisites
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
* You have connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
* If you are using seekdb or OceanBase Database in client mode, make sure that the user to which you are connected has the `INSERT` privilege on the table to be operated. For more information about how to view the privileges of the current user, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If you do not have the required privilege, contact the administrator to grant you the privilege. For more information about how to directly grant a privilege, see [Directly grant a privilege](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
## Request parameters
```python
add(
ids=ids,
embeddings=embeddings,
documents=documents,
metadatas=metadatas
)
```
|Parameter|Type|Required|Description|Example value|
|---|---|---|---|---|
|`ids`|string or List[str]|Yes|The ID of the data to be inserted. You can specify a single ID or an array of IDs.|item1|
|`embeddings`|List[float] or List[List[float]]|No|The vector or vectors of the data to be inserted. If you specify this parameter, the value of `embedding_function` is ignored. If you do not specify this parameter, you must specify `documents`, and the `collection` must have an `embedding_function`.|[0.1, 0.2, 0.3]|
|`documents`|string or List[str]|No|The document or documents to be inserted. If you do not specify `vectors`, `documents` will be converted to vectors using the `embedding_function` of the `collection`.|"This is a document"|
|`metadatas`|dict or List[dict]|No|The metadata or metadata list of the data to be inserted. |`{"category": "AI", "score": 95}`|
:::info
The `embedding_function` associated with the collection is set during `create_collection()` or `get_collection()`. You cannot override it for each operation.
:::
## Request example
```python
import pyseekdb
from pyseekdb import DefaultEmbeddingFunction, HNSWConfiguration
# Create a client
client = pyseekdb.Client()
collection = client.create_collection(
name="my_collection",
configuration=HNSWConfiguration(dimension=3, distance='cosine'),
embedding_function=None
)
# Add single item
collection.add(
ids="item1",
embeddings=[0.1, 0.2, 0.3],
documents="This is a document",
metadatas={"category": "AI", "score": 95}
)
# Add multiple items
collection.add(
ids=["item4", "item2", "item3"],
embeddings=[
[0.1, 0.2, 0.4],
[0.4, 0.5, 0.6],
[0.7, 0.8, 0.9]
],
documents=[
"Document 1",
"Document 2",
"Document 3"
],
metadatas=[
{"category": "AI", "score": 95},
{"category": "ML", "score": 88},
{"category": "DL", "score": 92}
]
)
# Add with only embeddings
collection.add(
ids=["vec1", "vec2"],
embeddings=[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
)
collection1 = client.create_collection(
name="my_collection1"
)
# Add with only documents - embeddings auto-generated by embedding_function
# Requires: collection must have embedding_function set
collection1.add(
ids=["doc1", "doc2"],
documents=["Text document 1", "Text document 2"],
metadatas=[{"tag": "A"}, {"tag": "B"}]
)
```
## Response parameters
None
## References
* [Update data](300.update-data-of-api.md)
* [Update or insert data](400.upsert-data-of-api.md)
* [Delete data](500.delete-data-of-api.md)

View File

@@ -0,0 +1,88 @@
---
slug: /update-data-of-api
---
# update - Update data
The `update()` method is used to update existing records in a collection. The record must exist, otherwise an error will be raised.
:::info
This API is only available when using a Client. For more information about the Client, see [Client](../50.client.md).
:::
## Prerequisites
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
* You have connected to the database. For more information about how to connect, see [Client](../50.client.md).
* If you are using seekdb in client mode or OceanBase Database, make sure that the user to which you have connected has the `UPDATE` privilege on the table to be operated. For more information about how to view the privileges of the current user, see [View User Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If you do not have this privilege, contact the administrator to grant it to you. For more information about how to directly grant privileges, see [Directly Grant Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
## Request parameters
```python
update(
ids=ids,
embeddings=embeddings,
documents=documents,
metadatas=metadatas
)
```
|Parameter|Type|Required|Description|Example value|
|---|---|---|---|---|
|`ids`|string or List[str]|Yes|The ID to be modified. It can be a single ID or an array of IDs.|item1|
|`embeddings`|List[float] or List[List[float]]|No|The new vectors. If provided, they will be used directly (ignoring `embedding_function`). If not provided, you can provide `documents` to automatically generate vectors.|[[0.9, 0.8, 0.7], [0.6, 0.5, 0.4]]|
|`documents`|string or List[str]|No|The new documents. If `vectors` are not provided, `documents` will be converted to vectors using the collection's `embedding_function`.|"New document text"|
|`metadatas`|dict or List[dict]|No|The new metadata.|`{"category": "AI"}`|
:::info
You can update only the `metadatas`. The `embedding_function` used must be associated with the collection.
:::
## Request example
```python
import pyseekdb
# Create a client
client = pyseekdb.Client()
collection = client.get_collection("my_collection")
collection1 = client.get_collection("my_collection1")
# Update single item
collection.update(
ids="item1",
metadatas={"category": "AI", "score": 98} # Update metadata only
)
# Update multiple items
collection.update(
ids=["item1", "item2"],
embeddings=[[0.9, 0.8, 0.7], [0.6, 0.5, 0.4]], # Update embeddings
documents=["Updated document 1", "Updated document 2"] # Update documents
)
# Update with documents only - embeddings auto-generated by embedding_function
# Requires: collection must have embedding_function set
collection1.update(
ids="doc1",
documents="New document text", # Embeddings will be auto-generated
metadatas={"category": "AI"}
)
```
## Response parameters
None
## References
* [Insert data](200.add-data-of-api.md)
* [Update or insert data](400.upsert-data-of-api.md)
* [Delete data](500.delete-data-of-api.md)

View File

@@ -0,0 +1,93 @@
---
slug: /upsert-data-of-api
---
# upsert - Update or insert data
The `upsert()` method is used to insert new records or update existing records. If a record with the given ID already exists, it will be updated; otherwise, a new record will be inserted.
:::info
This API is only available when using a Client connection. For more information about the Client, see [Client](../50.client.md).
:::
## Prerequisites
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
* You have connected to the database. For more information about how to connect, see [Client](../50.client.md).
* If you are using seekdb or OceanBase Database in client mode, ensure that the connected user has the `INSERT` and `UPDATE` privileges on the target table. For more information about how to view the current user privileges, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If the user does not have the required privileges, contact the administrator to grant them. For more information about how to directly grant privileges, see [Directly grant privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
## Request parameters
```python
Upsert(
ids=ids,
embeddings=embeddings,
documents=documents,
metadatas=metadatas
)
```
|Parameter|Type|Required|Description|Example value|
|---|---|---|---|---|
|`ids`|string or List[str]|Yes|The ID to be added or modified. It can be a single ID or an array of IDs.|item1|
|`embeddings`|List[float] or List[List[float]]|No|The vectors. If provided, they will be used directly (ignoring `embedding_function`). If not provided, you can provide `documents` to automatically generate vectors.|[0.1, 0.2, 0.3]|
|`documents`|string or List[str]|No|The documents. If `vectors` are not provided, `documents` will be converted to vectors using the collection's `embedding_function`.|"Document text"|
|`metadatas`|dict or List[dict]|No|The metadata. |`{"category": "AI"}`|
## Request example
```python
import pyseekdb
# Create a client
client = pyseekdb.Client()
collection = client.get_collection("my_collection")
collection1 = client.get_collection("my_collection1")
# Upsert single item (insert or update)
collection.upsert(
ids="item1",
embeddings=[0.1, 0.2, 0.3],
documents="Document text",
metadatas={"category": "AI", "score": 95}
)
# Upsert multiple items
collection.upsert(
ids=["item1", "item2", "item3"],
embeddings=[
[0.1, 0.2, 0.3],
[0.4, 0.5, 0.6],
[0.7, 0.8, 0.9]
],
documents=["Doc 1", "Doc 2", "Doc 3"],
metadatas=[
{"category": "AI"},
{"category": "ML"},
{"category": "DL"}
]
)
# Upsert with documents only - embeddings auto-generated by embedding_function
# Requires: collection must have embedding_function set
collection1.upsert(
ids=["item1", "item2"],
documents=["Document 1", "Document 2"],
metadatas=[{"category": "AI"}, {"category": "ML"}]
)
```
## Response parameters
None
## References
* [Insert data](200.add-data-of-api.md)
* [Update data](300.update-data-of-api.md)
* [Delete data](400.upsert-data-of-api.md)

View File

@@ -0,0 +1,87 @@
---
slug: /delete-data-of-api
---
# delete - Delete data
`delete()` is used to delete records from a collection. You can delete records by ID, metadata filter, or document filter.
:::info
This API is only available when you are connected to the database using a Client. For more information about the Client, see [Client](../50.client.md).
:::
## Prerequisites
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
* You are connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
* If you are using seekdb or OceanBase Database in client mode, make sure that the user to whom you are connected has the `DELETE` privilege on the table to be operated. For more information about how to view the privileges of the current user, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If you do not have this privilege, contact the administrator to grant it to you. For more information about how to directly grant privileges, see [Directly grant privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
## Request parameters
```python
Upsert(
ids=ids,
embeddings=embeddings,
documents=documents,
metadatas=metadatas
)
```
|Parameter|Type|Required|Description|Example value|
|---|---|---|---|---|
|`ids`|string or List[str]|Optional|The ID of the record to be deleted. You can specify a single ID or an array of IDs.|item1|
|`where`|dict|Optional|The metadata filter.|`{"category": {"$eq": "AI"}}`|
|`where_document`|dict|Optional|The document filter.|`{"$contains": "obsolete"}`|
:::info
At least one of the `id`, `where`, or `where_document` parameters must be specified.
:::
## Request examples
```python
import pyseekdb
# Create a client
client = pyseekdb.Client()
collection = client.get_collection("my_collection")
# Delete by IDs
collection.delete(ids=["item1", "item2", "item3"])
# Delete by single ID
collection.delete(ids="item1")
# Delete by metadata filter
collection.delete(where={"category": {"$eq": "AI"}})
# Delete by comparison operator
collection.delete(where={"score": {"$lt": 50}})
# Delete by document filter
collection.delete(where_document={"$contains": "obsolete"})
# Delete with combined filters
collection.delete(
where={"category": {"$eq": "AI"}},
where_document={"$contains": "deprecated"}
)
```
## Response parameters
None
## References
* [Insert data](200.add-data-of-api.md)
* [Update data](300.update-data-of-api.md)
* [Update or insert data](400.upsert-data-of-api.md)