Initial commit
This commit is contained in:
@@ -0,0 +1,66 @@
|
||||
---
|
||||
slug: /api-overview
|
||||
---
|
||||
|
||||
# API Reference
|
||||
|
||||
seekdb allows you to use seekdb through APIs.
|
||||
|
||||
## APIs
|
||||
|
||||
The following APIs are supported.
|
||||
|
||||
### Database
|
||||
|
||||
:::info
|
||||
You can use this API only when you connect to seekdb by using the `AdminClient`. For more information about the `AdminClient`, see [Admin Client](../50.apis/100.admin-client.md).
|
||||
:::
|
||||
|
||||
| API | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `create_database()` | Creates a database. | [Documentation](110.database/200.create-database-of-api.md) |
|
||||
| `get_database()` | Retrieves a specified database. |[Documentation](110.database/300.get-database-of-api.md)|
|
||||
| `list_databases()` | Retrieves a list of databases in an instance. |[Documentation](110.database/400.list-database-of-api.md)|
|
||||
| `delete_database()` | Deletes a specified database.|[Documentation](110.database/500.delete-database-of-api.md)|
|
||||
|
||||
|
||||
### Collection
|
||||
|
||||
:::info
|
||||
You can use this API only when you connect to seekdb by using the `Client`. For more information about the `Client`, see [Client](../50.apis/50.client.md).
|
||||
:::
|
||||
|
||||
| API | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `create_collection()` | Creates a collection. | [Documentation](200.collection/100.create-collection-of-api.md) |
|
||||
| `get_collection()` | Retrieves a specified collection. |[Documentation](200.collection/200.get-collection-of-api.md)|
|
||||
| `get_or_create_collection()` | Creates or queries a collection. If the collection does not exist in the database, it is created. If the collection exists, the corresponding result is obtained. |[Documentation](200.collection/250.get-or-create-collection-of-api.md)|
|
||||
| `list_collections()` | Retrieves the collection list in a database. |[Documentation](200.collection/300.list-collection-of-api.md)|
|
||||
| `count_collection()` | Counts the number of collections in a database. |[Documentation](200.collection/350.count-collection-of-api.md)|
|
||||
| `delete_collection()` | Deletes a specified collection.|[Documentation](200.collection/400.delete-collection-of-api.md)|
|
||||
|
||||
|
||||
### DML
|
||||
|
||||
:::info
|
||||
You can use this API only when you connect to seekdb by using the `Client`. For more information about the `Client`, see [Client](../50.apis/50.client.md).
|
||||
:::
|
||||
|
||||
| API | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `add()` | Inserts a new record into a collection. | [Documentation](300.dml/200.add-data-of-api.md) |
|
||||
| `update()` | Updates an existing record in a collection. |[Documentation](300.dml/300.update-data-of-api.md)|
|
||||
| `upsert()` | Inserts a new record or updates an existing record. |[Documentation](300.dml/400.upsert-data-of-api.md)|
|
||||
| `delete()` | Deletes a record from a collection.|[Documentation](300.dml/500.delete-data-of-api.md)|
|
||||
|
||||
### DQL
|
||||
|
||||
:::info
|
||||
You can use this API only when you connect to seekdb by using the `Client`. For more information about the `Client`, see [Client](../50.apis/50.client.md).
|
||||
:::
|
||||
|
||||
| API | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `query()` | Performs vector similarity search. | [Documentation](400.dql/200.query-interfaces-of-api.md) |
|
||||
| `get()` | Queries specific data from a table by using the ID, document, and metadata (non-vector). |[Documentation](400.dql/300.get-interfaces-of-api.md)|
|
||||
| `hybrid_search()` | Performs full-text search and vector similarity search by using ranking. |[Documentation](400.dql/400.hybrid-search-of-api.md)|
|
||||
@@ -0,0 +1,93 @@
|
||||
---
|
||||
slug: /admin-client
|
||||
---
|
||||
|
||||
# Admin Client
|
||||
|
||||
`AdminClient` provides database management operations. It uses the same database connection mode as `Client`, but only supports database management-related operations.
|
||||
|
||||
## Connect to an embedded seekdb instance
|
||||
|
||||
Connect to a local embedded seekdb instance by using `AdminClient`.
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Embedded mode - Database management
|
||||
admin = pyseekdb.AdminClient(path="./seekdb")
|
||||
```
|
||||
|
||||
Parameter description:
|
||||
|
||||
| Parameter | Value Type | Required | Description | Example Value |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `path` | string | Optional | The path of the seekdb data directory. seekdb stores database files in this directory and loads them when it starts. | `./seekdb` |
|
||||
|
||||
## Connect to a remote server
|
||||
|
||||
Connect to a remote server by using `AdminClient`. This way, you can connect to a seekdb instance or an OceanBase Database instance.
|
||||
|
||||
:::tip
|
||||
|
||||
Before you connect to a remote server, make sure that you have deployed a server mode seekdb instance or an OceanBase Database instance.<br/>For information about how to deploy a server mode seekdb instance, see [Overview](../../../400.guides/400.deploy/50.deploy-overview.md).<br/>For information about how to deploy an OceanBase Database instance, see [Overview](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003976427).
|
||||
|
||||
:::
|
||||
|
||||
Example: Connect to a server mode seekdb instance
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Remote server mode - Database management
|
||||
admin = pyseekdb.AdminClient(
|
||||
host="127.0.0.1",
|
||||
port=2881,
|
||||
user="root",
|
||||
password="" # Can be retrieved from SEEKDB_PASSWORD environment variable
|
||||
)
|
||||
```
|
||||
|
||||
Parameter description:
|
||||
|
||||
| Parameter | Value Type | Required | Description | Example Value |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `host` | string | Yes | The IP address of the server where the instance resides. | `127.0.0.1` |
|
||||
| `prot` | string | Yes | The port of the instance. The default value is 2881. | `2881` |
|
||||
| `user` | string | Yes | The username. The default value is root. | `root` |
|
||||
| `password` | string | Yes | The password corresponding to the username. If you do not specify `password` or specify an empty string, the system retrieves the password from the `SEEKDB_PASSWORD` environment variable. | |
|
||||
|
||||
Example: Connect to an OceanBase Database instance
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Remote server mode - Database management
|
||||
admin = pyseekdb.AdminClient(
|
||||
host="127.0.0.1",
|
||||
port=2881,
|
||||
tenant="test"
|
||||
user="root",
|
||||
password="" # Can be retrieved from SEEKDB_PASSWORD environment variable
|
||||
)
|
||||
```
|
||||
|
||||
Parameter description:
|
||||
|
||||
| Parameter | Value Type | Required | Description | Example Value |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `host` | string | Yes | The IP address of the server where the database resides. | `127.0.0.1` |
|
||||
| `prot` | string | Yes | The port of the OceanBase Database instance. The default value is 2881. | `2881` |
|
||||
| `tenant` | string | No | The name of the tenant. This parameter is not required for a server mode seekdb instance, but is required for an OceanBase Database instance. The default value is sys. | `test` |
|
||||
| `user` | string | Yes | The username corresponding to the tenant. The default value is root. | `root` |
|
||||
| `password` | string | Yes | The password corresponding to the username. If you do not specify `password` or specify an empty string, the system retrieves the password from the `SEEKDB_PASSWORD` environment variable. | |
|
||||
|
||||
## APIs supported when you use AdminClient to connect to a database
|
||||
|
||||
The following APIs are supported when you use `AdminClient` to connect to a database.
|
||||
|
||||
| API | Description | Documentation Link |
|
||||
| --- | --- | --- |
|
||||
| `create_database` | Creates a new database. |[Documentation](110.database/200.create-database-of-api.md)|
|
||||
| `get_database` | Queries a specified database. |[Documentation](110.database/300.get-database-of-api.md)|
|
||||
| `delete_database` | Deletes a specified database. |[Documentation](110.database/400.list-database-of-api.md)|
|
||||
| `list_databases` | Lists all databases. |[Documentation](110.database/500.delete-database-of-api.md)|
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
slug: /database-overview-of-api
|
||||
---
|
||||
|
||||
# Database Management
|
||||
|
||||
A database contains tables, indexes, and metadata of database objects. You can create, query, and delete databases as needed.
|
||||
|
||||
The following APIs are available for database operations.
|
||||
|
||||
| API | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `create_database()` | Creates a database. | [Documentation](200.create-database-of-api.md) |
|
||||
| `get_database()` | Gets a specified database. |[Documentation](300.get-database-of-api.md)|
|
||||
| `list_databases()` | Gets the list of databases in the instance. |[Documentation](400.list-database-of-api.md)|
|
||||
| `delete_database()` | Deletes a specified database.|[Documentation](500.delete-database-of-api.md)|
|
||||
@@ -0,0 +1,76 @@
|
||||
---
|
||||
slug: /create-database-of-api
|
||||
---
|
||||
|
||||
# create_database - Create a database
|
||||
|
||||
The `create_database()` function is used to create a new database.
|
||||
|
||||
:::info
|
||||
* This interface can only be used when you are connected to the database using `AdminClient`. For more information about `AdminClient`, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
* Currently, when you use `create_database` to create a database, you cannot specify the database properties. The database will be created based on the default values of the properties. If you want to create a database with specific properties, you can try to create it using SQL. For more information about how to create a database using SQL, see [Create a database](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003977077).
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You are connected to the database. For more information about how to connect to the database, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
* If you are using server mode of seekdb or OceanBase Database, make sure that the connected user has the `CREATE` privilege. For more information about how to check the privileges of the current user, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If the user does not have this privilege, contact the administrator to grant it. For more information about how to directly grant privileges, see [Directly grant privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Limitations
|
||||
|
||||
* In a seekdb instance or OceanBase Database, the name of each database must be globally unique.
|
||||
|
||||
* The maximum length of a database name is 128 characters.
|
||||
|
||||
* The name can contain only uppercase and lowercase letters, digits, underscores, dollar signs, and Chinese characters.
|
||||
|
||||
* Avoid using reserved keywords as database names.
|
||||
|
||||
For more information about reserved keywords, see [Reserved keywords](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003976774).
|
||||
|
||||
## Recommendations
|
||||
|
||||
* We recommend that you give the database a meaningful name that reflects its purpose and content. For example, you can use `Application Identifier_Sub-application name (optional)_db` as the database name.
|
||||
|
||||
* We recommend that you create the database and related users using the root user and assign only the necessary privileges to ensure the security and controllability of the database.
|
||||
|
||||
* You can create a database with a name consisting only of digits by enclosing the name in backticks (`), but this is not recommended. This is because names consisting only of digits have no clear meaning, and queries require the use of backticks (`), which can lead to unnecessary complexity and confusion.
|
||||
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
create_database(name, tenant=DEFAULT_TENANT)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the database to be created. |`my_database`|
|
||||
|`tenant`|string|No<ul><li>When using embedded seekdb or server mode of seekdb, this parameter is not required.</li><li>When using OceanBase Database, this parameter is required.</li></ul>|The tenant to which the database belongs. |`test_tenant`|
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Embedded mode
|
||||
admin = pyseekdb.AdminClient(path="./seekdb")
|
||||
|
||||
# Create database
|
||||
admin.create_database("my_database")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
|
||||
## References
|
||||
|
||||
* [Get a specific database](300.get-database-of-api.md)
|
||||
* [Delete a database](500.delete-database-of-api.md)
|
||||
* [List databases](400.list-database-of-api.md)
|
||||
@@ -0,0 +1,65 @@
|
||||
---
|
||||
slug: /get-database-of-api
|
||||
---
|
||||
|
||||
# get_database - Get the specified database
|
||||
|
||||
The `get_database()` method is used to obtain the information of the specified database.
|
||||
|
||||
:::info
|
||||
|
||||
This method can be used only when you connect to the database by using the `AdminClient`. For more information about the `AdminClient`, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
get_database(name, tenant=DEFAULT_TENANT)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the database to be queried. |`my_database`|
|
||||
|`tenant`|string|No<ul><li>When you use embedded seekdb and server mode seekdb, you do not need to specify this parameter.</li><li>When you use OceanBase Database, you must specify this parameter.</li></ul>|The tenant to which the database belongs. |test_tenant|
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Embedded mode
|
||||
admin = pyseekdb.AdminClient(path="./seekdb")
|
||||
|
||||
# Get database
|
||||
db = admin.get_database("my_database")
|
||||
# print(f"Database: {db.name}, Charset: {db.charset}, collation:{db.collation}, metadata:{db.metadata}")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the queried database. |`my_database`|
|
||||
|`tenant`|string|No<br/>When you use embedded seekdb and server mode SeekDB, this parameter does not exist. |The tenant to which the queried database belongs. |`test_tenant`|
|
||||
|`charset`|string|No|The character set used by the queried database. |`utf8mb4`|
|
||||
|`collation`|string|No|The collation used by the queried database. |`utf8mb4_general_ci`|
|
||||
|`metadata`|dict|No|Reserved field. | {} |
|
||||
|
||||
## Response example
|
||||
|
||||
```python
|
||||
Database: my_database, Charset: utf8mb4, collation:utf8mb4_general_ci, metadata:{}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
* [Create a database](200.create-database-of-api.md)
|
||||
* [Delete a database](500.delete-database-of-api.md)
|
||||
* [Get the database list](400.list-database-of-api.md)
|
||||
@@ -0,0 +1,70 @@
|
||||
---
|
||||
slug: /list-database-of-api
|
||||
---
|
||||
|
||||
# list_databases - Get the database list
|
||||
|
||||
The `list_databases()` method is used to retrieve the database list in the instance.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when using the `AdminClient`. For more information about the `AdminClient`, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
list_databases(limit=None, offset=None, tenant=DEFAULT_TENANT)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`limit`|int|Optional|The maximum number of databases to return. |2|
|
||||
|`offset`|int|Optional|The number of databases to skip. |3|
|
||||
|`tenant`|string|Optional<ul><li>When using embedded seekdb and server mode seekdb, this parameter is not required.</li><li>When using OceanBase Database, this parameter is required. The default value is `sys`.</li></ul>|The tenant to which the queried database belongs. |test_tenant|
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
# List all databases
|
||||
import pyseekdb
|
||||
|
||||
# Embedded mode
|
||||
admin = pyseekdb.AdminClient(path="./seekdb")
|
||||
|
||||
# list database
|
||||
databases = admin.list_databases(2,3)
|
||||
for db in databases:
|
||||
print(f"Database: {db.name}, Charset: {db.charset}, collation:{db.collation}, metadata:{db.metadata}")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the queried database. |`my_database`|
|
||||
|`tenant`|string|Optional<br/>When using embedded seekdb and server mode SeekDB, this parameter is not available. |The tenant to which the queried database belongs. |`test_tenant`|
|
||||
|`charset`|string|Optional|The character set of the queried database. |`utf8mb4`|
|
||||
|`collation`|string|Optional|The collation of the queried database. |`utf8mb4_general_ci`|
|
||||
|`metadata`|dict|Optional|Reserved field. No data is returned. | {} |
|
||||
|
||||
|
||||
## Response example
|
||||
|
||||
```python
|
||||
Database: test, Charset: utf8mb4, collation:utf8mb4_general_ci, metadata:{}
|
||||
Database: my_database, Charset: utf8mb4, collation:utf8mb4_general_ci, metadata:{}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
* [Create a database](200.create-database-of-api.md)
|
||||
* [Delete a database](500.delete-database-of-api.md)
|
||||
* [Get a specific database](300.get-database-of-api.md)
|
||||
@@ -0,0 +1,54 @@
|
||||
---
|
||||
slug: /delete-database-of-api
|
||||
---
|
||||
|
||||
# delete_database - Delete a database
|
||||
|
||||
The `delete_database()` method is used to delete a database.
|
||||
|
||||
:::info
|
||||
|
||||
This method is only available when using the `AdminClient`. For more information about the `AdminClient`, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Admin Client](../100.admin-client.md).
|
||||
|
||||
* If you are using server mode of seekdb or OceanBase Database, ensure that the user has the `DROP` privilege. For more information about how to view the privileges of the current user, see [View User Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If the user does not have the privilege, contact the administrator to grant the privilege. For more information about how to directly grant privileges, see [Directly Grant Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
delete_database(name,tenant=DEFAULT_TENANT)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example Value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the database to be deleted. |my_database|
|
||||
|`tenant`|string|No<ul><li>If you are using embedded seekdb or server mode of seekdb, you do not need to specify this parameter.</li><li>If you are using OceanBase Database, this parameter is required. The default value is `sys`.</li></ul>|The tenant to which the database belongs. |test_tenant|
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Embedded mode
|
||||
admin = pyseekdb.AdminClient(path="./seekdb")
|
||||
|
||||
# Delete database
|
||||
admin.delete_database("my_database")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Create a database](200.create-database-of-api.md)
|
||||
* [Get a specific database](300.get-database-of-api.md)
|
||||
* [Obtain a database list](400.list-database-of-api.md)
|
||||
@@ -0,0 +1,93 @@
|
||||
---
|
||||
slug: /create-collection-of-api
|
||||
---
|
||||
|
||||
# create_collection - Create a collection
|
||||
|
||||
`create_collection()` is used to create a new collection, which is a table in the database.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when you are connected to the database using a client. For more information about the client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You are connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* If you are using seekdb in server mode or OceanBase Database, make sure that the user has the `CREATE` privilege. For more information about how to view the privileges of the current user, see [View user privileges](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001971368). If the user does not have the privilege, contact the administrator to grant it. For more information about how to directly grant privileges, see [Directly grant privileges](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974754).
|
||||
|
||||
## Define the table name
|
||||
|
||||
When creating a table, you must first define its name. The following requirements apply when defining the table name:
|
||||
|
||||
* In seekdb, each table name must be unique within the database.
|
||||
|
||||
* The table name cannot exceed 64 characters.
|
||||
|
||||
* We recommend that you give the table a meaningful name instead of using generic names such as t1 or table1. For more information about table naming conventions, see [Table naming conventions](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003977289).
|
||||
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
create_collection(name = name,configuration = configuration, embedding_function = embedding_function )
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the collection to be created. |my_collection|
|
||||
|`configuration`|HNSWConfiguration|No|The index configuration, which specifies the dimension and distance metric. If not provided, the default values `dimension=384` and `distance='cosine'` are used. If set to `None`, the dimension is calculated from the `embedding_function` value. |HNSWConfiguration(dimension=384, distance='cosine')|
|
||||
|`embedding_function`|EmbeddingFunction|No|The function to convert data into vectors. If not provided, `DefaultEmbeddingFunction()(384 dimensions)` is used. If set to `None`, the collection will not include embedding functionality, and if provided, it will be calculated based on `configuration.dimension`.|DefaultEmbeddingFunction()|
|
||||
|
||||
:::info
|
||||
|
||||
When you provide `embedding_function`, the system will automatically calculate the vector dimension by calling this function. If you also provide `configuration.dimension`, it must match the dimension of `embedding_function`. Otherwise, a ValueError will be raised.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
from pyseekdb import DefaultEmbeddingFunction, HNSWConfiguration
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
# Create a collection with default embedding function (auto-calculates dimension)
|
||||
collection = client.create_collection(
|
||||
name="my_collection"
|
||||
)
|
||||
|
||||
# Create a collection with custom embedding function
|
||||
ef = UserDefinedEmbeddingFunction() // define your own Embedding function, See section.6
|
||||
config = HNSWConfiguration(dimension=384, distance='cosine') # Must match EF dimension
|
||||
collection = client.create_collection(
|
||||
name="my_collection2",
|
||||
configuration=config,
|
||||
embedding_function=ef
|
||||
)
|
||||
|
||||
# Create a collection without embedding function (vectors must be provided manually)
|
||||
collection = client.create_collection(
|
||||
name="my_collection3",
|
||||
configuration=HNSWConfiguration(dimension=384, distance='cosine'),
|
||||
embedding_function=None # Explicitly disable embedding function
|
||||
)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Query a collection](200.get-collection-of-api.md)
|
||||
* [Create or query a collection](250.get-or-create-collection-of-api.md)
|
||||
* [Get a collection list](300.list-collection-of-api.md)
|
||||
* [Count the number of collections](350.count-collection-of-api.md)
|
||||
* [Delete a collection](400.delete-collection-of-api.md)
|
||||
@@ -0,0 +1,89 @@
|
||||
---
|
||||
slug: /get-collection-of-api
|
||||
---
|
||||
|
||||
# get_collection - Get a collection
|
||||
|
||||
The `get_collection()` function is used to retrieve a specified collection.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when connected using a Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect, see [Client](../50.client.md).
|
||||
|
||||
* The collection you want to retrieve exists. If the collection does not exist, an error will be returned.
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
client.get_collection(name,configuration = configuration,embedding_function = embedding_function)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the collection to retrieve. |my_collection|
|
||||
|`configuration`|HNSWConfiguration|No|The index configuration, which specifies the dimension and distance metric. If not provided, the default value `dimension=384, distance='cosine'` will be used. If set to `None`, the dimension will be calculated from the `embedding_function` value. |HNSWConfiguration(dimension=384, distance='cosine')|
|
||||
|`embedding_function`|EmbeddingFunction|No|The function used to convert text to vectors. If not provided, `DefaultEmbeddingFunction()(384 dimensions)` will be used. If set to `None`, the collection will not contain an embedding function. If an embedding function is provided, it will be calculated based on `configuration.dimension`.|DefaultEmbeddingFunction()|
|
||||
|
||||
:::info
|
||||
|
||||
When vectors are not provided for documents/texts, the embedding function set here will be used for all operations on this collection, including add, upsert, update, query, and hybrid_search.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
# Get an existing collection (uses default embedding function if collection doesn't have one)
|
||||
collection = client.get_collection("my_collection")
|
||||
print(f"Database: {collection.name}, dimension: {collection.dimension}, embedding_function:{collection.embedding_function}, distance:{collection.distance}, metadata:{collection.metadata}")
|
||||
|
||||
# Get collection with specific embedding function
|
||||
ef = UserDefinedEmbeddingFunction() // define your own Embedding function, See section.6
|
||||
collection = client.get_collection("my_collection", embedding_function=ef)
|
||||
print(f"Database: {collection.name}, dimension: {collection.dimension}, embedding_function:{collection.embedding_function}, distance:{collection.distance}, metadata:{collection.metadata}")
|
||||
|
||||
# Get collection without embedding function
|
||||
collection = client.get_collection("my_collection", embedding_function=None)
|
||||
# Check if collection exists
|
||||
if client.has_collection("my_collection"):
|
||||
collection = client.get_collection("my_collection")
|
||||
print(f"Database: {collection.name}, dimension: {collection.dimension}, embedding_function:{collection.embedding_function}, distance:{collection.distance}, metadata:{collection.metadata}")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the collection to query. |my_collection|
|
||||
|`dimension`|int|No| |384|
|
||||
|`embedding_function`|EmbeddingFunction|No|DefaultEmbeddingFunction(model_name='all-MiniLM-L6-v2')|
|
||||
|`distance`|string|No| |cosine|
|
||||
|`metadata`|dict|No|Reserved field, currently no data| {} |
|
||||
|
||||
## Response example
|
||||
|
||||
```python
|
||||
Database: my_collection, dimension: 384, embedding_function:DefaultEmbeddingFunction(model_name='all-MiniLM-L6-v2'), distance:cosine, metadata:{}
|
||||
Database: my_collection1, dimension: 384, embedding_function:DefaultEmbeddingFunction(model_name='all-MiniLM-L6-v2'), distance:cosine, metadata:{}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
* [Create a collection](100.create-collection-of-api.md)
|
||||
* [Create or query a collection](250.get-or-create-collection-of-api.md)
|
||||
* [Get a list of collections](300.list-collection-of-api.md)
|
||||
* [Count the number of collections](350.count-collection-of-api.md)
|
||||
* [Delete a collection](400.delete-collection-of-api.md)
|
||||
@@ -0,0 +1,79 @@
|
||||
---
|
||||
slug: /get-or-create-collection-of-api
|
||||
---
|
||||
|
||||
# get_or_create_collection - Create or query a collection
|
||||
|
||||
The `get_or_create_collection()` function creates or queries a collection. If the collection does not exist in the database, it is created. If it exists, the corresponding result is obtained.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when using a client. For more information about the client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect, see [Client](../50.client.md).
|
||||
|
||||
* If you are using seekdb in server mode or OceanBase Database, ensure that the connected user has the `CREATE` privilege. For more information about how to check the privileges of the current user, see [Check User Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If the user does not have this privilege, contact the administrator to grant it. For more information about how to directly grant privileges, see [Directly Grant Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Define a table name
|
||||
|
||||
When creating a table, you need to define a table name. The following requirements must be met:
|
||||
|
||||
* In seekdb, each table name must be unique within the database.
|
||||
|
||||
* The table name must be no longer than 64 characters.
|
||||
|
||||
* It is recommended to use meaningful names for tables instead of generic names like t1 or table1. For more information about table naming conventions, see [Table Naming Conventions](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003977289).
|
||||
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
create_collection(name = name,configuration = configuration, embedding_function = embedding_function )
|
||||
```
|
||||
|
||||
|Parameter|Value Type|Required|Description|Example Value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the collection to be created. |my_collection|
|
||||
|`configuration`|HNSWConfiguration|No|The index configuration with dimension and distance metric. If not provided, the default value is used, which is `dimension=384, distance='cosine'`. If set to `None`, the dimension will be calculated from the `embedding_function` value. |HNSWConfiguration(dimension=384, distance='cosine')|
|
||||
|`embedding_function`|EmbeddingFunction|No|The function to convert to vectors. If not provided, `DefaultEmbeddingFunction()(384 dimensions)` is used. If set to `None`, the collection will not include embedding functionality. If embedding functionality is provided, it will be automatically calculated based on `configuration.dimension`. |DefaultEmbeddingFunction()|
|
||||
|
||||
:::info
|
||||
|
||||
When `embedding_function` is provided, the system will automatically calculate the vector dimension by calling the function. If `configuration.dimension` is also provided, it must match the dimension of `embedding_function`, otherwise a ValueError will be raised.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
from pyseekdb import DefaultEmbeddingFunction, HNSWConfiguration
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
# Get or create collection (creates if doesn't exist)
|
||||
collection = client.get_or_create_collection(
|
||||
name="my_collection4",
|
||||
configuration=HNSWConfiguration(dimension=384, distance='cosine'),
|
||||
embedding_function=DefaultEmbeddingFunction()
|
||||
)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Create a collection](100.create-collection-of-api.md)
|
||||
* [Query a collection](200.get-collection-of-api.md)
|
||||
* [Get a list of collections](300.list-collection-of-api.md)
|
||||
* [Count collections](350.count-collection-of-api.md)
|
||||
* [Delete a collection](400.delete-collection-of-api.md)
|
||||
@@ -0,0 +1,65 @@
|
||||
---
|
||||
slug: /list-collection-of-api
|
||||
---
|
||||
|
||||
|
||||
# list_collections - Get a list of collections
|
||||
|
||||
The `list_collections()` API is used to obtain all collections.
|
||||
|
||||
:::info
|
||||
|
||||
This API is supported only when you use a Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
client.list_collections()
|
||||
```
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
# List all collections
|
||||
collections = client.list_collections()
|
||||
for coll in collections:
|
||||
print(f"Collection: {coll.name}, Dimension: {coll.dimension}, embedding_function: {coll.embedding_function}, distance: {coll.distance}, metadata: {coll.metadata}")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the queried collection. |my_collection|
|
||||
|`dimension`|int|No| | 384 |
|
||||
|`embedding_function`|EmbeddingFunction|No|DefaultEmbeddingFunction(model_name='all-MiniLM-L6-v2')|
|
||||
|`distance`|string|No| |cosine|
|
||||
|`metadata`|dict|No|Reserved field. No data is returned. | {} |
|
||||
|
||||
## Response example
|
||||
|
||||
```pyhton
|
||||
Collection: my_collection, Dimension: 384, embedding_function: DefaultEmbeddingFunction(model_name='all-MiniLM-L6-v2'), distance: cosine, metadata: {}
|
||||
Database has 1 collections
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
* [Create a collection](100.create-collection-of-api.md)
|
||||
* [Query a collection](200.get-collection-of-api.md)
|
||||
* [Create or query a collection](250.get-or-create-collection-of-api.md)
|
||||
* [Count collections](350.count-collection-of-api.md)
|
||||
* [Delete a collection](400.delete-collection-of-api.md)
|
||||
@@ -0,0 +1,56 @@
|
||||
---
|
||||
slug: /count-collection-of-api
|
||||
---
|
||||
|
||||
# count_collection - Count the number of collections
|
||||
|
||||
The `count_collection()` method is used to count the number of collections in the database.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when you are connected to the database using a Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You are connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
client.count_collection()
|
||||
```
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
# Count collections in database
|
||||
collection_count = client.count_collection()
|
||||
print(f"Database has {collection_count} collections")
|
||||
```
|
||||
|
||||
## Return parameters
|
||||
|
||||
None
|
||||
|
||||
## Return example
|
||||
|
||||
```pyhton
|
||||
Database has 1 collections
|
||||
```
|
||||
|
||||
## Related operations
|
||||
|
||||
* [Create a collection](100.create-collection-of-api.md)
|
||||
* [Query a collection](200.get-collection-of-api.md)
|
||||
* [Create or query a collection](250.get-or-create-collection-of-api.md)
|
||||
* [Get a collection list](300.list-collection-of-api.md)
|
||||
* [Delete a collection](400.delete-collection-of-api.md)
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
slug: /delete-collection-of-api
|
||||
---
|
||||
|
||||
# delete_collection - Delete a Collection
|
||||
|
||||
The `delete_collection()` method is used to delete a specified Collection.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when you are connected to the database using a client. For more information about the client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You are connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* The Collection you want to delete exists. If the Collection does not exist, an error will be returned.
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
client.delete_collection(name)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`name`|string|Yes|The name of the Collection to be deleted. |my_collection|
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
# Delete a collection
|
||||
client.delete_collection("my_collection")
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Create a collection](100.create-collection-of-api.md)
|
||||
* [Query a collection](200.get-collection-of-api.md)
|
||||
* [Create or query a collection](250.get-or-create-collection-of-api.md)
|
||||
* [Get a collection list](300.list-collection-of-api.md)
|
||||
* [Count the number of collections](350.count-collection-of-api.md)
|
||||
@@ -0,0 +1,18 @@
|
||||
---
|
||||
slug: /collection-overview-of-api
|
||||
---
|
||||
|
||||
# Manage collections
|
||||
|
||||
In pyseekdb, a collection is a set similar to a table in a database. You can create, query, and delete collections.
|
||||
|
||||
The following API interfaces are supported for managing collections.
|
||||
|
||||
| API interface | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `create_collection()` | Creates a collection. | [Documentation](100.create-collection-of-api.md) |
|
||||
| `get_collection()` | Gets a specified collection. |[Documentation](200.get-collection-of-api.md)|
|
||||
| `get_or_create_collection()` | Creates or queries a collection. If the collection does not exist in the database, it is created. If the collection exists, the corresponding result is obtained. |[Documentation](250.get-or-create-collection-of-api.md)|
|
||||
| `list_collections()` | Gets the collection list of a database. |[Documentation](300.list-collection-of-api.md)|
|
||||
| `count_collection()` | Counts the number of collections in a database |[Documentation](350.count-collection-of-api.md)|
|
||||
| `delete_collection()` | Deletes a specified collection.|[Documentation](400.delete-collection-of-api.md)|
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
slug: /dml-overview-of-api
|
||||
---
|
||||
|
||||
# DML operations
|
||||
|
||||
DML (Data Manipulation Language) operations allow you to insert, update, and delete data in a collection.
|
||||
|
||||
For DML operations, you can use the following APIs.
|
||||
|
||||
| API | Description | Documentation |
|
||||
|---|---|---|
|
||||
| `add()` | Inserts a new record into a collection. | [Documentation](200.add-data-of-api.md) |
|
||||
| `update()` | Updates an existing record in a collection. |[Documentation](300.update-data-of-api.md)|
|
||||
| `upsert()` | Inserts a new record or updates an existing record. |[Documentation](400.upsert-data-of-api.md)|
|
||||
| `delete()` | Deletes a record from a collection.|[Documentation](500.delete-data-of-api.md)|
|
||||
@@ -0,0 +1,117 @@
|
||||
---
|
||||
slug: /add-data-of-api
|
||||
---
|
||||
|
||||
# add - Insert data
|
||||
|
||||
The `add()` method inserts new data into a collection. If a record with the same ID already exists, an error is returned.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when using a Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* If you are using seekdb or OceanBase Database in client mode, make sure that the user to which you are connected has the `INSERT` privilege on the table to be operated. For more information about how to view the privileges of the current user, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If you do not have the required privilege, contact the administrator to grant you the privilege. For more information about how to directly grant a privilege, see [Directly grant a privilege](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
add(
|
||||
ids=ids,
|
||||
embeddings=embeddings,
|
||||
documents=documents,
|
||||
metadatas=metadatas
|
||||
)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`ids`|string or List[str]|Yes|The ID of the data to be inserted. You can specify a single ID or an array of IDs.|item1|
|
||||
|`embeddings`|List[float] or List[List[float]]|No|The vector or vectors of the data to be inserted. If you specify this parameter, the value of `embedding_function` is ignored. If you do not specify this parameter, you must specify `documents`, and the `collection` must have an `embedding_function`.|[0.1, 0.2, 0.3]|
|
||||
|`documents`|string or List[str]|No|The document or documents to be inserted. If you do not specify `vectors`, `documents` will be converted to vectors using the `embedding_function` of the `collection`.|"This is a document"|
|
||||
|`metadatas`|dict or List[dict]|No|The metadata or metadata list of the data to be inserted. |`{"category": "AI", "score": 95}`|
|
||||
|
||||
:::info
|
||||
|
||||
The `embedding_function` associated with the collection is set during `create_collection()` or `get_collection()`. You cannot override it for each operation.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
from pyseekdb import DefaultEmbeddingFunction, HNSWConfiguration
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.create_collection(
|
||||
name="my_collection",
|
||||
configuration=HNSWConfiguration(dimension=3, distance='cosine'),
|
||||
embedding_function=None
|
||||
)
|
||||
|
||||
# Add single item
|
||||
collection.add(
|
||||
ids="item1",
|
||||
embeddings=[0.1, 0.2, 0.3],
|
||||
documents="This is a document",
|
||||
metadatas={"category": "AI", "score": 95}
|
||||
)
|
||||
|
||||
# Add multiple items
|
||||
collection.add(
|
||||
ids=["item4", "item2", "item3"],
|
||||
embeddings=[
|
||||
[0.1, 0.2, 0.4],
|
||||
[0.4, 0.5, 0.6],
|
||||
[0.7, 0.8, 0.9]
|
||||
],
|
||||
documents=[
|
||||
"Document 1",
|
||||
"Document 2",
|
||||
"Document 3"
|
||||
],
|
||||
metadatas=[
|
||||
{"category": "AI", "score": 95},
|
||||
{"category": "ML", "score": 88},
|
||||
{"category": "DL", "score": 92}
|
||||
]
|
||||
)
|
||||
|
||||
# Add with only embeddings
|
||||
collection.add(
|
||||
ids=["vec1", "vec2"],
|
||||
embeddings=[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
|
||||
)
|
||||
|
||||
collection1 = client.create_collection(
|
||||
name="my_collection1"
|
||||
)
|
||||
|
||||
# Add with only documents - embeddings auto-generated by embedding_function
|
||||
# Requires: collection must have embedding_function set
|
||||
collection1.add(
|
||||
ids=["doc1", "doc2"],
|
||||
documents=["Text document 1", "Text document 2"],
|
||||
metadatas=[{"tag": "A"}, {"tag": "B"}]
|
||||
)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Update data](300.update-data-of-api.md)
|
||||
* [Update or insert data](400.upsert-data-of-api.md)
|
||||
* [Delete data](500.delete-data-of-api.md)
|
||||
@@ -0,0 +1,88 @@
|
||||
---
|
||||
slug: /update-data-of-api
|
||||
---
|
||||
|
||||
# update - Update data
|
||||
|
||||
The `update()` method is used to update existing records in a collection. The record must exist, otherwise an error will be raised.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when using a Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect, see [Client](../50.client.md).
|
||||
|
||||
* If you are using seekdb in client mode or OceanBase Database, make sure that the user to which you have connected has the `UPDATE` privilege on the table to be operated. For more information about how to view the privileges of the current user, see [View User Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If you do not have this privilege, contact the administrator to grant it to you. For more information about how to directly grant privileges, see [Directly Grant Privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
update(
|
||||
ids=ids,
|
||||
embeddings=embeddings,
|
||||
documents=documents,
|
||||
metadatas=metadatas
|
||||
)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`ids`|string or List[str]|Yes|The ID to be modified. It can be a single ID or an array of IDs.|item1|
|
||||
|`embeddings`|List[float] or List[List[float]]|No|The new vectors. If provided, they will be used directly (ignoring `embedding_function`). If not provided, you can provide `documents` to automatically generate vectors.|[[0.9, 0.8, 0.7], [0.6, 0.5, 0.4]]|
|
||||
|`documents`|string or List[str]|No|The new documents. If `vectors` are not provided, `documents` will be converted to vectors using the collection's `embedding_function`.|"New document text"|
|
||||
|`metadatas`|dict or List[dict]|No|The new metadata.|`{"category": "AI"}`|
|
||||
|
||||
:::info
|
||||
|
||||
You can update only the `metadatas`. The `embedding_function` used must be associated with the collection.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.get_collection("my_collection")
|
||||
collection1 = client.get_collection("my_collection1")
|
||||
|
||||
# Update single item
|
||||
collection.update(
|
||||
ids="item1",
|
||||
metadatas={"category": "AI", "score": 98} # Update metadata only
|
||||
)
|
||||
|
||||
# Update multiple items
|
||||
collection.update(
|
||||
ids=["item1", "item2"],
|
||||
embeddings=[[0.9, 0.8, 0.7], [0.6, 0.5, 0.4]], # Update embeddings
|
||||
documents=["Updated document 1", "Updated document 2"] # Update documents
|
||||
)
|
||||
|
||||
# Update with documents only - embeddings auto-generated by embedding_function
|
||||
# Requires: collection must have embedding_function set
|
||||
collection1.update(
|
||||
ids="doc1",
|
||||
documents="New document text", # Embeddings will be auto-generated
|
||||
metadatas={"category": "AI"}
|
||||
)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Insert data](200.add-data-of-api.md)
|
||||
* [Update or insert data](400.upsert-data-of-api.md)
|
||||
* [Delete data](500.delete-data-of-api.md)
|
||||
@@ -0,0 +1,93 @@
|
||||
---
|
||||
slug: /upsert-data-of-api
|
||||
---
|
||||
|
||||
# upsert - Update or insert data
|
||||
|
||||
The `upsert()` method is used to insert new records or update existing records. If a record with the given ID already exists, it will be updated; otherwise, a new record will be inserted.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when using a Client connection. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect, see [Client](../50.client.md).
|
||||
|
||||
* If you are using seekdb or OceanBase Database in client mode, ensure that the connected user has the `INSERT` and `UPDATE` privileges on the target table. For more information about how to view the current user privileges, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If the user does not have the required privileges, contact the administrator to grant them. For more information about how to directly grant privileges, see [Directly grant privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
Upsert(
|
||||
ids=ids,
|
||||
embeddings=embeddings,
|
||||
documents=documents,
|
||||
metadatas=metadatas
|
||||
)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`ids`|string or List[str]|Yes|The ID to be added or modified. It can be a single ID or an array of IDs.|item1|
|
||||
|`embeddings`|List[float] or List[List[float]]|No|The vectors. If provided, they will be used directly (ignoring `embedding_function`). If not provided, you can provide `documents` to automatically generate vectors.|[0.1, 0.2, 0.3]|
|
||||
|`documents`|string or List[str]|No|The documents. If `vectors` are not provided, `documents` will be converted to vectors using the collection's `embedding_function`.|"Document text"|
|
||||
|`metadatas`|dict or List[dict]|No|The metadata. |`{"category": "AI"}`|
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.get_collection("my_collection")
|
||||
collection1 = client.get_collection("my_collection1")
|
||||
|
||||
# Upsert single item (insert or update)
|
||||
collection.upsert(
|
||||
ids="item1",
|
||||
embeddings=[0.1, 0.2, 0.3],
|
||||
documents="Document text",
|
||||
metadatas={"category": "AI", "score": 95}
|
||||
)
|
||||
|
||||
# Upsert multiple items
|
||||
collection.upsert(
|
||||
ids=["item1", "item2", "item3"],
|
||||
embeddings=[
|
||||
[0.1, 0.2, 0.3],
|
||||
[0.4, 0.5, 0.6],
|
||||
[0.7, 0.8, 0.9]
|
||||
],
|
||||
documents=["Doc 1", "Doc 2", "Doc 3"],
|
||||
metadatas=[
|
||||
{"category": "AI"},
|
||||
{"category": "ML"},
|
||||
{"category": "DL"}
|
||||
]
|
||||
)
|
||||
|
||||
# Upsert with documents only - embeddings auto-generated by embedding_function
|
||||
# Requires: collection must have embedding_function set
|
||||
collection1.upsert(
|
||||
ids=["item1", "item2"],
|
||||
documents=["Document 1", "Document 2"],
|
||||
metadatas=[{"category": "AI"}, {"category": "ML"}]
|
||||
)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Insert data](200.add-data-of-api.md)
|
||||
* [Update data](300.update-data-of-api.md)
|
||||
* [Delete data](400.upsert-data-of-api.md)
|
||||
@@ -0,0 +1,87 @@
|
||||
---
|
||||
slug: /delete-data-of-api
|
||||
---
|
||||
|
||||
# delete - Delete data
|
||||
|
||||
`delete()` is used to delete records from a collection. You can delete records by ID, metadata filter, or document filter.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when you are connected to the database using a Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Quick Start](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You are connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* If you are using seekdb or OceanBase Database in client mode, make sure that the user to whom you are connected has the `DELETE` privilege on the table to be operated. For more information about how to view the privileges of the current user, see [View user privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980135). If you do not have this privilege, contact the administrator to grant it to you. For more information about how to directly grant privileges, see [Directly grant privileges](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003980140).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
Upsert(
|
||||
ids=ids,
|
||||
embeddings=embeddings,
|
||||
documents=documents,
|
||||
metadatas=metadatas
|
||||
)
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`ids`|string or List[str]|Optional|The ID of the record to be deleted. You can specify a single ID or an array of IDs.|item1|
|
||||
|`where`|dict|Optional|The metadata filter.|`{"category": {"$eq": "AI"}}`|
|
||||
|`where_document`|dict|Optional|The document filter.|`{"$contains": "obsolete"}`|
|
||||
|
||||
:::info
|
||||
|
||||
At least one of the `id`, `where`, or `where_document` parameters must be specified.
|
||||
|
||||
:::
|
||||
|
||||
## Request examples
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.get_collection("my_collection")
|
||||
|
||||
# Delete by IDs
|
||||
collection.delete(ids=["item1", "item2", "item3"])
|
||||
|
||||
# Delete by single ID
|
||||
collection.delete(ids="item1")
|
||||
|
||||
# Delete by metadata filter
|
||||
collection.delete(where={"category": {"$eq": "AI"}})
|
||||
|
||||
# Delete by comparison operator
|
||||
collection.delete(where={"score": {"$lt": 50}})
|
||||
|
||||
# Delete by document filter
|
||||
collection.delete(where_document={"$contains": "obsolete"})
|
||||
|
||||
# Delete with combined filters
|
||||
collection.delete(
|
||||
where={"category": {"$eq": "AI"}},
|
||||
where_document={"$contains": "deprecated"}
|
||||
)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
None
|
||||
|
||||
## References
|
||||
|
||||
* [Insert data](200.add-data-of-api.md)
|
||||
* [Update data](300.update-data-of-api.md)
|
||||
* [Update or insert data](400.upsert-data-of-api.md)
|
||||
@@ -0,0 +1,15 @@
|
||||
---
|
||||
slug: /dql-overview-of-api
|
||||
---
|
||||
|
||||
# Overview of DQL
|
||||
|
||||
DQL (Data Query Language) operations allow you to retrieve data from collections using various query methods.
|
||||
|
||||
For DQL operations, the following API interfaces are supported.
|
||||
|
||||
| API Interface | Description | Documentation Link |
|
||||
|---|---|---|
|
||||
| `query()` | A vector similarity search method. | [Documentation](200.query-interfaces-of-api.md) |
|
||||
| `get()` | Queries specific data from a table using an ID, document, or metadata (excluding vectors). | [Documentation](300.get-interfaces-of-api.md) |
|
||||
| `hybrid_search()` | Combines full-text search and vector similarity search using a ranking method. | [Documentation](400.hybrid-search-of-api.md) |
|
||||
@@ -0,0 +1,161 @@
|
||||
---
|
||||
slug: /query-interfaces-of-api
|
||||
---
|
||||
|
||||
# query - vector query
|
||||
|
||||
The `query()` method is used to perform vector similarity search to find the most similar documents to the query vector.
|
||||
|
||||
:::info
|
||||
|
||||
This interface is only available when using the Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* You have created a collection and inserted data. For more information about how to create a collection and insert data, see [create_collection - Create a collection](../200.collection/100.create-collection-of-api.md) and [add - Insert data](../300.dml/200.add-data-of-api.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
query()
|
||||
```
|
||||
|
||||
|Parameter|Value type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`query_embeddings`|List[float] or List[List[float]] |Yes|A single vector or a list of vectors for batch queries; if provided, it will be used directly (ignoring `embedding_function`); if not provided, `query_text` must be provided, and the `collection` must have an `embedding_function`|[1.0, 2.0, 3.0]|
|
||||
|`query_texts`|str or List[str]|No|A single text or a list of texts for query; if provided, it will be used directly (ignoring `embedding_function`); if not provided, `documents` must be provided, and the `collection` must have an `embedding_function`|["my query text"]|
|
||||
|`n_results`|int|Yes|The number of similar results to return, default is 10|3|
|
||||
|`where`|dict |No|Metadata filter conditions.|`{"category": {"$eq": "AI"}}`|
|
||||
|`where_document`|dict|No|Document filter conditions.|`{"$contains": "machine"}`|
|
||||
|`include`|List[str]|No|List of fields to include: `["documents", "metadatas", "embeddings"]`|["documents", "metadatas", "embeddings"]|
|
||||
|
||||
:::info
|
||||
|
||||
The `embedding_function` used is associated with the collection (set during `create_collection()` or `get_collection()`). You cannot override it for each operation.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.get_collection("my_collection")
|
||||
collection1 = client.get_collection("my_collection1")
|
||||
|
||||
# Basic vector similarity query (embedding_function not used)
|
||||
results = collection.query(
|
||||
query_embeddings=[1.0, 2.0, 3.0],
|
||||
n_results=3
|
||||
)
|
||||
|
||||
# Iterate over results
|
||||
for i in range(len(results["ids"][0])):
|
||||
print(f"ID: {results['ids'][0][i]}, Distance: {results['distances'][0][i]}")
|
||||
if results.get("documents"):
|
||||
print(f"Document: {results['documents'][0][i]}")
|
||||
if results.get("metadatas"):
|
||||
print(f"Metadata: {results['metadatas'][0][i]}")
|
||||
|
||||
# Query by texts - vectors auto-generated by embedding_function
|
||||
# Requires: collection must have embedding_function set
|
||||
results = collection1.query(
|
||||
query_texts=["my query text"],
|
||||
n_results=10
|
||||
)
|
||||
# The collection's embedding_function will automatically convert query_texts to query_embeddings
|
||||
|
||||
# Query by multiple texts (batch query)
|
||||
results = collection1.query(
|
||||
query_texts=["query text 1", "query text 2"],
|
||||
n_results=5
|
||||
)
|
||||
# Returns dict with lists of lists, one list per query text
|
||||
for i in range(len(results["ids"])):
|
||||
print(f"Query {i}: {len(results['ids'][i])} results")
|
||||
|
||||
# Query with metadata filter (using query_texts)
|
||||
results = collection1.query(
|
||||
query_texts=["AI research"],
|
||||
where={"category": {"$eq": "AI"}},
|
||||
n_results=5
|
||||
)
|
||||
|
||||
# Query with comparison operator (using query_texts)
|
||||
results = collection1.query(
|
||||
query_texts=["machine learning"],
|
||||
where={"score": {"$gte": 90}},
|
||||
n_results=5
|
||||
)
|
||||
|
||||
# Query with document filter (using query_texts)
|
||||
results = collection1.query(
|
||||
query_texts=["neural networks"],
|
||||
where_document={"$contains": "machine learning"},
|
||||
n_results=5
|
||||
)
|
||||
|
||||
# Query with combined filters (using query_texts)
|
||||
results = collection1.query(
|
||||
query_texts=["AI research"],
|
||||
where={"category": {"$eq": "AI"}, "score": {"$gte": 90}},
|
||||
where_document={"$contains": "machine"},
|
||||
n_results=5
|
||||
)
|
||||
|
||||
# Query with multiple vectors (batch query)
|
||||
results = collection.query(
|
||||
query_embeddings=[[1.0, 2.0, 3.0], [2.0, 3.0, 4.0]],
|
||||
n_results=2
|
||||
)
|
||||
# Returns dict with lists of lists, one list per query vector
|
||||
for i in range(len(results["ids"])):
|
||||
print(f"Query {i}: {len(results['ids'][i])} results")
|
||||
|
||||
# Query with specific fields
|
||||
results = collection.query(
|
||||
query_embeddings=[1.0, 2.0, 3.0],
|
||||
include=["documents", "metadatas", "embeddings"],
|
||||
n_results=3
|
||||
)
|
||||
```
|
||||
|
||||
## Return parameters
|
||||
|
||||
|Parameter|Value type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`ids`|List[List[str]] |Yes|The IDs to add or modify. It can be a single ID or an array of IDs.|item1|
|
||||
|`embeddings`|[List[List[List[float]]]]|No|The vectors; if provided, it will be used directly (ignoring `embedding_function`), if not provided, `documents` can be provided to generate vectors automatically.|[0.1, 0.2, 0.3]|
|
||||
|`documents`|[List[List[Dict]]]|No|The documents. If `vectors` are not provided, `documents` will be converted to vectors using the `embedding_function` of the collection.| "Document text"|
|
||||
|`metadatas`|[List[List[Dict]]]|No|The metadata.|`{"category": "AI"}`|
|
||||
|`distances`|[List[List[Dict]]]|No| |`{"category": "AI"}`|
|
||||
|
||||
## Return example
|
||||
|
||||
```python
|
||||
ID: vec1, Distance: 0.0
|
||||
Document: None
|
||||
Metadata: {}
|
||||
ID: vec2, Distance: 0.025368153802923787
|
||||
Document: None
|
||||
Metadata: {}
|
||||
Query 0: 4 results
|
||||
Query 1: 4 results
|
||||
Query 0: 2 results
|
||||
Query 1: 2 results
|
||||
```
|
||||
|
||||
## Related operations
|
||||
|
||||
* [get - Retrieve](300.get-interfaces-of-api.md)
|
||||
* [Hybrid search](400.hybrid-search-of-api.md)
|
||||
* [Operators](500.filter-operators-of-api.md)
|
||||
@@ -0,0 +1,127 @@
|
||||
---
|
||||
slug: /get-interfaces-of-api
|
||||
---
|
||||
|
||||
# get - Retrieve
|
||||
|
||||
`get()` is used to retrieve documents from a collection without performing vector similarity search.
|
||||
|
||||
It supports filtering by IDs, metadata, and documents.
|
||||
|
||||
:::info
|
||||
|
||||
This interface is only available when using the Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* You have created a collection and inserted data. For more information about how to create a collection and insert data, see [create_collection - Create a collection](../200.collection/100.create-collection-of-api.md) and [add - Insert data](../300.dml/200.add-data-of-api.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
get()
|
||||
```
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`ids`|List[float] or List[List[float]] |Yes|The ID or list of IDs to retrieve.|[1.0, 2.0, 3.0]|
|
||||
|`where`|dict |No|The metadata filter. |`{"category": {"$eq": "AI"}}`|
|
||||
|`where_document`|dict|No|The document filter. |`{"$contains": "machine"}`|
|
||||
|`limit`|dict |No|The maximum number of results to return. |`{"category": {"$eq": "AI"}}`|
|
||||
|`offset`|dict|No|The number of results to skip for pagination. |`{"$contains": "machine"}`|
|
||||
|`include`|List[str]|No|The list of fields to include: `["documents", "metadatas", "embeddings"]`. |["documents", "metadatas", "embeddings"]|
|
||||
|
||||
:::info
|
||||
|
||||
If no parameters are provided, all data is returned.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.get_collection("my_collection")
|
||||
|
||||
# Get by single ID
|
||||
results = collection.get(ids="123")
|
||||
|
||||
# Get by multiple IDs
|
||||
results = collection.get(ids=["1", "2", "3"])
|
||||
|
||||
# Get by metadata filter
|
||||
results = collection.get(
|
||||
where={"category": {"$eq": "AI"}},
|
||||
limit=10
|
||||
)
|
||||
|
||||
# Get by comparison operator
|
||||
results = collection.get(
|
||||
where={"score": {"$gte": 90}},
|
||||
limit=10
|
||||
)
|
||||
|
||||
# Get by $in operator
|
||||
results = collection.get(
|
||||
where={"tag": {"$in": ["ml", "python"]}},
|
||||
limit=10
|
||||
)
|
||||
|
||||
# Get by logical operators ($or)
|
||||
results = collection.get(
|
||||
where={
|
||||
"$or": [
|
||||
{"category": {"$eq": "AI"}},
|
||||
{"tag": {"$eq": "python"}}
|
||||
]
|
||||
},
|
||||
limit=10
|
||||
)
|
||||
|
||||
# Get by document content filter
|
||||
results = collection.get(
|
||||
where_document={"$contains": "machine learning"},
|
||||
limit=10
|
||||
)
|
||||
|
||||
# Get with combined filters
|
||||
results = collection.get(
|
||||
where={"category": {"$eq": "AI"}},
|
||||
where_document={"$contains": "machine"},
|
||||
limit=10
|
||||
)
|
||||
|
||||
# Get with pagination
|
||||
results = collection.get(limit=2, offset=1)
|
||||
|
||||
# Get with specific fields
|
||||
results = collection.get(
|
||||
ids=["1", "2"],
|
||||
include=["documents", "metadatas", "embeddings"]
|
||||
)
|
||||
|
||||
# Get all data (up to limit)
|
||||
results = collection.get(limit=100)
|
||||
```
|
||||
|
||||
## Response parameters
|
||||
|
||||
* If a single ID is provided: The result contains the get object for that ID.
|
||||
* If multiple IDs are provided: A list of QueryResult objects, one for each ID.
|
||||
* If filters are provided: A QueryResult object containing all matching results.
|
||||
|
||||
## Related operations
|
||||
|
||||
* [Vector query](200.query-interfaces-of-api.md)
|
||||
* [Hybrid search](400.hybrid-search-of-api.md)
|
||||
* [Operators](500.filter-operators-of-api.md)
|
||||
@@ -0,0 +1,140 @@
|
||||
---
|
||||
slug: /hybrid-search-of-api
|
||||
---
|
||||
|
||||
# hybrid_search - Hybrid search
|
||||
|
||||
`hybrid_search()` combines full-text search and vector similarity search with ranking.
|
||||
|
||||
:::info
|
||||
|
||||
This API is only available when using the Client. For more information about the Client, see [Client](../50.client.md).
|
||||
|
||||
:::
|
||||
|
||||
## Prerequisites
|
||||
|
||||
* You have installed pyseekdb. For more information about how to install pyseekdb, see [Get Started](../../10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).
|
||||
|
||||
* You have connected to the database. For more information about how to connect to the database, see [Client](../50.client.md).
|
||||
|
||||
* You have created a collection and inserted data. For more information about how to create a collection and insert data, see [create_collection - Create a collection](../200.collection/100.create-collection-of-api.md) and [add - Insert Data](../300.dml/200.add-data-of-api.md).
|
||||
|
||||
## Request parameters
|
||||
|
||||
```python
|
||||
hybrid_search(
|
||||
query={
|
||||
"where_document": ,
|
||||
"where": ,
|
||||
"n_results":
|
||||
},
|
||||
knn={
|
||||
"query_texts":
|
||||
"where":
|
||||
"n_results":
|
||||
},
|
||||
rank=,
|
||||
n_results=,
|
||||
include=
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
* query: full-text search configuration, including the following parameters:
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`where`|dict |Optional|Metadata filter conditions. |`{"category": {"$eq": "AI"}}`|
|
||||
|`where_document`|dict|Optional|Document filter conditions. |`{"$contains": "machine"}`|
|
||||
|`n_results`|int|Yes|Number of results for full-text search.||
|
||||
|
||||
* knn: vector search configuration, including the following parameters:
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|---|---|---|---|---|
|
||||
|`query_embeddings`|List[float] or List[List[float]] |Yes|A single vector or list of vectors for batch queries; if provided, it will be used directly (ignoring `embedding_function`); if not provided, `query_text` must be provided, and the `collection` must have an `embedding_function`|[1.0, 2.0, 3.0]|
|
||||
|`query_texts`|str or List[str]|Optional|A single vector or list of vectors; if provided, it will be used directly (ignoring `embedding_function`); if not provided, `documents` must be provided, and the `collection` must have an `embedding_function`|["my query text"]|
|
||||
|`where`|dict |Optional|Metadata filter conditions. |`{"category": {"$eq": "AI"}}`|
|
||||
|`n_results`|int|Yes|Number of results for vector search.||
|
||||
|
||||
* Other parameters are as follows:
|
||||
|
||||
|Parameter|Type|Required|Description|Example value|
|
||||
|`rank`|dict |Optional|Ranking configuration, for example: `{"rrf": {"rank_window_size": 60, "rank_constant": 60}}`|`{"category": {"$eq": "AI"}}`|
|
||||
|`n_results`|int|Yes|Number of similar results to return. Default value is 10|3|
|
||||
|`include`|List[str]|Optional|List of fields to include: `["documents", "metadatas", "embeddings"]`.|["documents", "metadatas", "embeddings"]|
|
||||
|
||||
|
||||
:::info
|
||||
|
||||
The `embedding_function` used is associated with the collection (set during `create_collection()` or `get_collection()`). You cannot override it for each operation.
|
||||
|
||||
:::
|
||||
|
||||
## Request example
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create a client
|
||||
client = pyseekdb.Client()
|
||||
|
||||
collection = client.get_collection("my_collection")
|
||||
collection1 = client.get_collection("my_collection1")
|
||||
|
||||
# Hybrid search with query_embeddings (embedding_function not used)
|
||||
results = collection.hybrid_search(
|
||||
query={
|
||||
"where_document": {"$contains": "machine learning"},
|
||||
"n_results": 10
|
||||
},
|
||||
knn={
|
||||
"query_embeddings": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], # Used directly
|
||||
"n_results": 10
|
||||
},
|
||||
rank={"rrf": {}},
|
||||
n_results=5
|
||||
)
|
||||
|
||||
# Hybrid search with both full-text and vector search (using query_texts)
|
||||
results = collection1.hybrid_search(
|
||||
query={
|
||||
"where_document": {"$contains": "machine learning"},
|
||||
"where": {"category": {"$eq": "science"}},
|
||||
"n_results": 10
|
||||
},
|
||||
knn={
|
||||
"query_texts": ["AI research"], # Will be embedded automatically
|
||||
"where": {"year": {"$gte": 2020}},
|
||||
"n_results": 10
|
||||
},
|
||||
rank={"rrf": {}}, # Reciprocal Rank Fusion
|
||||
n_results=5,
|
||||
include=["documents", "metadatas", "embeddings"]
|
||||
)
|
||||
|
||||
# Hybrid search with multiple query texts (batch)
|
||||
results = collection1.hybrid_search(
|
||||
query={
|
||||
"where_document": {"$contains": "AI"},
|
||||
"n_results": 10
|
||||
},
|
||||
knn={
|
||||
"query_texts": ["machine learning", "neural networks"], # Multiple queries
|
||||
"n_results": 10
|
||||
},
|
||||
rank={"rrf": {}},
|
||||
n_results=5
|
||||
)
|
||||
```
|
||||
|
||||
## Return parameters
|
||||
|
||||
A dictionary containing search results, including ID, distances, metadatas, document, etc.
|
||||
|
||||
## Related operations
|
||||
|
||||
* [Vector query](200.query-interfaces-of-api.md)
|
||||
* [get - Retrieve](300.get-interfaces-of-api.md)
|
||||
* [Operators](500.filter-operators-of-api.md)
|
||||
@@ -0,0 +1,151 @@
|
||||
---
|
||||
slug: /filter-operators-of-api
|
||||
---
|
||||
|
||||
# Operators
|
||||
|
||||
Operators are used to connect operands or parameters and return results. In terms of syntax, operators can appear before, after, or between operands.
|
||||
|
||||
## Operator examples
|
||||
|
||||
### Data filtering (where)
|
||||
|
||||
#### Equal to
|
||||
|
||||
Use `$eq` to indicate equal to, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"category": {"$eq": "AI"}}
|
||||
```
|
||||
|
||||
#### Not equal to
|
||||
|
||||
Use `$ne` to indicate not equal to, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"status": {"$ne": "deleted"}}
|
||||
```
|
||||
|
||||
#### Greater than
|
||||
|
||||
Use `$gt` to indicate greater than, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"score": {"$gt": 90}}
|
||||
```
|
||||
|
||||
#### Greater than or equal to
|
||||
|
||||
Use `$gte` to indicate greater than or equal to, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"score": {"$gte": 90}}
|
||||
```
|
||||
|
||||
#### Less than
|
||||
|
||||
Use `$lt` to indicate less than, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"score": {"$lt": 50}}
|
||||
```
|
||||
|
||||
#### Less than or equal to
|
||||
|
||||
Use `$lte` to indicate less than or equal to, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"score": {"$lte": 50}}
|
||||
```
|
||||
|
||||
#### Contains
|
||||
|
||||
Use `$in` to indicate contains, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"tag": {"$in": ["ml", "python", "ai"]}}
|
||||
```
|
||||
|
||||
#### Does not contain
|
||||
|
||||
Use `$nin` to indicate does not contain, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={"tag": {"$nin": ["deprecated", "old"]}}
|
||||
```
|
||||
|
||||
#### Logical OR
|
||||
|
||||
Use `$or` to indicate logical OR, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={
|
||||
"$or": [
|
||||
{"category": {"$eq": "AI"}},
|
||||
{"tag": {"$eq": "python"}}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### Logical AND
|
||||
|
||||
Use `$and` to indicate logical AND, as shown in the following example:
|
||||
|
||||
```python
|
||||
where={
|
||||
"$and": [
|
||||
{"category": {"$eq": "AI"}},
|
||||
{"score": {"$gte": 90}}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Text filtering (where_document)
|
||||
|
||||
#### Full-text search (contains substring)
|
||||
|
||||
Use `$contains` to indicate full-text search, as shown in the following example:
|
||||
|
||||
```python
|
||||
where_document={"$contains": "machine learning"}
|
||||
```
|
||||
|
||||
#### Regular expression
|
||||
|
||||
Use `$regex` to indicate regular expression, as shown in the following example:
|
||||
|
||||
```python
|
||||
where_document={"$regex": "pattern.*"}
|
||||
```
|
||||
|
||||
#### Logical OR
|
||||
|
||||
Use `$or` to indicate logical OR, as shown in the following example:
|
||||
|
||||
```python
|
||||
where_document={
|
||||
"$or": [
|
||||
{"$contains": "machine learning"},
|
||||
{"$contains": "artificial intelligence"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### Logical AND
|
||||
|
||||
Use `$and` to indicate logical AND, as shown in the following example:
|
||||
|
||||
```python
|
||||
where_document={
|
||||
"$and": [
|
||||
{"$contains": "machine"},
|
||||
{"$contains": "learning"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Related operations
|
||||
|
||||
* [Vector query](200.query-interfaces-of-api.md)
|
||||
* [get - Retrieve](300.get-interfaces-of-api.md)
|
||||
* [Hybrid search](400.hybrid-search-of-api.md)
|
||||
@@ -0,0 +1,107 @@
|
||||
---
|
||||
slug: /client
|
||||
---
|
||||
|
||||
# Client
|
||||
|
||||
The `Client` class is used to connect to a database in either embedded mode or server mode. It automatically selects the appropriate connection mode based on the provided parameters.
|
||||
|
||||
:::tip
|
||||
OceanBase Database is a fully self-developed, enterprise-level, native distributed database developed by OceanBase. It achieves financial-grade high availability on ordinary hardware and sets a new standard for automatic, lossless disaster recovery across five IDCs in three regions. It also sets a new benchmark in the TPC-C benchmark test, with a single cluster size exceeding 1,500 nodes. OceanBase Database is cloud-native, highly consistent, and highly compatible with Oracle and MySQL. For more information about OceanBase Database, see [OceanBase Database](https://www.oceanbase.com/docs/oceanbase-database-cn).
|
||||
:::
|
||||
|
||||
## Connect to an embedded seekdb instance
|
||||
|
||||
Use the `Client` class to connect to a local embedded seekdb instance.
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create embedded client
|
||||
client = pyseekdb.Client(
|
||||
#path="./seekdb", # Path to SeekDB data directory
|
||||
#database="test" # Database name
|
||||
)
|
||||
```
|
||||
|
||||
The following table describes the parameters.
|
||||
|
||||
| Parameter | Value type | Required | Description | Example value |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `path` | string | No | The path to the seekdb data directory. seekdb stores database files in this directory and loads them when it starts. | `./seekdb` |
|
||||
| `database` | string | No | The name of the database. | `test` |
|
||||
|
||||
## Connect to a remote server
|
||||
|
||||
Use the `Client` class to connect to a remote server, which runs seekdb or OceanBase Database.
|
||||
|
||||
:::tip
|
||||
|
||||
Before you connect to a remote server, make sure that you have deployed a server instance of seekdb or OceanBase Database. <br/>For information about how to deploy a server instance of seekdb, see [Overview](../../../400.guides/400.deploy/50.deploy-overview.md).<br/>For information about how to deploy OceanBase Database, see [Overview](https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000003976427).
|
||||
|
||||
:::
|
||||
|
||||
Example: Connect to a server instance of seekdb
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create remote server client (SeekDB Server)
|
||||
client = pyseekdb.Client(
|
||||
host="127.0.0.1", # Server host
|
||||
port=2881, # Server port
|
||||
database="test", # Database name
|
||||
user="root", # Username
|
||||
password="" # Password (can be retrieved from SEEKDB_PASSWORD environment variable)
|
||||
)
|
||||
```
|
||||
|
||||
The following table describes the parameters.
|
||||
|
||||
| Parameter | Value type | Required | Description | Example value |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `host` | string | Yes | The IP address of the server where the instance is located. | `127.0.0.1` |
|
||||
| `prot` | string | Yes | The port number of the instance. The default value is 2881. | `2881` |
|
||||
| `database` | string | Yes | The name of the database. | `test` |
|
||||
| `user` | string | Yes | The username. The default value is root. | `root` |
|
||||
| `password` | string | Yes | The password corresponding to the user. If you do not provide the `password` parameter or specify an empty string, the system retrieves the password from the `SEEKDB_PASSWORD` environment variable. ||
|
||||
|
||||
Example: Connect to OceanBase Database
|
||||
|
||||
```python
|
||||
import pyseekdb
|
||||
|
||||
# Create remote server client (OceanBase Server)
|
||||
client = pyseekdb.Client(
|
||||
host="127.0.0.1", # Server host
|
||||
port=2881, # Server port (default: 2881)
|
||||
tenant="test", # Tenant name
|
||||
database="test", # Database name
|
||||
user="root", # Username (default: "root")
|
||||
password="" # Password (can be retrieved from SEEKDB_PASSWORD environment variable)
|
||||
)
|
||||
```
|
||||
|
||||
The following table describes the parameters.
|
||||
|
||||
| Parameter | Value type | Required | Description | Example value |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `host` | string | Yes | The IP address of the server where the database is located. | `127.0.0.1` |
|
||||
| `prot` | string | Yes | The port number of OceanBase Database. The default value is 2881. | `2881` |
|
||||
| `tenant` | string | No | The name of the tenant. This parameter is not required for seekdb. For OceanBase Database, the default value is sys. | `test` |
|
||||
| `database` | string | Yes | The name of the database. | `test` |
|
||||
| `user` | string | Yes | The username corresponding to the tenant. The default value is root. | `root` |
|
||||
| `password` | string | Yes | The password corresponding to the user. If you do not provide the `password` parameter or specify an empty string, the system retrieves the password from the `SEEKDB_PASSWORD` environment variable. ||
|
||||
|
||||
## APIs supported when you use the Client class to connect to a database
|
||||
|
||||
When you use the `Client` class to connect to a database, you can call the following APIs.
|
||||
|
||||
| API | Description | Document link |
|
||||
| --- | --- | --- |
|
||||
| `create_collection()` | Creates a new collection. | [Document](200.collection/100.create-collection-of-api.md) |
|
||||
| `get_collection()` | Queries a specified collection. |[Document](200.collection/200.get-collection-of-api.md)|
|
||||
| `delete_collection()` | Deletes a specified collection. |[Document](200.collection/400.delete-collection-of-api.md)|
|
||||
| `list_collections()` | Lists all collections in the current database.|[Document](200.collection/300.list-collection-of-api.md)|
|
||||
| `get_or_create_collection()` | Queries a specified collection. If the collection does not exist, it is created.|[Document](200.collection/250.get-or-create-collection-of-api.md)|
|
||||
| `count_collection()` | Queries the number of collections in the current database. |[Document](200.collection/350.count-collection-of-api.md)|
|
||||
Reference in New Issue
Block a user