Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:44:54 +08:00
commit eb309b7b59
133 changed files with 21979 additions and 0 deletions

View File

@@ -0,0 +1,131 @@
---
slug: /deploy-seekdb-testing-environment
---
# Quickly deploy seekdb in client/server mode
seekdb provides embedded mode and client/server mode. You can choose the appropriate deployment mode based on your business scenario. This topic introduces how to quickly deploy seekdb in client/server mode.
:::info
For information about using seekdb in embedded mode, see [Experience embedded seekdb](../50.embedded-mode/25.using-seekdb-in-python-sdk.md).
:::
## Deployment modes
seekdb provides flexible deployment modes that support everything from rapid prototyping to large-scale user workloads, meeting the full range of your application needs.
* Embedded mode
seekdb embeds as a lightweight library installable with a single pip command, ideal for personal learning or prototyping, and can easily run on various end devices.
* Client/Server mode
A lightweight and easy-to-use deployment mode recommended for both testing and production, delivering stable and efficient service.
:::info
For more detailed and comprehensive deployment methods for seekdb, see [Deployment overview](../../400.guides/400.deploy/50.deploy-overview.md).
:::
## Prerequisites
Before performing the operations in this topic, you need to confirm the following information:
* Your environment is an RPM platform system. The following systems are currently verified to be supported:
* Anolis OS 8.X (Linux kernel 3.10.0 or later)
* Alibaba Cloud Linux 2/3 (Linux kernel 3.10.0 or later)
* Red Hat Enterprise Linux Server 7.X, 8.X (Linux kernel 3.10.0 or later)
* CentOS Linux 7.X, 8.X (Linux kernel 3.10.0 or later)
* Debian 9.X or later (Linux kernel 3.10.0 or later)
* Ubuntu 20.X or later (Linux kernel 3.10.0 or later)
* SUSE / OpenSUSE 15.X or later (Linux kernel 3.10.0 or later)
* openEuler 22.03 and 24.03 (Linux kernel 5.10.0 or later)
* KylinOS V10
* UOS 1020a/1021a/1021e/1001c
* NFSChina 4.0 or later
* Inspur KOS 5.8
* The minimum CPU requirement for the current environment is 1 core.
* The minimum available memory requirement for the current environment is 2 GB.
* You have installed a database connection tool (MySQL client or OBClient) in your environment.
* The user you are using has permission to execute sudo commands.
* Requirements for deploying using yum install:
* You have installed the jq command-line tool in your environment and correctly configured systemd as the system and service manager.
* Requirements for deploying using Docker:
* You have installed Docker and started the Docker service.
## Quickly deploy seekdb using yum install
1. Add the seekdb repository.
```shell
[admin@test001 ~]$ sudo yum-config-manager --add-repo https://mirrors.aliyun.com/oceanbase/OceanBase.repo
```
2. Install seekdb.
```shell
[admin@test001 ~]$ sudo yum install seekdb obclient
```
3. Start seekdb.
```shell
[admin@test001 ~]$ sudo systemctl start seekdb
```
4. Check the startup status of seekdb.
```shell
[admin@test001 ~]$ sudo systemctl status seekdb
```
When the status shows `Service is ready`, seekdb has started successfully.
5. Connect to seekdb.
```shell
mysql -h127.0.0.1 -uroot -P2881 -A oceanbase
```
## Quickly deploy seekdb in a container environment
If Docker is installed and the Docker service is started in your environment, you can also deploy seekdb using Docker containers. For more information about Docker deployment, see [Deploy seekdb in a container environment](../../400.guides/400.deploy/700.server-mode/200.deploy-by-docker.md).
1. Start a seekdb instance directly.
```shell
[admin@test001 ~]$ sudo docker run -d -p 2881:2881 oceanbase/seekdb
```
:::info
If pulling the Docker image fails, you can also pull the image from the quay.io or ghcr.io repository. Simply replace <code>oceanbase/seekdb</code> in the above command with <code>quay.io/oceanbase/seekdb</code> or <code>ghcr.io/oceanbase/seekdb</code>. For example, execute <code>sudo docker run -d -p 2881:2881 quay.io/oceanbase/seekdb</code> to pull the image from quay.io.
:::
2. Connect to seekdb.
```shell
mysql -h127.0.0.1 -uroot -P2881 -A oceanbase
```
## What's next
After deploying and connecting to seekdb, you can further experience seekdb's AI Native features and try building AI applications based on seekdb:
* [Experience vector search](30.experience-vector-search.md)
* [Experience full-text indexing](40.experience-full-text-indexing.md)
* [Experience hybrid search](50.experience-hybrid-search.md)
* [Experience AI function service](60.experience-ai-function.md)
* [Experience semantic indexing](70.experience-hybrid-vector-index.md)
* [Experience the Vibe Coding paradigm with Cursor Agent + OceanBase MCP](80.experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)

View File

@@ -0,0 +1,861 @@
---
slug: /basic-sql-operations
---
# Basic SQL operations
This topic introduces some basic SQL operations in seekdb.
## Create a database
Use the `CREATE DATABASE` statement to create a database.
Example: Create a database named `db1`, specify the character set as `utf8mb4`, and set the read-write attribute.
```sql
obclient> CREATE DATABASE db1 DEFAULT CHARACTER SET utf8mb4 READ WRITE;
Query OK, 1 row affected
```
For more information about the `CREATE DATABASE` statement, see [CREATE DATABASE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974111).
After creation, you can use the `SHOW DATABASES` command to view all databases in the current database server.
```sql
obclient> SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| db1 |
| information_schema |
| mysql |
| oceanbase |
| sys_external_tbs |
| test |
+--------------------+
6 rows in set
```
## Table operations
In seekdb, a table is the most basic data storage unit that contains all data accessible to users. Each table contains multiple rows of records, and each record consists of multiple columns. This topic provides the syntax and examples for creating, viewing, modifying, and deleting tables in a database.
### Create a table
Use the `CREATE TABLE` statement to create a new table in a database.
Example: Create a table named `test` in the database `db1`.
```sql
obclient> USE db1;
Database changed
obclient> CREATE TABLE test (c1 INT PRIMARY KEY, c2 VARCHAR(3));
Query OK, 0 rows affected
```
For more information about the `CREATE TABLE` statement, see [CREATE TABLE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974140).
### View tables
Use the `SHOW CREATE TABLE` statement to view the table creation statement.
Examples:
* View the table creation statement for the table `test`.
```sql
obclient> SHOW CREATE TABLE test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`c1` int(11) NOT NULL,
`c2` varchar(3) DEFAULT NULL,
PRIMARY KEY (`c1`)
) ORGANIZATION INDEX DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
1 row in set
```
* Use the `SHOW TABLES` statement to view all tables in the database `db1`.
```sql
obclient> SHOW TABLES FROM db1;
+---------------+
| Tables_in_db1 |
+---------------+
| test |
+---------------+
1 row in set
```
### Modify a table
Use the `ALTER TABLE` statement to modify the structure of an existing table, including modifying table attributes, adding columns, modifying columns and their attributes, and deleting columns.
Examples:
* Rename the column `c2` to `c3` in the table `test` and change its data type.
```sql
obclient> DESCRIBE test;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c2 | varchar(3) | YES | | NULL | |
+-------+------------+------+-----+---------+-------+
2 rows in set
obclient> ALTER TABLE test CHANGE COLUMN c2 c3 CHAR(10);
Query OK, 0 rows affected
obclient> DESCRIBE test;
+-------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c3 | char(10) | YES | | NULL | |
+-------+----------+------+-----+---------+-------+
2 rows in set
```
* Add and delete columns in the table `test`.
```sql
obclient> DESCRIBE test;
+-------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c3 | char(10) | YES | | NULL | |
+-------+----------+------+-----+---------+-------+
2 rows in set
obclient> ALTER TABLE test ADD c4 int;
Query OK, 0 rows affected
obclient> DESCRIBE test;
+-------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c3 | char(10) | YES | | NULL | |
| c4 | int(11) | YES | | NULL | |
+-------+----------+------+-----+---------+-------+
3 rows in set
obclient> ALTER TABLE test DROP c3;
Query OK, 0 rows affected
obclient> DESCRIBE test;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c4 | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set
```
For more information about the `ALTER TABLE` statement, see [ALTER TABLE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974126).
### Delete a table
Use the `DROP TABLE` statement to delete a table.
Example: Delete the table `test`.
```sql
obclient> DROP TABLE test;
Query OK, 0 rows affected
```
For more information about the `DROP TABLE` statement, see [DROP TABLE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974139).
## Index operations
An index is a structure created on a table that sorts the values of one or more columns in the database table. Its main purpose is to improve query speed and reduce the performance overhead of the database system. This topic introduces the syntax and examples for creating, viewing, and deleting indexes in a database.
### Create an index
Use the `CREATE INDEX` statement to create an index on a table.
Example: Create an index on the table `test`.
```sql
obclient> CREATE TABLE test (c1 INT PRIMARY KEY, c2 VARCHAR(3));
Query OK, 0 rows affected (0.10 sec)
obclient> DESCRIBE test;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c2 | varchar(3) | YES | | NULL | |
+-------+------------+------+-----+---------+-------+
2 rows in set
obclient> CREATE INDEX test_index ON test (c1, c2);
Query OK, 0 rows affected
```
For more information about the `CREATE INDEX` statement, see [CREATE INDEX](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974165).
### View indexes
Use the `SHOW INDEX` statement to view indexes on a table.
Example: View index information for the table `test`.
```sql
obclient> SHOW INDEX FROM test\G
*************************** 1. row ***************************
Table: test
Non_unique: 0
Key_name: PRIMARY
Seq_in_index: 1
Column_name: c1
Collation: A
Cardinality: NULL
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment: available
Index_comment:
Visible: YES
Expression: NULL
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: test_index
Seq_in_index: 1
Column_name: c1
Collation: A
Cardinality: NULL
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment: available
Index_comment:
Visible: YES
Expression: NULL
*************************** 3. row ***************************
Table: test
Non_unique: 1
Key_name: test_index
Seq_in_index: 2
Column_name: c2
Collation: A
Cardinality: NULL
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment: available
Index_comment:
Visible: YES
Expression: NULL
3 rows in set
```
### Delete an index
Use the `DROP INDEX` statement to delete an index on a table.
Example: Delete the index on the table `test`.
```sql
obclient> DROP INDEX test_index ON test;
Query OK, 0 rows affected
```
For more information about the `DROP INDEX` statement, see [DROP INDEX](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974168).
## Insert data
Use the `INSERT` statement to insert data into an existing table.
Examples:
* Create a table `t1` and insert one row of data.
```sql
obclient> CREATE TABLE t1(c1 INT PRIMARY KEY, c2 int) PARTITION BY KEY(c1) PARTITIONS 4;
Query OK, 0 rows affected
obclient> SELECT * FROM t1;
Empty set
obclient> INSERT t1 VALUES(1,1);
Query OK, 1 row affected
obclient> SELECT * FROM t1;
+----+------+
| c1 | c2 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set
```
* Insert multiple rows of data into the table `t1`.
```sql
obclient> INSERT t1 VALUES(2,2),(3,default),(2+2,3*4);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t1;
+----+------+
| c1 | c2 |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | NULL |
| 4 | 12 |
+----+------+
4 rows in set
```
For more information about the `INSERT` statement, see [INSERT](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974718).
## Delete data
Use the `DELETE` statement to delete data. It supports deleting data from a single table or multiple tables.
Examples:
* Create tables `t2` and `t3` using `CREATE TABLE`. Delete the row where `c1=2`, where `c1` is the `PRIMARY KEY` column in the table `t2`.
```sql
/*Table `t3` is a `KEY` partitioned table, and the partition names are automatically generated by the system according to the partition naming rules, that is, the partition names are `p0`, `p1`, `p2`, and `p3`*/
obclient> CREATE TABLE t2(c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> INSERT t2 VALUES(1,1),(2,2),(3,3),(5,5);
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t2;
+----+------+
| c1 | c2 |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 5 | 5 |
+----+------+
4 rows in set
obclient> CREATE TABLE t3(c1 INT PRIMARY KEY, c2 INT) PARTITION BY KEY(c1) PARTITIONS 4;
Query OK, 0 rows affected
obclient> INSERT INTO t3 VALUES(5,5),(1,1),(2,2),(3,3);
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t3;
+----+------+
| c1 | c2 |
+----+------+
| 5 | 5 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
4 rows in set
obclient> DELETE FROM t2 WHERE c1 = 2;
Query OK, 1 row affected
obclient> SELECT * FROM t2;
+----+------+
| c1 | c2 |
+----+------+
| 1 | 1 |
| 3 | 3 |
| 5 | 5 |
+----+------+
3 rows in set
```
* Delete the first row of data from the table `t2` after sorting by the `c2` column.
```sql
obclient> DELETE FROM t2 ORDER BY c2 LIMIT 1;
Query OK, 1 row affected
obclient> SELECT * FROM t2;
+----+------+
| c1 | c2 |
+----+------+
| 3 | 3 |
| 5 | 5 |
+----+------+
2 rows in set
```
* Delete data from the `p2` partition of the table `t3`.
```sql
obclient> SELECT * FROM t3 PARTITION(p2);
+----+------+
| c1 | c2 |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
3 rows in set
obclient> DELETE FROM t3 PARTITION(p2);
Query OK, 3 rows affected
obclient> SELECT * FROM t3;
+----+------+
| c1 | c2 |
+----+------+
| 5 | 5 |
+----+------+
1 row in set
```
* Delete data from tables `t2` and `t3` where `t2.c1 = t3.c1`.
```sql
obclient> SELECT * FROM t2;
+----+------+
| c1 | c2 |
+----+------+
| 3 | 3 |
| 5 | 5 |
+----+------+
2 rows in set
obclient> SELECT * FROM t3;
+----+------+
| c1 | c2 |
+----+------+
| 5 | 5 |
+----+------+
obclient> DELETE t2, t3 FROM t2, t3 WHERE t2.c1 = t3.c1;
Query OK, 3 rows affected
/*Equivalent to
obclient> DELETE FROM t2, t3 USING t2, t3 WHERE t2.c1 = t3.c1;
*/
obclient> SELECT * FROM t2;
+----+------+
| c1 | c2 |
+----+------+
| 3 | 3 |
+----+------+
1 row in set
obclient> SELECT * FROM t3;
Empty set
```
For more information about the `DELETE` statement, see [DELETE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974138).
## Update data
Use the `UPDATE` statement to modify field values in a table.
Examples:
* Create tables `t4` and `t5` using `CREATE TABLE`. Modify the `c2` column value to `100` for the row where `t4.c1=10` in the table `t4`.
```sql
obclient> CREATE TABLE t4(c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> INSERT t4 VALUES(10,10),(20,20),(30,30),(40,40);
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t4;
+----+------+
| c1 | c2 |
+----+------+
| 10 | 10 |
| 20 | 20 |
| 30 | 30 |
| 40 | 40 |
+----+------+
4 rows in set
obclient> CREATE TABLE t5(c1 INT PRIMARY KEY, c2 INT) PARTITION BY KEY(c1) PARTITIONS 4;
Query OK, 0 rows affected
obclient> INSERT t5 VALUES(50,50),(10,10),(20,20),(30,30);
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t5;
+----+------+
| c1 | c2 |
+----+------+
| 20 | 20 |
| 10 | 10 |
| 50 | 50 |
| 30 | 30 |
+----+------+
4 rows in set
obclient> UPDATE t4 SET t4.c2 = 100 WHERE t4.c1 = 10;
Query OK, 1 row affected
Rows matched: 1 Changed: 1 Warnings: 0
obclient> SELECT * FROM t4;
+----+------+
| c1 | c2 |
+----+------+
| 10 | 100 |
| 20 | 20 |
| 30 | 30 |
| 40 | 40 |
+----+------+
4 rows in set
```
* Modify the `c2` column value to `100` for the first two rows of data in the table `t4` after sorting by the `c2` column.
```sql
obclient> UPDATE t4 set t4.c2 = 100 ORDER BY c2 LIMIT 2;
Query OK, 2 rows affected
Rows matched: 2 Changed: 2 Warnings: 0
obclient> SELECT * FROM t4;
+----+------+
| c1 | c2 |
+----+------+
| 10 | 100 |
| 20 | 100 |
| 30 | 100 |
| 40 | 40 |
+----+------+
4 rows in set
```
* Modify the `c2` column value to `100` for the rows in the `p1` partition of the table `t5` where `t5.c1 > 20`.
```sql
obclient> SELECT * FROM t5 PARTITION (p1);
+----+------+
| c1 | c2 |
+----+------+
| 10 | 10 |
| 50 | 50 |
+----+------+
2 rows in set
obclient> UPDATE t5 PARTITION(p1) SET t5.c2 = 100 WHERE t5.c1 > 20;
Query OK, 1 row affected
Rows matched: 1 Changed: 1 Warnings: 0
obclient> SELECT * FROM t5 PARTITION(p1);
+----+------+
| c1 | c2 |
+----+------+
| 10 | 10 |
| 50 | 100 |
+----+------+
2 rows in set
```
* For rows in tables `t4` and `t5` that satisfy `t4.c2 = t5.c2`, modify the `c2` column value in the table `t4` to `100` and the `c2` column value in the table `t5` to `200`.
```sql
obclient> UPDATE t4,t5 SET t4.c2 = 100, t5.c2 = 200 WHERE t4.c2 = t5.c2;
Query OK, 1 row affected
Rows matched: 4 Changed: 1 Warnings: 0
obclient> SELECT * FROM t4;
+----+------+
| c1 | c2 |
+----+------+
| 10 | 100 |
| 20 | 100 |
| 30 | 100 |
| 40 | 40 |
+----+------+
4 rows in set
obclient> SELECT * FROM t5;
+----+------+
| c1 | c2 |
+----+------+
| 20 | 20 |
| 10 | 10 |
| 50 | 200 |
| 30 | 30 |
+----+------+
4 rows in set
```
For more information about the `UPDATE` statement, see [UPDATE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974152).
## Query data
Use the `SELECT` statement to query the contents of a table.
Examples:
* Create a table `t6` using `CREATE TABLE`. Read the `name` data from the table `t6`.
```sql
obclient> CREATE TABLE t6 (id INT, name VARCHAR(50), num INT);
Query OK, 0 rows affected
obclient> INSERT INTO t6 VALUES(1,'a',100),(2,'b',200),(3,'a',50);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t6;
+------+------+------+
| ID | NAME | NUM |
+------+------+------+
| 1 | a | 100 |
| 2 | b | 200 |
| 3 | a | 50 |
+------+------+------+
3 rows in set
obclient> SELECT name FROM t6;
+------+
| NAME |
+------+
| a |
| b |
| a |
+------+
3 rows in set
```
* Remove duplicates from the `name` column in the query results.
```sql
obclient> SELECT DISTINCT name FROM t6;
+------+
| NAME |
+------+
| a |
| b |
+------+
2 rows in set
```
* Output the corresponding `id`, `name`, and `num` from the table `t6` based on the filter condition `name = 'a'`.
```sql
obclient> SELECT id, name, num FROM t6 WHERE name = 'a';
+------+------+------+
| ID | NAME | NUM |
+------+------+------+
| 1 | a | 100 |
| 3 | a | 50 |
+------+------+------+
2 rows in set
```
For more information about the `SELECT` statement, see [SELECT](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974942).
## Commit a transaction
Use the `COMMIT` statement to commit a transaction.
Before committing a transaction (COMMIT):
* Your modifications are visible only to the current session and not visible to other database sessions.
* Your modifications are not persisted. You can undo the modifications using the ROLLBACK statement.
After committing a transaction (COMMIT):
* Your modifications are visible to all database sessions.
* Your modifications are successfully persisted and cannot be rolled back using the ROLLBACK statement.
Example: Create a table `t_insert` using `CREATE TABLE`. Use the `COMMIT` statement to commit the transaction.
```sql
obclient> CREATE TABLE t_insert(
id number NOT NULL PRIMARY KEY,
name varchar(10) NOT NULL,
value number,
gmt_create DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);
Query OK, 0 rows affected
obclient> BEGIN;
Query OK, 0 rows affected
obclient> INSERT INTO t_insert(id, name, value, gmt_create) VALUES(1,'CN',10001, current_timestamp),(2,'US',10002, current_timestamp),(3,'EN',10003, current_timestamp);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t_insert;
+----+------+-------+---------------------+
| id | name | value | gmt_create |
+----+------+-------+---------------------+
| 1 | CN | 10001 | 2025-11-07 16:01:53 |
| 2 | US | 10002 | 2025-11-07 16:01:53 |
| 3 | EN | 10003 | 2025-11-07 16:01:53 |
+----+------+-------+---------------------+
3 rows in set
obclient> INSERT INTO t_insert(id,name) VALUES(4,'JP');
Query OK, 1 row affected
obclient> COMMIT;
Query OK, 0 rows affected
obclient> exit;
Bye
obclient> obclient -h127.0.0.1 -uroot -P2881 -Ddb1
obclient> SELECT * FROM t_insert;
+------+------+-------+---------------------+
| id | name | value | gmt_create |
+------+------+-------+---------------------+
| 1 | CN | 10001 | 2025-11-07 16:01:53 |
| 2 | US | 10002 | 2025-11-07 16:01:53 |
| 3 | EN | 10003 | 2025-11-07 16:01:53 |
| 4 | JP | NULL | 2025-11-07 16:02:02 |
+------+------+-------+---------------------+
4 rows in set
```
For more information about transaction control statements, see [Transaction management overview](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001971667).
## Roll back a transaction
Use the `ROLLBACK` statement to roll back a transaction.
Rolling back a transaction means undoing all modifications made in the transaction. You can roll back the entire uncommitted transaction or roll back to any savepoint in the transaction. To roll back to a savepoint, you must use the `ROLLBACK` statement together with `TO SAVEPOINT`.
* If you roll back the entire transaction:
* The transaction ends.
* All modifications are discarded.
* All savepoints are cleared.
* All locks held by the transaction are released.
* If you roll back to a savepoint:
* The transaction does not end.
* Modifications before the savepoint are retained, and modifications after the savepoint are discarded.
* Savepoints after the savepoint are cleared (excluding the savepoint itself).
* All locks held by the transaction after the savepoint are released.
Example: Roll back all modifications in a transaction.
```sql
obclient> SELECT * FROM t_insert;
+------+------+-------+---------------------+
| id | name | value | gmt_create |
+------+------+-------+---------------------+
| 1 | CN | 10001 | 2025-11-07 16:01:53 |
| 2 | US | 10002 | 2025-11-07 16:01:53 |
| 3 | EN | 10003 | 2025-11-07 16:01:53 |
| 4 | JP | NULL | 2025-11-07 16:02:02 |
+------+------+-------+---------------------+
4 rows in set
obclient> BEGIN;
Query OK, 0 rows affected
obclient> INSERT INTO t_insert(id, name, value) VALUES(5,'JP',10004),(6,'FR',10005),(7,'RU',10006);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM t_insert;
+------+------+-------+---------------------+
| id | name | value | gmt_create |
+------+------+-------+---------------------+
| 1 | CN | 10001 | 2025-11-07 16:01:53 |
| 2 | US | 10002 | 2025-11-07 16:01:53 |
| 3 | EN | 10003 | 2025-11-07 16:01:53 |
| 4 | JP | NULL | 2025-11-07 16:02:02 |
| 5 | JP | 10004 | 2025-11-07 16:04:14 |
| 6 | FR | 10005 | 2025-11-07 16:04:14 |
| 7 | RU | 10006 | 2025-11-07 16:04:14 |
+------+------+-------+---------------------+
7 rows in set
obclient> ROLLBACK;
Query OK, 0 rows affected
obclient> SELECT * FROM t_insert;
+------+------+-------+---------------------+
| id | name | value | gmt_create |
+------+------+-------+---------------------+
| 1 | CN | 10001 | 2025-11-07 16:01:53 |
| 2 | US | 10002 | 2025-11-07 16:01:53 |
| 3 | EN | 10003 | 2025-11-07 16:01:53 |
| 4 | JP | NULL | 2025-11-07 16:02:02 |
+------+------+-------+---------------------+
4 rows in set
```
For more information about transaction control statements, see [Transaction management overview](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001971667).
## Create a user
Use the `CREATE USER` statement to create a user.
Example:
Create a user named `test`.
```shell
obclient> CREATE USER 'test' IDENTIFIED BY '******';
Query OK, 0 rows affected
```
For more information about the `CREATE USER` statement, see [CREATE USER](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974176).
## Grant user privileges
Use the `GRANT` statement to grant privileges to a user.
Example:
Grant the user `test` the privilege to access all tables in the database `db1`.
```shell
obclient> GRANT SELECT ON db1.* TO test;
Query OK, 0 rows affected
```
Check the privileges of the user `test`.
```shell
obclient> SHOW GRANTS for test;
+-----------------------------------+
| Grants for test@% |
+-----------------------------------+
| GRANT USAGE ON *.* TO 'test' |
| GRANT SELECT ON `db1`.* TO 'test' |
+-----------------------------------+
2 rows in set
```
For more information about the `GRANT` statement, see [GRANT](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974144).
## Delete a user
Use the `DROP USER` statement to delete a user.
Example:
Delete the user `test`.
```shell
obclient> DROP USER test;
Query OK, 0 rows affected
```
For more information about the `DROP USER` statement, see [DROP USER](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974172).

View File

@@ -0,0 +1,230 @@
---
slug: /experience-vector-search
---
# Experience vector search
## Vector search overview
In today's era of information explosion, users often need to quickly retrieve the information they need from massive amounts of data. For example, online literature databases, e-commerce platform product catalogs, and growing multimedia content libraries all require efficient retrieval systems to quickly locate content of interest to users. As data volumes continue to grow, traditional keyword-based retrieval methods can no longer meet users' needs for retrieval accuracy and speed. Vector search technology can effectively solve these problems. Vector search encodes different types of data such as text, images, and audio into mathematical vectors and performs retrieval in vector space. This method allows systems to capture deep semantic information of data, thereby providing more accurate and efficient retrieval results.
seekdb provides the capability to store, index, and search embedding vector data, and supports storing vector data together with other data.
seekdb supports up to 16,000 dimensions of float-type dense vectors, sparse vectors, and various types of vector distance calculations such as Manhattan distance, Euclidean distance, inner product, and cosine distance. It supports creating vector indexes based on HNSW/IVF, and supports incremental updates and deletions without affecting recall.
seekdb vector search has hybrid search capabilities with scalar filtering. It also provides flexible access interfaces, supporting SQL access through MySQL protocol clients in various languages, as well as Python SDK access. It has also completed adaptation to AI application development frameworks LlamaIndex and DB-GPT, and AI application development platform Dify, better serving AI application development.
This topic demonstrates how to quickly perform vector search using SQL.
## Prerequisites
* Ensure that seekdb is installed.
* You are connected to seekdb.
## Quick start
1. Create vector columns and indexes.
When creating a table, you can use the `VECTOR(dim)` data type to declare a column as a vector column and specify its dimension. Vector indexes must be created on vector columns, and at least two parameters, `type` and `distance`, must be provided.
The example creates a vector column `embedding` with a dimension of `3`, and creates an HNSW index on the `embedding` column, specifying the distance algorithm as L2.
```sql
CREATE TABLE t1(
id INT PRIMARY KEY,
doc VARCHAR(200),
embedding VECTOR(3),
VECTOR INDEX idx1(embedding) WITH (distance=L2, type=hnsw)
);
```
2. Insert vector data.
To simulate a vector search scenario, you need to construct some vector data first. Each row of data includes a description of the data and the corresponding vector. In the example, it is assumed that `'apple'` corresponds to the vector `'[1.2,0.7,1.1]'`, and `'carrot'` corresponds to the vector `'[5.3,4.8,5.4]'`, and so on.
```sql
INSERT INTO t1
VALUES (1, 'apple', '[1.2,0.7,1.1]'),
(2, 'banana', '[0.6,1.2,0.8]'),
(3, 'orange','[1.1,1.1,0.9]'),
(4, 'carrot', '[5.3,4.8,5.4]'),
(5, 'spinach', '[4.9,5.3,4.8]'),
(6, 'tomato','[5.2,4.9,5.1]');
```
For convenience of demonstration, this example simplifies the vector dimension to only 3 dimensions, and the vectors are manually generated. In actual applications, you need to use embedding models to generate vectors from real text, and the dimensions can reach hundreds or thousands.
You can check whether the data is inserted successfully by querying the table.
```sql
SELECT * FROM t1;
```
The expected result is as follows:
```shell
+----+---------+---------------+
| id | doc | embedding |
+----+---------+---------------+
| 1 | apple | [1.2,0.7,1.1] |
| 2 | banana | [0.6,1.2,0.8] |
| 3 | orange | [1.1,1.1,0.9] |
| 4 | carrot | [5.3,4.8,5.4] |
| 5 | spinach | [4.9,5.3,4.8] |
| 6 | tomato | [5.2,4.9,5.1] |
+----+---------+---------------+
6 rows in set
```
3. Perform vector search.
To perform vector search, you need to provide a vector as the search condition. Suppose we need to find all `'fruits'`, and the corresponding vector is `[0.9, 1.0, 0.9]`, then the corresponding SQL is:
```sql
SELECT id, doc FROM t1
ORDER BY l2_distance(embedding, '[0.9, 1.0, 0.9]')
APPROXIMATE LIMIT 3;
```
The expected result is as follows:
```shell
+----+--------+
| id | doc |
+----+--------+
| 3 | orange |
| 2 | banana |
| 1 | apple |
+----+--------+
3 rows in set
```
## Comparison between exact search and approximate search
### Perform exact search
Exact search uses a full scan strategy, performing exact search by calculating the distance between the query vector and all vectors in the dataset. This method can guarantee complete accuracy of search results, but since full distance calculation is required, search performance will significantly decrease as the data scale grows.
When performing exact search, the system calculates and compares the distance between the query vector vₑ and all vectors in the vector space. After completing the full distance calculation, the system selects the k vectors with the closest distance as the search results.
#### Example: Euclidean similarity search
Euclidean similarity search is used to retrieve the top-k vectors closest to the query vector in vector space, using Euclidean distance as the metric. The following example demonstrates how to use exact search to retrieve the top 5 vectors closest to the query vector from a table:
```sql
-- Create a test table
CREATE TABLE t1 (
id INT PRIMARY KEY,
c1 VECTOR(3)
);
-- Insert data
INSERT INTO t1 VALUES
(1, '[0.1, 0.2, 0.3]'),
(2, '[0.2, 0.3, 0.4]'),
(3, '[0.3, 0.4, 0.5]'),
(4, '[0.4, 0.5, 0.6]'),
(5, '[0.5, 0.6, 0.7]'),
(6, '[0.6, 0.7, 0.8]'),
(7, '[0.7, 0.8, 0.9]'),
(8, '[0.8, 0.9, 1.0]'),
(9, '[0.9, 1.0, 0.1]'),
(10, '[1.0, 0.1, 0.2]');
-- Perform exact search
SELECT c1
FROM t1
ORDER BY l2_distance(c1, '[0.1, 0.2, 0.3]') LIMIT 5;
```
The result is as follows:
```shell
+---------------+
|| c1 |
+---------------+
|| [0.1,0.2,0.3] |
|| [0.2,0.3,0.4] |
|| [0.3,0.4,0.5] |
|| [0.4,0.5,0.6] |
|| [0.5,0.6,0.7] |
+---------------+
5 rows in set
```
### Perform approximate search using vector indexes
Vector index search uses an approximate nearest neighbor (ANN) strategy, accelerating the search process through pre-built index structures. Although it cannot guarantee 100% accuracy of results, it can significantly improve search performance, achieving a good balance between accuracy and performance in practical applications.
#### Example: HNSW index approximate search
```sql
-- Create an HNSW vector index with the table
CREATE TABLE t2 (
id INT PRIMARY KEY,
vec VECTOR(3),
VECTOR INDEX idx(vec) WITH (distance=l2, type=hnsw, lib=vsag)
);
-- Insert test data
INSERT INTO t2 VALUES
(1, '[0.1, 0.2, 0.3]'),
(2, '[0.2, 0.3, 0.4]'),
(3, '[0.3, 0.4, 0.5]'),
(4, '[0.4, 0.5, 0.6]'),
(5, '[0.5, 0.6, 0.7]'),
(6, '[0.6, 0.7, 0.8]'),
(7, '[0.7, 0.8, 0.9]'),
(8, '[0.8, 0.9, 1.0]'),
(9, '[0.9, 1.0, 0.1]'),
(10, '[1.0, 0.1, 0.2]');
-- Perform approximate search, returning the 5 most similar records
SELECT id, vec
FROM t2
ORDER BY l2_distance(vec, '[0.1, 0.2, 0.3]')
APPROXIMATE
LIMIT 5;
```
The result is as follows. Due to the small data volume, it is consistent with the exact search result above:
```shell
+------+---------------+
|| id | vec |
+------+---------------+
|| 1 | [0.1,0.2,0.3] |
|| 2 | [0.2,0.3,0.4] |
|| 3 | [0.3,0.4,0.5] |
|| 4 | [0.4,0.5,0.6] |
|| 5 | [0.5,0.6,0.7] |
+------+---------------+
5 rows in set
```
### Summary
A comparison of the two search methods is as follows:
| Comparison item | Exact search | Approximate search |
|----------------|--------------|-------------------|
| Execution method | Full table scan (`TABLE FULL SCAN`) followed by sorting | Direct search through vector index (`VECTOR INDEX SCAN`) |
| Performance characteristics | Requires scanning all table data and sorting, performance significantly decreases as data volume grows | Directly locates target data through index, stable performance |
| Result accuracy | 100% accurate, guarantees returning true nearest neighbors | Approximately accurate, may have minor errors |
| Applicable scenarios | Small data volumes, scenarios with high accuracy requirements | Large-scale datasets, scenarios with high performance requirements |
## What's next
For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:
* [Experience full-text indexing](40.experience-full-text-indexing.md)
* [Experience hybrid search](50.experience-hybrid-search.md)
* [Experience AI function service](60.experience-ai-function.md)
* [Experience semantic indexing](70.experience-hybrid-vector-index.md)
* [Experience the Vibe Coding paradigm with Cursor Agent + OceanBase MCP](80.experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)
In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see [Experience embedded seekdb](../50.embedded-mode/25.using-seekdb-in-python-sdk.md) and [pyseekdb overview](../../200.develop/900.sdk/10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).

View File

@@ -0,0 +1,354 @@
---
slug: /experience-full-text-indexing
---
# Experience full-text indexing
## Background information
seekdb's full-text indexing feature can effectively solve various problems encountered in actual production, especially in scenarios such as system log analysis and user behavior and profile analysis. This feature can quickly filter and screen data efficiently, as well as perform high-quality relevance evaluation. In addition, combined with the multi-path recall architecture of sparse and dense vectors, more efficient recall can be achieved in RAG systems in specific knowledge domains.
This tutorial uses document retrieval scenarios as an example. In such scenarios, three core challenges place higher demands on retrieval systems:
- **Real-time requirements**: Quickly locate target information from TB-level data.
- **Semantic complexity**: Solve natural language processing challenges such as word segmentation and synonym processing.
- **Hybrid query requirements**: Improve the joint optimization capability of text retrieval and structured queries.
This tutorial demonstrates how to quickly find target documents from massive information by using the full-text indexing feature. We will use keywords in queries to demonstrate the improvements of seekdb's full-text indexing in terms of functionality, performance, and ease of use.
## How it works
In seekdb's storage engine, user documents and queries are split into multiple keywords (word/token) by a tokenizer. These keywords and the statistical information features of documents are stored in internal auxiliary tables (tablets) for relevance evaluation (ranking) during the information retrieval phase. seekdb uses the advanced BM25 algorithm, which can more effectively calculate the relevance score between keywords in user query statements and stored documents, and finally output documents that meet the conditions and their scores.
In the full-text indexing query process, combined with seekdb's high-performance query engine, seekdb has optimized the TAAT/DAAT process and supports union merge between multiple indexes. These improvements enable full-text indexing to handle more complex query features and meet users' data retrieval needs.
![](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/observer-enterprise/V4.3.5/690.tutorials/fulltext-index-structure.png)
## Prerequisites
To successfully operate and experience seekdb's full-text indexing feature, ensure that the following prerequisites are met:
1. **Environment requirements**: seekdb is deployed.
2. **Database creation**: Ensure that a database is created. For detailed steps, see [Create a database](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001971662).
## Procedure
The following steps guide you through experiencing seekdb's full-text indexing and common views and query techniques.
### Step 1: Import a dataset
seekdb has a built-in Beng tokenizer that is suitable for English, as well as a Boolean mode that is more efficient than traditional natural language processing. The Beng tokenizer is suitable for English text and provides efficient word segmentation for English documents. seekdb's built-in tokenizers also include IK (for Chinese), space (for space-separated languages), and ngram (which splits by character length).
We will use the [wikIR1k dataset](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/SeekDB/get-started/documents.csv) to import data into seekdb, create a table named `wikir1k` with a `document` column, and create a full-text index on the `document` field using the Beng tokenizer.
:::tip
All query results and performance metrics shown in the examples are for reference only. Your actual results may vary depending on your data volume, machine specifications, and query patterns.
:::
```sql
-- Create a table and use the Beng tokenizer for full-text indexing
CREATE TABLE wikir1k (
id INT AUTO_INCREMENT PRIMARY KEY,
document TEXT,
FULLTEXT INDEX ft_idx1_document(document)
WITH PARSER beng
);
```
Import the dataset into the table through the client's local file method.
```sql
-- Import data
LOAD DATA /*+ PARALLEL(8) */ LOCAL INFILE '/home/admin/documents10k.csv' INTO TABLE wikir1k
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
```
After importing the data, the table contains approximately 10,000 documents (the exact count may vary slightly).
```sql
-- Verify the number of imported records
SELECT AVG(LENGTH(document)), COUNT(*) FROM wikir1k;
```
The following result is returned:
```sql
+-----------------------+----------+
| AVG(LENGTH(document)) | COUNT(*) |
+-----------------------+----------+
| 1144.6949 | 369721 |
+-----------------------+----------+
1 row in set (1.07 sec)
```
```sql
-- Query the view to verify the result
SELECT * FROM oceanbase.DBA_OB_TABLE_SPACE_USAGE WHERE DATABASE_NAME = 'test' AND TABLE_NAME LIKE '%wikir1k%';
```
The following result is returned:
```sql
+----------+---------------+------------+-------------+---------------+
| TABLE_ID | DATABASE_NAME | TABLE_NAME | OCCUPY_SIZE | REQUIRED_SIZE |
+----------+---------------+------------+-------------+---------------+
| 500252 | test | wikir1k | 185571540 | 190853120 |
+----------+---------------+------------+-------------+---------------+
1 row in set (0.05 sec)
```
### Step 2: Query using full-text indexing
Using the stored document dataset and index, we can perform multi-condition combination or highly filtered retrieval. For example, if I want to search for documents containing both "london" and "mayfair", I can use Boolean mode.
Compared to string `LIKE` matching without an index, Boolean mode has simpler syntax and faster query speed.
```sql
-- Use Boolean mode to query and find documents that contains both "london" and "mayfair"
SELECT COUNT(*) FROM wikir1k
WHERE MATCH (document) AGAINST ('+london +mayfair' IN BOOLEAN MODE);
```
The following result is returned:
```sql
+----------+
| COUNT(*) |
+----------+
| 58 |
+----------+
1 row in set (0.01 sec)
```
In contrast, using the `LIKE` query method:
```sql
-- Use LIKE syntax to query
SELECT COUNT(*) FROM wikir1k
WHERE document LIKE '%london%' AND document LIKE '%mayfair%';
```
The following result is also returned:
```sql
+----------+
| COUNT(*) |
+----------+
| 58 |
+----------+
1 row in set (3.48 sec)
```
For the documents returned, we can further perform ranking by using the score in the output result to determine which documents are more relevant to the query.
```sql
-- Return the id and score of the documents to help determine relevance
SELECT id, MATCH (document) AGAINST ('london mayfair') AS score
FROM wikir1k
WHERE MATCH (document) AGAINST ('+london +mayfair' IN BOOLEAN MODE)
LIMIT 10;
```
The following result is returned:
```sql
+---------+--------------------+
| id | score |
+---------+--------------------+
| 425035 | 17.661768297948015 |
| 1122217 | 16.349131415195043 |
| 34959 | 14.813025094926918 |
| 1576669 | 14.620715555483576 |
| 2100682 | 13.40354137543347 |
| 1179964 | 13.40354137543347 |
| 1642217 | 13.391619146335605 |
| 123391 | 13.36985391637557 |
| 852529 | 13.336357369363272 |
| 380931 | 13.249691534256172 |
+---------+--------------------+
10 rows in set (0.03 sec)
```
At the same time, Boolean mode also allows us to reverse exclude some keywords. For example, if I want to find documents about "london" but exclude those mentioning "westminster", I can use the `-` operator in Boolean mode.
```sql
-- Query documents about london but excluding westminster
SELECT COUNT(*) FROM wikir1k
WHERE MATCH (document) AGAINST ('+london -westminster' IN BOOLEAN MODE);
```
The following result is returned:
```sql
+----------+
| COUNT(*) |
+----------+
| 18771 |
+----------+
1 row in set (0.01 sec)
```
### Step 3: Tuning
#### Tune using the `TOKENIZE` function
When the query results of full-text indexing do not meet expectations, it is usually because the tokenization results are not ideal. seekdb provides a fast `TOKENIZE` function to assist in testing tokenization effects. This function supports all tokenizers and their corresponding properties. You can use the `TOKENIZE` function to verify tokenizer processing effects.
For example, the tokenization results in the following example show how the Beng tokenizer splits English text into words, which helps verify that the tokenization is working correctly.
1. Use the `TOKENIZE` function to verify tokenizer processing effects:
```sql
-- Verify English document tokenization effects using Beng tokenizer
SELECT TOKENIZE('The computer system provides efficient processing and information management capabilities', 'beng', '[]');
```
The following result is returned:
```sql
+---------------------------------------------------------------------------------------------------------------------+
| TOKENIZE('The computer system provides efficient processing and information management capabilities', 'beng', '[]') |
+---------------------------------------------------------------------------------------------------------------------+
| ["efficient", "processing", "capabilities", "system", "computer", "provides", "and", "information", "management"] |
+---------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
```
The above result shows that the text has been correctly split into individual words.
2. Next, execute the following statement to check whether the query statement hits the target document:
```sql
-- Use Boolean mode to retrieve documents about computer systems
SELECT COUNT(*)
FROM wikir1k
WHERE MATCH (document) AGAINST ('+computer +system' IN BOOLEAN MODE);
```
The following result is returned:
```sql
+----------+
| COUNT(*) |
+----------+
| 1010 |
+----------+
1 row in set (0.01 sec)
```
The above result shows that target records were matched.
## Performance comparison with MySQL
To compare the full-text indexing performance differences between seekdb and MySQL, we use MySQL's full-text indexing feature as a reference. The complete dataset `wikir1k` (containing 369,721 rows, with an average of 200 words per row) is used for performance comparison.
:::tip
The test results are provided for reference only and may vary depending on your specific environment, data volume, and query patterns.
:::
The following are the comparison results of various scenarios in natural language mode and Boolean mode. It can be seen that in scenarios that require a large amount of tokenization or return large result sets, seekdb's performance is significantly better than MySQL. For small result sets, since the calculation proportion is small, the query engine's advantage is not obvious, and the performance of both engines is similar.
**Test environment**: seekdb's test specification is 8c 16g, and MySQL version uses 8.0.36 for Linux on x86_64 (MySQL Community Server - GPL).
### Natural language mode
```sql
-- q1: Query documents containing "and"
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('and');
-- q2: Query documents containing "and", limit to 10 results
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('and') LIMIT 10;
-- q3: Query documents containing "librettists"
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('librettists');
-- q4: Query documents containing "librettists", limit to 10 results
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('librettists') LIMIT 10;
-- q5: Query documents containing "alleviating librettists"
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('alleviating librettists');
-- q6: Query documents containing "black spotted white yellow"
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('black spotted white yellow');
-- q7: Query documents containing "black spotted white yellow", limit to 10 results
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('black spotted white yellow') LIMIT 10;
-- q8: Query documents containing "between up and down"
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('between up and down');
-- q9: Query documents containing "between up and down", limit to 10 results
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('between up and down') LIMIT 10;
-- q10: Query long documents
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('alleviating librettists modifications retelling intangible hydrographic administratively berwickshire strathaven dumfriesshire lesmahagow transhumanist musselburgh prestwick cardiganshire montgomeryshire');
-- q11: Query long documents, with "and" appended
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('alleviating librettists modifications retelling intangible hydrographic administratively berwickshire strathaven dumfriesshire lesmahagow transhumanist musselburgh prestwick cardiganshire montgomeryshire and');
-- q12: Query long documents, limit to 10 results
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('alleviating librettists modifications retelling intangible hydrographic administratively berwickshire strathaven dumfriesshire lesmahagow transhumanist musselburgh prestwick cardiganshire montgomeryshire and') LIMIT 10;
```
| **Scenario** | **seekdb** | **MySQL** |
|-------------------------------|-------------------|-----------------|
| q1 Single token high-frequency word | 3820458us | 5718430us |
| q2 Single token high-frequency word limit | 231861us | 503772us |
| q3 Single token low-frequency word | 879us | 672us |
| q4 Single token low-frequency word limit | 720us | 700us |
| q5 Multiple tokens small result set | 1591us | 1100us |
| q6 Multiple tokens medium result set | 259700us | 602221us |
| q7 Multiple tokens medium result set limit | 25502us | 42620us |
| q8 Multiple tokens large result set | 3842391us | 6846847us |
| q9 Multiple tokens large result set limit | 301362us | 784024us |
| q10 Many tokens small result set | 22143us | 10161us |
| q11 Many tokens large result set | 3905829us | 5929343us |
| q12 Many tokens large result set limit| 345968us | 769970us |
### Boolean mode
```sql
-- q1: +high-frequency word -medium-frequency word
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('+and -which -his' IN BOOLEAN MODE);
-- q2: +high-frequency word -low-frequency word
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('+which (+and -his)' IN BOOLEAN MODE);
-- q3: +medium-frequency word (+high-frequency word -medium-frequency word)
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('+and -carabantes -bufera' IN BOOLEAN MODE);
-- q4: +high-frequency word +low-frequency word
SELECT * FROM wikir1k WHERE MATCH (document) AGAINST ('+and +librettists' IN BOOLEAN MODE);
```
| **Scenario** | **seekdb** | **MySQL** |
|-------------------------------|-------------------|-----------------|
| q1: +high-frequency word -medium-frequency word | 1586657us | 2440798us |
| q2: +high-frequency word -low-frequency word | 3726508us | 7974832us |
| q3: +medium-frequency word (+high-frequency word -medium-frequency word)| 3080644us | 5612041us |
| q4: +high-frequency word +low-frequency word | 230284us | 357580us |
### Performance comparison summary
From the above data comparison, it can be seen that when performing complex full-text retrieval, seekdb demonstrates significantly better performance than MySQL in both natural language mode and Boolean mode. Especially when processing queries that require a large amount of tokenization or return large result sets, seekdb's advantages are more obvious. This provides strong reference for developers and data analysts when choosing a database, especially in application scenarios that require efficient retrieval of massive data, where seekdb clearly demonstrates its powerful performance and flexible query capabilities.
seekdb's full-text indexing can always provide fast response times when processing complex queries, making it more suitable for actual application scenarios that require high concurrency and high-performance retrieval.
## What's next
For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:
* [Experience vector search](30.experience-vector-search.md)
* [Experience hybrid search](50.experience-hybrid-search.md)
* [Experience AI function service](60.experience-ai-function.md)
* [Experience semantic indexing](70.experience-hybrid-vector-index.md)
* [Experience the Vibe Coding paradigm with Cursor Agent + OceanBase MCP](80.experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)
In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see [Experience embedded seekdb using Python SDK](../50.embedded-mode/25.using-seekdb-in-python-sdk.md) and [pyseekdb overview](../../200.develop/900.sdk/10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).

View File

@@ -0,0 +1,360 @@
---
slug: /experience-hybrid-search
---
# Experience hybrid search in seekdb
This tutorial guides you through getting started with seekdb's hybrid search feature, demonstrating how hybrid search leverages the advantages of both full-text index keywords and vector index semantic search to help you better understand the practical applications of hybrid search.
## Overview
Hybrid search combines vector-based semantic retrieval and full-text index-based keyword retrieval, providing more accurate and comprehensive retrieval results through comprehensive ranking. Vector search excels at semantic approximate matching but is weak at matching exact keywords, numbers, and proper nouns, while full-text retrieval effectively compensates for this deficiency. seekdb provides hybrid search functionality through the DBMS_HYBRID_SEARCH system package, supporting the following scenarios:
* Pure vector search: Find relevant content based on semantic similarity, suitable for semantic search, recommendation systems, and other scenarios.
* Pure full-text search: Find content based on keyword matching, suitable for document search, product search, and other scenarios.
* Hybrid search: Combines keyword matching and semantic understanding to provide more accurate and comprehensive search results.
This feature is widely used in intelligent search, document search, product recommendation, and other scenarios.
## Prerequisites
* Contact the administrator to obtain the corresponding database connection string, then execute the following command to connect to the database:
```shell
- host: seekdb database connection IP.
- port: seekdb database connection port.
- database_name: Name of the database to access.
- user_name: Database username.
- password: Database password.
obclient -h$host -P$port -u$user_name -p$password -D$database_name
```
* A test table has been created, and vector indexes and full-text indexes have been created in the table:
:::collapse
```sql
CREATE TABLE doc_table(
c1 INT,
vector VECTOR(3),
query VARCHAR(255),
content VARCHAR(255),
VECTOR INDEX idx1(vector) WITH (distance=l2, type=hnsw, lib=vsag),
FULLTEXT INDEX idx2(query),
FULLTEXT INDEX idx3(content)
);
INSERT INTO doc_table VALUES
(1, '[1,2,3]', "hello world", "oceanbase Elasticsearch database"),
(2, '[1,2,1]', "hello world, what is your name", "oceanbase mysql database"),
(3, '[1,1,1]', "hello world, how are you", "oceanbase oracle database"),
(4, '[1,3,1]', "real world, where are you from", "postgres oracle database"),
(5, '[1,3,2]', "real world, how old are you", "redis oracle database"),
(6, '[2,1,1]', "hello world, where are you from", "starrocks oceanbase database");
```
:::
## Step 1: Pure vector search
Vector search finds semantically relevant content by calculating vector similarity, suitable for semantic search, recommendation systems, and other scenarios.
Set search parameters and use vector search to find records most similar to the query vector `[1,2,3]`:
```sql
SET @parm = '{
"knn" : {
"field": "vector",
"k": 3,
"query_vector": [1,2,3]
}
}';
SELECT JSON_PRETTY(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));
```
The following result is returned:
:::collapse
```shell
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm)) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [
{
"c1": 1,
"query": "hello world",
"_score": 1.0,
"vector": "[1,2,3]",
"content": "oceanbase Elasticsearch database"
},
{
"c1": 5,
"query": "real world, how old are you",
"_score": 0.41421356,
"vector": "[1,3,2]",
"content": "redis oracle database"
},
{
"c1": 2,
"query": "hello world, what is your name",
"_score": 0.33333333,
"vector": "[1,2,1]",
"content": "oceanbase mysql database"
}
] |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
```
:::
The results are sorted by vector similarity, where `_score` represents the similarity score. A higher score indicates greater similarity.
## Step 2: Pure full-text search
Full-text search finds content through keyword matching, suitable for document search, product search, and other scenarios.
Set search parameters and use full-text search to find records containing keywords in the `query` and `content` fields:
```sql
SET @parm = '{
"query": {
"query_string": {
"fields": ["query", "content"],
"query": "hello oceanbase"
}
}
}';
SELECT JSON_PRETTY(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));
```
The following result is returned:
:::collapse
```shell
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm)) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [
{
"c1": 1,
"query": "hello world",
"_score": 0.37162162162162166,
"vector": "[1,2,3]",
"content": "oceanbase Elasticsearch database"
},
{
"c1": 2,
"query": "hello world, what is your name",
"_score": 0.3503184713375797,
"vector": "[1,2,1]",
"content": "oceanbase mysql database"
},
{
"c1": 3,
"query": "hello world, how are you",
"_score": 0.3503184713375797,
"vector": "[1,1,1]",
"content": "oceanbase oracle database"
},
{
"c1": 6,
"query": "hello world, where are you from",
"_score": 0.3503184713375797,
"vector": "[2,1,1]",
"content": "starrocks oceanbase database"
}
] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
```
:::
The results are sorted by keyword matching degree, where `_score` represents the matching score. A higher score indicates better matching.
## Step 3: Hybrid search
Hybrid search combines keyword matching and semantic understanding to provide more accurate and comprehensive search results, leveraging the advantages of both full-text indexes and vector indexes.
Set search parameters to perform both full-text search and vector search simultaneously:
```sql
SET @parm = '{
"query": {
"query_string": {
"fields": ["query", "content"],
"query": "hello oceanbase"
}
},
"knn" : {
"field": "vector",
"k": 5,
"query_vector": [1,2,3]
}
}';
SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));
```
The following result is returned:
:::collapse
```shell
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm)) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [
{
"c1": 1,
"query": "hello world",
"_score": 0.37162162162162166,
"vector": "[1,2,3]",
"content": "oceanbase Elasticsearch database"
},
{
"c1": 2,
"query": "hello world, what is your name",
"_score": 0.3503184713375797,
"vector": "[1,2,1]",
"content": "oceanbase mysql database"
},
{
"c1": 3,
"query": "hello world, how are you",
"_score": 0.3503184713375797,
"vector": "[1,1,1]",
"content": "oceanbase oracle database"
},
{
"c1": 6,
"query": "hello world, where are you from",
"_score": 0.3503184713375797,
"vector": "[2,1,1]",
"content": "starrocks oceanbase database"
}
] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
MySQL [test]> SET @parm = '{
'> "query": {
'> "query_string": {
'> "fields": ["query", "content"],
'> "query": "hello oceanbase"
'> }
'> },
'> "knn" : {
'> "field": "vector",
'> "k": 5,
'> "query_vector": [1,2,3]
'> }
'> }';
Query OK, 0 rows affected (0.00 sec)
MySQL [test]>
MySQL [test]> SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm)) |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [
{
"c1": 1,
"query": "hello world",
"_score": 1.3716216216216217,
"vector": "[1,2,3]",
"content": "oceanbase Elasticsearch database"
},
{
"c1": 2,
"query": "hello world, what is your name",
"_score": 0.6836518013375796,
"vector": "[1,2,1]",
"content": "oceanbase mysql database"
},
{
"c1": 3,
"query": "hello world, how are you",
"_score": 0.6593354613375797,
"vector": "[1,1,1]",
"content": "oceanbase oracle database"
},
{
"c1": 5,
"query": "real world, how old are you",
"_score": 0.41421356,
"vector": "[1,3,2]",
"content": "redis oracle database"
},
{
"c1": 6,
"query": "hello world, where are you from",
"_score": 0.3503184713375797,
"vector": "[2,1,1]",
"content": "starrocks oceanbase database"
},
{
"c1": 4,
"query": "real world, where are you from",
"_score": 0.30901699,
"vector": "[1,3,1]",
"content": "postgres oracle database"
}
] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
```
:::
The hybrid search results comprehensively consider the keyword matching score (`_keyword_score`) and semantic similarity score (`_semantic_score`). The final `_score` is the sum of these two, used to comprehensively rank the search results.
## Parameter tuning
In hybrid search, you can adjust the weight ratio of full-text search and vector search through the `boost` parameter to optimize search results. For example, to increase the weight of full-text search:
```sql
SET @parm = '{
"query": {
"query_string": {
"fields": ["query", "content"],
"query": "hello oceanbase",
"boost": 2.0
}
},
"knn" : {
"field": "vector",
"k": 5,
"query_vector": [1,2,3],
"boost": 1.0
}
}';
SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));
```
By adjusting the `boost` parameter, you can control the weight of keyword search and semantic search in the final ranking. For example, if you focus more on keyword matching, you can increase the `boost` value of `query_string`; if you focus more on semantic similarity, you can increase the `boost` value of `knn`.
## Summary
Through this tutorial, you have mastered the core features of seekdb hybrid search:
* Pure vector search: Find relevant content through semantic similarity, suitable for semantic search scenarios.
* Pure full-text search: Find content through keyword matching, suitable for precise search scenarios.
* Hybrid search: Combines keywords and semantic understanding to provide more comprehensive and accurate search results.
The hybrid search feature is an ideal choice for processing massive unstructured data and building intelligent search and recommendation systems, significantly improving the accuracy and comprehensiveness of retrieval results.
### What's next
* Explore [AI function service features](../../200.develop/300.ai-function/200.ai-function.md)
* View [hybrid vector index](../../200.develop/300.ai-function/200.ai-function.md) to simplify vector search processes
## More information
For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:
* [Experience vector search](30.experience-vector-search.md)
* [Experience full-text indexing](40.experience-full-text-indexing.md)
* [Experience AI function service](60.experience-ai-function.md)
* [Experience semantic indexing](70.experience-hybrid-vector-index.md)
* [Experience the Vibe Coding paradigm with Cursor Agent + OceanBase MCP](80.experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)
In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see [Experience embedded seekdb using Python SDK](../50.embedded-mode/25.using-seekdb-in-python-sdk.md) and [pyseekdb overview](../../200.develop/900.sdk/10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).

View File

@@ -0,0 +1,332 @@
---
slug: /experience-ai-function
---
# Experience AI function service in seekdb
This tutorial guides you through getting started with seekdb's AI function service, helping you understand how it leverages AI capabilities, understand practical applications, and experience the powerful features of an AI-native database.
## Overview
AI functions integrate AI model capabilities directly into data processing within the database through SQL expressions. They greatly simplify operations such as data extraction, analysis, summarization, and storage using AI large models, and are an important new feature in the current database and data warehouse field. seekdb provides comprehensive AI model and endpoint management through the `DBMS_AI_SERVICE` package, and includes multiple built-in AI function expressions, while supporting monitoring of AI model calls through views. You can directly call AI models in SQL without writing additional code, and experience several core functions including `AI_COMPLETE`, `AI_EMBED`, `AI_RERANK`, and `AI_PROMPT` in just a few minutes:
* `AI_EMBED`: Converts text data to vector data by calling an embedding model.
* `AI_COMPLETE`: Processes prompts and data information by calling a specified text generation large model and parses the processing results.
* `AI_PROMPT`: Organizes prompt templates and dynamic data into JSON format, which can be used directly in the `AI_COMPLETE` function to replace the `prompt` parameter.
* `AI_RERANK`: Ranks text by similarity according to prompts by calling a rerank model.
This feature can be applied to text generation, text conversion, text reranking, and other scenarios.
## Prerequisites
* Contact the administrator to obtain the corresponding database connection string, then execute the following command to connect to the database:
```shell
# host: seekdb database connection IP.
# port: seekdb database connection port.
# database_name: Name of the database to access.
# user_name: Database username.
# password: Database password.
obclient -h$host -P$port -u$user_name -p$password -D$database_name
```
* Ensure that you have the relevant permissions for [AI function service](../../200.develop/300.ai-function/200.ai-function.md). Complete model and endpoint registration information is provided before each example, which you can copy and use directly.
## Step 1: Use AI_EMBED to generate vectors
`AI_EMBED` can convert text to vectors for vector retrieval. This is a fundamental step in vector retrieval, converting text data into high-dimensional vector representations for similarity calculations.
### Register embedding model and endpoint
```sql
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.siliconflow.cn/v1/embeddings",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
```
### Try embedding a single row of data
```sql
SELECT AI_EMBED("ob_embed", "Hello world") AS embedding;
```
The expected result is a vector array, such as `[0.1, 0.2, 0.3]`. This allows you to batch convert text in tables to vectors for subsequent vector retrieval.
## Step 2: Use AI_COMPLETE and AI_PROMPT to generate text
`AI_COMPLETE` can directly call large language models in SQL to implement text generation, translation, analysis, and other functions. The `AI_PROMPT` function can organize prompt templates and dynamic data into JSON format, which can be used directly in the `AI_COMPLETE` function to replace the `prompt` parameter.
### Register text generation model and endpoint
```sql
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.siliconflow.cn/v1/chat/completions",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
```
### Try sentiment analysis
```sql
SELECT AI_COMPLETE("ob_complete", AI_PROMPT('Your task is to perform sentiment analysis on the provided text and determine whether its emotional tendency is positive or negative.
The following is the text to be analyzed:
<text>
{0}
</text>
The judgment criteria are as follows:
If the text expresses positive emotions, output 1; if the text expresses negative emotions, output -1. Do not output anything else.', 'The weather is really good.')) AS sentiment;
```
The following result is returned:
```sql
+----------+
| sentiment|
+----------+
| 1 |
+----------+
```
## Step 3: Use AI_RERANK to optimize retrieval results
`AI_RERANK` can intelligently rerank retrieval results, reordering document lists by relevance to query terms.
### Register rerank model and endpoint
```sql
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://api.siliconflow.cn/v1/rerank",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
```
### Try reranking
```sql
SELECT AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]');
```
The following result is returned:
:::collapse
```sql
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]') |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"index": 0, "relevance_score": 0.9911285638809204}, {"index": 1, "relevance_score": 0.0030552432872354984}, {"index": 2, "relevance_score": 0.0003349370090290904}, {"index": 3, "relevance_score": 0.00001892922773549799}] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
```
:::
Reranking can significantly improve the accuracy of retrieval results, especially suitable for RAG scenarios.
## Step 4: Comprehensive application: Build an intelligent Q&A system
Combine the three AI functions to build a simple intelligent Q&A system in three steps.
### Register all required models and endpoints
This example requires the use of embedding models, text generation models, and rerank models simultaneously. Ensure that the following models and endpoints are registered:
:::collapse
```sql
-- Register embedding model (skip if already registered in Step 1)
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.siliconflow.cn/v1/embeddings",
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
-- Register text generation model (skip if already registered in Step 2)
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.siliconflow.cn/v1/chat/completions",
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
-- Register rerank model (skip if already registered in Step 3)
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://api.siliconflow.cn/v1/rerank",
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
```
:::
:::info
Replace all <code>access_key</code> values with actual API keys. If you have already registered the corresponding models in the previous steps, you can skip the corresponding registration steps.
:::
### Prepare data and generate vectors
```sql
CREATE TABLE knowledge_base (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
content TEXT,
embedding TEXT
);
INSERT INTO knowledge_base (title, content) VALUES
('seekdb Introduction', 'seekdb is a powerful database system that supports vector retrieval and AI functions.'),
('Vector Retrieval', 'Vector retrieval can be used for semantic search to find similar content.'),
('AI Functions', 'AI functions can directly call AI models in SQL.');
UPDATE knowledge_base
SET embedding = AI_EMBED("ob_embed", content);
```
### Vector retrieval and reranking
```sql
SET @query = "What is vector retrieval?";
SET @query_vector = AI_EMBED("ob_embed", @query);
-- Directly construct a document list in string array format
SET @candidate_docs = '["seekdb is a powerful database system that supports vector retrieval and AI functions.", "Vector retrieval can be used for semantic search to find similar content."]';
SELECT AI_RERANK("ob_rerank", @query, @candidate_docs) AS ranked_results;
```
The following result is returned. `index` is the document index, and `relevance_score` is the relevance score:
```sql
+-------------------------------------------------------------------------------------------------------------+
| ranked_results |
+-------------------------------------------------------------------------------------------------------------+
| [{"index": 1, "relevance_score": 0.9904329776763916}, {"index": 0, "relevance_score": 0.16993996500968933}] |
+-------------------------------------------------------------------------------------------------------------+
1 row in set
```
### Generate answers
Based on the question retrieval in the first step and the reranking results in the second step, generate an answer:
```sql
SELECT AI_COMPLETE("ob_complete",
AI_PROMPT('Based on the following document content, answer the user's question.
User question: {0}
Relevant document: {1}
Please answer the user's question concisely and accurately based on the above document content.', @query, CAST(JSON_EXTRACT(@candidate_docs, '$[1]') AS CHAR))) AS answer;
```
The following result is returned:
:::collapse
```sql
+--------------------------------------------------------------------------------------------------------------------------------------------+
| answer |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| According to the provided document content, vector retrieval is a technology used for semantic search, aimed at finding similar content by comparing vector data. |
+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
```
:::
Through these three steps, you can quickly complete a complete AI application flow within the seekdb database: vectorization, retrieval, reranking, and answer generation.
## Summary
Through this tutorial, you have mastered the core features of seekdb's AI function service:
* AI_EMBED: Convert text to vectors to prepare data for vector retrieval.
* AI_COMPLETE: Directly call LLMs in SQL to implement text generation, translation, analysis, and other functions.
* AI_RERANK: Optimize the accuracy of retrieval results and improve RAG application effectiveness.
### What's next
* View and monitor AI model information and call status through views in the [AI function service usage and examples - AI model call monitoring](../../200.develop/300.ai-function/200.ai-function.md) section
* Learn about [vector retrieval](../../200.develop/100.vector-search/100.vector-search-overview/100.vector-search-intro.md)
* Explore [hybrid search](50.experience-hybrid-search.md) features
* View [hybrid vector index](70.experience-hybrid-vector-index.md) to simplify vector search processes
## More information
For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:
* [Experience vector search](30.experience-vector-search.md)
* [Experience full-text indexing](40.experience-full-text-indexing.md)
* [Experience hybrid search](50.experience-hybrid-search.md)
* [Experience semantic indexing](70.experience-hybrid-vector-index.md)
* [Experience the Vibe Coding paradigm with Cursor Agent + OceanBase MCP](80.experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)
In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see [Experience embedded seekdb using Python SDK](../50.embedded-mode/25.using-seekdb-in-python-sdk.md) and [pyseekdb overview](../../200.develop/900.sdk/10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).

View File

@@ -0,0 +1,186 @@
---
slug: /experience-hybrid-vector-index
---
# Experience hybrid vector index in seekdb
This tutorial guides you through getting started with seekdb's hybrid vector index, helping you understand the practical applications of hybrid vector indexes and experience the powerful features of hybrid vector indexes. You can achieve semantic retrieval by directly storing text without manually converting to vectors.
## Overview
Hybrid vector index refers to a vector index that can automatically convert text to vectors and build indexes. It is a powerful feature provided by seekdb that makes the vector concept transparent to users. Compared to vector indexes that do not use hybrid vector indexes, hybrid vector indexes greatly simplify the usage process.
* Vector index process without hybrid vector index:
```shell
Text → Manually call `AI_EMBED` function to generate vectors → Insert vectors → Use vector retrieval
```
* Hybrid vector index process:
```shell
Text → Direct insertion → Direct text retrieval
```
seekdb automatically converts text to vectors and builds indexes internally. During retrieval, you only need to provide the original text, and the system automatically performs embedding and retrieves the vector index, significantly improving ease of use.
## Prerequisites
* Contact the administrator to obtain the corresponding database connection string, then execute the following command to connect to the database:
```shell
# host: seekdb database connection IP.
# port: seekdb database connection port.
# database_name: Name of the database to access.
# user_name: Database username.
# password: Database password.
obclient -h$host -P$port -u$user_name -p$password -D$database_name
```
* Ensure that you have the relevant permissions for [AI function service](../../200.develop/300.ai-function/200.ai-function.md), and ensure that an embedding model has been registered in the database using the `CREATE_AI_MODEL` and `CREATE_AI_MODEL_ENDPOINT` procedures:
:::collapse
```sql
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.siliconflow.cn/v1/embeddings",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
```
:::
:::info
The hybrid vector index feature currently only supports HNSW/HNSW_BQ index types.
:::
## Step 1: Create a hybrid vector index
Hybrid vector indexes support two methods: **create during table creation** and **create after table creation**.
:::info
When creating a hybrid vector index, you must specify it on a <code>VARCHAR</code> column and specify the embedding model and vector dimension.
:::
### Create during table creation
```sql
CREATE TABLE items (
id INT PRIMARY KEY,
doc VARCHAR(100),
VECTOR INDEX vector_idx(doc)
WITH (distance=l2, lib=vsag, type=hnsw, model=ob_embed, dim=1024, sync_mode=immediate)
);
```
### Create after table creation
```sql
CREATE TABLE items1 (
id INT PRIMARY KEY,
doc VARCHAR(100)
);
CREATE VECTOR INDEX vector_idx
ON items (doc)
WITH (distance=l2, lib=vsag, type=hnsw, model=ob_embed, dim=1024, sync_mode=immediate);
```
## Step 2: Insert text data (no manual vectorization required)
When inserting text data, the system automatically performs embedding without manually calling the `AI_EMBED` function:
```sql
INSERT INTO items(id, doc) VALUES(1, 'Rose');
INSERT INTO items(id, doc) VALUES(2, 'Sunflower');
INSERT INTO items(id, doc) VALUES(3, 'Lily');
```
## Step 3: Use text for direct retrieval
Use the `semantic_distance` function, pass in the original text for vector retrieval, without manually generating query vectors:
```sql
SELECT id, doc FROM items
ORDER BY semantic_distance(doc, 'flower')
APPROXIMATE LIMIT 3;
```
The following result is returned:
```sql
+----+-----------+
| id | doc |
+----+-----------+
| 1 | Rose |
| 2 | Sunflower |
| 3 | Lily |
+----+-----------+
3 rows in set
```
The system automatically converts the query text `'flower'` to a vector and then retrieves the most similar text in the vector index.
## Advanced: Use vector retrieval
If you already have vector representations of the retrieval content (for example, pre-generated through the `AI_EMBED` function), you can also directly use these vectors to retrieve hybrid vector indexes, avoiding repeated embedding operations for each retrieval:
```sql
-- First get the query vector
SET @query_vector = AI_EMBED("ob_embed", "flower");
-- Use vectors for index retrieval
SELECT id, doc FROM items
ORDER BY semantic_vector_distance(doc, @query_vector)
APPROXIMATE LIMIT 3;
```
The following result is returned:
```sql
+----+-----------+
| id | doc |
+----+-----------+
| 1 | Rose |
| 2 | Sunflower |
| 3 | Lily |
+----+-----------+
3 rows in set
```
## Summary
Through this tutorial, you have mastered the core features of seekdb's hybrid vector index:
* Simplified usage process: Achieve semantic retrieval by directly storing text without manually converting to vectors.
* Automatic embedding: The system automatically converts text to vectors and builds indexes. During retrieval, you only need to provide the original text.
* Performance optimization: Supports direct vector retrieval to avoid repeated embedding operations.
The hybrid vector index feature greatly simplifies the usage process of vector retrieval and is an ideal choice for building intelligent search applications.
### What's next
* Learn about [vector index maintenance and monitoring](../../200.develop/100.vector-search/200.vector-index/200.dense-vector-index.md)
* Learn more about [AI function service features](../../200.develop/300.ai-function/200.ai-function.md)
* Explore [hybrid search](50.experience-hybrid-search.md) to combine keyword matching and semantic understanding for more accurate and comprehensive search results.
## More information
For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:
* [Experience vector search](30.experience-vector-search.md)
* [Experience full-text indexing](40.experience-full-text-indexing.md)
* [Experience hybrid search](50.experience-hybrid-search.md)
* [Experience AI function service](60.experience-ai-function.md)
* [Experience the Vibe Coding paradigm with Cursor Agent + OceanBase MCP](80.experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)
In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see [Experience embedded seekdb using Python SDK](../50.embedded-mode/25.using-seekdb-in-python-sdk.md) and [pyseekdb overview](../../200.develop/900.sdk/10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).

View File

@@ -0,0 +1,287 @@
---
slug: /experience-vibe-coding-paradigm-with-cursor-agent-oceanbase-mcp
---
# Experience the vibe coding paradigm with Cursor Agent + OceanBase MCP
Can you launch a product without writing a single line of code? The arrival of the AI era may mean that "writing code" is no longer about "writing" code. This vision is gradually becoming reality, as AI is bringing transformative changes to people's lives and work. Vibe coding, proposed by AI researcher Andrej Karpathy in 2025, demonstrates a new development approach: developers directly express their intentions to AI through natural language (voice or text), and the rest—from generating code, optimizing structure, to partial debugging—is all handled by AI. In this mode, developers only need to consider how to clearly express "what I want", such as "create a multi-language login module for me", and AI can output the corresponding project structure and implementation code. In other words, developers focus on "goals", and AI handles "implementation".
## Vibe coding vs. traditional AI-assisted development
AI-assisted development is not a new concept. However, unlike traditional AI development assistants such as Gemini Code Assist that emphasize manual line-by-line review and code completion, vibe coding tends to automatically complete various development stages and reduces developers' involvement in underlying details.
Of course, current industry exploration is still far from complete "code-transparent black-box adoption", and actual implementations still mostly use a hybrid process of human-machine collaboration and review.
* **Trust mechanisms are not yet perfect:** In ideal vibe coding scenarios, developers can directly adopt AI output, but this is difficult to achieve in actual projects. Especially in scenarios with high security and business complexity, manual review and testing are still essential.
* **Auxiliary tools continue to evolve:** New IDEs continue to improve natural language processing and context awareness capabilities, enhancing AI code output quality and user interaction experience. However, these tools are mostly limited to standardized or prototyping tasks, and complex systems still require engineers to actively oversee and participate.
* **Collaboration and requirements management are becoming trends:** As vibe coding evolves, collaboration between developers and coordination between projects are gradually becoming new trends. The emergence of protocols such as MCP (Model Context Protocol) enables multiple developers to collaborate through fragmented descriptions, synchronize adjustments, and integrate requirements simultaneously, while agents (such as Cursor) help make these processes smoother.
## Cursor: A new generation of AI-native development environment
As emerging AI-driven development modes such as vibe coding gradually become mainstream, various AI-native tools have emerged to provide developers with more convenient and user-friendly development environments. Cursor is one of the leaders.
Cursor is an AI-driven code editor that supports efficient programming through natural language by deeply understanding codebases. The Cursor download address is as follows: [https://cursor.com](https://cursor.com).
Compared to traditional "code suggestion" tools, AI-native IDEs such as Cursor support deeper natural language-code interaction, automatic context association, and intelligent debugging assistance. For example, a single natural language description can build crawlers, configure dependencies, and even introduce testing and exception handling, helping developers lower development barriers and improve engineering efficiency.
## Cursor Agent + OceanBase MCP: A new vibe coding paradigm
seekdb, based on a unified architecture, provides users with vector capabilities and supports multi-modal fusion queries. Without introducing new technology stacks, it can meet diverse business needs, thereby reducing learning costs and accelerating AI development, making it a preferred choice for many developers when using vector databases.
Currently, both seekdb and Cursor support the MCP (Model Context Protocol) protocol. With the MCP protocol, developers can easily implement the new vibe coding paradigm based on Cursor Agent + seekdb.
 
The MCP protocol is regarded as an "adapter" connecting AI models with actual business systems. Through the MCP protocol, large models can access various external applications, such as Git version management and database software commonly used in development, to obtain more environmental information and automatically complete various development stages. This means that enterprises can seamlessly integrate seekdb's data service capabilities into various AI application processes directly through the MCP protocol, significantly reducing the barriers to data interface development and integration.
### Prerequisites
* You have deployed seekdb.
* Install [Python 3.11 or later](https://www.python.org/downloads/) and the corresponding [pip](https://pip.pypa.io/en/stable/installation/). If your machine has a lower Python version, you can use Miniconda to create a new Python 3.11 or later environment. For details, see the [Miniconda installation guide](https://docs.anaconda.com/miniconda/install/).
* Install [Git](https://git-scm.com/downloads) according to your operating system.
* Install the Python package manager uv. After installation, you can use the `uv --version` command to verify whether the installation is successful:
```shell
pip install uv
uv --version
```
* Download [Cursor](https://cursor.com/downloads) and install the appropriate version for your operating system. Note that when using Cursor for the first time, you need to register a new account or log in with an existing account. After logging in, you can create a new project or open an existing project.
### Vibe coding practice
Here we will combine coding with databases based on the vibe coding concept to quickly build an API service.
#### Step 1: Obtain database connection information
Contact the deployment personnel or administrator to obtain the corresponding database connection string, for example:
```sql
obclient -h$host -P$port -u$user_name -p$password -D$database_name
```
**Parameter description:**
* `$host`: Provides the seekdb connection IP.
* `$port`: Provides the seekdb database connection port. The default is `2881`, which can be customized when deploying the seekdb database.
* `$database_name`: The name of the database to access.
* `$user_name`: Provides the connection account for the tenant. The default is `root`.
* `$password`: Provides the account password.
#### Step 2: Clone the OceanBase MCP server project
Clone the project to your local machine
```shell
git clone https://github.com/oceanbase/mcp-oceanbase.git
```
#### Step 3: Prepare the Cursor environment
1. Create a Cursor client working directory and configure the OceanBase MCP server
Manually create a Cursor working directory and open it with Cursor. Files generated by Cursor will be placed in this directory. The example directory name is `cursor`.
Use the shortcut `Ctrl + L` (Windows) or `Command + L` (macOS) to open the chat dialog, click the gear icon in the upper right corner, select `MCP Tools`, click `Add Custom MCP` to fill in the configuration file;
<!--![1](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/SeekDB/get-started/mcp-cursor-1.png)-->
The example configuration file is as follows. Replace `path/to/your/mcp-oceanbase/src/oceanbase_mcp_server` with the absolute path of the `oceanbase_mcp_server` folder, and replace `OB_HOST`, `OB_PORT`, `OB_USER`, `OB_PASSWORD`, and `OB_DATABASE` with your database information:
```json
{
"mcpServers": {
"oceanbase": {
"command": "uv",
"args": [
"--directory",
"/path/to/your/mcp-oceanbase/src/oceanbase_mcp_server",
"run",
"oceanbase_mcp_server"
],
"env": {
"OB_HOST": "***",
"OB_PORT": "***",
"OB_USER": "***",
"OB_PASSWORD": "***",
"OB_DATABASE": "***"
}
}
}
}
```
2. If the configuration is successful, it will display the `Available` status.
<!--![3](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/cloud/integrations/AI/cursor-3.1.png)-->
#### Step 4: Build a RESTful API
1. Create a customer table.
Enter the instruction `Create a customer table with ID as the primary key, including name, age, telephone, and location fields`. After confirming the SQL statement, click the `RUN Tool` button to execute the query.
<!--![4](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/SeekDB/get-started/mcp-cursor-4.png)-->
2. Insert test data.
Enter the instruction `Insert 10 test records` in the dialog box. After confirming the SQL statement, click `RUN tool`. After successful insertion, you will see a message indicating that `10 test records have been successfully inserted into the customer table...`.
<!--![5](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/SeekDB/get-started/mcp-cursor-5.png)-->
3. Create a FastAPI project.
Enter the prompt in the dialog box: `Create a FastAPI project and generate a RESTful API based on the customer table`. After confirming the SQL statement, click the `Run tool` button to execute the query. Cursor will automatically generate files such as main.py, and you can also continue to issue new instructions to automatically start the service, etc.
<!--![6](https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/doc/img/SeekDB/get-started/mcp-cursor-6.png)-->
4. Create a virtual environment and install dependencies.
Execute the following commands to create a virtual environment in the current directory using the uv package manager and install dependency packages:
```shell
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
```
5. Start the FastAPI project.
Execute the following command to start the FastAPI project:
```shell
uvicorn main:app --reload
```
6. View data in the table.
Run the following command in the command line, or use other request tools to view the data in the table:
```shell
curl http://127.0.0.1:8000/customers
```
The result is as follows:
```json
[{"id":1,"name":"Alice","age":28,"telephone":"1234567890","location":"Beijing"},{"id":2,"name":"Bob","age":32,"telephone":"2345678901","location":"Shanghai"},{"id":3,"name":"Charlie","age":25,"telephone":"3456789012","location":"Guangzhou"},{"id":4,"name":"David","age":40,"telephone":"4567890123","location":"Shenzhen"},{"id":5,"name":"Eve","age":22,"telephone":"5678901234","location":"Chengdu"},{"id":6,"name":"Frank","age":35,"telephone":"6789012345","location":"Wuhan"},{"id":7,"name":"Grace","age":30,"telephone":"7890123456","location":"Hangzhou"},{"id":8,"name":"Heidi","age":27,"telephone":"8901234567","location":"Nanjing"},{"id":9,"name":"Ivan","age":29,"telephone":"9012345678","location":"Tianjin"},{"id":10,"name":"Judy","age":31,"telephone":"0123456789","location":"Chongqing"}]
```
You can see that the RESTful API for create, read, update, and delete operations has been successfully generated:
```shell
from fastapi import FastAPI, HTTPException, Depends
from pydantic import BaseModel
from typing import List
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, Session
# OceanBase connection configuration (modify according to your actual situation)
DATABASE_URL = "mysql://***:***@***:***/***"
engine = create_engine(DATABASE_URL, echo=True)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()
class Customer(Base):
__tablename__ = "customer"
id = Column(Integer, primary_key=True, index=True)
name = Column(String(100))
age = Column(Integer)
telephone = Column(String(20))
location = Column(String(100))
class CustomerCreate(BaseModel):
id: int
name: str
age: int
telephone: str
location: str
class CustomerUpdate(BaseModel):
name: str = None
age: int = None
telephone: str = None
location: str = None
class CustomerOut(BaseModel):
id: int
name: str
age: int
telephone: str
location: str
class Config:
orm_mode = True
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()
app = FastAPI()
@app.post("/customers/", response_model=CustomerOut)
def create_customer(customer: CustomerCreate, db: Session = Depends(get_db)):
db_customer = Customer(**customer.dict())
db.add(db_customer)
try:
db.commit()
db.refresh(db_customer)
except Exception as e:
db.rollback()
raise HTTPException(status_code=400, detail=str(e))
return db_customer
@app.get("/customers/", response_model=List[CustomerOut])
def read_customers(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
return db.query(Customer).offset(skip).limit(limit).all()
@app.get("/customers/{customer_id}", response_model=CustomerOut)
def read_customer(customer_id: int, db: Session = Depends(get_db)):
customer = db.query(Customer).filter(Customer.id == customer_id).first()
if customer is None:
raise HTTPException(status_code=404, detail="Customer not found")
return customer
@app.put("/customers/{customer_id}", response_model=CustomerOut)
def update_customer(customer_id: int, customer: CustomerUpdate, db: Session = Depends(get_db)):
db_customer = db.query(Customer).filter(Customer.id == customer_id).first()
if db_customer is None:
raise HTTPException(status_code=404, detail="Customer not found")
for var, value in vars(customer).items():
if value is not None:
setattr(db_customer, var, value)
db.commit()
db.refresh(db_customer)
return db_customer
@app.delete("/customers/{customer_id}")
def delete_customer(customer_id: int, db: Session = Depends(get_db)):
db_customer = db.query(Customer).filter(Customer.id == customer_id).first()
if db_customer is None:
raise HTTPException(status_code=404, detail="Customer not found")
db.delete(db_customer)
db.commit()
return {"ok": True}
```
## What's next
For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:
* [Experience vector search](30.experience-vector-search.md)
* [Experience full-text indexing](40.experience-full-text-indexing.md)
* [Experience hybrid search](50.experience-hybrid-search.md)
* [Experience AI function service](60.experience-ai-function.md)
* [Experience semantic indexing](70.experience-hybrid-vector-index.md)
* [Build a knowledge base desktop application based on seekdb](../../500.tutorials/100.create-ai-app-demo/100.build-kb-in-seekdb.md)
* [Build a cultural tourism assistant with multi-model integration based on seekdb](../../500.tutorials/100.create-ai-app-demo/300.build-multi-model-application-based-on-oceanbase.md)
* [Build an image search application based on seekdb](../../500.tutorials/100.create-ai-app-demo/400.build-image-search-app-in-seekdb.md)
In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see [Experience embedded seekdb using Python SDK](../50.embedded-mode/25.using-seekdb-in-python-sdk.md) and [pyseekdb overview](../../200.develop/900.sdk/10.pyseekdb-sdk/10.pyseekdb-sdk-get-started.md).