Initial commit

2025-11-30 08:44:54 +08:00
commit eb309b7b59
133 changed files with 21979 additions and 0 deletions
--- a/skills/seekdb-docs/official-docs/300.integrations/100.model/100.jina.md
+++ b/skills/seekdb-docs/official-docs/300.integrations/100.model/100.jina.md
@@ -0,0 +1,193 @@
+---
+sidebar_label: Jina AI
+slug: /jina
+---
+
+# Integrate seekdb vector search with Jina AI
+
+seekdb supports vector data storage, vector indexes, and embedding vector search. You can store vectorized data in seekdb for further search.
+
+Jina AI is an AI platform focused on multimodal search and vector search. It offers core components and tools for building enterprise-grade Retrieval-Augmented Generation (RAG) applications based on multimodal search, helping organizations and developers create advanced search-driven generative AI solutions.
+
+## Prerequisites
+
+* You have deployed seekdb.
+
+* You have an existing MySQL database and account available in your environment, and the database account has been granted read and write privileges.
+
+* You have installed Python 3.11 or later.
+
+* You have installed required dependencies:
+
+  ```shell
+  python3 -m pip install pyobvector requests sqlalchemy
+  ```
+
+## Step 1: Obtain the database connection information
+
+Contact your seekdb deployment engineer or administrator to obtain the database connection string. For example:
+
+```sql
+obclient -h$host -P$port -u$user_name -p$password -D$database_name
+```
+
+**Parameters:**
+
+* `$host`: The IP address for connecting to seekdb.
+* `$port`: The port number for connecting to seekdb. Default is `2881`.
+* `$database_name`: The name of the database to access.
+
+    :::tip
+        The connected user must have <code>CREATE</code>, <code>INSERT</code>, <code>DROP</code>, and <code>SELECT</code> privileges on the database.
+    :::
+
+* `$user_name`: The username for connecting to the database.
+* `$password`: The password for the account.
+
+## Step 2: Build your AI assistant
+
+### Set your Jina AI API key as an environment variable
+
+Get your [Jina AI API key](https://jina.ai/api-dashboard/reader) and configure it, along with your seekdb connection details, as environment variables:
+
+```shell
+export OCEANBASE_DATABASE_URL=YOUR_OCEANBASE_DATABASE_URL
+export OCEANBASE_DATABASE_USER=YOUR_OCEANBASE_DATABASE_USER
+export OCEANBASE_DATABASE_DB_NAME=YOUR_OCEANBASE_DATABASE_DB_NAME
+export OCEANBASE_DATABASE_PASSWORD=YOUR_OCEANBASE_DATABASE_PASSWORD
+export JINAAI_API_KEY=YOUR_JINAAI_API_KEY
+```
+
+### Example code snippets
+
+#### Get embeddings from Jina AI
+
+Jina AI offers several embedding models. You can choose the one that best fits your needs.
+
+| Model | Parameter size | Embedding dimension | Text |
+| --- | --- | --- | --- |
+| [jina-embeddings-v3](https://zilliz.com/ai-models/jina-embeddings-v3) | 570M | flexible embedding size (Default: 1024) | multilingual text embeddings; supports 94 language in total |
+| [jina-embeddings-v2-small-en](https://zilliz.com/ai-models/jina-embeddings-v2-small-en) | 33M | 512 | English monolingual embeddings |
+| [jina-embeddings-v2-base-en](https://zilliz.com/ai-models/jina-embeddings-v2-base-en) | 137M | 768 | English monolingual embeddings |
+| [jina-embeddings-v2-base-zh](https://zilliz.com/ai-models/jina-embeddings-v2-base-zh) | 161M | 768 | Chinese-English Bilingual embeddings |
+| [jina-embeddings-v2-base-de](https://zilliz.com/ai-models/jina-embeddings-v2-base-de) | 161M | 768 | German-English Bilingual embeddings |
+| [jina-embeddings-v2-base-code](https://zilliz.com/ai-models/jina-embeddings-v2-base-code) | 161M | 768 | English and programming languages |
+
+Here is an example using `jina-embeddings-v3`. The following helper function, `generate_embeddings`, calls the Jina AI embedding API:
+
+```python
+import os
+import requests
+from sqlalchemy import Column, Integer, String
+from pyobvector import ObVecClient, VECTOR, IndexParam, cosine_distance
+
+JINAAI_API_KEY = os.getenv('JINAAI_API_KEY')
+
+# Step 1. Text data vectorization
+def generate_embeddings(text: str):
+    JINAAI_API_URL = 'https://api.jina.ai/v1/embeddings'
+    JINAAI_HEADERS = {
+        'Content-Type': 'application/json',
+        'Authorization': f'Bearer {JINAAI_API_KEY}'
+    }
+    JINAAI_REQUEST_DATA = {
+        'input': [text],
+        'model': 'jina-embeddings-v3'
+    }
+    
+    response = requests.post(JINAAI_API_URL, headers=JINAAI_HEADERS, json=JINAAI_REQUEST_DATA)
+    response_json = response.json()
+    return response_json['data'][0]['embedding']
+    
+
+TEXTS = [
+    'Jina AI offers best-in-class embeddings, reranker and prompt optimizer, enabling advanced multimodal AI.',
+    'OceanBase Database is an enterprise-level, native distributed database independently developed by the OceanBase team. It is cloud-native, highly consistent, and highly compatible with Oracle and MySQL.',
+    'OceanBase is a native distributed relational database that supports HTAP hybrid transaction analysis and processing. It features enterprise-level characteristics such as high availability, transparent scalability, and multi-tenancy, and is compatible with MySQL/Oracle protocols.'
+]
+data = []
+for text in TEXTS:
+    # Generate the embedding for the text via Jina AI API.
+    embedding = generate_embeddings(text)
+    data.append({
+        'content': text,
+        'content_vec': embedding
+    })
+
+print(f"Successfully processed {len(data)} texts")
+```
+
+#### Define the vector table structure and store vectors in seekdb
+
+Create a table called `jinaai_oceanbase_demo_documents` with columns for the text (`content`), the embedding vector (`content_vec`), and vector index information. Then insert the vector data into seekdb:
+
+```python
+# Step 2. Connect seekdb Serverless
+OCEANBASE_DATABASE_URL = os.getenv('OCEANBASE_DATABASE_URL')
+OCEANBASE_DATABASE_USER = os.getenv('OCEANBASE_DATABASE_USER')
+OCEANBASE_DATABASE_DB_NAME = os.getenv('OCEANBASE_DATABASE_DB_NAME')
+OCEANBASE_DATABASE_PASSWORD = os.getenv('OCEANBASE_DATABASE_PASSWORD')
+
+client = ObVecClient(uri=OCEANBASE_DATABASE_URL, user=OCEANBASE_DATABASE_USER,password=OCEANBASE_DATABASE_PASSWORD,db_name=OCEANBASE_DATABASE_DB_NAME)
+# Step 3. Create the vector table.
+table_name = "jinaai_oceanbase_demo_documents"
+client.drop_table_if_exist(table_name)
+
+cols = [
+    Column("id", Integer, primary_key=True, autoincrement=True),
+    Column("content", String(500), nullable=False),
+    Column("content_vec", VECTOR(1024))
+]
+
+# Create vector index
+vector_index_params = IndexParam(
+    index_name="idx_content_vec",
+    field_name="content_vec",  
+    index_type="HNSW",
+    distance_metric="cosine"
+)
+
+client.create_table_with_index_params(
+    table_name=table_name,
+    columns=cols, 
+    vidxs=[vector_index_params]
+)
+
+print('- Inserting Data to OceanBase...')
+client.insert(table_name, data=data)
+```
+
+#### Semantic search
+
+Use the Jina AI embedding API to generate an embedding for your query text. Then, search for the most relevant document by calculating the cosine distance between the query embedding and each embedding in the vector table:
+
+```python
+# Step 4. Query the most relevant document based on the query.
+query = 'What is OceanBase?'
+# Generate the embedding for the query via Jina AI API.
+query_embedding = generate_embeddings(query)
+
+res = client.ann_search(
+    table_name,
+    vec_data=query_embedding,
+    vec_column_name="content_vec",
+    distance_func=cosine_distance,  # Use cosine distance function
+    with_dist=True,
+    topk=1,
+    output_column_names=["id", "content"],
+)
+
+print('- The Most Relevant Document and Its Distance to the Query:')
+for row in res.fetchall():
+    print(f'  - ID: {row[0]}\n'
+          f'    content: {row[1]}\n'
+          f'    distance: {row[2]}')
+```
+
+#### Expected result
+
+```plain
+  - ID: 2
+    content: OceanBase Database is an enterprise-level, native distributed database independently developed by the OceanBase team. It is cloud-native, highly consistent, and highly compatible with Oracle and MySQL.
+    distance: 0.14733879001870276
+```
--- a/skills/seekdb-docs/official-docs/300.integrations/100.model/200.openai.md
+++ b/skills/seekdb-docs/official-docs/300.integrations/100.model/200.openai.md
@@ -0,0 +1,228 @@
+---
+sidebar_label: OpenAI
+slug: /openai
+---
+
+# OpenAI
+
+OpenAI is an artificial intelligence company that has developed several large language models. These models excel at understanding and generating natural language, making them highly effective for tasks such as text generation, answering questions, and engaging in conversations. Access to these models is available through an API.
+
+seekdb offers features such as vector storage, vector indexing, and embedding-based vector search. By using OpenAI's API, you can convert data into vectors, store these vectors in seekdb, and then take advantage of seekdb's vector search capabilities to find relevant data.
+
+## Prerequisites
+
+* You have deployed seekdb.
+* You have an existing MySQL database and account available in your environment, and the database account has been granted read and write privileges.
+* You have installed [Python 3.9 or later](https://www.python.org/downloads/) and [pip](https://pip.pypa.io/en/stable/installation/).
+* You have installed [Poetry](https://python-poetry.org/docs/), [Pyobvector](https://github.com/oceanbase/pyobvector), and OpenAI SDK. The installation commands are as follows:
+
+    ```shell
+    python3 pip install poetry
+    python3 pip install pyobvector
+    python3 pip install openai
+    ```
+
+* You have obtained an [OpenAI API key](https://platform.openai.com/api-keys).
+
+## Step 1: Obtain the connection string of seekdb
+
+Contact the seekdb deployment engineer or administrator to obtain the connection string of seekdb, for example:
+
+```sql
+obclient -h$host -P$port -u$user_name -p$password -D$database_name
+```
+
+**Parameters:**
+
+* `$host`: The IP address for connecting to seekdb.
+* `$port`: The port number for connecting to seekdb. Default is `2881`.
+* `$database_name`: The name of the database to be accessed.
+
+    :::tip
+        The user for connection must have the <code>CREATE</code>, <code>INSERT</code>, <code>DROP</code>, and <code>SELECT</code> privileges on the database. 
+    :::
+
+* `$user_name`: The database account.
+* `$password`: The password of the account.
+
+**Here is an example:**
+
+```shell
+obclient -hxxx.xxx.xxx.xxx -P2881 -utest_user001 -p****** -Dtest
+```
+
+## Step 2: Register an LLM account
+
+Obtain an OpenAI API key.
+
+1. Log in to the [OpenAI](https://platform.openai.com/) platform.
+
+2. Click **API Keys** in the upper-right corner.
+
+3. Click **Create API Key**.
+
+4. Specify the required information and click **Create API Key**.
+
+Specify the API key for the relevant environment variable.
+
+* For a Unix-based system such as Ubuntu or macOS, you can run the following command in a terminal:
+
+    ```shell
+    export OPENAI_API_KEY='your-api-key'
+    ```
+
+* For a Windows system, you can run the following command in Command Prompt:
+
+    ```shell
+    set OPENAI_API_KEY=your-api-key
+    ```
+
+You must replace `your-api-key` with the actual OpenAI API key.
+
+## Step 3: Store vector data in seekdb
+
+### Store vector data in seekdb
+
+1. Prepare test data.
+
+    Download the [CSV file](https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20240827/srxyhu/fine_food_reviews.csv) that already contains the vectorized data. This CSV file includes 1,000 food review entries, and the last column contains the vector values. Therefore, you do not need to calculate the vectors yourself. If you want to recalculate the embeddings for the "embedding" column (the vector column), you can use the following code to generate a new CSV file:
+
+    ```shell
+    from openai import OpenAI
+    import pandas as pd
+    input_datapath = "./fine_food_reviews.csv"
+    client = OpenAI()
+    # Here the text-embedding-ada-002 model is used. You can change the model as needed.
+    def embedding_text(text, model="text-embedding-ada-002"):
+        # For more information about how to create embedding vectors, see https://community.openai.com/t/embeddings-api-documentation-needs-to-updated/475663.
+        res = client.embeddings.create(input=text, model=model)
+        return res.data[0].embedding
+    df = pd.read_csv(input_datapath, index_col=0)
+    # It takes a few minutes to generate the CSV file by calling the OpenAI Embedding API row by row.
+    df["embedding"] = df.combined.apply(embedding_text)
+    output_datapath = './fine_food_reviews_self_embeddings.csv'
+    df.to_csv(output_datapath)
+    ```
+
+2. Run the following script to insert the test data into seekdb. The script must be located in the same directory as the test data.
+
+    ```shell
+    import os
+    import sys
+    import csv
+    import json
+    from pyobvector import *
+    from sqlalchemy import Column, Integer, String
+    # Connect to seekdb by using pyobvector and replace the at (@) sign in the username and password with %40, if any.
+    client = ObVecClient(uri="host:port", user="username",password="****",db_name="test")
+    # The test dataset has been vectorized and is stored in the same directory as the Python script by default. If you vectorize the dataset again, specify the new file.
+    file_name = "fine_food_reviews.csv"
+    file_path = os.path.join("./", file_name)
+    # Define columns. The last column is a vector column.
+    cols = [
+        Column('id', Integer, primary_key=True, autoincrement=False),
+        Column('product_id', String(256), nullable=True),
+        Column('user_id', String(256), nullable=True),
+        Column('score', Integer, nullable=True),
+        Column('summary', String(2048), nullable=True),
+        Column('text', String(8192), nullable=True),
+        Column('combined', String(8192), nullable=True),
+        Column('n_tokens', Integer, nullable=True),
+        Column('embedding', VECTOR(1536))
+    ]
+    # Define the table name.
+    table_name = 'fine_food_reviews'
+    # If the table does not exist, create it.
+    if not client.check_table_exists(table_name):
+        client.create_table(table_name,columns=cols)
+        # Create an index on the vector column.
+        client.create_index(
+            table_name=table_name,
+            is_vec_index=True,
+            index_name='vidx',
+            column_names=['embedding'],
+            vidx_params='distance=l2, type=hnsw, lib=vsag',
+        )
+    # Open and read the CSV file.
+    with open(file_name, mode='r', newline='', encoding='utf-8') as csvfile:
+        csvreader = csv.reader(csvfile)
+        # Read the header line.
+        headers = next(csvreader)
+        print("Headers:", headers)
+        batch = [] # Store data by inserting 10 rows into the database each time.
+        for i, row in enumerate(csvreader):
+            # The CSV file contains nine columns: `id`, `product_id`, `user_id`, `score`, `summary`, `text`, `combined`, `n_tokens`, and `embedding`.
+            if not row:
+                break
+            food_review_line= {'id':row[0],'product_id':row[1],'user_id':row[2],'score':row[3],'summary':row[4],'text':row[5],\
+            'combined':row[6],'n_tokens':row[7],'embedding':json.loads(row[8])}
+            batch.append(food_review_line)
+            # Insert 10 rows each time.
+            if (i + 1) % 10 == 0:
+                client.insert(table_name,batch)
+                batch = []  # Clear the cache.
+        # Insert the rest rows, if any.
+        if batch:
+            client.insert(table_name,batch)
+    # Check the data in the table and make sure that all data has been inserted.
+    count_sql = f"select count(*) from {table_name};"
+    cursor = client.perform_raw_text_sql(count_sql)
+    result = cursor.fetchone()
+    print(f"Total number of inserted rows:{result[0]}")
+    ```
+
+### Query seekdb data
+
+1. Save the following Python script and name it as `openAIQuery.py`.
+
+    ```shell
+        import os
+        import sys
+        import csv
+        import json
+        from pyobvector import *
+        from sqlalchemy import func
+        from openai import OpenAI
+        # Obtain command-line options.
+        if len(sys.argv) != 2:
+            print("Enter a query statement." )
+            sys.exit()
+        queryStatement = sys.argv[1]
+        # Connect to seekdb by using pyobvector and replace the at (@) sign in the username and password with %40, if any.
+        client = ObVecClient(uri="host:port", user="usename",password="****",db_name="test")
+        openAIclient = OpenAI()
+        # Define the function for generating text vectors.
+        def generate_embeddings(text, model="text-embedding-ada-002"):
+            # For more information about how to create embedding vectors, see https://community.openai.com/t/embeddings-api-documentation-needs-to-updated/475663.
+            res = openAIclient.embeddings.create(input=text, model=model)
+            return res.data[0].embedding
+
+        def query_ob(query, tableName, vector_name="embedding", top_k=1):
+            embedding = generate_embeddings(query)
+            # Perform an approximate nearest neighbor search (ANNS).
+            res = client.ann_search(
+                table_name=tableName,
+                vec_data=embedding,
+                vec_column_name=vector_name,
+                distance_func=func.l2_distance,
+                topk=top_k,
+                output_column_names=['combined']
+            )
+            for row in res:
+                print(str(row[0]).replace("Title: ", "").replace("; Content: ", ": "))
+        # Specify the table name.
+        table_name = 'fine_food_reviews'
+        query_ob(queryStatement,table_name,'embedding',1)
+    ```
+
+2. Enter a question for an answer.
+
+    ```shell
+    python3 openAIQuery.py 'pet food'
+    ```
+
+    The expected result is as follows:
+
+    ```shell
+    Crack for dogs.: These thing are like crack for dogs. I am not sure of the make-up but the doggies sure love them.
+    ```
--- a/skills/seekdb-docs/official-docs/300.integrations/100.model/300.qwen.md
+++ b/skills/seekdb-docs/official-docs/300.integrations/100.model/300.qwen.md
@@ -0,0 +1,205 @@
+---
+sidebar_label: Qwen
+slug: /qwen
+---
+
+# Qwen
+
+[Tongyi Qianwen (Qwen)](https://tongyi.aliyun.com) is a large language model (LLM) developed by Alibaba Cloud for interpreting and analyzing user inputs. You can use the API of Qwen in the [Alibaba Cloud Model Studio](https://bailian.console.alibabacloud.com/?spm=a2c63.p38356.0.0.948073b58ycZ3f&accounttraceid=ffba8dd7c8ef4dfd95c06513316ac8cfacdj#/home).
+
+seekdb offers features such as vector storage, vector indexing, and embedding-based vector search. By using Qwen's API, you can convert data into vectors, store these vectors in seekdb, and then take advantage of seekdb's vector search capabilities to find relevant data.
+
+## Prerequisites
+
+* You have deployed seekdb.
+* You have an existing MySQL database and account available in your environment, and the database account has been granted read and write privileges.
+* You have installed [Python 3.9 or later](https://www.python.org/downloads/) and [pip](https://pip.pypa.io/en/stable/installation/).
+* You have installed [Poetry](https://python-poetry.org/docs/), [Pyobvector](https://github.com/oceanbase/pyobvector), and DashScope SDK. The installation commands are as follows:
+
+    ```shell
+    pip install poetry
+    pip install pyobvector
+    pip install dashscope
+    ```
+
+* You have obtained the [Qwen API key](https://help.aliyun.com/zh/model-studio/developer-reference/get-api-key).
+
+## Step 1: Obtain the connection string of seekdb
+
+Contact the seekdb deployment engineer or administrator to obtain the connection string of seekdb, for example:
+
+```sql
+obclient -h$host -P$port -u$user_name -p$password -D$database_name
+```
+
+**Parameters:**
+
+* `$host`: The IP address for connecting to seekdb.
+* `$port`: The port number for connecting to seekdb. Default is `2881`.
+* `$database_name`: The name of the database to be accessed.
+
+    :::tip
+        The user for connection must have the <code>CREATE</code>, <code>INSERT</code>, <code>DROP</code>, and <code>SELECT</code> privileges on the database. 
+    :::
+
+* `$user_name`: The database account.
+* `$password`: The password of the account.
+
+## Step 2: Configure the environment variable for the Qwen API key
+
+For a Unix-based system (such as Ubuntu or MacOS), run the following command in the terminal:
+
+```shell
+export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"
+```
+
+For Windows, run the following command in the command prompt:
+
+```shell
+set DASHSCOPE_API_KEY=YOUR_DASHSCOPE_API_KEY
+```
+
+You must replace `YOUR_DASHSCOPE_API_KEY` with the actual Qwen API key.
+
+## Step 3: Store the vector data in seekdb
+
+1. Prepare the test data.
+   Download the [CSV file](https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20240827/srxyhu/fine_food_reviews.csv) that already contains the vectorized data. This CSV file includes 1,000 food review entries, and the last column contains the vector values. Therefore, you do not need to calculate the vectors yourself. If you want to recalculate the embeddings for the "embedding" column (the vector column), you can use the following code to generate a new CSV file:
+
+   ```shell
+    import dashscope
+    import pandas as pd
+    input_datapath = "./fine_food_reviews.csv"
+    # Here the text_embedding_v1 model is used. You can change the model as needed.
+    def generate_embeddings(text):
+        rsp = dashscope.TextEmbedding.call(model=TextEmbedding.Models.text_embedding_v1, input=text)
+        embeddings = [record['embedding'] for record in rsp.output['embeddings']]
+        return embeddings if isinstance(text, list) else embeddings[0]
+    df = pd.read_csv(input_datapath, index_col=0)
+    # It takes a few minutes to generate the CSV file by calling the Tongyi Qianwen Embedding API row by row.
+    df["embedding"] = df.combined.apply(generate_embeddings)
+    output_datapath = './fine_food_reviews_self_embeddings.csv'
+    df.to_csv(output_datapath)
+   ```
+
+2. Execute the following script to insert the test data into seekdb. The directory where the script is located must be the same as the directory where the test data is stored.
+
+    ```shell
+    import os
+    import sys
+    import csv
+    import json
+    from pyobvector import *
+    from sqlalchemy import Column, Integer, String
+    # Use pyobvector to connect to seekdb. If @ is in the username or password, replace it with %40.
+    client = ObVecClient(uri="host:port", user="username",password="****",db_name="test")
+    # The test dataset is prepared in advance and has been vectorized. By default, it is placed in the same directory as the Python script. If you have vectorized it yourself, replace it with the corresponding file.
+    file_name = "fine_food_reviews.csv"
+    file_path = os.path.join("./", file_name)
+    # Define the columns. The vectorized column is placed in the last field.
+    cols = [
+        Column('id', Integer, primary_key=True, autoincrement=False),
+        Column('product_id', String(256), nullable=True),
+        Column('user_id', String(256), nullable=True),
+        Column('score', Integer, nullable=True),
+        Column('summary', String(2048), nullable=True),
+        Column('text', String(8192), nullable=True),
+        Column('combined', String(8192), nullable=True),
+        Column('n_tokens', Integer, nullable=True),
+        Column('embedding', VECTOR(1536))
+    ]
+    # Table name
+    table_name = 'fine_food_reviews'
+    # If the table does not exist, create it.
+    if not client.check_table_exists(table_name):
+        client.create_table(table_name,columns=cols)
+        # Create an index for the vector column.
+        client.create_index(
+            table_name=table_name,
+            is_vec_index=True,
+            index_name='vidx',
+            column_names=['embedding'],
+            vidx_params='distance=l2, type=hnsw, lib=vsag',
+        )
+    # Open and read the CSV file.
+    with open(file_name, mode='r', newline='', encoding='utf-8') as csvfile:
+        csvreader = csv.reader(csvfile)
+        # Read the header row.
+        headers = next(csvreader)
+        print("Headers:", headers)
+        batch = [] # Store data and insert it into the database every 10 rows.
+        for i, row in enumerate(csvreader):
+            # The CSV file has 9 fields: id, product_id, user_id, score, summary, text, combined, n_tokens, embedding.
+            if not row:
+                break
+            food_review_line= {'id':row[0],'product_id':row[1],'user_id':row[2],'score':row[3],'summary':row[4],'text':row[5],\
+            'combined':row[6],'n_tokens':row[7],'embedding':json.loads(row[8])}
+            batch.append(food_review_line)
+            # Insert data every 10 rows.
+            if (i + 1) % 10 == 0:
+                client.insert(table_name,batch)
+                batch = []  # Clear the cache.
+        # Insert the remaining rows (if any).
+        if batch:
+            client.insert(table_name,batch)
+    # Check the data in the table to ensure that all data has been inserted.
+    count_sql = f"select count(*) from {table_name};"
+    cursor = client.perform_raw_text_sql(count_sql)
+    result = cursor.fetchone()
+    print(f"Total number of imported data: {result[0]}")
+    ```
+
+## Step 4: Query seekdb data
+
+1. Save the following Python script as `query.py`.
+
+    ```shell
+    import os
+    import sys
+    import csv
+    import json
+    from pyobvector import *
+    from sqlalchemy import func
+    import dashscope
+    # Get command-line arguments
+    if len(sys.argv) != 2:
+        print("Please enter a query statement.")
+        sys.exit()
+    queryStatement = sys.argv[1]
+    # Use pyobvector to connect to seekdb. If the username or password contains @, replace it with %40.
+    client = ObVecClient(uri="host:port", user="username",password="****",db_name="test")
+    # Define a function to generate text vectors.
+    def generate_embeddings(text):
+        rsp = dashscope.TextEmbedding.call(model=TextEmbedding.Models.text_embedding_v1, input=text)
+        embeddings = [record['embedding'] for record in rsp.output['embeddings']]
+        return embeddings if isinstance(text, list) else embeddings[0]
+
+    def query_ob(query, tableName, vector_name="embedding", top_k=1):
+        embedding = generate_embeddings(query)
+        # Execute approximate nearest neighbor search.
+        res = client.ann_search(
+            table_name=tableName,
+            vec_data=embedding,
+            vec_column_name=vector_name,
+            distance_func=func.l2_distance,
+            topk=top_k,
+            output_column_names=['combined']
+        )
+        for row in res:
+            print(str(row[0]).replace("Title: ", "").replace("; Content: ", ": "))
+    # Table name
+    table_name = 'fine_food_reviews'
+    query_ob(queryStatement,table_name,'embedding',1)
+    ```
+
+2. Enter a question and obtain the related answer.
+
+    ```shell
+    python3 query.py 'pet food'
+    ```
+
+    The expected result is as follows:
+
+    ```shell
+    This is so good!: I purchased this after my sister sent a small bag to me in a gift box. I loved it so much I wanted to find it to buy for myself and keep it around. I always look on Amazon because you can find everything here and true enough, I found this wonderful candy. It is nice to keep in your purse for when you are out and about and get a dry throat or a tickle in the back of your throat. It is also nice to have in a candy dish at home for guests to try.
+    ```