257 lines
11 KiB
Markdown
257 lines
11 KiB
Markdown
---
|
|
|
|
slug: /create-a-json-value
|
|
---
|
|
|
|
# Create a JSON value
|
|
|
|
A JSON value must be one of the following: objects (JSON objects), arrays, strings, numbers, boolean values (true/false), or the null value. Note that false, true, and the null value must be in lowercase.
|
|
|
|
## JSON text structure
|
|
|
|
A JSON text structure includes characters, strings, numbers, and three literal names. Whitespace characters (spaces, horizontal tabs, line feeds, and carriage returns) are allowed before or after any structural character.
|
|
|
|
```sql
|
|
begin-array = [ left square bracket
|
|
|
|
begin-object = { left curly bracket
|
|
|
|
end-array = ] right square bracket
|
|
|
|
end-object = } right curly bracket
|
|
|
|
name-separator = : colon
|
|
|
|
value-separator = , comma
|
|
```
|
|
|
|
### Objects
|
|
|
|
An object is represented by a pair of curly brackets containing zero or more name/value pairs (also called members). Names within an object must be unique. Each name is a string followed by a colon that separates the name from its value. Multiple name/value pairs are separated by commas.
|
|
Here is an example:
|
|
|
|
```sql
|
|
{ "NAME": "SAM", "Height": 175, "Weight": 100, "Registered" : false}
|
|
```
|
|
|
|
### Arrays
|
|
|
|
An array is represented by square brackets containing zero or more values (also called elements). Array elements are separated by commas, and values in an array do not need to be of the same type.
|
|
|
|
Here is an example:
|
|
|
|
```sql
|
|
["abc", 10, null, true, false]
|
|
```
|
|
|
|
### Numbers
|
|
|
|
Numbers use decimal format and contain an integer component that may optionally be prefixed with a minus sign (-). This can be followed by a fractional part and/or an exponent part. Leading zeros are not allowed. The fractional part consists of a decimal point followed by one or more digits. The exponent part begins with an uppercase or lowercase letter E, optionally followed by a plus (+) or minus (-) sign and one or more digits.
|
|
|
|
Here is an example:
|
|
|
|
```sql
|
|
[100, 0, -100, 100.11, -12.11, 10.22e2, -10.22e2]
|
|
```
|
|
|
|
### Strings
|
|
|
|
A string begins and ends with quotation marks ("). All Unicode characters can be placed within the quotation marks, except characters that must be escaped (including quotation marks, backslashes, and control characters).
|
|
|
|
JSON text must be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8.
|
|
|
|
Here is an example:
|
|
|
|
```sql
|
|
{"Url": "http://www.example.com/image/481989943"}
|
|
```
|
|
|
|
## Create JSON values
|
|
|
|
seekdb supports the following DDL operations on JSON types:
|
|
|
|
* Create tables with JSON columns.
|
|
|
|
* Add or drop JSON columns.
|
|
|
|
* Create indexes on generated columns based on JSON columns.
|
|
|
|
* Enable semi-structured encoding when creating tables.
|
|
|
|
* Enable semi-structured encoding on existing tables.
|
|
|
|
### Limitations
|
|
|
|
You can create multiple JSON columns in each table, with the following limitations:
|
|
|
|
* JSON columns cannot be used as `PRIMARY KEY`, `FOREIGN KEY`, or `UNIQUE KEY`, but you can add `NOT NULL` or `CHECK` constraints.
|
|
|
|
* JSON columns cannot have default values.
|
|
|
|
* JSON columns cannot be used as partitioning keys.
|
|
|
|
* The length of JSON data cannot exceed the length of `LONGTEXT`, and the maximum depth of each JSON object or array is 99.
|
|
|
|
### Examples
|
|
|
|
#### Create or modify JSON columns
|
|
|
|
```sql
|
|
obclient> CREATE TABLE tbl1 (id INT PRIMARY KEY, docs JSON NOT NULL, docs1 JSON);
|
|
Query OK, 0 rows affected
|
|
|
|
obclient> ALTER TABLE tbl1 MODIFY docs JSON CHECK(docs <'{"a" : 100}');
|
|
Query OK, 0 rows affected
|
|
|
|
obclient> CREATE TABLE json_tab(
|
|
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'Primary key',
|
|
json_info JSON COMMENT 'JSON data',
|
|
json_id INT GENERATED ALWAYS AS (json_info -> '$.id') COMMENT 'Virtual field from JSON data',
|
|
json_name VARCHAR(5) GENERATED ALWAYS AS (json_info -> '$.NAME'),
|
|
index json_info_id_idx (json_id)
|
|
)COMMENT 'Example JSON table';
|
|
Query OK, 0 rows affected
|
|
|
|
obclient> ALTER TABLE json_tab ADD COLUMN json_info1 JSON;
|
|
Query OK, 0 rows affected
|
|
|
|
obclient> ALTER TABLE json_tab ADD INDEX (json_name);
|
|
Query OK, 0 rows affected
|
|
|
|
obclient> ALTER TABLE json_tab drop COLUMN json_info1;
|
|
Query OK, 0 rows affected
|
|
```
|
|
|
|
#### Create an index on a specific key using a generated column
|
|
|
|
```sql
|
|
obclient> CREATE TABLE jn ( c JSON, g INT GENERATED ALWAYS AS (c->"$.id"));
|
|
Query OK, 0 rows affected
|
|
|
|
obclient> CREATE INDEX idx1 ON jn(g);
|
|
Query OK, 0 rows affected
|
|
Records: 0 Duplicates: 0 Warnings: 0
|
|
|
|
obclient> INSERT INTO jn (c) VALUES
|
|
('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),
|
|
('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');
|
|
Query OK, 4 rows affected
|
|
Records: 4 Duplicates: 0 Warnings: 0
|
|
|
|
obclient> SELECT c->>"$.name" AS name FROM jn WHERE g <= 2;
|
|
+-------+
|
|
| name |
|
|
+-------+
|
|
| Fred |
|
|
| Wilma |
|
|
+-------+
|
|
2 rows in set
|
|
|
|
obclient> EXPLAIN SELECT c->>"$.name" AS name FROM jn WHERE g <= 2\G
|
|
*************************** 1. row ***************************
|
|
Query Plan: =========================================
|
|
|ID|OPERATOR |NAME |EST. ROWS|COST|
|
|
-----------------------------------------
|
|
|0 |TABLE SCAN|jemp(idx1)|2 |92 |
|
|
=========================================
|
|
|
|
Outputs & filters:
|
|
-------------------------------------
|
|
0 - output([JSON_UNQUOTE(JSON_EXTRACT(jemp.c, '$.name'))]), filter(nil),
|
|
access([jemp.c]), partitions(p0)
|
|
|
|
1 row in set
|
|
```
|
|
|
|
#### Use semi-structured encoding
|
|
|
|
seekdb supports enabling semi-structured encoding when creating tables, primarily controlled by the table-level parameter `SEMISTRUCT_PROPERTIES`. You must also set `ROW_FORMAT=COMPRESSED` for the table, otherwise an error will occur:
|
|
|
|
* When `SEMISTRUCT_PROPERTIES=(encoding_type=encoding)`, the table is considered a semi-structured table, meaning all JSON columns in the table will have semi-structured encoding enabled.
|
|
* When `SEMISTRUCT_PROPERTIES=(encoding_type=none)`, the table is considered a structured table.
|
|
* You can also set the frequency threshold using the `freq_threshold` parameter.
|
|
* Currently, `encoding_type` and `freq_threshold` can only be modified using online DDL statements, not offline DDL statements.
|
|
|
|
1. Enable semi-structured encoding.
|
|
|
|
:::tip
|
|
If you enable semi-structured encoding, make sure that the parameter <a href="https://en.oceanbase.com/docs/common-oceanbase-database-10000000001971939">micro_block_merge_verify_level</a> is set to the default value <code>2</code>. Do not disable micro-block major compaction verification.
|
|
:::
|
|
|
|
:::tab
|
|
tab Example: Enable semi-structured encoding during table creation
|
|
|
|
```sql
|
|
CREATE TABLE t1( j json)
|
|
ROW_FORMAT=COMPRESSED
|
|
SEMISTRUCT_PROPERTIES=(encoding_type=encoding, freq_threshold=50);
|
|
```
|
|
|
|
For more information about the syntax, see [CREATE TABLE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974140).
|
|
|
|
tab Example: Enable semi-structured encoding for existing table
|
|
|
|
```sql
|
|
CREATE TABLE t1(j json);
|
|
ALTER TABLE t1 SET ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES = (encoding_type=encoding, freq_threshold=50);
|
|
```
|
|
|
|
For more information about the syntax, see [ALTER TABLE](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001974126).
|
|
|
|
Some modification limitations:
|
|
|
|
* If semi-structured encoding is not enabled, modifying the frequent column threshold will not report an error but will have no effect.
|
|
* The `freq_threshold` parameter cannot be modified during direct load operations or when the table is locked.
|
|
* Modifying one sub-parameter does not affect the others.
|
|
:::
|
|
|
|
2. Disable semi-structured encoding.
|
|
|
|
When `SEMISTRUCT_PROPERTIES` is set to `(encoding_type=none)`, semi-structured encoding is disabled. This operation does not affect existing data and only applies to data written afterward. Here is an example of disabling semi-structured encoding:
|
|
|
|
```sql
|
|
ALTER TABLE t1 SET ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES = (encoding_type=none);
|
|
```
|
|
|
|
3. Query semi-structured encoding configuration.
|
|
|
|
Use the `SHOW CREATE TABLE` statement to query the semi-structured encoding configuration. Here is an example statement:
|
|
|
|
```sql
|
|
SHOW CREATE TABLE t1;
|
|
```
|
|
|
|
The result is as follows:
|
|
|
|
```shell
|
|
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| Table | Create Table |
|
|
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| t1 | CREATE TABLE `t1` (
|
|
`j` json DEFAULT NULL
|
|
) ORGANIZATION INDEX DEFAULT CHARSET = utf8mb4 ROW_FORMAT = COMPRESSED COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 SEMISTRUCT_PROPERTIES=(ENCODING_TYPE=ENCODING, FREQ_THRESHOLD=50) |
|
|
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
1 row in set
|
|
```
|
|
|
|
When `SEMISTRUCT_PROPERTIES=(encoding_type=encoding)` is specified, the query displays this parameter, indicating that semi-structured encoding is enabled.
|
|
|
|
Using semi-structured encoding can improve the performance of conditional filtering queries with the [JSON_VALUE() function](https://en.oceanbase.com/docs/common-oceanbase-database-10000000001975890). Based on JSON semi-structured encoding technology, seekdb optimizes the performance of `JSON_VALUE` expression conditional filtering query scenarios. Since JSON data is split into sub-columns, the system can filter directly based on the encoded sub-column data without reconstructing the complete JSON structure, significantly improving query efficiency.
|
|
|
|
Here is an example query:
|
|
|
|
```sql
|
|
-- Query rows where the value of the name field is 'Devin'
|
|
SELECT * FROM t WHERE JSON_VALUE(j_doc, '$.name' RETURNING CHAR) = 'Devin';
|
|
```
|
|
|
|
Character set considerations:
|
|
|
|
- seekdb uses `utf8_bin` encoding for JSON.
|
|
|
|
- To ensure string whitebox filtering works properly, we recommend the following settings:
|
|
|
|
```sql
|
|
SET @@collation_server = 'utf8mb4_bin';
|
|
SET @@collation_connection='utf8mb4_bin';
|
|
``` |