JSON and Generated ColumnsΒΆ

The MySQL Server supports schemaless data storage with the addition of the following features:

  1. A JSON data type - JSON values are parsed on insert/update, validated, and then stored in a binary optimized format. The JSON data type does not require any parsing or validation when reading values.
  2. JSON functions - A set of 20+ SQL functions to search, manipulate and create JSON values.
  3. Generated columns - While not specifically tied to JSON, generated columns work similar to functional indexes and allow part of a JSON document to be extracted and indexed.

The optimizer will automatically look for matching indexes from generated columns when querying JSON data [1] . In Example 35, user preferences are stored in a JSON column. Initially querying for users who have requested to be notified on updates (notify_on_updates) results in a table scan operation. By adding a virtual generated column with an index, EXPLAIN shows that an index can now be used.

Example 35: A schemaless representation of user preferences

CREATE TABLE users (
  id INT NOT NULL auto_increment,
  username VARCHAR(32) NOT NULL,
  preferences JSON NOT NULL,
  PRIMARY KEY (id),
  UNIQUE (username)
);

INSERT INTO users
 (id,username,preferences)
VALUES
 (NULL, 'morgan', '{"layout": "horizontal", "warn_before_delete": false, "notify_on_updates": true}'),
 (NULL, 'wes', '{"layout": "horizontal", "warn_before_delete": false, "notify_on_updates": false}'),
 (NULL, 'jasper', '{"layout": "horizontal", "warn_before_delete": false, "notify_on_updates": false}'),
 (NULL, 'gus', '{"layout": "horizontal", "warn_before_delete": false, "notify_on_updates": false}'),
 (NULL, 'olive', '{"layout": "horizontal", "warn_before_delete": false, "notify_on_updates": false}');

EXPLAIN FORMAT=JSON
SELECT * FROM users WHERE preferences->"$.notify_on_updates" = true;
{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "2.00"
    },
    "table": {
      "table_name": "users",
      "access_type": "ALL",
      "rows_examined_per_scan": 5,
      "rows_produced_per_join": 5,
      "filtered": "100.00",
      "cost_info": {
        "read_cost": "1.00",
        "eval_cost": "1.00",
        "prefix_cost": "2.00",
        "data_read_per_join": "280"
      },
      "used_columns": [
        "id",
        "username",
        "preferences"
      ],
      "attached_condition": "(json_extract(`test`.`users`.`preferences`,'$.notify_on_updates') = TRUE)"
    }
  }
}

Example 36: Adding a virtual generated column with an index

ALTER TABLE users ADD notify_on_updates TINYINT AS (preferences->"$.notify_on_updates"),
 ADD INDEX(notify_on_updates);

EXPLAIN FORMAT=JSON SELECT * FROM users WHERE preferences->"$.notify_on_updates" = true;
{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "1.20"
    },
    "table": {
      "table_name": "users",
      "access_type": "ref",
      "possible_keys": [
        "notify_on_updates"
      ],
      "key": "notify_on_updates",
      "used_key_parts": [
        "notify_on_updates"
      ],
      "key_length": "2",
      "ref": [
        "const"
      ],
      "rows_examined_per_scan": 1,
      "rows_produced_per_join": 1,
      "filtered": "100.00",
      "cost_info": {
        "read_cost": "1.00",
        "eval_cost": "0.20",
        "prefix_cost": "1.20",
        "data_read_per_join": "56"
      },
      "used_columns": [
        "id",
        "username",
        "preferences",
        "notify_on_updates"
      ]
    }
  }
}
[1]Examples are using the short-hand JSON_EXTRACT operator (->). When extracting strings, you should use the short-hand extract and unquote operator (->>).