Optimizer TraceΒΆ

EXPLAIN only shows the intended execution plan for a query. What is not shown is why some of the alternative execution strategies were not chosen. Understanding why an alternative was not chosen can be perplexing:

  • Was it because the alternative was not suitable (i.e., some optimizations can only apply to specific use cases)?
  • Was it because the alternative was deemed to be a higher cost?
  • If the alternative was a higher cost, by how much?

OPTIMIZER_TRACE provides answers to these questions. It was designed to provide more diagnostic data on the optimizer, but it can be useful for practical troubleshooting as well as learning how the optimizer cost model works.

Example 2: EXPLAIN shows a newly added index is not used

ALTER TABLE Country ADD INDEX p (population);
EXPLAIN FORMAT=JSON
SELECT * FROM Country WHERE continent='Asia' and population > 5000000;

{
  "query_block": {
   "select_id": 1,
   "cost_info": {
   "query_cost": "53.80"
   },
   "table": {
   "table_name": "Country",
   "access_type": "ALL",       # This query is executing as a table scan
   "possible_keys": [          # Even though the optimizer
      "p"                      # recognizes an index is available!
   ],
   "rows_examined_per_scan": 239,
   "rows_produced_per_join": 15,
   "filtered": "6.46",
   "cost_info": {
      "read_cost": "50.71",
      "eval_cost": "3.09",
      "prefix_cost": "53.80",
      "data_read_per_join": "3K"
   },
   "used_columns": [
      "Code",
      "Name",
      "Continent",
      "Region",
      "SurfaceArea",
      "IndepYear",
      "Population",
      "LifeExpectancy",
      "GNP",
      "GNPOld",
      "LocalName",
      "GovernmentForm",
      "HeadOfState",
      "Capital",
      "Code2"
   ],
   "attached_condition": "((`world`.`Country`.`Continent` = 'Asia') and (`world`.`Country`.`Population` > 5000000))"
   }
  }
}

In Example 2 we can see that the index p (Population) is not selected after it has been added to the table. EXPLAIN shows that it is a candidate, but it does not describe why it was not selected. To find the reason, we need to turn to the OPTIMIZER_TRACE.

Example 3: OPTIMIZER_TRACE showing why the index was not used

SET optimizer_trace="enabled=on"
SELECT * FROM Country WHERE continent='Asia' and population > 5000000;
SELECT * FROM information_schema.optimizer_trace;
{
  "steps": [
   {
   "join_preparation": {
      "select#": 1,
      "steps": [
         {
         "expanded_query": "/* select#1 */ select `Country`.`Code` AS `Code`,`Country`.`Name` AS `Name`,`Country`.`Continent` AS `Continent`,`Country`.`Region` AS `Region`,`Country`.`SurfaceArea` AS `SurfaceArea`,`Country`.`IndepYear` AS `IndepYear`,`Country`.`Population` AS `Population`,`Country`.`LifeExpectancy` AS `LifeExpectancy`,`Country`.`GNP` AS `GNP`,`Country`.`GNPOld` AS `GNPOld`,`Country`.`LocalName` AS `LocalName`,`Country`.`GovernmentForm` AS `GovernmentForm`,`Country`.`HeadOfState` AS `HeadOfState`,`Country`.`Capital` AS `Capital`,`Country`.`Code2` AS `Code2` from `Country` where ((`Country`.`Continent` = 'Asia') and (`Country`.`Population` > 5000000))"
         }
      ]
   }
   },
   {
   "join_optimization": {
      "select#": 1,
      "steps": [
         {
         "condition_processing": {
            "condition": "WHERE",
            "original_condition": "((`Country`.`Continent` = 'Asia') and (`Country`.`Population` > 5000000))",
            "steps": [
               {
               "transformation": "equality_propagation",
               "resulting_condition": "((`Country`.`Population` > 5000000) and multiple equal('Asia', `Country`.`Continent`))"
               },
               {
               "transformation": "constant_propagation",
               "resulting_condition": "((`Country`.`Population` > 5000000) and multiple equal('Asia', `Country`.`Continent`))"
               },
               {
               "transformation": "trivial_condition_removal",
               "resulting_condition": "((`Country`.`Population` > 5000000) and multiple equal('Asia', `Country`.`Continent`))"
               }
            ]
         }
         },
         {
         "substitute_generated_columns": {}
         },
         {
         "table_dependencies": [
            {
               "table": "`Country`",
               "row_may_be_null": false,
               "map_bit": 0,
               "depends_on_map_bits": []
            }
         ]
         },
         {
         "ref_optimizer_key_uses": []
         },
         {
         "rows_estimation": [
            {
               "table": "`Country`",
               "range_analysis": {
               "table_scan": {
                  "rows": 239,
                  "cost": 55.9
               },
               "potential_range_indexes": [
                  {
                     "index": "PRIMARY",
                     "usable": false,
                     "cause": "not_applicable"
                  },
                  {
                     "index": "p",
                     "usable": true,
                     "key_parts": [
                     "Population",
                     "Code"
                     ]
                  }
               ],
               "setup_range_conditions": [],
               "group_index_range": {
                  "chosen": false,
                  "cause": "not_group_by_or_distinct"
               },
               "analyzing_range_alternatives": {
                  "range_scan_alternatives": [
                     {
                     "index": "p",
                     "ranges": [
                        "5000000 < Population"
                     ],
                     "index_dives_for_eq_ranges": true,
                     "rowid_ordered": false,
                     "using_mrr": false,
                     "index_only": false,
                     "rows": 108,
                     "cost": 130.61,             # This is the cost to use the index
                     "chosen": false,            # It was not chosen because
                     "cause": "cost"             # it costs more than a tablescan!
                     }
                  ],
                  "analyzing_roworder_intersect": {
                     "usable": false,
                     "cause": "too_few_roworder_scans"
                  }
               }
               }
            }
         ]
         },
         {
         "considered_execution_plans": [
            {
               "plan_prefix": [],
               "table": "`Country`",
               "best_access_path": {
               "considered_access_paths": [
                  {
                     "rows_to_scan": 239,
                     "access_type": "scan",
                     "resulting_rows": 239,
                     "cost": 53.8,
                     "chosen": true
                  }
               ]
               },
               "condition_filtering_pct": 100,
               "rows_for_plan": 239,
               "cost_for_plan": 53.8,
               "chosen": true
            }
         ]
         },
         {
         "attaching_conditions_to_tables": {
            "original_condition": "((`Country`.`Continent` = 'Asia') and (`Country`.`Population` > 5000000))",
            "attached_conditions_computation": [],
            "attached_conditions_summary": [
               {
               "table": "`Country`",
               "attached": "((`Country`.`Continent` = 'Asia') and (`Country`.`Population` > 5000000))"
               }
            ]
         }
         },
         {
         "refine_plan": [
            {
               "table": "`Country`"
            }
         ]
         }
      ]
   }
   },
   {
   "join_execution": {
      "select#": 1,
      "steps": []
   }
   }
  ]
}

In Example 3 range_scan_alternatives lists p (Population) was considered but eliminated based on cost: ("chosen": false and "cause": "cost"). The output even provides an estimated cost for using this index: 130.61 cost units. This compares to 55.9 cost units for a table scan. Lower is better, so a table scan is preferred.

To describe why this is, we need to first understand that indexes need to eliminate work. In this dataset most countries have a population > 5M. The optimizer has decided that it is faster to table scan than to jump back and forth between the index and the data. The costs of how these decisions are made are configurable to advanced users.

Non selective index