You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug#36189820 Server crash happened while running 'explain format=json for connection xxx'.
The new iterator based explains are not impacted.
The issue here is a race condition. More than one thread is using the
query term iterator at the same time (whoch is neithe threas safe nor
reantrant), and part of its state is in the query terms being visited
which leads to interference/race conditions.
a) the explain thread
uses an iterator here:
Sql_cmd_explain_other_thread::execute
is inspecting the Query_expression of the running query
calling master_query_expression()->find_blocks_query_term which uses
an iterator over the query terms in the query expression:
for (auto qt : query_terms<>()) {
if (qt->query_block() == qb) {
return qt;
}
}
the above search fails to find qb due to the interference of the
thread b), see below, and then tries to access a nullpointer:
* thread #36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11
frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50
frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21
frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26
frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32
frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9
frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14
frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20
frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11
frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13
frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7
frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17
frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13
* frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11
frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29
frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19
frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7
frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18
frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13
frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3
frame #20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99
frame #21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15
b) the query thread being explained is itself performing LEX::cleanup
and as part of the iterates over the query terms, but still allows
EXPLAIN of the query plan since
thd->query_plan.set_query_plan(SQLCOM_END, ...)
hasn't been called yet.
20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613)
21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861)
22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286)
30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799)
31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648)
32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303)
33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135)
34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464)
57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304)
58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053)
65:frame: _pthread_start (in libsystem_pthread.dylib) + 99
66:frame: thread_start (in libsystem_pthread.dylib) + 15
Solution:
This patch solves the issue by removing iterator state from
Query_term, making the query_term iterators thread safe. This solution
labels every child query_term with its index in its parent's
m_children vector. The iterator can therefore easily compute the next
child to visit based on Query_term::m_sibling_idx.
A unit test case is added to check reentrancy.
One can also manually verify that we have no remaining race condition
by running two client connections files (with \. <file>) with a big
number of copies of the repro query in one connection and a big number
of EXPLAIN format=json FOR <connection>, e.g.
EXPLAIN FORMAT=json FOR CONNECTION 8\G
in the other. The actual connection number would need to verified
in connection one, of course.
Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7
0 commit comments