Skip to content

Commit 01a8081

Browse files
author
Dag Wanvik
committed
Bug#36189820 Server crash happened while running 'explain format=json for connection xxx'.
The new iterator based explains are not impacted. The issue here is a race condition. More than one thread is using the query term iterator at the same time (whoch is neithe threas safe nor reantrant), and part of its state is in the query terms being visited which leads to interference/race conditions. a) the explain thread uses an iterator here: Sql_cmd_explain_other_thread::execute is inspecting the Query_expression of the running query calling master_query_expression()->find_blocks_query_term which uses an iterator over the query terms in the query expression: for (auto qt : query_terms<>()) { if (qt->query_block() == qb) { return qt; } } the above search fails to find qb due to the interference of the thread b), see below, and then tries to access a nullpointer: * thread #36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11 frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50 frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21 frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26 frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32 frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9 frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14 frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20 frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11 frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13 frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7 frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17 frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13 * frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11 frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29 frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19 frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7 frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18 frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13 frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3 frame #20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99 frame #21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15 b) the query thread being explained is itself performing LEX::cleanup and as part of the iterates over the query terms, but still allows EXPLAIN of the query plan since thd->query_plan.set_query_plan(SQLCOM_END, ...) hasn't been called yet. 20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613) 21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861) 22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286) 30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799) 31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648) 32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303) 33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135) 34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464) 57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304) 58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053) 65:frame: _pthread_start (in libsystem_pthread.dylib) + 99 66:frame: thread_start (in libsystem_pthread.dylib) + 15 Solution: This patch solves the issue by removing iterator state from Query_term, making the query_term iterators thread safe. This solution labels every child query_term with its index in its parent's m_children vector. The iterator can therefore easily compute the next child to visit based on Query_term::m_sibling_idx. A unit test case is added to check reentrancy. One can also manually verify that we have no remaining race condition by running two client connections files (with \. <file>) with a big number of copies of the repro query in one connection and a big number of EXPLAIN format=json FOR <connection>, e.g. EXPLAIN FORMAT=json FOR CONNECTION 8\G in the other. The actual connection number would need to verified in connection one, of course. Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7
1 parent c6c3296 commit 01a8081

File tree

6 files changed

+109
-35
lines changed

6 files changed

+109
-35
lines changed

sql/parse_tree_nodes.cc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1773,6 +1773,7 @@ bool PT_set_operation::contextualize_setop(Parse_context *pc,
17731773
if (setop == nullptr) return true;
17741774

17751775
merge_descendants(pc, setop, ql);
1776+
setop->label_children();
17761777

17771778
Query_expression *qe = pc->select->master_query_expression();
17781779
if (setop->set_block(qe->create_post_processing_block(setop))) return true;

sql/query_term.cc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,11 @@ Query_term *Query_term::pushdown_limit_order_by(Query_term_set_op *parent) {
7979
case QT_EXCEPT: {
8080
auto setop = down_cast<Query_term_set_op *>(this);
8181
for (Query_term *&child : setop->m_children) {
82+
const uint sibling_idx = child->sibling_idx();
8283
child = child->pushdown_limit_order_by(
8384
down_cast<Query_term_set_op *>(this));
85+
// Make sure the new child inherits the old child's sibling index
86+
child->set_sibling_idx(sibling_idx);
8487
}
8588
} break;
8689
case QT_UNARY: {

sql/query_term.h

Lines changed: 69 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -260,16 +260,28 @@ class Query_term {
260260
*/
261261
virtual size_t child_count() const { return 0; }
262262

263+
/// Set the correct value of \c Query_term::m_sibling_idx recursively for
264+
/// set operations. For \c Query_term_unary, this is done in its constructor.
265+
/// A no-op for \c Query_block. See also \c set_sibling_idx.
266+
virtual void label_children() = 0;
267+
263268
protected:
264269
/**
265270
Back pointer to the node whose child we are, or nullptr (root term).
266271
*/
267272
Query_term_set_op *m_parent{nullptr};
268273

274+
/// If parent is non-null, this holds the index of the current sibling.
275+
/// Used for efficient iterator traversal up and down the tree.
276+
uint m_sibling_idx{0};
277+
269278
public:
270279
/// Getter for m_parent, q.v.
271280
Query_term_set_op *parent() const { return m_parent; }
272-
281+
/// Setter for \c m_sibling_idx, q.v.
282+
void set_sibling_idx(uint idx) { m_sibling_idx = idx; }
283+
/// Getter for \c m_sibling_idx, q.v.
284+
uint sibling_idx() { return m_sibling_idx; }
273285
/**
274286
Reset resources used.
275287
@param full do full cleanup. Same semantics as for Query_expression's
@@ -384,13 +396,6 @@ class Query_term {
384396
void set_fields(mem_root_deque<Item *> *fields) { m_fields = fields; }
385397
// Getter for m_fields, q.v.
386398
mem_root_deque<Item *> *fields() { return m_fields; }
387-
388-
private:
389-
/// class Query_terms iterator traversal state variable, hence friend
390-
/// declaration immediately below
391-
uint m_curr_id;
392-
template <Visit_order order, Visit_leaves visit_leaves>
393-
friend class Query_terms;
394399
};
395400

396401
/// Common base class for n-ary set operations, including unary.
@@ -441,6 +446,14 @@ class Query_term_set_op : public Query_term {
441446
return false;
442447
}
443448

449+
void label_children() override {
450+
uint idx = 0;
451+
for (auto child : m_children) {
452+
child->set_sibling_idx(idx++);
453+
child->label_children();
454+
}
455+
}
456+
444457
size_t child_count() const override { return m_children.size(); }
445458
bool open_result_tables(THD *thd, int level) override;
446459
void cleanup(bool full) override;
@@ -549,6 +562,7 @@ class Query_term_unary : public Query_term_set_op {
549562
: Query_term_set_op(mem_root) {
550563
m_last_distinct = 0;
551564
m_children.push_back(t);
565+
t->set_sibling_idx(0);
552566
}
553567
Query_term_type term_type() const override { return QT_UNARY; }
554568
const char *operator_string() const override { return "result"; }
@@ -559,7 +573,7 @@ class Query_term_unary : public Query_term_set_op {
559573
Containing class for iterator over the query term tree. The structure is
560574
in part dictated by C++ conventions for iterators.
561575
@tparam visit_order indicates whether pre or post order visiting is requested
562-
@tparam visit_leaves indicated whether to visit the leaf nodes (query blocks)
576+
@tparam visit_leaves indicates whether to visit the leaf nodes (query blocks)
563577
*/
564578
template <Visit_order visit_order, Visit_leaves visit_leaves>
565579
class Query_terms {
@@ -580,50 +594,38 @@ class Query_terms {
580594
*/
581595
Query_term_iterator(Query_term *root) {
582596
m_current = root;
583-
if (m_current != nullptr) m_current->m_curr_id = 0;
597+
m_child_idx = 0;
584598
if constexpr (visit_order == QTC_POST_ORDER) {
585599
// start at left-most leaf node
586-
while (m_current != nullptr &&
587-
m_current->term_type() != QT_QUERY_BLOCK) {
588-
m_current = down_cast<Query_term_set_op *>(m_current)
589-
->m_children[m_current->m_curr_id];
590-
m_current->m_curr_id = 0;
591-
}
600+
dive_to_leftmost_leaf_of_child();
592601
if constexpr (visit_leaves == VL_SKIP_LEAVES) operator++();
593602
}
594603
}
595604

596605
Query_term_iterator() = default;
597606

598607
Query_term_iterator &operator++() {
599-
if (m_current == nullptr) {
600-
return *this;
601-
}
608+
assert(m_current != nullptr);
602609
while (m_current != nullptr) {
603610
uint children = m_current->child_count();
604611
if constexpr (visit_order == QTC_PRE_ORDER) {
605-
while (m_current != nullptr && (m_current->m_curr_id >= children)) {
612+
while (m_current != nullptr && m_child_idx >= children) {
606613
// no more at this level, go back up
607-
m_current = m_current->m_parent;
614+
prepare_for_next_sibling();
608615
if (m_current != nullptr) {
609616
children = m_current->child_count();
610617
}
611618
}
612619

613620
if (m_current == nullptr) return *this;
614621
m_current = down_cast<Query_term_set_op *>(m_current)
615-
->m_children[m_current->m_curr_id++];
616-
m_current->m_curr_id = 0;
622+
->m_children[m_child_idx];
623+
m_child_idx = 0;
617624
} else {
618-
m_current = m_current->m_parent;
625+
prepare_for_next_sibling();
619626
if (m_current == nullptr) return *this;
620-
if (++m_current->m_curr_id < m_current->child_count()) {
621-
while (m_current != nullptr &&
622-
m_current->term_type() != QT_QUERY_BLOCK) {
623-
m_current = down_cast<Query_term_set_op *>(m_current)
624-
->m_children[m_current->m_curr_id];
625-
m_current->m_curr_id = 0;
626-
}
627+
if (m_child_idx < m_current->child_count()) {
628+
dive_to_leftmost_leaf_of_child();
627629
} else {
628630
// return non-leaf
629631
}
@@ -645,8 +647,43 @@ class Query_terms {
645647
bool operator!=(const Query_term_iterator &other) const {
646648
return !((*this) == other);
647649
}
648-
/// Iterator state consists of this and Query_term::m_curr_id
650+
651+
private:
652+
/// Iterator state consists of the next two member variables
649653
Query_term *m_current{nullptr};
654+
655+
/// Used to find next child node to dive into, see \c set_next_child_idx
656+
uint m_child_idx{0};
657+
658+
/// Starting at \c m_current->m_children[m_child_index], dive down through
659+
/// any left-most (index == 0) further children till we find left-most leaf
660+
/// term (a \c Query_block) under the child pointed to by
661+
/// \c m_child_index. After the dive, \c m_child_index will be zero, and
662+
/// \c m_current will be set to the leaf term.
663+
void dive_to_leftmost_leaf_of_child() {
664+
while (m_current != nullptr && m_current->term_type() != QT_QUERY_BLOCK) {
665+
m_current =
666+
down_cast<Query_term_set_op *>(m_current)->m_children[m_child_idx];
667+
m_child_idx = 0;
668+
}
669+
}
670+
671+
/// Find the index of the next sibling, if any, of \c m_current qua child
672+
/// of its parent, so we can visit it: assign it to \c m_child_idx.
673+
/// Setting \c m_child_idx to a value of \c Query_term::child_count means we
674+
/// signal that we are done. Also, \c m_current is set to its parent.
675+
void prepare_for_next_sibling() {
676+
assert(m_current != nullptr);
677+
if (m_current->parent() == nullptr) {
678+
m_current = m_current->parent();
679+
return;
680+
}
681+
assert(m_current->parent()->m_children[m_current->sibling_idx()] ==
682+
m_current);
683+
m_child_idx = m_current->sibling_idx() + 1;
684+
685+
m_current = m_current->parent();
686+
}
650687
};
651688

652689
public:

sql/sql_lex.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1183,6 +1183,7 @@ class Query_block : public Query_term {
11831183
Query_block *query_block() const override {
11841184
return const_cast<Query_block *>(this);
11851185
}
1186+
void label_children() override {}
11861187
void destroy_tree() override { m_parent = nullptr; }
11871188

11881189
bool open_result_tables(THD *, int) override;

sql/sql_union.cc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -689,9 +689,9 @@ bool Query_expression::can_materialize_directly_into_result() const {
689689
}
690690

691691
/**
692-
Prepares all query blocks of a query expression, including
693-
fake_query_block. If a recursive query expression, this also creates the
694-
materialized temporary table.
692+
Prepares all query blocks of a query expression.
693+
If a recursive query expression, this also creates the materialized temporary
694+
table.
695695
696696
@param thd Thread handler
697697
@param sel_result Result object where the unit's output should go.

unittest/gunit/union_syntax-t.cc

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,4 +342,36 @@ TEST_F(UnionSyntaxTest, InnerVsOuterOrder) {
342342
// EXPECT_EQ(2, get_limit(query_expression->fake_query_block));
343343
}
344344

345+
TEST_F(UnionSyntaxTest, QueryTermIteratorReentrancy) {
346+
Query_block *query_block = parse(
347+
"(SELECT * FROM r UNION ALL SELECT * FROM s ORDER BY a LIMIT 10)"
348+
" UNION ALL "
349+
" (SELECT * FROM r UNION DISTINCT SELECT * FROM s) LIMIT 7");
350+
351+
Query_expression *qe = query_block->master_query_expression();
352+
Query_terms<QTC_POST_ORDER, VL_VISIT_LEAVES> terms(qe->query_term());
353+
// set of nodes collected without any "interference"
354+
std::vector<Query_term *> nodes_a_priori;
355+
std::vector<Query_term *> nodes_outer;
356+
std::vector<Query_term *> nodes_inner;
357+
358+
for (Query_term *term1 : terms) {
359+
nodes_a_priori.push_back(term1);
360+
}
361+
362+
EXPECT_EQ(7, nodes_a_priori.size());
363+
364+
for (Query_term *term1 : terms) {
365+
// run a second iterator over the same query terms and verify that it
366+
// doesn't interfere with the outer iterator's job
367+
for (Query_term *term2 : terms) {
368+
nodes_inner.push_back(term2);
369+
}
370+
EXPECT_EQ(nodes_a_priori, nodes_inner);
371+
nodes_inner.clear();
372+
nodes_outer.push_back(term1);
373+
}
374+
EXPECT_EQ(nodes_a_priori, nodes_outer);
375+
}
376+
345377
} // namespace union_syntax_unittest

0 commit comments

Comments
 (0)