Make auth_socket more flexible. #1

dveeden · 2014-10-27T09:41:56Z

Now supports "AS ''" so it doesn't have to be a 1-on-1 mapping.

Example:
CREATE USER 'foo'@'localhost IDENTIFIED WITH auth_socket AS 'myuser';

Now supports "AS '<user>'" so it doesn't have to be a 1-on-1 mapping. Example: CREATE USER 'foo'@'localhost IDENTIFIED WITH auth_socket AS 'myuser';

roel666 · 2014-10-27T09:55:08Z

Thanks

dveeden · 2014-10-27T10:00:55Z

The MySQL Bug ID for this:
http://bugs.mysql.com/74586

MyDanny · 2014-10-29T08:05:24Z

Thank you. Unfortunately, we can't accept pull requests on github. I see you've already submitted your patch via the bug tracker, so I'm closing this pull request.

…rows. Background 1: When binlog_format = row, CREATE ... SELECT is logged in two pieces, like: Anonymous_Gtid query_log_event(CREATE TABLE without SELECT) Anonymous_Gtid query_log_event(BEGIN) ...row events... query_log_event(COMMIT) (or Xid_log_event) Internally, there is a call to MYSQL_BIN_LOG::commit after the table has been created and before the rows are selected. When gtid_next='ANONYMOUS', we must not release anonymous ownership for the commit occurring in the middle of the statement (since that would allow a concurrent client to set gtid_mode=on, making it impossible to commit the rest of the statement). Also, the commit in the middle of the statement should not decrease the counter of ongoing GTID-violating transactions, since that would allow a concurrent client to set ENFORCE_GTID_CONSISTENCY=ON even if there is an ongoing transaction that violates GTID-consistency. The logic to skip releasing anonymous ownership and skip decreasing the counter is as follows. Before calling mysql_bin_log.commit, it sets the flag thd->is_commit_in_middle_of_statement. Eventually, mysql_bin_log.commit calls gtid_state->update_on_commit, which calls gtid_state->update_gtids_impl, which reads the thd->is_commit_in_middle_of_statement and accordingly decides to skip releasing anonymous ownership and/or skips decreasing the counter. Problem 1: When thd->is_commit_in_middle_of_statement has been set, it is crucial that there is another call to update_gtids_impl when the transaction ends (otherwise the session will keep holding anonymous ownership and will not decrease the counters). Normally, this happens because mysql_bin_log.commit is always called, and mysql_bin_log.commit normally invokes ordered_commit, which calls update_gtids_impl. However, in case the SELECT part of the statement does not find any rows, mysql_bin_log.commit skips the call to ordered_commit, so update_gtids_impl does not get called. This is the first problem we fix in this commit. Fix 1: We fix this problem as follows. After calling mysql_bin_log.commit to log the CREATE part of CREATE...SELECT, the CREATE...SELECT code sets thd->pending_gtid_state_update=true (this is a new flag that we introduce in this patch). If the flag is set, update_gtids_impl clears it. At the end of mysql_bin_log.commit, we check the flag to see if update_gtids_impl has been called by any function invoked by mysql_bin_log.commit. If not, i.e., if the flag is still true at the end of mysql_bin_log.commit, it means we have reached the corner case where update_gtids_impl was skipped. Thus we call it explicitly from mysql_bin_log.commit. Background 2: GTID-violating DDL (CREATE...SELECT and CREATE TEMPORARY) is detected in is_ddl_gtid_compatible, called from gtid_pre_statement_checks, which is called from mysql_parse just before the implicit pre-commit. is_ddl_gtid_compatible determines whether an error or warning or nothing is to be generated, and whether to increment the counters of GTID- violating transactions. In case an error is generated, it is important that the error happens before the implicit commit, so that the statement fails before it commits the ongoing transaction. Problem 2: In case a warning is to be generated, and there is an ongoing transaction, the implicit commit will write to the binlog, and thus it will call gtid_state->update_gtids_impl, which will decrease the counters of GTID-violating transactions. Thus, the counters will be zero for the duration of the transaction. Fix 2: We call is_ddl_gtid_compatible *both* before the implicit commit and after the implicit commit. If an error is to be generated, the error is generated before the commit. If a warning is to be generated and/or the counter of GTID-violating transactions is to be increased, then this happens after the commit. Code changes #1: @sql/binlog.cc - Move MYSQL_BIN_LOG::commit to a new function MYSQL_BIN_LOG::write_binlog_and_commit_engine. Make MYSQL_BIN_LOG::commit call this function, and after the return check thd->pending_gtid_state_update to see if another call to gtid_state->update_on_[commit|rollback] is needed. - Simplify MYSQL_BIN_LOG::write_bin_log_and_commit_engine; remove useless local variable 'error' that would never change its value. @sql/binlog.h: - Declaration of new function. @sql/rpl_gtid_state.cc: - Set thd->pending_gtid_state_update to false at the end of update_gtids_impl. Code changes #2: @sql/binlog.cc: - Add two parameters to handle_gtid_consistency and is_ddl_compatible: handle_error is true in the call to is_ddl_gtid_compatible that happens *before* the implicit commit and fals in the call to is_ddl_gtid_compatible that happens *after* the implicit commit. It tells the function to generate the error, if an error is to be generated. The other parameter, handle_nonerror, is true in the call to is_ddl_gtid_compatible that happens *after* the implicit commit and false in the call that happens *before* the implicit commit. It tells the function to generate the warnings and increment the counter, if that needs to be done. @sql/rpl_gtid_execution.cc: - Call is_ddl_gtid_compatible after the implicit commit. Pass the two new parameters to the function. @sql/sql_class.h: - Update prototype for is_ddl_gtid_compatible. @sql/sql_insert.cc: - Set thd->pending_gtid_state_update = true after committing the CREATE part of a CREATE...SELECT. Misc changes: @sql/binlog.cc - Add DEBUG_SYNC symbol end_decide_logging_format used in tests. @sql/rpl_gtid_state.cc: - For modularity, move out parts of update_gtids_impl to a new function, end_gtid_violating_transaction. - Move the lock/unlock of global_sid_lock into update_gtids_impl. - Make update_gtids_impl release global_sid_lock before the call to end_gtid_violating_transaction, so as to hold it as short as possible. @sql/rpl_gtid.h - Because we release the locks earlier in update_gtids_impl in rpl_gtid_state.cc, we need to acquire the lock again in end_[anonymous|automatic]_gtid_violating_transaction, in order to do some debug assertions. - Add DBUG_PRINT for the counters. Test changes: - Split binlog_enforce_gtid_consistency into six tests, depending on the type of scenarios it tests: Three classes of GTID-violation: *_create_select_*: test CREATE ... SELECT. *_tmp_*: test CREATE TEMPORARY/DROP TEMPORARY inside a transaction. *_trx_nontrx_*: test combinations of transactional and non-transactional updates in the same statement or in the same transaction. For each class of GTID-violation, one positive and one negative test: *_consistent.test: Cases which are *not* GTID-violating *_violation.test: Cases which *are* GTID-violating. - The general logic of these test is: - extra/binlog_tests/enforce_gtid_consistency.test iterates over all values of GTID_MODE, ENFORCE_GTID_CONSISTENCY, and GTID_NEXT. For each case, it sources file; the name of the sourced file is specified by the file that sources extra/binlog_tests/enforce_gtid_consistency.test - The top-level file in suite/binlog/t invokes extra/binlog_tests/enforce_gtid_consistency.test, specifying one of the filenames extra/binlog_tests/enforce_gtid_consistency_[create_select|tmp| trx_nontrx]_[consistent|violation].test. - Each of the files extra/binlog_tests/enforce_gtid_consistency_[create_select|tmp| trx_nontrx]_[consistent|violation].test sets up a number of test scenarios. Each test scenario is executed by sourcing extra/binlog_tests/enforce_gtid_consistency_statement.inc. - extra/binlog_tests/enforce_gtid_consistency_statement.inc executes the specified statement, checks that warnings are generated and counters incremented/decremented as specified by the caller. - Since the tests set GTID_MODE explicitly, it doesn't make sense to run the test in both combinations GTID_MODE=ON/OFF. However, for the *_trx_nontrx_* cases, it is important to test that it works both with binlog_direct_non_transactional_updates=on and off. The suite is never run with those combinations. To leverage from the combinations GTID_MODE=ON/OFF, we run the test with binlog_direct_non_transactional_updates=on if GTID_MODE=ON, and we run the test with binlog_direct_non_transactional_updates=off if GTID_MODE=OFF.

Added the extra scope argument to the status variables.

Two places in replication code were causing Valgrind errors: 1. Memory leak in Relay_log_info::wait_for_gtid_set, since it passed the return value from Gtid_set::to_string() directly to DBUG_PRINT, without storing it anywhere so that it can be freed. 2. In MYSQL_BIN_LOG::init_gtid_sets would pass a bad pointer to DBUG_PRINT in some cases. In problem #1, an underlying problem was that to_string returns newly allocated memory and this was easy to miss when reading the code that calls the function. It would be better to return the value through a parameter, since that forces the caller to store it in a variable, and then it is more obvious that the value must be freed. And in fact such a function existed already, so we fix the problem by removing the (redundant) no-args version of Gtid_set::to_string and using the one- or two-argument function instead. In problem #2, print an empty string if we detect that the pointer will be bad. These bugs were found when adding some debug printouts to read_gtids_from_binlog. These debug printouts never made it to the server code through any other bug report, but would be useful to have for future debugging, so including them in this patch. Additionally, removed the call to global_sid_lock->rdlock() used before Previous_gtids_log_event::get_str(). This is not needed since Previous_gtids_log_event doesn't use shared resources.

…TO SELF Problem: If a multi-column update statement fails when updating one of the columns in a row, it will go on and update the remaining columns in that row before it stops and reports an error. If the failure happens when updating a JSON column, and the JSON column is also referenced later in the update statement, new and more serious errors can happen when the update statement attempts to read the JSON column, as it may contain garbage at this point. The fix is twofold: 1) Field_json::val_str() currently returns NULL if an error happens. This is correct for val_str() functions in the Item class hierarchy, but not for val_str() functions in the Field class hierarchy. The val_str() functions in the Field classes instead return a pointer to an empty String object on error. Since callers don't expect it to return NULL, this caused a crash when a caller unconditionally dereferenced the returned pointer. The patch makes Field_json::val_str() return a pointer to an empty String on error to avoid such crashes. 2) Whereas #1 fixes the immediate crash, Field_json::val_str() may still read garbage when this situation occurs. This could lead to unreliable behaviour, and both valgrind and ASAN warn about it. The patch therefore also makes Field_json::store() start by clearing the field, so that it will hold an empty value rather than garbage after an error has happened. Fix #2 is sufficient to fix the reported problems. Fix #1 is included for consistency, so that Field_json::val_str() behaves the same way as the other Field::val_str() functions. The query in the bug report didn't always crash. Since the root cause was that it had read garbage, it could be lucky and read something that looked like a valid value. In that case, Field_json::val_str() didn't return NULL, and the crash was avoided. The patch also makes these changes: - It removes the Field_json::store_dom() function, since it is only called from one place. It is now inlined instead. - It corrects information about return values in the comment that describes the ensure_utf8mb4() function.

Background: WAIT_FOR_EXECUTED_GTID_SET waits until a specified set of GTIDs is included in GTID_EXECUTED. SET GTID_PURGED adds GTIDs to GTID_EXECUTED. RESET MASTER clears GTID_EXECUTED. There were multiple issues: 1. Problem: The change in GTID_EXECUTED implied by SET GTID_PURGED did not cause WAIT_FOR_EXECUTED_GTID_SET to stop waiting. Analysis: WAIT_FOR_EXECUTED_GTID_SET waits for a signal to be sent. But SET GTID_PURGED never sent the signal. Fix: Make GTID_PURGED send the signal. Changes: - sql/rpl_gtid_state.cc:Gtid_state::add_lost_gtids - sql/rpl_gtid_state.cc: removal of #ifdef HAVE_GTID_NEXT_LIST - sql/rpl_gtid.h: removal of #ifdef HAVE_GTID_NEXT_LIST 2. Problem: There was a race condition where WAIT_FOR_EXECUTED_GTID_SET could miss the signal from a commit and go into an infinite wait even if GTID_EXECUTED contains all the waited-for GTIDs. Analysis: In the bug, WAIT_FOR_EXECUTED_GTID_SET took a lock while taking a copy of the global state. Then it released the lock, analyzed the copy of the global state, and decided whether it should wait. But if the GTID to wait for was committed after the lock was released, WAIT_FOR_EXECUTED_GTID_SET would miss the signal and go to an infinite wait even if GTID_EXECUTED contains all the waited-for GTIDs. Fix: Refactor the code so that it holds the lock all the way from before it reads the global state until it goes to the wait. Changes: - sql/rpl_gtid_state.cc:Gtid_state::wait_for_gtid_set: Most of the changes in this function are to fix this bug. Note: When the bug existed, it was possible to create a test case for this by placing a debug sync point in the section where it does not hold the lock. However, after the bug has been fixed this section does not exist, so there is no way to test it deterministically. The bug would also cause the test to fail rarely, so a way to test this is to run the test case 1000 times. 3. Problem: The function would take global_sid_lock.wrlock every time it has to wait, and while holding it takes a copy of the entire gtid_executed (which implies allocating memory). This is not very optimal: it may process the entire set each time it waits, and it may wait once for each member of the set, so in the worst case it is O(N^2) where N is the size of the set. Fix: This is fixed by the same refactoring that fixes problem #2. In particular, it does not re-process the entire Gtid_set for each committed transaction. It only removes all intervals of gtid_executed for the current sidno from the remainder of the wait-for-set. Changes: - sql/rpl_gtid_set.cc: Add function remove_intervals_for_sidno. - sql/rpl_gtid_state.cc: Use remove_intervals_for_sidno and remove only intervals for the current sidno. Remove intervals incrementally in the innermost while loop, rather than recompute the entire set each iteration. 4. Problem: If the client that executes WAIT_FOR_EXECUTED_GTID_SET owns a GTID that is included in the set, then there is no chance for another thread to commit it, so it will wait forever. In effect, it deadlocks with itself. Fix: Detect the situation and generate an error. Changes: - sql/share/errmsg-utf8.txt: new error code ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID - sql/item_func.cc: check the condition and generate the new error 5. Various simplfications. - sql/item_func.cc:Item_wait_for_executed_gtid_set::val_int: - Pointless to set null_value when generating an error. - add DBUG_ENTER - Improve the prototype for Gtid_state::wait_for_gtid_set so that it takes a Gtid_set instead of a string, and also so that it requires global_sid_lock. - sql/rpl_gtid.h:Mutex_cond_array - combine wait functions into one and make it return bool - improve some comments - sql/rpl_gtid_set.cc:Gtid_set::remove_gno_intervals: - Optimize so that it returns early if this set becomes empty @mysql-test/extra/rpl_tests/rpl_wait_for_executed_gtid_set.inc - Move all wait_for_executed_gtid_set tests into mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test @mysql-test/include/kill_wait_for_executed_gtid_set.inc @mysql-test/include/wait_for_wait_for_executed_gtid_set.inc - New auxiliary scripts. @mysql-test/include/rpl_init.inc - Document undocumented side effect. @mysql-test/suite/rpl/r/rpl_wait_for_executed_gtid_set.result - Update result file. @mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test - Rewrote the test to improve coverage and cover all parts of this bug. @sql/item_func.cc - Add DBUG_ENTER - No point in setting null_value when generating an error. - Do the decoding from text to Gtid_set here rather than in Gtid_state. - Check for the new error ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID @sql/rpl_gtid.h - Simplify the Mutex_cond_array::wait functions in the following ways: - Make them one function since they share most code. This also allows calling the three-argument function with NULL as the last parameter, which simplifies the caller. - Make it return bool rather than 0/ETIME/ETIMEOUT, to make it more easy to use. - Make is_thd_killed private. - Add prototype for new Gtid_set::remove_intervals_for_sidno. - Add prototype for Gtid_state::wait_for_sidno. - Un-ifdef-out lock_sidnos/unlock_sidnos/broadcast_sidnos since we now need them. - Make wait_for_gtid_set return bool. @sql/rpl_gtid_mutex_cond_array.cc - Remove the now unused check_thd_killed. @sql/rpl_gtid_set.cc - Optimize Gtid_set::remove_gno_intervals, so that it returns early if the Interval list becomes empty. - Add Gtid_set::remove_intervals_for_sidno. This is just a wrapper around the already existing private member function Gtid_set::remove_gno_intervals. @sql/rpl_gtid_state.cc - Rewrite wait_for_gtid_set to fix problems 2 and 3. See code comment for details. - Factor out wait_for_sidno from wait_for_gtid. - Enable broadcast_sidnos/lock_sidnos/unlock_sidnos, which were ifdef'ed out. - Call broadcast_sidnos after updating the state, to fix issue #1. @sql/share/errmsg-utf8.txt - Add error message used to fix issue #4.

An assert failure is seen in some queries which have a semijoin and use the materialization strategy. The assertion fails if either the length of the key is zero or the number of key parts is zero. This could indicate two different problems. 1) If the length is zero, there may not be a problem, as it can legitimately be zero if, for example, the key is a zero-length string. 2) If the number of key parts is zero, there is a bug, as a key must have at least one part. The patch fixes issue #1 by removing the length check in the assertion. Issue #2 happens if JOIN::update_equalities_for_sjm() doesn't recognize the expression selected from a subquery, and fails to replace it with a reference to a column in a temporary table that holds the materialized result. This causes it to not recognize it as a part of the key later, and keyparts could end up as zero. The patch fixes it by calling real_item() on the expression in order to see through Item_refs that may wrap the expression if the subquery reads from a view.

Problem: The binary log group commit sync is failing when committing a group of transactions into a non-transactional storage engine while other thread is rotating the binary log. Analysis: The binary log group commit procedure (ordered_commit) acquires LOCK_log during the #1 stage (flush). As it holds the LOCK_log, a binary log rotation will have to wait until this flush stage to finish before actually rotating the binary log. For the #2 stage (sync), the binary log group commit only holds the LOCK_log if sync_binlog=1. In this case, the rotation has to wait also for the sync stage to finish. When sync_binlog>1, the sync stage releases the LOCK_log (to let other groups to enter the flush stage), holding only the LOCK_sync. In this case, the rotation can acquire the LOCK_log in parallel with the sync stage. For commits into transactional storage engine, the binary log rotation checks a counter of "flushed but not yet committed" transactions, waiting until this counter to be zeroed before closing the current binary log file. As the commit of the transactions happen in the #3 stage of the binary log group commit, the sync of the binary log in stage #2 always succeed. For commits into non-transactional storage engine, the binary log rotation is checking the "flushed but not yet committed" transactions counter, but it is zero because it only counts transactions that contains XIDs. So, the rotation is allowed to take place in parallel with the #2 stage of the binary log group commit. When the sync is called at the same time that the rotation has closed the old binary log file but didn't open the new file yet, the sync is failing with the following error: 'Can't sync file 'UNOPENED' to disk (Errcode: 9 - Bad file descriptor)'. Fix: For non-transactional only workload, binary log group commit will keep the LOCK_log when entering #2 stage (sync) if the current group is supposed to be synced to the binary log file.

Description: For ngshell query Crud.Find({ name:$.name, count:count(*) }).GroupBy($.name); an error has been raised: "Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.coll.doc' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by" Reviewed-by: Lukasz Kotula <[email protected]> RB:12283

Author: Andrei Elkin <[email protected]> Date: Fri Nov 25 15:17:17 2016 +0200 WL#9175 Correct recovery of DDL statements/transactions by binary log The patch consists of two parts implementing the WL agenda which is is to provide crash-safety for DDL. That is a server (a general one, master or slave) must be able to recover from crash to commit or rollback every DDL command that was in progress on the eve of crash. The Commit decision is done to commands that had reached Engine-prepared status and got successfully logged into binary log. Otherwise they are rolled back. In order to achieve the goal some refinements are done to the binlogging mechanism, minor addition is done to the server recovery module and some changes applied to the slave side. The binary log part includes Query-log-event which is made to contain xid that is a key item at server recovery. The recovery now is concern with it along with its standard location in Xid_log_event. The first part deals with the ACL DDL sub-class and TRIGGER related queries are fully 2pc-capable. It constructs the WL's framework which is proved on these subclasses. It also specifies how to cover the rest of DDLs by the WL's framework. For those not 2pc-ready DDL cases, sometimes "stub" tests are prepared to be refined by responsible worklogs. Take a few notes to the low-level details of implementation. Note #1. Tagging by xid number is done to the exact 2pc-capable DDL subclass. For DDL:s that will be ready for xiding in future, there is a tech specification how to do so. Note #2. By virtue of existing mechanisms, the slave applier augments the DDL transaction incorporating the slave info table update and the Gtid-executed table (either one optionally) at time of the DDL is ready for the final commit. When for filtering reason the DDL skips committing at its regular time, the augmented transaction would still be not empty consisting of only the added statements, and it would have to be committed by top-level slave specific functions through Log_event::do_update_pos(). To aid this process Query_log_event::has_committed is introduced. Note #3 (QA, please read this.) Replication System_table interface that is employed by handler of TABLE type slave info had to be refined in few places. Note #4 (runtime code). While trying to lessen the footprint to the runtime server code few concessions had to be conceded. These include changes to ha_commit_trans() to invoke new pre_commit() and post_commit(), and post_rollback() hooks due to the slave extra statement. ------------------------------------------------------------------- The 2nd part patch extends the basic framework, xidifies the rest of DDL commands that are (at least) committable at recovery. At the moment those include all Data Definition Statements except ones related to VIEWs, STORED Functions and Procedures. DDL Query is recoverable for these subclasses when it has been recorded into the binary log and was discovered there at the server restart, quite compatible with the DML algorithm. However a clean automatic rollback can't be provided for some of the commands and the user would have to complete recovery manually.

… WITH DEVELOPER STUDIO When we do a release type build of the server (with both optimized and debug enabled server/plugins) with Developer Studio, some MTR tests when run with --debug-server will fail in one of two ways: 1. Tests which try to load a plugin into the mysql client fail with missing symbols. This is caused by the plugin having references to functions which do not exist in the non-debug client. 2. Some tests on sparc fail with Thread stack overrun. Fix for issue #1: mtr will have appended /debug to the plugin dir part when running with --debug-server and if there actually is such a directory. The fix is to remove any trailing /debug from the env. variable within the test. This will affect the client only, not the server. Developer builds will not have put the plugins in a subdirectory /debug so it makes no different to those. Fix for issue #2: apparently this thread stack overrun is not feasible to avoid, so just skip the test if running with debug server on sparc; there is already an include file to do that. Also added not_sparc_debug.inc to the "white list" so the tests are skipped even when running mtr --no-skip. (cherry picked from commit 9c79e477261ab252e38def436bca3336ef597603)

…l process

In WL-included builds ASAN run witnessed missed ~Query_log_event invocation. The destruct-or was not called due to the WL's changes in the error propagation that specifically affect LC MTS. The failure is exposed in particular by rpl_trigger as the following stack: #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98) mysql#1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12 mysql#2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72 mysql#3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46 mysql#4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17 mysql#5 0x293b96f in Log_event::read_log_event() mysql#6 0x2a2a1c9 in next_event(Relay_log_info*) Previously before the WL Mts_submode_logical_clock::wait_for_workers_to_finish() had not returned any error even when Coordinator thread is killed. The WL patch needed to refine such behavior, but at doing so it also had to attend log_event.cc::schedule_next_event() to register an error to follow an existing pattern. While my_error() does not take place the killed Coordinator continued scheduling, ineffectively though - no Worker gets engaged (legal case of deferred scheduling), and without noticing its killed status up to a point when it resets the event pointer in apply_event_and_update_pos(): *ptr_ev= NULL; // announcing the event is passed to w-worker The reset was intended for an assigned Worker to perform the event destruction or by Coordinator itself when the event is deferred. As neither is the current case the event gets unattended for its termination. In contrast in the pre-WL sources the killed Coordinator does find a Worker. However such Worker could be already down (errored out and exited), in which case apply_event_and_update_pos() reasonably returns an error and executes delete ev in exec_relay_log_event() error branch. **Fixed** with deploying my_error() call in log_event.cc::schedule_next_event() error branch which fits to the existing pattern. THD::is_error() has been always checked by Coordinator before any attempt to reset *ptr_ev= NULL. In the errored case Coordinator does not reset and destroys the event itself in the exec_relay_log_event() error branch pretty similarly to how the pre-WL sources do. Tested against rpl_trigger and rpl suites to pass. Approved on rb#15667.

Some character sets are designated as MY_CS_STRNXFRM, meaning that sorting needs to go through my_strnxfrm() (implemented by the charset), and some are not, meaning that a client can do the strnxfrm itself based on cs->sort_order. However, most of the logic related to the latter has been removed already (e.g. filesort always uses my_strnxfrm() since 2003), and now it's mostly in the way. The three main uses left are: 1. A microoptimization for constructing sort keys in filesort. 2. A home-grown implementation of Boyer-Moore for accelerating certain LIKE patterns that should probably be handled through FTS. 3. Some optimizations to MyISAM prefix keys. Given that our default collation (utf8mb4_0900_ai_ci) now is a strnxfrm-based collation, the benefits of keeping these around for a narrow range of single-byte locales (like latin1_swedish_ci, cp850 and a bunch of more obscure locales) seems dubious. We seemingly can't remove the flag entirely due to mysql#3 seemingly affecting the on-disk MyISAM structure, but we can remove the code for mysql#1 and mysql#2. Change-Id: If974e490d451b7278355e33ab1fca993f446b792

Patch #1: When find_child_doms() evaluates an ellipsis path leg, it adds each matching child twice. First once for every immediate child in the top-level call to find_child_doms(), and then again in the recursive call to find_child_doms() on each of the children. This doesn't cause any wrong results, since duplicate elimination prevents them from being added to the result, but it is unnecessary work. This patch makes find_child_dom() stop calling add_if_missing() on the children before it recurses into them, so that they are only added once. Additionally: Make find_child_dom() a free, static function instead of a member of Json_dom. It is a helper function for Json_dom::seek(), and doesn't use any non-public members of Json_dom, so it doesn't have to be part of Json_dom's interface. Remove unnecessary checks for only_need_one in the ellipsis processing. Json_dom::seek() sets the flag to true only in the last path leg, and the JSON path parser rejects paths that end with an ellipsis, so this cannot happen. Added an assert instead. Microbenchmarks (64-bit, Intel Core i7-4770 3.4 GHz, GCC 6.3): BM_JsonDomSearchEllipsis 79880 ns/iter [+32.0%] BM_JsonDomSearchEllipsis_OnlyOne 75872 ns/iter [+35.3%] BM_JsonDomSearchKey 129 ns/iter [ -1.6%] BM_JsonBinarySearchEllipsis 320920 ns/iter [+11.8%] BM_JsonBinarySearchEllipsis_OnlyOne 315458 ns/iter [+12.6%] BM_JsonBinarySearchKey 86 ns/iter [ -2.3%] Change-Id: I865af789b90b820a6e180ad822f2fb68f411516b

…_ROW()' FAILED Analysis: When a window with buffering follows a equijoin on a unique index (JT_EQ_REF) , we can get into trouble because windowing modifies the input record, presuming that once the windowing has been handed the record, next time control passes back to the join code a (new) record will be read to the input record. However, this does not hold with JT_EQ_REF, cf. the caching done in join_read_key: From its Doxygen: "Since the eq_ref access method will always return the same row, it is not necessary to read the row more than once, regardless of how many times it is needed in execution. This cache element is used when a row is needed after it has been read once, unless a key conversion error has occurred, or the cache has been disabled." Fix: We solve this problem by reinstating the input record before handing control back from end_write_wf. We optimize: only do this if the window in question follows after such a JOIN, i.e. window #1, and it has actually clobbered the input record. This can only happen if the last qep_tab has type JT_EQ_REF. Another, perhaps better approach, is to refactor to never touch the input record but keep the copying between the out record and the frame table record instead. Left for future refactoring. Added some missing Window method "const"s, and folded a couple of one-liners into window.h (from .cc). Repro added. Change-Id: I33bc43cd99ff79303b17d181abc3805ce226fb85

…TABLE_UPGRADE_GUARD To repeat: cmake -DWITH_ASAN=1 -DWITH_ASAN_SCOPE=1 ./mtr --mem --sanitize main.dd_upgrade_error A few dd tests fail with: ==26861==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7000063bf5e8 at pc 0x00010d4dbe8b bp 0x7000063bda40 sp 0x7000063bda38 READ of size 8 at 0x7000063bf5e8 thread T2 #0 0x10d4dbe8a in Prealloced_array<st_plugin_int**, 16ul>::empty() const prealloced_array.h:186 #1 0x10d406a8b in lex_end(LEX*) sql_lex.cc:560 #2 0x10dae4b6d in dd::upgrade::Table_upgrade_guard::~Table_upgrade_guard() (mysqld:x86_64+0x100f87b6d) #3 0x10dadc557 in dd::upgrade::migrate_table_to_dd(THD*, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, bool) (mysqld:x86_64+0x100f7f557) #4 0x10dad7e85 in dd::upgrade::migrate_plugin_table_to_dd(THD*) (mysqld:x86_64+0x100f7ae85) #5 0x10daec6a1 in dd::upgrade::do_pre_checks_and_initialize_dd(THD*) upgrade.cc:1216 #6 0x10cd0a5c0 in bootstrap::handle_bootstrap(void*) bootstrap.cc:336 Change-Id: I265ec6dd97ee8076aaf03763840c0cdf9e20325b Fix: increase lifetime of 'LEX lex;' which is used by 'table_guard'

Fix misc UBSAN warnings in unit tests. To repeat: export UBSAN_OPTIONS="print_stacktrace=1" ./runtime_output_directory/merge_large_tests-t --gtest_filter='-*DeathTest*' > /dev/null unittest/gunit/gis_algos-t.cc:78:70: runtime error: downcast of address 0x000012dc0be8 which does not point to an object of type 'Gis_polygon_ring' include/sql_string.h:683:35: runtime error: null pointer passed as argument 2, which is declared to never be null mysql#1 0x373e7af in histograms::Value_map<String>::add_values(String const&, unsigned long long) sql/histograms/value_map.cc:149 mysql#2 0x294fcf2 in dd_column_statistics_unittest::add_values(histograms::Value_map<String>&) unittest/gunit/dd_column_statistics-t.cc:62 runtime_output_directory/merge_keyring_file_tests-t --gtest_filter='-*DeathTest*' > /dev/null plugin/keyring/common/keyring_key.cc:82:57: runtime error: null pointer passed as argument 2, which is declared to never be null Change-Id: I2651362e3373244b72e6893f0e22e67402b49a52 (cherry picked from commit 1fe3f72561994da1d912a257689e1b18106f8828)

AUTH_COMMON.H Description:- Sever crashes due to a NULL pointer de-reference. Analysis:- Sever encounters a NUll pointer de-reference during "acl_load()". Fix:- A check is introduced to avoid the NULL pointer de-reference. This issue is already prevented in 8.0 through Bug#27225806 fix. Therefore, this patch is applicable only for 5.7.

Group Replication does implement conflict detection on multi-primary to avoid write errors on parallel operations. The conflict detection is also engaged in single-primary mode on the particular case of primary change and the new primary still has a backlog to apply. Until the backlog is flushed, conflict detection is enabled to prevent write errors between the backlog and incoming transactions. The conflict detection data, which we name certification info, is also used to detected dependencies between accepted transactions, dependencies which will rule the transactions schedule on the parallel applier. In order to avoid that the certification info grows forever, periodically all members exchange their GTID_EXECUTED set, which full intersection will provide the set of transactions that are applied on all members. Future transactions cannot conflict with this set since all members are operating on top of it, so we can safely remove all write-sets from the certification info that do belong to those transactions. More details at WL#6833: Group Replication: Read-set free Certification Module (DBSM Snapshot Isolation). Though a corner case was found on which the garbage collection was purging more data than it should. The scenario is: 1) Group with 2 members; 2) Member1 executes: CREATE TABLE t1(a INT, b INT, PRIMARY KEY(a)); INSERT INTO t1 VALUE(1, 1); Both members have a GTID_EXECUTED= UUID:1-4 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4 3) member1 executes TA UPDATE t1 SET b=10 WHERE a=1; and blocks immediately before send the transaction to the group. This transaction has snapshot_version: UUID:1-4 4) member2 executes TB UPDATE t1 SET b=10 WHERE a=1; This transaction has snapshot_version: UUID:1-4 It goes through the complete patch and it is committed. This transaction has GTID: UUID:1000002 Both members have a GTID_EXECUTED= UUID:1-4:1000002 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 5) member2 becomes extremely slow in processing transactions, we simulate that by holding the transaction queue to the GR pipeline. Transaction delivery is still working, but the transaction will be block before certification. 6) member1 is able to send its TA transaction, lets recall that this transaction has snapshot_version: UUID:1-4. On conflict detection on member1, it will conflict with #1, since this snapshot_version does not contain the snapshot_version of #1, that is TA was executed on a previous version than TB. On member2 the transaction will be delivered and will be put on hold before conflict detection. 7) meanwhile the certification info garbage collection kicks in. Both members have a GTID_EXECUTED= UUID:1-4:1000002 Its intersection is UUID:1-4:1000002 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 The condition to purge write-sets is: snapshot_version.is_subset(intersection) We have "UUID:1-4:1000002".is_subset("UUID:1-4:1000002) which is true, so we remove #1. Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) <empty> 8) member2 gets back to normal, we release transaction TA, lets recall that this transaction has snapshot_version: UUID:1-4. On conflict detection, since the certification info is empty, the transaction will be allowed to proceed, which is incorrect, it must rollback (like on member1) since it conflicts with TB. The problem it is on certification garbage collection, more precisely on the condition used to purge data, we cannot leave the certification info empty otherwise this situation can happen. The condition must be changed to snapshot_version.is_subset_not_equals(intersection) which will always leave a placeholder to detect delayed conflicting transaction. So a trace of the solution is (starting on step 7): 7) meanwhile the certification info garbage collection kicks in. Both members have a GTID_EXECUTED= UUID:1-4:1000002 Its intersection is UUID:1-4:1000002 Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 The condition to purge write-sets is: snapshot_version.is_subset_not_equals(intersection) We have "UUID:1-4:1000002".is_subset_not_equals("UUID:1-4:1000002) which is false, so we do not remove #1. Both members certification info has: Hash of item in Writeset snapshot version (Gtid_set) #1 UUID1:1-4:1000002 8) member2 gets back to normal, we release transaction TA, lets recall that this transaction has snapshot_version: UUID:1-4. On conflict detection on member2, it will conflict with #1, since this snapshot_version does not contain the snapshot_version of #1, that is TA was executed on a previous version than TB. This is the same scenario that we see on this bug, though here the pipeline is being blocked by the distributed recovery procedure, that is, while the joining member is applying the missing data through the recovery channel, the incoming data is being queued. Meanwhile the certification info garbage collection kicks in and purges more data that it should, the result it is that conflicts are not being detected.

Use void* for function arguments, and cast Item_field* in function body. compare_fields_by_table_order(Item_field*, Item_field*, void*) through pointer to incorrect function type 'int (*)(void *, void *, void *)' sql/sql_optimizer.cc:3904: note: compare_fields_by_table_order(Item_field*, Item_field*, void*) defined here #0 0x34ac57d in base_list::sort(int (*)(void*, void*, void*), void*) sql/sql_list.h:278:13 #1 0x49ad303 in substitute_for_best_equal_field(Item*, COND_EQUAL*, Change-Id: I4f5417304a24201682f32fc7631034de7aa62589

…ENERATED_READ_FIELDS It's a SELECT with WHERE "(-1) minus 0x4d". this operation has a result type of "unsigned" (because 0x4d is unsigned integer) and the result (-78) doesn't fit int an unsigned type. This WHERE is evaluated by InnoDB in index condition pushdown: #0 my_error #1 Item_func::raise_numeric_overflow ... #7 Item_cond_and::val_int #8 innobase_index_cond ... #12 handler::index_read_map ... #15 handler::multi_range_read_next ... #20 rr_quick #21 join_init_read_record As val_int() has no "error" return code, the execution continues until frame #12; there we call update_generated_read_fields(), which has an assertion about thd->is_error() which fails. Fix: it would be nice to detect error as soon as it happens, i.e. in innodb code right after it calls val_bool(). But innodb's index condition pushdown functions only have found / not found return codes so they cannot signal "error" to the upper layers. Same is true for MyISAM. Moreover, "thd" isn't easily accessible there. Adding a detection a bit above in the stack (handler::* functions which do index reads) is also possible but would require fixing ~20 functions. The chosen fix here is to change update_generated_*_fields() to return error if thd->is_error() is true. Note that the removed assertion was already one cause of bug 27041382.

Post push fix sql/opt_range.cc:14196:22: runtime error: -256 is outside the range of representable values of type 'unsigned int' #0 0x4248a9d in cost_skip_scan(TABLE*, unsigned int, unsigned int, unsigned long long, Cost_estimate*, unsigned long long*, Item*, Opt_trace_object*) sql/opt_range.cc:14196:22 #1 0x41c524c in get_best_skip_scan(PARAM*, SEL_TREE*, bool) sql/opt_range.cc:14086:5 #2 0x41b7b65 in test_quick_select(THD*, Bitmap<64u>, unsigned long long, unsigned long long, bool, enum_order, QEP_shared_owner const*, Item*, Bitmap<64u>*, QUICK_SELECT_I**) sql/opt_range.cc:3352:23 #3 0x458fc08 in get_quick_record_count(THD*, JOIN_TAB*, unsigned long long) sql/sql_optimizer.cc:5542:17 #4 0x458a0cd in JOIN::estimate_rowcount() sql/sql_optimizer.cc:5290:25 The fix is to handle REC_PER_KEY_UNKNOWN explicitly, to avoid using -1.0 in computations later. Change-Id: Ie8a81bdf7323e4f66abcad0a9aca776de8acd945

…tion modes.

Introduce class NdbSocket, which includes both an ndb_socket_t and an SSL *, and wraps all socket operations that might use TLS. Change-Id: I20d7aeb4854cdb11cfd0b256270ab3648b067efa

The patch for WL#15130 Socket-level TLS patch mysql#1: class NdbSocket re-introduced a -Wcast-qual warning. Use const_cast, and reinterpret_cast to fix it, since the corresponding posix version of ndb_socket_writev() is const-correct. Change-Id: Ib446a926b4108edf51eda7d8fd27ada560b67a24

…etwork https://bugs.mysql.com/bug.php?id=109668 Description ----------- GR suffered from problems caused by the security probes and network scanner processes connecting to the group replication communication port. This usually is not a problem, but poses a serious threat when another member tries to join the cluster by initialting a connection to the member which is affected by external processes using the port dedicated for group communication for longer durations. On such activites by external processes, the SSL enabled server stalled forever on the SSL_accept() call waiting for handshake data. Below is the stacktrace: Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)): #0 in read () mysql#1 in sock_read () mysql#2 in BIO_read () mysql#3 in ssl23_read_bytes () mysql#4 in ssl23_get_client_hello () mysql#5 in ssl23_accept () mysql#6 in xcom_tcp_server_startup(Xcom_network_provider*) () When the server stalled in the above path forever, it prohibited other members to join the cluster resulting in the following messages on the joiner server's logs. [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' Solution -------- This patch adds two new variables 1. group_replication_xcom_ssl_socket_timeout It is a file-descriptor level timeout in seconds for both accept() and SSL_accept() calls when group replication is listening on the xcom port. When set to a valid value, say for example 5 seconds, both accept() and SSL_accept() return after 5 seconds. The default value has been set to 0 (waits infinitely) for backward compatibility. This variable is effective only when GR is configred with SSL. 2. group_replication_xcom_ssl_accept_retries It defines the number of retries to be performed before closing the socket. For each retry the server thread calls SSL_accept() with timeout defined by the group_replication_xcom_ssl_socket_timeout for the SSL handshake process once the connection has been accepted by the first accept() call. The default value has been set to 10. This variable is effective only when GR is configred with SSL. Note: - Both of the above variables are dynamically configurable, but will become effective only on START GROUP_REPLICATION. - This patch is only for the Linux systems.

Part of WL#15135 Certificate Architecture This patch introduces a set of C++ classes to implement the creation of private keys, PKCS#10 signing requests, and X.509 certificates for NDB cluster. The TlsSearchPath class provides searching for files over a delimited list of directories. The PrivateKey and Certificate classes provide simple wrappers over the OpenSSL routines to create, free, save, and open keys and certificates. Classes PendingPrivateKey and PendingCertificate implement file naming conventions and life cycle for pending key pairs; ActivePrivateKey and ActiveCertificate implement them for active key pairs. Class SigningRequest provides the naming conventions and life cycle for PKCS#10 CSRs. Class NodeCertificate is the primary in-memory representation of a node's TLS credentials. A unit test, NodeCertificate-t, is intended to thoroughly test the whole suite of classes. It should be possible to run this test under valgrind with no reported leaks. Change-Id: I76bf719375ab2a9b6a97245e326158a49dde28c2

This is a complete implementation of ndb_sign_keys. It searches --ndb-tls-search-path for node certificate and key files, and additionally searchs in --CA-search-path for CA-related key and certificate files. It includes three methods for remote key signing: With --remote-CA-host, run ndb_sign_keys remotely, using ssh. With --remote-openssl, run openssl on the remote host, using ssh. With --CA-tool, run a local signing helper tool. Change-Id: I5d93b702a667fa98d820ed150631a91e8444b8d7

Post-push fix for : WL#15166 patch #1 ndb_sign_keys DWORD is 'unsigned long' not int Remove an unused local variable. C-style cast (LPSTR) drops const qualifier [-Wcast-qual] Change-Id: I059ad8a5a5f6b1cc644456576a8acff9a78331e3

Add boolean parameter "RequireCertificate" to [DB] section. Default is false. If true, node will fail at startup time unless it finds a TLS key and a current valid certificate. Add boolean parameter "RequireTls" to [DB] section. Default is false. If true, every transporter link involving the data node must use TLS. Add boolean parameter "RequireTls" to [TCP] sections. This is computed, and not user-setable. If either endpoint of a link has RequireTls set to true, RequireTls for the link will be set true. Add some clarifying comments to ndbinfo_plans test. Change-Id: I889d9b7563022e2ebb2eaae92c3b26b557180d40

Add an MGM protocol command to turn a plaintext mgm api session into a TLS session. Add three new MGM API functions: ndb_mgm_set_ssl_ctx() ndb_mgm_start_tls() ndb_mgm_connect_tls() Define two client TLS requirement levels: CLIENT_TLS_RELAXED, CLIENT_TLS_STRICT This adds a new test: testMgmd -n StartTls Change-Id: Ib46faacd9198c474558e46c3fa0538c7e759f3fb

Post push fix. Remove added C++ dependencies in C header mgmapi.h. - forward declare SSL_CTX. - add missing struct keyword with ndb_mgm_cert_table and ndb_mgm_tls_stats - make ndb_mgm_set_ssl_ctx return int instead of bool as other mgmapi functions do. Change-Id: I493b4c4fb1272974e1bb72e35abb08c8cef1a534

Post push fix. Do not allow ndb_mgm_listen_event to return a socket that uses TLS since user can not access the corresponding SSL object thorugh the public MgmAPI. Change-Id: I2a741efe4f80db750419101ecabb03fb5e025346

Post push fix. Make NdbSocket::ssl_readln return 0 on timeout. Change-Id: I4cad95abd319883c16f2c28eff5cf2b6761731d6

Post push fix. Add missing socket close in testMgmd -n StartTls. Change-Id: Ia446b522ad2698f63d588d3c52122df8735765c7

Problem ================================ Group Replication ASAN run failing without any symptom of a leak, but with shutdown issues: worker[6] Shutdown report from /dev/shm/mtr-3771884/var-gr-debug/6/log/mysqld.1.err after tests: group_replication.gr_flush_logs group_replication.gr_delayed_initialization_thread_handler_error group_replication.gr_sbr_verifications group_replication.gr_server_uuid_matches_group_name_bootstrap group_replication.gr_stop_async_on_stop_gr group_replication.gr_certifier_message_same_member group_replication.gr_ssl_mode_verify_identity_error_xcom Analysis and Fix ================================ It ended up being a leak on gr_ssl_mode_verify_identity_error_xcom test: Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f1709fbe1c7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x7f16ea0df799 in xcom_tcp_server_startup(Xcom_network_provider*) (/export/home/tmp/BUG35594709/mysql-trunk/BIN-ASAN/plugin_output_directory /group_replication.so+0x65d799) #2 0x7f170751e2b2 (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc2b2) This happens because we delegated incoming connections cleanup to the external consumer in incoming_connection_task. Since it calls incoming_connection() from Network_provider_manager, in case of a concurrent stop, a connection could be left orphan in the shared atomic due to the lack of an Active Provider, thus creating a memory leak. The solution is to make this cleanup on Network_provider_manager, on both stop_provider() and in stop_all_providers() methods, thus ensuring that no incoming connection leaks. Change-Id: I2367c37608ad075dee63785e9f908af5e81374ca

Post push fix. Make NdbSocket::ssl_readln return 0 on timeout. Change-Id: I4cad95abd319883c16f2c28eff5cf2b6761731d6

BUG#35949017 Schema dist setup lockup Bug#35948153 Problem setting up events due to stale NdbApi dictionary cache [#2] Bug#35948153 Problem setting up events due to stale NdbApi dictionary cache [#1] Bug#32550019 Missing check for ndb_schema_result leads to schema dist timeout Change-Id: I4a32197992bf8b6899892f21587580788f828f34

… cache [#1] Problem: A MySQL Server which has been disconnected from schema distribution fails to setup event operations since the columns of the table can't be found in the event. Analysis: The ndbcluster plugin uses NDB table definitions which are cached by the NdbApi. These cached objects are reference counted and there can be multiple versions of the same table in the cache, the intention is that it should be possible to continue using the table even though it changes in NDB. When changing a table in NDB this cache need to be invalidated, both on the local MySQL Server and on all other MySQL Servers connected to the same cluster. Such invalidation is especially important before installing in DD and setting up event subscriptions. The local MySQL Server cache is invalidated directly when releasing the reference from the NdbApi after having modified the table. The other MySQL Servers are primarily invalidated by using schema distribution. Since schema distribution is event driven the invalidation will happen promptly but as with all things in a distributed system there is a possibility that these events are not handled for some reason. This means there must be a fallback mechanism which invalidates stale cache objects. The reported problem occurs since there is a stale NDB table definition in the NdbApi, it has the same name but different columns than the current table in NDB. In most cases the NdbApi continues to operate on a cached NDB table definition but when setting up events the "mismatch on version" will be detected inside the NdbApi(due to the relation between the event and the table), this causes the cache to be invalidated and current version to be loaded from NDB. However the caller is still using the "old" cached table definition and thus when trying to subscribe the columns they can not be found. Solution: 1) Invalidate NDB table definition in schema event handler that handles new table created. This covers the case where table is dropped directly in NDB using for example ndb_drop_table or ndb_restore and then subsequently created using SQL. This scenario is covered by the existing metadata_sync test cases who will be detected by 4) before this part of the fix. 2) Invalidate NDB table definition before table schema synchronization install tables in DD and setup event subscripotion. This function handles the case when schema distribution is reconnecting to the cluster and a table it knew about earlier has changed while schema distribution event handlers have not been active. This scenario is tested by the drop_util_table test case. 3) Invalidate NDB table definition when schema distribution event handler which is used for drop table and cluster failure occurs. At this time it's well known that table does not exists or it's status is unknown. Earlier this invalidation was only performed if there was a version mismatch in the the event vs. table relation. 4) Detect when problem occurs by checking that NDB table definition has not been invalidated (by NdbApi event functions) in the function that setup the event subscription. It's currently not possible to handle the problem this low down, but at least it can be detected and fix added to the callers. This detection is only done in debug compile. Change-Id: I4ed6efb9308be0022e99c51eb23ecf583805b1f4

…and a local DDL executed https://bugs.mysql.com/bug.php?id=113727 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- mysql#1 __GI___lll_lock_wait mysql#2 ___pthread_mutex_lock mysql#3 native_mutex_lock <= waits for commit lock while holding sidno lock mysql#4 Commit_stage_manager::enroll_for mysql#5 MYSQL_BIN_LOG::change_stage mysql#6 MYSQL_BIN_LOG::ordered_commit mysql#7 MYSQL_BIN_LOG::commit mysql#8 ha_commit_trans mysql#9 trans_commit_implicit mysql#10 mysql_create_like_table mysql#11 Sql_cmd_create_table::execute mysql#12 mysql_execute_command mysql#13 dispatch_sql_command Applier thread -------------- mysql#1 ___pthread_mutex_lock mysql#2 native_mutex_lock mysql#3 safe_mutex_lock mysql#4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock mysql#5 Gtid_state::update_commit_group mysql#6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here mysql#7 Commit_order_manager::finish mysql#8 Commit_order_manager::wait_and_finish mysql#9 ha_commit_low mysql#10 trx_coordinator::commit_in_engines mysql#11 MYSQL_BIN_LOG::commit mysql#12 ha_commit_trans mysql#13 trans_commit mysql#14 Xid_log_event::do_commit mysql#15 Xid_apply_log_event::do_apply_event_worker mysql#16 Slave_worker::slave_worker_exec_event mysql#17 slave_worker_exec_job_group mysql#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

When built with ASAN, a use-after-free is reported for the TcpPortPool. AddressSanitizer: heap-use-after-free on address 0x60200019f190 at pc 0x00000076a18d bp 0x7fff51e7d1d0 sp 0x7fff51e7d1c0 #4 0x770b73 in UniqueId::ProcessUniqueIds::erase(unsigned int) ../router/tests/helpers/tcp_port_pool.h:112 #5 0x770c48 in UniqueId::~UniqueId() ../router/tests/helpers/tcp_port_pool.cc:234 ... #12 0x82faa3 in testing::UnitTest::~UnitTest() ../extra/googletest/googletest-release-1.12.0/googletest/src/gtest.cc:5496 #13 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8) 0x60200019f190 is located 0 bytes inside of 16-byte region [0x60200019f190,0x60200019f1a0) freed by thread T0 here: #0 0x7f5fe3cbd10f in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb710f) #1 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8) Background ========== __run_exit_handlers destroys "static" and "global" variables in reverse order of their creation. googletest's unit-tests are a static, and the TcpPortPool also has ProcessUniqueId's which contains the process-wide unique-ids. At construct: unittest -> tcp-port-pool -> proces-unique-ids At destruct : process-unique-ids -> tcp-port-pool -> 💥 The use-after-free happens as the process-unique-ids static is destructed before the tcp-port-pool which tries to its Ids from the process-unique-ids. Change ====== - extend the lifetime of the process-unique-ids to after the last use of the tcp-port-pool via a std::shared_ptr<> Change-Id: I75b8b781e1d240f18ca72f2c86182639a7699f06

…nt on Windows and posix [#1] When passing arguments to NdbProcess::create it will become important when introducing quoting to distinguish spaces that are port of the argument value or beeing an argument separator. This patch removes current uses of space as separator in arguments to NdbProcess::create. Change-Id: I1d1bab27e183fc33632bfd9974010129a8970365

Problem: Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal program termination in MgmtSrvr destructor when ndb_mgmd restart itself. Core was generated by `ndb_mgmd --defa'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f8ce4066b8f in raise () from /lib64/libc.so.6 #1 0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6 #2 0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6 #3 0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6 #5 0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp: 890 #6 0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/ storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849 #7 0x0000000000700d94 in mgmd_run () at mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260 #8 0x0000000000700775 in mgmd_main (argc=<optimized out>, argv=0x28041d0) at mysql/8.0/storage/ndb/src/ mgmsrv/main.cpp:479 Analysis: While starting up, the ndb_mgmd will allocate memory for bind_address in order to potentially rewrite the parameter. When ndb_mgmd restart itself the memory will be released and dangling pointer causing double free. Fix: Drop support for bind_address=[::], it is not documented anywhere, is not useful and doesn't work. This means the need to rewrite bind_address is gone and bind_address argument need neither alloc or free. Change-Id: I7797109b9d8391394587188d64d4b1f398887e94

This worklog introduces dynamic offload of Queries to RAPID in following ways: When system variable rapid_use_dynamic_offload is 0/false , then we fall back to normal cost threshold classifier, which also implies that when use secondary engine is set to forced, eligible queries will go to secondary engine, regardless of cost threshold or this classifier. When rapid_use_dynamic_offload is 1/true, then we proceed with looking for optimal execution engine for this queries, if secondary engine is found more optimal, then query is offloaded, otherwise it is sent back to mysql. This is handled in following scenarios: 1. Static Scenario: When there's no Change propagation or Queue on RAPID side, this introduces decision tree which has > 85 % precision in training which queries should be faster on mysql or which queries should be faster on mysql, and accepts or rejects queries. the decision tree takes around 20-100 microseconds for fast queries, hence minimal overhead, for bigger queries this introduces overhead of upto maximum observed 700 microseconds, these end up with long execution time, hence not a problem. For very fast queries, defined here by having cost < 10 and of the form point select, dynamic offload is not applied, since 100 % of these queries (out of 16667 samples) are faster on MySQL. Additionally, routing these "very fast queries" through dynamic offload leads to performance regressions due to 3 phase optimisation. 2. Dynamic Scenario: When there's CP or queuing on RAPID, this worklog introduces dynamic feature normalization to factor into account extra catch up time RAPID needs, and factoring in that, attempts to verify if RAPID is still the best engine for execution. If queue is too long or CP is too long, this mechanism wants to progressively start shifting queries to mysql, moving gradually towards the heavier queries The steps in this worklog with respect to query lifecycle in server with secondary_engine = ON, are described below: query | Primary Tentatively optimisation -> mysql optimises for Innodb | secondary_engine_pre_prepare_hook -> following Rapid function called: | RapidCachePrimaryInfoAtPrimaryTentativelyStep | If dynamic offload is enabled and query is not "very fast": | This caches features from mysql plan in rapid_statement_context | to be used for dynamic offload. | If dynamic offload is disabled or the query is "very fast": | This function invokes standary mysql cost threshold classifier, | which decides if query needs further RAPID optimisation. | | |-> if returns False, then query proceeds to Innodb for execution |-> if returns true, step below is called | Secondary optimisation -> mysql optimises for RAPID | prepare_secondary_engine -> following Rapid function is called: | RapidPrepareEstimateQueryCosts | In this function, Dynamic offload combines mysql plan features | retrieved from rapid_statement_context | and RAPID info such as rapid base table cardinality, | dict encoding projection, varlen projection size, rapid queue | size in to decide if query should be offloaded to RAPID. | |->if returns True, then query proceeds to Innodb for execution |->if returns False, step below is called | optimize_secondary_engine -> following Rapid function is called | RapidOptimize | In this function, Dynamic offload retrieves info from | rapid_statement_context and additionally looks at Change | propagation lag to decide if query should be offloaded to rapid | |->if returns True, then query proceeds to Innodb for execution |->if returns False, then query goes to Rapid Execution. Following new MYSQL ERR log messages are printed with this WL, when dynamic offload is enabled, and query is not a "very fast query". 1. SelOffload allow decision 1 : as secondary not forced 1 and enable var value 1 and transactional enabled 1 and( big shape detected 0 or small shape detected 1 ) inno: 10737418240 , rpd: 4294967296 , no lh table: 1 Message such as this shows if dynamic offload is used to classify this query or not. If not, why not, using each of the conditions. 1 = pass, 0 = not pass. 2. myqid=65 Selective offload classifier #1#1#1 f_mysql_total_ts_nrows <= 2105.5 : 0.173916, f_MySQLCost <= 68.3899040222168 : 0.028218, f_count_all_base_tables = 0 , f_count_ref_index_ts = 0 ,f_BaseTableSumNrows <= 278177.5 : 0.173916 are_all_ts_index_ref = true outcome=0 Line such as this serialises what leg of decision tree decided outcome of this query 0 -> back to mysql 1 -> keep on rapid. each leg is uniquely searchable via identifier such as #1#1#1 here. This worklog additionally introduces python scripts to run queries on mysql client with multiple queries and multiple dmls at once, in various modes such as simulator mode and standard benchmark modes. By Default this WL is enabled, but before release it will be disabled. This is tracked via BUG#36343189 #no-close. Perf mode unittests will be enabled on jenkins after this wl. Further cleanup will be done via BUG#36368437 #no-close. Bugs tackled via this WL: BUG#35738194, Enh#34132523, Bug#36343208 Unrelated bugs fixed: BUG#35987975 Old gerrit review : 25567 (abandoned due to 1000 update limit reached) Change-Id: Ie5f9fdcd8b55a669d04b389d3aec5f6b33f0fe2e

… for connection xxx'. The new iterator based explains are not impacted. The issue here is a race condition. More than one thread is using the query term iterator at the same time (whoch is neithe threas safe nor reantrant), and part of its state is in the query terms being visited which leads to interference/race conditions. a) the explain thread uses an iterator here: Sql_cmd_explain_other_thread::execute is inspecting the Query_expression of the running query calling master_query_expression()->find_blocks_query_term which uses an iterator over the query terms in the query expression: for (auto qt : query_terms<>()) { if (qt->query_block() == qb) { return qt; } } the above search fails to find qb due to the interference of the thread b), see below, and then tries to access a nullpointer: * thread #36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11 frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50 frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21 frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26 frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32 frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9 frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14 frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20 frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11 frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13 frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7 frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17 frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13 * frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11 frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29 frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19 frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7 frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18 frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13 frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3 frame #20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99 frame #21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15 b) the query thread being explained is itself performing LEX::cleanup and as part of the iterates over the query terms, but still allows EXPLAIN of the query plan since thd->query_plan.set_query_plan(SQLCOM_END, ...) hasn't been called yet. 20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613) 21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861) 22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286) 30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799) 31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648) 32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303) 33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135) 34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464) 57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304) 58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053) 65:frame: _pthread_start (in libsystem_pthread.dylib) + 99 66:frame: thread_start (in libsystem_pthread.dylib) + 15 Solution: This patch solves the issue by removing iterator state from Query_term, making the query_term iterators thread safe. This solution labels every child query_term with its index in its parent's m_children vector. The iterator can therefore easily compute the next child to visit based on Query_term::m_sibling_idx. A unit test case is added to check reentrancy. One can also manually verify that we have no remaining race condition by running two client connections files (with \. <file>) with a big number of copies of the repro query in one connection and a big number of EXPLAIN format=json FOR <connection>, e.g. EXPLAIN FORMAT=json FOR CONNECTION 8\G in the other. The actual connection number would need to verified in connection one, of course. Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7

This worklog introduces dynamic offload of Queries to RAPID in following ways: When system variable rapid_use_dynamic_offload is 0/false , then we fall back to normal cost threshold classifier, which also implies that when use secondary engine is set to forced, eligible queries will go to secondary engine, regardless of cost threshold or this classifier. When rapid_use_dynamic_offload is 1/true, then we proceed with looking for optimal execution engine for this queries, if secondary engine is found more optimal, then query is offloaded, otherwise it is sent back to mysql. This is handled in following scenarios: 1. Static Scenario: When there's no Change propagation or Queue on RAPID side, this introduces decision tree which has > 85 % precision in training which queries should be faster on mysql or which queries should be faster on mysql, and accepts or rejects queries. the decision tree takes around 20-100 microseconds for fast queries, hence minimal overhead, for bigger queries this introduces overhead of upto maximum observed 700 microseconds, these end up with long execution time, hence not a problem. For very fast queries, defined here by having cost < 10 and of the form point select, dynamic offload is not applied, since 100 % of these queries (out of 16667 samples) are faster on MySQL. Additionally, routing these "very fast queries" through dynamic offload leads to performance regressions due to 3 phase optimisation. 2. Dynamic Scenario: When there's CP or queuing on RAPID, this worklog introduces dynamic feature normalization to factor into account extra catch up time RAPID needs, and factoring in that, attempts to verify if RAPID is still the best engine for execution. If queue is too long or CP is too long, this mechanism wants to progressively start shifting queries to mysql, moving gradually towards the heavier queries The steps in this worklog with respect to query lifecycle in server with secondary_engine = ON, are described below: query | Primary Tentatively optimisation -> mysql optimises for Innodb | secondary_engine_pre_prepare_hook -> following Rapid function called: | RapidCachePrimaryInfoAtPrimaryTentativelyStep | If dynamic offload is enabled and query is not "very fast": | This caches features from mysql plan in rapid_statement_context | to be used for dynamic offload. | If dynamic offload is disabled or the query is "very fast": | This function invokes standary mysql cost threshold classifier, | which decides if query needs further RAPID optimisation. | | |-> if returns False, then query proceeds to Innodb for execution |-> if returns true, step below is called | Secondary optimisation -> mysql optimises for RAPID | prepare_secondary_engine -> following Rapid function is called: | RapidPrepareEstimateQueryCosts | In this function, Dynamic offload combines mysql plan features | retrieved from rapid_statement_context | and RAPID info such as rapid base table cardinality, | dict encoding projection, varlen projection size, rapid queue | size in to decide if query should be offloaded to RAPID. | |->if returns True, then query proceeds to Innodb for execution |->if returns False, step below is called | optimize_secondary_engine -> following Rapid function is called | RapidOptimize | In this function, Dynamic offload retrieves info from | rapid_statement_context and additionally looks at Change | propagation lag to decide if query should be offloaded to rapid | |->if returns True, then query proceeds to Innodb for execution |->if returns False, then query goes to Rapid Execution. Following new MYSQL ERR log messages are printed with this WL, when dynamic offload is enabled, and query is not a "very fast query". 1. SelOffload allow decision 1 : as secondary not forced 1 and enable var value 1 and transactional enabled 1 and( big shape detected 0 or small shape detected 1 ) inno: 10737418240 , rpd: 4294967296 , no lh table: 1 Message such as this shows if dynamic offload is used to classify this query or not. If not, why not, using each of the conditions. 1 = pass, 0 = not pass. 2. myqid=65 Selective offload classifier #1#1#1 f_mysql_total_ts_nrows <= 2105.5 : 0.173916, f_MySQLCost <= 68.3899040222168 : 0.028218, f_count_all_base_tables = 0 , f_count_ref_index_ts = 0 ,f_BaseTableSumNrows <= 278177.5 : 0.173916 are_all_ts_index_ref = true outcome=0 Line such as this serialises what leg of decision tree decided outcome of this query 0 -> back to mysql 1 -> keep on rapid. each leg is uniquely searchable via identifier such as #1#1#1 here. This worklog additionally introduces python scripts to run queries on mysql client with multiple queries and multiple dmls at once, in various modes such as simulator mode and standard benchmark modes. By Default this WL is enabled, but before release it will be disabled. This is tracked via BUG#36343189 #no-close. Perf mode unittests will be enabled on jenkins after this wl. Further cleanup will be done via BUG#36368437 #no-close. Bugs tackled via this WL: BUG#35738194, Enh#34132523, Bug#36343208 Unrelated bugs fixed: BUG#35987975 Old gerrit review : 25567 (abandoned due to 1000 update limit reached) Change-Id: Ie5f9fdcd8b55a669d04b389d3aec5f6b33f0fe2e

This worklog introduces dynamic offload of Queries to RAPID in following ways: When system variable rapid_use_dynamic_offload is 0/false , then we fall back to normal cost threshold classifier, which also implies that when use secondary engine is set to forced, eligible queries will go to secondary engine, regardless of cost threshold or this classifier. When rapid_use_dynamic_offload is 1/true, then we proceed with looking for optimal execution engine for this queries, if secondary engine is found more optimal, then query is offloaded, otherwise it is sent back to mysql. This is handled in following scenarios: 1. Static Scenario: When there's no Change propagation or Queue on RAPID side, this introduces decision tree which has > 85 % precision in training which queries should be faster on mysql or which queries should be faster on mysql, and accepts or rejects queries. the decision tree takes around 20-100 microseconds for fast queries, hence minimal overhead, for bigger queries this introduces overhead of upto maximum observed 700 microseconds, these end up with long execution time, hence not a problem. For very fast queries, defined here by having cost < 10 and of the form point select, dynamic offload is not applied, since 100 % of these queries (out of 16667 samples) are faster on MySQL. Additionally, routing these "very fast queries" through dynamic offload leads to performance regressions due to 3 phase optimisation. 2. Dynamic Scenario: When there's CP or queuing on RAPID, this worklog introduces dynamic feature normalization to factor into account extra catch up time RAPID needs, and factoring in that, attempts to verify if RAPID is still the best engine for execution. If queue is too long or CP is too long, this mechanism wants to progressively start shifting queries to mysql, moving gradually towards the heavier queries The steps in this worklog with respect to query lifecycle in server with secondary_engine = ON, are described below: query | Primary Tentatively optimisation -> mysql optimises for Innodb | secondary_engine_pre_prepare_hook -> following Rapid function called: | RapidCachePrimaryInfoAtPrimaryTentativelyStep | If dynamic offload is enabled and query is not "very fast": | This caches features from mysql plan in rapid_statement_context | to be used for dynamic offload. | If dynamic offload is disabled or the query is "very fast": | This function invokes standary mysql cost threshold classifier, | which decides if query needs further RAPID optimisation. | | |-> if returns False, then query proceeds to Innodb for execution |-> if returns true, step below is called | Secondary optimisation -> mysql optimises for RAPID | prepare_secondary_engine -> following Rapid function is called: | RapidPrepareEstimateQueryCosts | In this function, Dynamic offload combines mysql plan features | retrieved from rapid_statement_context | and RAPID info such as rapid base table cardinality, | dict encoding projection, varlen projection size, rapid queue | size in to decide if query should be offloaded to RAPID. | |->if returns True, then query proceeds to Innodb for execution |->if returns False, step below is called | optimize_secondary_engine -> following Rapid function is called | RapidOptimize | In this function, Dynamic offload retrieves info from | rapid_statement_context and additionally looks at Change | propagation lag to decide if query should be offloaded to rapid | |->if returns True, then query proceeds to Innodb for execution |->if returns False, then query goes to Rapid Execution. Following new MYSQL ERR log messages are printed with this WL, when dynamic offload is enabled, and query is not a "very fast query". 1. SelOffload allow decision 1 : as secondary not forced 1 and enable var value 1 and transactional enabled 1 and( big shape detected 0 or small shape detected 1 ) inno: 10737418240 , rpd: 4294967296 , no lh table: 1 Message such as this shows if dynamic offload is used to classify this query or not. If not, why not, using each of the conditions. 1 = pass, 0 = not pass. 2. myqid=65 Selective offload classifier mysql#1#1#1 f_mysql_total_ts_nrows <= 2105.5 : 0.173916, f_MySQLCost <= 68.3899040222168 : 0.028218, f_count_all_base_tables = 0 , f_count_ref_index_ts = 0 ,f_BaseTableSumNrows <= 278177.5 : 0.173916 are_all_ts_index_ref = true outcome=0 Line such as this serialises what leg of decision tree decided outcome of this query 0 -> back to mysql 1 -> keep on rapid. each leg is uniquely searchable via identifier such as mysql#1#1#1 here. This worklog additionally introduces python scripts to run queries on mysql client with multiple queries and multiple dmls at once, in various modes such as simulator mode and standard benchmark modes. By Default this WL is enabled, but before release it will be disabled. This is tracked via BUG#36343189 #no-close. Perf mode unittests will be enabled on jenkins after this wl. Further cleanup will be done via BUG#36368437 #no-close. Bugs tackled via this WL: BUG#35738194, Enh#34132523, Bug#36343208 Unrelated bugs fixed: BUG#35987975 Old gerrit review : 25567 (abandoned due to 1000 update limit reached) Change-Id: Ie5f9fdcd8b55a669d04b389d3aec5f6b33f0fe2e

… and .6node3rpl Issue #1 Problem: Test fail in 4node4rpl (1 node group). Solution: Skip test when there is only one NG. Issue #2 Problem: Test fail in 6node3rpl (2 node group) with timeout. Test idea is to restart, with nostart option, *ALL* nodes in same node group to check if QMGR handles it wrongly as "node group is missing". In the test only two nodes in same node group are restarted, it works for 2 replica setups but, for 4 replica, test hangs waiting cluster to enter a noStart state. Solution: Instead of restart exactly 2 nodes, restart ALL nodes in a given node group. Change-Id: Iafb0511992a553723013e73593ea10540cd03661

In case `with_ndb_home` is set, `buf` is allocated with `PATH_MAX` and the home is already written into the buffer. The additional path is written using `snprintf` and it starts off at `len`. It still can write up to `PATH_MAX` though which is wrong, since if we already have a home written into it, we only have `PATH_MAX - len` available in the buffer. On Ubuntu 24.04 with debug builds this is caught and it crashes: ``` *** buffer overflow detected ***: terminated Signal 6 thrown, attempting backtrace. stack_bottom = 0 thread_stack 0x0 #0 0x604895341cb6 <unknown> mysql#1 0x7ff22524531f <unknown> at sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 mysql#2 0x7ff22529eb1c __pthread_kill_implementation at ./nptl/pthread_kill.c:44 mysql#3 0x7ff22529eb1c __pthread_kill_internal at ./nptl/pthread_kill.c:78 mysql#4 0x7ff22529eb1c __GI___pthread_kill at ./nptl/pthread_kill.c:89 mysql#5 0x7ff22524526d __GI_raise at sysdeps/posix/raise.c:26 mysql#6 0x7ff2252288fe __GI_abort at ./stdlib/abort.c:79 mysql#7 0x7ff2252297b5 __libc_message_impl at sysdeps/posix/libc_fatal.c:132 mysql#8 0x7ff225336c18 __GI___fortify_fail at ./debug/fortify_fail.c:24 mysql#9 0x7ff2253365d3 __GI___chk_fail at ./debug/chk_fail.c:28 mysql#10 0x7ff225337db4 ___snprintf_chk at ./debug/snprintf_chk.c:29 mysql#11 0x6048953593ba <unknown> mysql#12 0x604895331a3d <unknown> mysql#13 0x6048953206e7 <unknown> mysql#14 0x60489531f4b1 <unknown> mysql#15 0x60489531e8e6 <unknown> mysql#16 0x7ff22522a1c9 __libc_start_call_main at sysdeps/nptl/libc_start_call_main.h:58 mysql#17 0x7ff22522a28a __libc_start_main_impl at csu/libc-start.c:360 mysql#18 0x60489531ed54 <unknown> mysql#19 0xffffffffffffffff <unknown> ``` In practice this buffer overflow only would happen with very long paths. Signed-off-by: Dirkjan Bussink <[email protected]>

Description: ============ Dropping a primary key and adding a new auto-increment column as a primary key in descending order using the "inplace" algorithm fails. Analysis: ========= Dropping an existing primary key and adding a new auto-increment key in descending order requires arranging the records in reverse order, which necessitates a file sort. However, this scenario was not detected in the method innobase_pk_order_preserved(), causing it to return false. As a result, the ALTER INPLACE operation, which calls this method, skips the file sort. Instead, it processes the primary key as usual in batches, a method known as bulk mode. In bulk mode, records are inserted into a sort buffer (in descending order in this case). When the sort buffer becomes full, records are directly inserted into the B-tree. Consider a case where we have 2000 records, and the sort buffer can hold 1000 records in a batch: Batch #1 inserted: Records 1000 to 1 (in descending order) Batch #2 inserted: Records 2000 to 1001 (in descending order) If the records from both batches happen to be in the same page, the record order is violated. It's important to note that this record order violation would still exist even if the sort buffer were skipped when file sort was skipped. Therefore, enabling file sort is essential to ensure correct record order across batches. Fix: ==== Enable file sort when add autoinc descending. This patch is based on the contribution from Shaohua Wang at Alibaba Group. We thank you for contributing to MySQL. Change-Id: I398173bbd27db7f5e29218d217bf11c30297c242

all working, super remainign

Post-push fix for broken unit test mdl-t In Debug mode: mdl-t: sql/mdl.h:481: void MDL_key::mdl_key_init(enum_mdl_namespace, const char *, const char *): Assertion `!use_normalized_object_name()' failed. unit test got signal 6 stack_bottom = 0 thread_stack 0x0 #0 0x67f2b7 _ZL14signal_handleri at unittest/gunit/gunit_test_main.cc:62 #1 0x7f9fd3e4fcff <unknown> at sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 .... #7 0x645bda _ZN7MDL_key12mdl_key_initENS_18enum_mdl_namespaceEPKcS2_ at sql/mdl.h:481 #8 0x65f6d1 _ZN11MDL_request16init_with_sourceEN7MDL_key18enum_mdl_namespaceEPKcS3_13enum_mdl_type17enum_mdl_durationS3_j at sql/mdl.cc:1520 #9 0x624783 _ZN12mdl_unittest39MDLHtonNotifyTest_NotifyNamespaces_Test8TestBodyEv at unittest/gunit/mdl-t.cc:3893 In RelWithDebInfo mode: unittest/gunit/mdl-t.cc:3902: Failure Expected equality of these values: 1U Which is: 1 pre_acquire_count() Which is: 0 Change-Id: I155ef98b40fb521a1721ba4f34e3a315ef847626

Make auth_socket more flexible.

3f630b9

Now supports "AS '<user>'" so it doesn't have to be a 1-on-1 mapping. Example: CREATE USER 'foo'@'localhost IDENTIFIED WITH auth_socket AS 'myuser';

MyDanny closed this Oct 29, 2014

bkandasa pushed a commit that referenced this pull request Aug 4, 2015

Addendum #1 to bug #14588145:

5e2b2f5

Added the extra scope argument to the status variables.

bjornmu pushed a commit that referenced this pull request Oct 21, 2015

Addendum #1 to bug #21034322: fixed a signed/unsigned comparison.

2256190

pobrzut pushed a commit that referenced this pull request May 8, 2017

wl#9819 patch #1 : NDBD angel process reports its process id to kerne…

f51bd28

…l process

pobrzut pushed a commit that referenced this pull request May 8, 2017

wl#9819 Version 2. Patch #1: ndbinfo processes table definition

d1ae317

sjmudd pushed a commit to sjmudd/mysql-server that referenced this pull request Jan 24, 2019

WL#12217 addendum mysql#1: only enable the test in one of the replica…

c2ce386

…tion modes.

bjornmu pushed a commit that referenced this pull request Jan 16, 2024

WL#15130 Socket-level TLS patch #1: class NdbSocket

e4b3dde

Post push fix. Make NdbSocket::ssl_readln return 0 on timeout. Change-Id: I4cad95abd319883c16f2c28eff5cf2b6761731d6

bjornmu pushed a commit that referenced this pull request Jan 16, 2024

WL#15524 Patch #1 "START TLS" for management API

5d9e655

Post push fix. Add missing socket close in testMgmd -n StartTls. Change-Id: Ia446b522ad2698f63d588d3c52122df8735765c7

bjornmu pushed a commit that referenced this pull request Jan 16, 2024

WL#15130 Socket-level TLS patch #1: class NdbSocket

de0ff0f

Post push fix. Make NdbSocket::ssl_readln return 0 on timeout. Change-Id: I4cad95abd319883c16f2c28eff5cf2b6761731d6

VarunNagaraju added a commit to VarunNagaraju/mysql-server that referenced this pull request Jan 3, 2025

Performance check mysql#1 with PMR containers

525cb3f

lurkingryuu pushed a commit to lurkingryuu/asql that referenced this pull request Apr 13, 2025

Merge pull request mysql#1 from inifnite/temp

adbeb86

all working, super remainign

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make auth_socket more flexible. #1

Make auth_socket more flexible. #1

Uh oh!

dveeden commented Oct 27, 2014

Uh oh!

roel666 commented Oct 27, 2014

Uh oh!

dveeden commented Oct 27, 2014

Uh oh!

MyDanny commented Oct 29, 2014

Uh oh!

Uh oh!

Make auth_socket more flexible. #1

Make auth_socket more flexible. #1

Uh oh!

Conversation

dveeden commented Oct 27, 2014

Uh oh!

roel666 commented Oct 27, 2014

Uh oh!

dveeden commented Oct 27, 2014

Uh oh!

MyDanny commented Oct 29, 2014

Uh oh!

Uh oh!