Commit Graph

92 Commits

Author SHA1 Message Date
1c7209e828 Merge 10.6 into 10.11 2025-05-21 07:36:35 +03:00
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
98dbe3bfaf Merge 10.5 into 10.6 2025-01-20 09:57:37 +02:00
901c6c7ab6 MDEV-33064: Sync trx->wsrep state from THD on trx start
InnoDB transactions may be reused after committed:
- when taken from the transaction pool
- during a DDL operation execution

In this case wsrep flag on trx object is cleared, which may cause wrong
execution logic afterwards (wsrep-related hooks are not run).

Make trx->wsrep flag initialize from THD object only once on InnoDB transaction
start and don't change it throughout the transaction's lifetime.
The flag is reset at commit time as before.

Unconditionally set wsrep=OFF for THD objects that represent InnoDB background
threads.

Make Wsrep_schema::store_view() operate in its own transaction.

Fix streaming replication transactions' fragments rollback to not switch
THD->wsrep value during transaction's execution
(use THD->wsrep_ignore_table as a workaround).

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-14 02:17:22 +01:00
133e26fd7d MDEV-34924 : gtid_slave_pos table neven been deleted on non replica nodes (wsrep_gtid_mode = 1)
Problem was caused by MDEV-31413 commit 277968aa where
mysql.gtid_slave_pos table was replicated by Galera.
However, as not all nodes in Galera cluster are replica
nodes, rows were not deleted from table.

In this fix this is corrected so that mysql.gtid_slave_pos
table is not replicated by Galera. Instead when Galera
node receives GTID event and wsrep_gtid_mode=1, this event
is stored to mysql.gtid_slave_pos table.

Added test case galera_2primary_replica for 2 async
primaries replicating to galera cluster.

Added test case galera_circular_replication where
async primary replicates to galera cluster and
one of the galera cluster nodes is master
to async replica.

Modified test case galera_restart_replica to monitor
gtid positions and rows in mysql.gtid_pos_table.
2025-01-13 19:14:26 +01:00
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
e821c9fa7c MDEV-35281 SR transaction crashes with innodb_snapshot_isolation
Ignore snapshot isolation conflict during fragment removal, before
streaming transaction commits. This happens when a streaming
transaction creates a read view that precedes the INSERTion of
fragments into the streaming_log table. Fragments are INSERTed
using a different transaction. These fragment are then removed
as part of COMMIT of the streaming transaction. This fragment
removal operation could fail when the fragments were not part
the transaction's read view, thus violating snapshot isolation.
2024-11-29 08:06:32 +01:00
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
d058be62b8 Merge branch '10.6' into '10.11' 2024-09-02 03:49:03 +02:00
bac0804d81 Merge branch '10.5' into '10.6' 2024-09-01 06:51:25 +02:00
9091afdc55 MDEV-31173 : Server crashes when setting wsrep_cluster_address after adding invalid value to wsrep_allowlist table
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.

This is 10.4-10.9 version of fix.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-29 13:41:23 +02:00
034a175982 Merge branch '10.6' into 10.11 2024-07-04 11:52:07 +02:00
cfbd57dfb7 MDEV-33064: Sync trx->wsrep state from THD on trx start
InnoDB transactions may be reused after committed:
- when taken from the transaction pool
- during a DDL operation execution

In this case wsrep flag on trx object is cleared, which may cause wrong
execution logic afterwards (wsrep-related hooks are not run).

Make trx->wsrep flag initialize from THD object only once on InnoDB transaction
start and don't change it throughout the transaction's lifetime.
The flag is reset at commit time as before.

Unconditionally set wsrep=OFF for THD objects that represent InnoDB background
threads.

Make Wsrep_schema::store_view() operate in its own transaction.

Fix streaming replication transactions' fragments rollback to not switch
THD->wsrep value during transaction's execution
(use THD->wsrep_ignore_table as a workaround).

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-07-01 13:07:39 +02:00
0aae11ac28 Merge branch '10.6' into 10.11 2024-04-30 16:56:49 +02:00
ef7a2344af Fixup 0ccdf54b64
0ccdf54 removed stack allocated THD objects from functions
Wsrep_schema::replay_transaction(). However, it inadvertedly
anticipated the destruction of the THD, causing assertions and usage
of THD after it was destroyed.
The fix consists in extracting the original function into a separate
function, and leave the allocation and destruction of the THD object
in Wsrep_schema::replay_transaction(), making sure that using the heap
allocated THD has no side effects.
Same for Wsrep_schema::recover_sr_transactions().

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-04-26 11:18:36 +02:00
c9b1ebee2f Merge branch '10.6' into 10.11 2024-04-26 08:02:49 +02:00
0ccdf54b64 Check and remove high stack usage
I checked all stack overflow potential problems found with
gcc -Wstack-usage=16384
and
clang -Wframe-larger-than=16384 -no-inline

Fixes:
Added '#pragma clang diagnostic ignored "-Wframe-larger-than="'
  to a lot of function to where stack usage large but resonable.
- Added stack check warnings to BUILD scrips when using clang and debug.

Function changed to use malloc instead allocating things on stack:
- read_bootstrap_query() now allocates line_buffer (20000 bytes) with
  malloc() instead of using stack. This has a small performance impact
  but this is not releant for bootstrap.
- mroonga grn_select() used 65856 bytes on stack. Changed it to use
  malloc().
- Wsrep_schema::replay_transaction() and
  Wsrep_schema::recover_sr_transactions().
- Connect zipOpen3()

Not fixed:
- mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses
  43712 byte on stack.  However this is not easy to fix as the stack
  used is caused by a lot of code generated by defines.
- Most changes in mroonga/groonga where only adding of pragmas to disable
  stack warnings.
- rocksdb/options/options_helper.cc uses 20288 of stack space.
  (no reason to fix except to get rid of the compiler warning)
- Causes using alloca() where the allocation size is resonable.
- An issue in libmariadb (reported to connectors).
2024-04-23 14:12:31 +03:00
018d537ec1 Merge branch '10.6' into 10.11 2024-04-22 15:23:10 +02:00
829cb1a49c Merge 10.5 into 10.6 2024-04-17 14:14:58 +03:00
9b18275623 Merge branch '10.4' into 10.5 2024-04-16 11:04:14 +02:00
16aa4b5f59 Merge from 10.4 to 10.5
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 17:46:49 +02:00
a618ff2b1c MDEV-33216 stack-use-after-return in Wsrep_schema_impl::open_table()
Fix a case of stack-use-after-return reported by ASAN in
Wsrep_schema_impl::open_table(). This function has a stack allocated
TABLE_LIST object and return TABLE_LIST::table to the caller.
Changed the function to take a TABLE_LIST pointer as argument.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-03-28 10:55:37 +01:00
e0c8165487 MDEV-33509 Failed to apply write set with flags=(rollback|pa_unsafe)
Fix function `remove_fragment()` in wsrep_schema so that no error is
raised if the fragment to be removed is not found in the
wsrep_streaming_log table. This is necessary to handle the case where
streaming transaction in idle state is BF aborted. This may result in
the case where the rollbacker thread successfully removes the
transaction's fragments, followed by the applier's attempt to remove
the same fragments. Causing the node to leave the cluster after
reporting a "Failed to apply write set" error.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-03-26 05:56:37 +01:00
ced243a099 Merge branch '10.9' into 10.10 2023-08-05 20:34:09 +02:00
6bf8483cac Merge branch '10.5' into 10.6 2023-08-01 15:08:52 +02:00
f52954ef42 Merge commit '10.4' into 10.5 2023-07-20 11:54:52 +02:00
9f909e546e MDEV-30197 : Missing DBUG_RETURN or DBUG_VOID_RETURN macro in function "Wsrep_schema::restore_view()"
Here user is starting server with unsupported client charset.
We need to create wsrep_schema tables using explicit latin1
charset to avoid errors in restoring view.
2023-05-23 01:10:19 +02:00
956d6c4af9 MDEV-21479 : Galera 4 unable to query cluster state if not primary component
Set mysql.wsrep_cluster and mysql.wsrep_cluster_members as
TABLE_CATEGORY_INFORMATION as mysql.wsrep_streaming_log
so that they can be queried even if node is not primary
component.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-05-16 13:11:44 +02:00
13a294a2c9 Merge branch '10.9' into 10.10 2023-05-03 14:09:13 +02:00
043d69bbcc Merge branch '10.5' into 10.6 2023-05-03 09:51:25 +02:00
10e135b679 Merge branch 'bb-10.4-release' into bb-10.5-release 2023-05-02 15:47:10 +02:00
ef227762b1 MDEV-30838 Assertion `m_thd == _current_thd()'
- Update wsrep-lib which contains fix for the assertion
- Fix error handling for appending fragment to streaming log,
  make sure tables are closed after rollback.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-05-02 03:42:39 +02:00
345356b868 Merge 10.9 into 10.10 2023-02-16 11:36:38 +02:00
6aec87544c Merge 10.5 into 10.6 2023-02-10 13:03:01 +02:00
c41c79650a Merge 10.4 into 10.5 2023-02-10 12:02:11 +02:00
08c852026d Apply clang-tidy to remove empty constructors / destructors
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .

Code style changes have been done on top. The result of this change
leads to the following improvements:

1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
  ~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.

2. Compiler can better understand the intent of the code, thus it leads
   to more optimization possibilities. Additionally it enabled detecting
   unused variables that had an empty default constructor but not marked
   so explicitly.

   Particular change required following this patch in sql/opt_range.cc

   result_keys, an unused template class Bitmap now correctly issues
   unused variable warnings.

   Setting Bitmap template class constructor to default allows the compiler
   to identify that there are no side-effects when instantiating the class.
   Previously the compiler could not issue the warning as it assumed Bitmap
   class (being a template) would not be performing a NO-OP for its default
   constructor. This prevented the "unused variable warning".
2023-02-09 16:09:08 +02:00
c0a6ce61d8 Merge 10.9 into 10.10 2022-09-06 10:51:18 +03:00
9fefd440b5 Merge 10.5 into 10.6 2022-09-05 14:05:30 +03:00
ba987a46c9 Merge 10.4 into 10.5 2022-09-05 13:28:56 +03:00
2917bd0d2c Reduce compilation dependencies on wsrep_mysqld.h
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-08-31 11:05:23 +03:00
7864d955f3 Fixed regression on SST tests. We incorrectly used my_thread_end(), which
sets mysys_var pointer to NULL, so the next THD::THD will crash.
Removed my_thread_init()/end() pairs and because
Wsrep_allowlist_service::allowlist_cb is not always called from a
new thread added a thread to do so.

Fix co-authored by Sergei Golubchik <serg@mariadb.org>
and mkaruza <mario.karuza@galeracluster.com>

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-08-02 17:25:04 +03:00
7fdc993ec6 MDEV-27263 Cluster bootstrap node shows duplicate wsrep allowlist IP warning messages on each restart.
We should clear `wsrep_allowlist` table on bootstrap before writing to
it.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-08-02 17:24:28 +03:00
9743d0043e MDEV-27246 Implement a method to add IPs to allowlist for Galera Cluster node addresses that can make SST/IST requests
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-08-02 17:24:28 +03:00
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
49e3bd2cbc MDEV-27553 Assertion `inited==INDEX' failed: in ha_index_end()
In wsrep_schema code, call ha_index_end() only if the corresponding
ha_index_init() call succeeded.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-01-24 09:46:21 +02:00
a49e394399 Merge 10.5 into 10.6 2021-09-30 10:38:44 +03:00
690c472591 MDEV-21613 : galera_sr.GCF-1018B MTR failed: Failed to open table mysql.wsrep_streaming_log for writing
Query can be bf aborted already earlier and then we should not
even try to open table.
2021-09-27 15:09:50 +03:00