Background:
In MDEV-33474, we introduced runtime dependency packaging primarily to
support libcurl and other potential third-party dependencies from vcpkg.
Problem:
The INSTALL(RUNTIME_DEPENDENCY_SET) command was failing at packaging step
unless shared libraries from the same build were explicitly excluded via
PRE_EXCLUDE_REGEXES. While initially only server.dll was excluded this way,
this turned out insufficient for users compiling their own plugins
Solution:
Exclude all linked shared libraries from the same build via
PRE_EXCLUDE_REGEXES. Move dependency detection and install to the end of
CMake processing, after all add_library/add_executable calls, when all
targets are known.
Also made the INSTALL_RUNTIME_DEPENDENCIES variable independent of vcpkg
detection, for simplicity.
Follow-up 9e1c1d429f
Item_func_hybrid_field_type::val_real_from_int_op() might return wrong
result. This makes mtr fail on Windows/ARM , at least in main.func_format
and main.sp-vars
Many to Chengya/coneco-cy for debugging the issue and finding the cause.
Initialize UErrorCode to U_ZERO_ERROR before passing it to
ucal_getDefaultTimeZone(), as required by the ICU conventions.
Passing an uninitialized status variable leads to undefined behavior and
non-deterministic failures.
This issue was not observed on x64 builds, but appeared during
Windows/ARM64 CI testing. The lack of initialization led to
inconsistent fallback between ICU and native Windows time zone.
mtr tests with restarts showed the alternation of system_time_zone,
as reported by check_testcase after the test.
-SYSTEM_TIME_ZONE Coordinated Universal Time
+SYSTEM_TIME_ZONE Etc/UTC
In commit b6923420f3 (MDEV-29445)
some hash tables were accidentally created with the minimum size
(101 entries) instead of correctly deriving the size from the
initial innodb_buffer_pool_size. This led to very long hash bucket
chains, which are very slow to traverse.
ut_find_prime(): Assert that the size is nonzero in order to catch
this type of regression in the future.
innodb_init_params(): Do not bother reading buf_pool.curr_size()
when it is known to be 0,
srv_start(): Correctly initialize srv_lock_table_size to 5 times
buf_pool.curr_size(), that is, the buffer pool size in pages,
between invoking buf_pool.create() and lock_sys.create().
btr_search_enable(), dict_sys_t::create(), dict_sys_t::resize():
Correctly refer to buf_pool.curr_pool_size(), that is,
innodb_buffer_pool_size in bytes, when calculating the hash table size.
In MDEV-29445 the expressions buf_pool_get_curr_size() were
accidentally replaced with buf_pool.curr_size().
buf_buddy_shrink(): Properly cover the case when KEY_BLOCK_SIZE
corresponds to the innodb_page_size, that is, the ROW_FORMAT=COMPRESSED
page frame is directly allocated from the buffer pool, not via the
binary buddy allocator.
buf_LRU_check_size_of_non_data_objects(): Avoid a crash when the
buffer pool is being shrunk.
buf_pool_t::shrink(): Abort if over 95% of the shrunk buffer pool
would be occupied by the adaptive hash index or record locks.
In commit b6923420f3 (MDEV-29445)
we started to specify the MAP_POPULATE flag for allocating the
InnoDB buffer pool. This would cause a lot of time to be spent
on __mm_populate() inside the Linux kernel, such as 16 seconds
to pre-fault or commit innodb_buffer_pool_size=64G.
Let us revert to the previous way of allocating the buffer pool
at startup. Note: An attempt to increase the buffer pool size by
SET GLOBAL innodb_buffer_pool_size (up to innodb_buffer_pool_size_max)
will invoke my_virtual_mem_commit(), which will use MAP_POPULATE
to zero-fill and prefault the requested additional memory area, blocking
buf_pool.mutex.
Before MDEV-29445 we allocated the InnoDB buffer pool by invoking
mmap(2) once (via my_large_malloc()). After the change, we would
invoke mmap(2) twice, first via my_virtual_mem_reserve() and then
via my_virtual_mem_commit(). Outside Microsoft Windows, we are
reverting back to my_large_malloc() like allocation.
my_virtual_mem_reserve(): Define only for Microsoft Windows.
Other platforms should invoke my_large_virtual_alloc() and
update_malloc_size() instead of my_virtual_mem_reserve() and
my_virtual_mem_commit().
my_large_virtual_alloc(): Define only outside Microsoft Windows.
Do not specify MAP_NORESERVE nor MAP_POPULATE, to preserve compatibility
with my_large_malloc(). Were MAP_POPULATE specified, the mmap()
system call would be significantly slower, for example 18 seconds
to reserve 64 GiB upfront.
log_t::append_prepare_wait(): Do not attempt to read log_sys.write_lsn
because it is not protected by log_sys.latch but by write_lock, which
we cannot hold here. The assertion could fail if log_t::write_buf()
is executing concurrently, and it has not yet executed log_write_buf()
or updated log_sys.write_lsn.
Fixes up commit acd071f599 (MDEV-21923)
There were two issues with the test:
1. A race between a race_condition.inc and status variable, where the
status variable Rpl_semi_sync_master_status could be ON before the
semi-sync connection finished establishing, resulting in
Rpl_semi_sync_master_clients showing 0 (instead of 1). To fix this,
we simply instead wait for Rpl_semi_sync_master_clients to be 1
before proceeding.
2. Another race between a race_condition.inc and status variable,
where the wait_condition waited on a process_list command of
'BINLOG DUMP' to disappear to infer the binlog dump thread was
killed, to where we then verified semi-sync state was correct
using status variables. However, the 'BINLOG DUMP' command is
overridden with a killed status before the semi-sync tear-down
happens, and thereby we could see invalid values. The fix for
this is to change the wait_condition to instead wait for the
connection with the replication user is gone, because that stays
through the binlog dump thread tear-down life-cycle
As seen with openwrt and some other distros, the
determination of hostname can sometime need alternate
commmands.
This provides a cmake option HOSTNAME for non-windows machines
for the mariadb-install-db and mariadbd-safe scripts
and the support-files init scripts..
The minimum statistics level now is rocksdb::StatsLevel::kDisableAll.
The default remains rocksdb::StatsLevel::kExceptHistogramOrTimers
which is now 1 (it used to be 0).
Hopefully, this ends the long story of CapabilityBoundingSet
in mariadb.service.
Started from MDEV-9095 (27e6fd9a59) which was supposed
to let --memlock work without root, but instead of
adding the necessary capability (CAP_IPC_LOCK) by putting it into
AmbientCapabilities it removed all other capabilities,
by putting CAP_IPC_LOCK into CapabilityBoundingSet
(which is the mask of allowed capabilities).
This broke pam plugin, which needed CAP_DAC_OVERRIDE,
it was fixed in MDEV-19878 (dd93028dae) by appending
CAP_DAC_OVERRIDE to CapabilityBoundingSet.
Obviously, memlock still didn't work, this was fixed
in MDEV-33301 (76a27155b4) by moving CAP_IPC_LOCK
to AmbientCapabilities.
Unfortunately, it moved too much (everything), so
MDEV-36229 (85ecb80fa3) fixed it moving CAP_DAC_OVERRIDE
back to CapabilityBoundingSet.
This caused MDEV-36591 (8925877dc8) triggering a bug in old
systemd versions. And it broke pam plugin on CentOS Stream 10,
where CAP_DAC_OVERRIDE alone was apparently not enough.
Let's finally fix this by removing CapabilityBoundingSet
completely and keeping CAP_IPC_LOCK in AmbientCapabilities,
which should've been the correct fix for MDEV-9095 from the start.
Get rid of need of matherialization for usual INSERT (cache results in
Item_cache* if needed)
- subqueries in VALUE do not see new records in the table we are
inserting to
- subqueries in RETIRNING prohibited to use the table we are inserting to
According to buildbot cross-reference atomic.alter_table is failing ~10
times a day due to 900 seconds test case timeout. Split it into two
tests: atomic.alter_table_myisam and atomic.alter_table_innodb.
It should reduce failure frequency down to once a day or so, similarly
to atomic.alter_table_aria test.
Diabling InnoDB for MyISAM/Aria/RocksDB tests makes them 20-35% faster.
inside the get_lookup_field_values when lower_case_table_names's value
is 2 this should reserve the case for the table's and database's names
so this commit changes the condition to lowercase only when the
lower_case_table_name's value is 1 not just 1 and 2 "any value not equal
0"
SELECT FROM information_schema.metadata_lock_info may sporadically
return MDL_SHARED lock acquired by InnoDB purge thread.
Fix proposed in a7bf0a42d0 wasn't enough to solve this problem:
apparently it did protect against purging old rows, but not against
locking the table.
Reverted a7bf0a42d0 in favour of solution proposed in e996f77cd8.
That is use wait_all_purged.inc to make sure purge is completed.
Reason:
=======
- MDEV-16239 does apply the DML logs after bulk insert for
ALTER TABLE..ALGORITHM=COPY, but InnoDB fails to reset the bulk_insert
in ha_innobase::extra(HA_EXTRA_END_ALTER_COPY). This leads to crash
while applying DML logs.
Solution:
=======
ha_innobase::extra(HA_EXTRA_END_ALTER_COPY): Reset TRX_DDL_BULK at the
end of bulk insert operation
When a replica stops an established semi-sync connection, it is
supposed to kill the corresponding binlog dump thread on the primary
server. However, when connections are configured to use SSL, this new
connection created by the replica to kill the dump thread doesn't have
any logic to configure SSL options, and thereby the connection can't be
made, and the dump thread will never be killed.
This patch adds logic to configure the semi-sync kill connection with
SSL. The exising logic to set up the connection options for the regular
connection was extracted into a function that the semi-sync kill
connection invokes.
Co-author: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
A statement SET GLOBAL innodb_buffer_pool_size=...
could fail for no good reason when the buffer pool contains many
pages that can actually be evicted.
buf_flush_LRU_list_batch(): Keep evicting as long as the buffer pool
is being shrunk, for at most innodb_lru_scan_depth extra blocks.
Disregard the flush limit for pages that are marked as freed in files.
buf_flush_LRU_to_withdraw(): Update the to_withdraw target during
buf_flush_LRU_list_batch().
buf_pool_t::will_be_withdrawn(): Allow also ptr=nullptr (the condition
will not hold for it).
This fixes a regression that was introduced in
commit b6923420f3 (MDEV-29445)
and caught by the test innodb.temp_truncate_freed in MariaDB Server 11.4.
Tested by: Thirunarayanan Balathandayuthapani
Reviewed by: Thirunarayanan Balathandayuthapani
Combined AmbientCapabilities and CapabilityBoundingSet configuration
within a service file we have found by testing aren't supported in the
systemd v245 (Ubuntu 20.04) and v239 (RHEL8) for non-root users. This
resulted in a service start error EXIT_CAPABILITIES, a systemd limitation
of the version that we cannot work around consequences.
The systemd version 247 these combined capabilities have been tested to
work on Debian 11. No other supported major distros run systemd
version 246, and if they did, the missing capability of CAP_IPC_LOCK
won't be noticed as it was a convenience for --memlock users.
As such we disable the AmbientCapabilites for CAP_IPC_LOCK rather
that disabling the CapabilityBoundingSet, because doing the later
will disable authentication for MariaDB users that have configured PAM
with MariaDB.
Should a user require CAP_IPC_LOCK they can append in their own
systemd overlay file this configuration in the CapabilityBoundingSet
and configure the capability file attributes on the mariadbd executable
to have the IPC_LOCK capability. This isn't configured by default as the
presence of a capability in the MariaDB Server is detected by
openssl libraries as "insecure" which will then ignore any user configured TLS
configuration file passed though by the OPENSSL_CONF environment variable.
Test changes only. Idea of the test is to test two concurrent
nonconflicting DDL-clauses. Therefore, use debug sync to
really execute two DDL-clauses as concurrently as Galera
allows.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>