MariaDB-open-code

mirror of https://github.com/MariaDB/server.git synced 2025-07-29 11:45:32 +00:00

Author	SHA1	Message	Date
Oleksandr Byelkin	f1102da37a	Merge branch '11.8' into 12.0	2025-05-22 09:22:55 +02:00
Vasilii Lakhin	40c5b62531	Fix remaining typos	2025-04-29 11:18:00 +10:00
Sergei Golubchik	237e24497b	Merge remote-tracking branch 'github/bb-11.4-release' into bb-11.8-serg	2025-04-27 19:40:00 +02:00
Sergei Golubchik	9b824e62d4	Merge branch '11.8' into main	2025-04-18 17:11:01 +02:00
Alexander Barkov	10c063f9f0	MDEV-36213 Doubled memory usage (11.4.4 <-> 11.4.5) Fixing the code adding MySQL _0900_ collations as _uca1400_ aliases not to perform deep initialization of the corresponding _uca1400_ collations. Only basic initialization is now performed which allows to watch these collations (both _0900_ and _uca1400_) in queries to INFORMATION_SCHEMA tables COLLATIONS and COLLATION_CHARACTER_SET_APPLICABILITY, as well as in SHOW COLLATION statements. Deep initialization is now performed only when a collation (either the _0900_ alias or the corresponding _uca1400_ collation) is used for the very first time after the server startup. Refactoring was done to maintain the code easier: - most of the _uca1400_ code was moved from ctype-uca.c to a new file ctype-uca1400.c - most of the _0900_ code was moved from type-uca.c to a new file ctype-uca0900.c Change details: - The original function add_alias_for_collation() added by the patch for "MDEV-20912 Add support for utf8mb4_0900_* collations in MariaDB Server" was removed from mysys/charset.c, as it had two two problems: a. it forced deep initialization of the _uca1400_ collations when adding _0900_ aliases for them at the server startup (the main reported problem) b. the collation initialization code in add_alias_for_collation() was related more to collations rather than to memory management, so /strings should be a better place for it than /mysys. The code from add_alias_for_collation() was split into separate functions. Cyclic dependency was removed. `#include <my_sys.h>` was removed from /strings/ctype-uca.c. Collations are now added using a callback function MY_CHARSET_LOADED::add_collation, like it is done for user collations defined in Index.xml. The code in /mysys sets MY_CHARSET_LOADED::add_collation to add_compiled_collation(). - The function compare_collations() was removed. A new virtual function was added into my_collation_handler_st instead: my_bool (eq_collation)(CHARSET_INFO self, CHARSET_INFO other); because it is the collation handler who knows how to detect equal collations by comparing only some of CHARSET_INFO members without their deep initialization. Three implementations were added: - my_ci_eq_collation_uca() for UCA collations, it compares _0900_ collations as equal to their corresponding _uca1400_ collations. - my_ci_eq_collation_utf8mb4_bin(), it compares utf8mb4_nopad_bin and utf8mb4_0900_bin as equal. - my_ci_eq_collation_generic() - the default implementation, which compares all collations as not equal. A C++ wrapper CHARSET_INFO::eq_collations() was added. The code in /sql was changes to use the wrapper instead of the former calls for the removed function compare_collations(). - A part of add_alias_for_collation() was moved into a new function my_ci_alloc(). It allocates a memory for a new charset_info_st instance together with the collation name and the comment using a single MY_CHARSET_LOADER::once_alloc call, which points to my_once_alloc() in the server. - A part of add_alias_for_collation() was moved into a new function my_ci_make_comment_for_alias(). It makes an "Alias for xxx" string, e.g. "Alias for utf8mb4_uca1400_swedish_ai_ci" in case of utf8mb4_sv_0900_ai_ci. - A part of the code in create_tailoring() was moved to a new function my_uca1400_collation_get_initialized_shared_uca(), to reuse the code between _uca1400_ and _0900_ collations. - A new function my_collation_id_is_mysql_uca0900() was added in addition to my_collation_id_is_mysql_uca1400(). - Functions to build collation names were added: my_uca0900_collation_build_name() my_uca1400_collation_build_name() - A shared function function was added: my_bool my_uca1400_collation_alloc_and_init(MY_CHARSET_LOADER loader, LEX_CSTRING name, LEX_CSTRING comment, const uca_collation_def_param_t *param, uint id) It's reused to add _uca1400_ and _0900_ collations, with basic initialization (without deep initialization). - The function add_compiled_collation() changed its return type from void to int, to make it compatible with MY_CHARSET_LOADER::add_collation. - Functions mysql_uca0900_collation_definition_add(), mysql_uca0900_utf8mb4_collation_definitions_add(), mysql_utf8mb4_0900_bin_add() were added into ctype-uca0900.c. They get MY_CHARSET_LOADER as a parameter. - Functions my_uca1400_collation_definition_add(), my_uca1400_collation_definitions_add() were moved from charset-def.c to strings/ctype-uca1400.c. The latter now accepts MY_CHARSET_LOADER as the first parameter instead of initializing a MY_CHARSET_LOADER inside. - init_compiled_charsets() now initializes a MY_CHARSET_LOADER variable and passes it to all functions adding collations: - mysql_utf8mb4_0900_collation_definitions_add() - mysql_uca0900_utf8mb4_collation_definitions_add() - mysql_utf8mb4_0900_bin_add() - A new structure was added into ctype-uca.h: typedef struct uca_collation_def_param { my_cs_encoding_t cs_id; uint tailoring_id; uint nopad_flags; uint level_flags; } uca_collation_def_param_t; It simplifies reusing the code for _uca1400_ and _0900_ collations. - The definition of MY_UCA1400_COLLATION_DEFINITION was moved from ctype-uca.c to ctype-uca1400.h, to reuse the code for _uca1400_ and _0900_ collations. - The definitions of "MY_UCA_INFO my_uca_v1400" and "MY_UCA_INFO my_uca1400_info_tailored[][]" were moved from ctype-uca.c to ctype-uca1400.c. - The definitions/declarations of: - mysql_0900_collation_start, - struct mysql_0900_to_mariadb_1400_mapping - mysql_0900_to_mariadb_1400_mapping - mysql_utf8mb4_0900_collation_definitions_add() were moved from ctype-uca.c to ctype-uca0900.c - Functions my_uca1400_make_builtin_collation_id() my_uca1400_collation_definition_init() my_uca1400_collation_id_uca400_compat() my_ci_get_collation_name_uca1400_context() were moved from ctype-uca.c to ctype-uca1400.c and ctype-uca1400.h - A part of my_uca1400_collation_definition_init() was moved into my_uca0520_builtin_collation_by_id(), to make functions smaller.	2025-04-17 10:01:53 +04:00
Marko Mäkelä	bb1d88b6dc	Merge 11.4 into 11.8	2025-04-02 14:07:01 +03:00
Marko Mäkelä	f5bd250f5b	Merge 10.11 into 11.4	2025-03-28 13:55:21 +02:00
Marko Mäkelä	ab0f2a00b6	Merge 10.6 into 10.11	2025-03-27 08:01:47 +02:00
Vasilii Lakhin	5f7c2a617f	Fix typos in C comments in miscellaneous files	2025-03-24 13:36:28 +11:00
Julius Goryavsky	e3d7d5ca26	Merge branch '10.5' into '10.6'	2025-02-27 04:02:33 +01:00
Sergei Golubchik	ba01c2aaf0	Merge branch '11.4' into 11.7 * rpl.rpl_system_versioning_partitions updated for MDEV-32188 * innodb.row_size_error_log_warnings_3 changed error for MDEV-33658 (checks are done in a different order)	2025-02-06 16:46:36 +01:00
Alexander Barkov	583b39811c	MDEV-35620 UBSAN: runtime error: applying zero offset to null pointer in _ma_unique_hash, skip_trailing_space, my_hash_sort_mb_nopad_bin and my_strnncollsp_utf8mb4_bin UBSAN detected the nullptr-with-offset in a few places when handling empty blobs. Fix: - Adding DBUG_ASSERT(source_string) into all hash_sort() implementations to catch this problem in non-UBSAN debug builds. - Fixing mi_unique_hash(), mi_unique_comp(), _ma_unique_hash(), _ma_unique_comp() to replace NULL pointer to an empty string ponter.. Note, we should also add DBUG_ASSERT(source_string != NULL) into all implementations of strnncoll*(). But I'm afraid the patch is going to be too long and too dangerous for 10.5.	2025-02-03 16:45:02 +04:00
Alexander Barkov	c1559f261f	MDEV-35688 UBSAN: SUMMARY: UndefinedBehaviorSanitizer: nullptr-with-offset in my_casedn_utf8mb3 The functions MY_CHARSET_HANDLER::caseup() and MY_CHARSET_HANDLER::casedn() in their virtual imlementations do "const char end= src + srclen" in the very beginning. Therefore src cannot be NULL to avoid "UBSAN: SUMMARY: UndefinedBehaviorSanitizer: nullptr-with-offset". Adding DBUG_ASSERT(src != NULL) into all virtual implementations, to catch this problem in regular Debug builds (without UBSAN). Fixing Master_info_index::get_master_info() to check connection_name->str. If it is NULL then passing empty_clex_str into IdentBufferCasedn instead of connection_name.	2025-01-20 20:01:48 +04:00
Alexander Barkov	0d17c540a5	MDEV-27277 Add a warning when max_sort_length is reached Step#1: fixing the return type of strnxfrm() from size_t to this structure: typedef struct { size_t m_output_length; size_t m_source_length_used; uint m_warnings; } my_strnxfrm_ret_t;	2024-10-22 21:42:53 +07:00
Marko Mäkelä	43465352b9	Merge 11.4 into 11.6	2024-10-03 16:09:56 +03:00
Marko Mäkelä	b53b81e937	Merge 11.2 into 11.4	2024-10-03 14:32:14 +03:00
Marko Mäkelä	63913ce5af	Merge 10.6 into 10.11	2024-10-03 10:55:08 +03:00
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Alexander Barkov	841dc07ee1	MDEV-28386 UBSAN: runtime error: negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_strntoull_8bit on SELECT ... OCT The code in my_strntoull_8bit() and my_strntoull_mb2_or_mb4() could hit undefinite behavior by negating of LONGLONG_MIN. Fixing the code to avoid this.	2024-09-20 11:01:31 +04:00
Alexander Barkov	fd247cc21f	MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() This patch also fixes: MDEV-33050 Build-in schemas like oracle_schema are accent insensitive MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 - Removing the virtual function strnncoll() from MY_COLLATION_HANDLER - Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function. - Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implement accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collations use Unicode-14.0.0 casefolding data. The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp() There is only a difference in a couple dozen rare characters, because: - the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description. Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters! Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal. - Adding classes representing names of the file based database objects: Lex_ident_db Lex_ident_table Lex_ident_trigger Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci. - Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci: Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_charset engine_option_value::Name - All the mentioned Lex_ident_xxx classes implement a method streq(): if (ident1.streq(ident2)) do_equal(); This method works as a wrapper for CHARSET_INFO::streq(). - Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters. - Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2) - Taking advantage of the c++11 user defined literal operator for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h) data types. Use example: const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column; is now a shorter version of: const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});	2024-04-18 15:22:10 +04:00
Alexander Barkov	929c2e06aa	MDEV-31531 Remove my_casedn_str() and my_caseup_str() Under terms of MDEV 27490 we'll add support for non-BMP identifiers and upgrade casefolding information to Unicode version 14.0.0. In Unicode-14.0.0 conversion to lower and upper cases can increase octet length of the string, so conversion won't be possible in-place any more. This patch removes virtual functions performing in-place casefolding: - my_charset_handler_st::casedn_str() - my_charset_handler_st::caseup_str() and fixes the code to use the non-inplace functions instead: - my_charset_handler_st::casedn() - my_charset_handler_st::caseup()	2024-02-28 22:20:29 +04:00
Andrew Hutchings	f552febe43	MDEV-30879 Add support for up to BASE 62 to CONV() BASE 62 uses 0-9, A-Z and then a-z to give the numbers 0-61. This patch increases the range of the string functions to cover this. Based on ideas and tests in PR #2589, but re-written into the charset functions. Includes fix by Sergei, UBSAN complained: ctype-simple.c:683:38: runtime error: negation of -9223372036854775808 cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself Co-authored-by: Weijun Huang <huangweijun1001@gmail.com> Co-authored-by: Sergei Golubchik <serg@mariadb.org>	2024-01-17 15:24:26 +00:00
Marko Mäkelä	656c2e18b1	Merge 10.10 into 10.11	2023-04-14 13:08:28 +03:00
Marko Mäkelä	a009280e60	Merge 10.9 into 10.10	2023-04-14 12:24:14 +03:00
Marko Mäkelä	5bada1246d	Merge 10.5 into 10.6	2023-04-11 16:15:19 +03:00
Alexander Barkov	62e137d4d7	Merge remote-tracking branch 'origin/10.4' into 10.5	2023-04-05 16:16:19 +04:00
Alexander Barkov	8020b1bd73	MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations - Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars() and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES. The flag defines if strnncollsp_nchars() should emulate trailing spaces which were possibly trimmed earlier (e.g. in InnoDB CHAR compression). This is important for NOPAD collations. For example, with this input: - str1= 'a ' (Latin letter a followed by one space) - str2= 'a ' (Latin letter a followed by two spaces) - nchars= 3 if the flag is given, strnncollsp_nchars() will virtually restore one trailing space to str1 up to nchars (3) characters and compare two strings as equal: - str1= 'a ' (one extra trailing space emulated) - str2= 'a ' (as is) If the flag is not given, strnncollsp_nchars() does not add trailing virtual spaces, so in case of a NOPAD collation, str1 will be compared as less than str2 because it is shorter. - Field_string::cmp_prefix() now passes the new flag. Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do not pass the new flag. - The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc (which handles the CHAR data type) now also passed the new flag. - Fixing UCA collations to respect the new flag. Other collations are possibly also affected, however I had no success in making an SQL script demonstrating the problem. Other collations will be extended to respect this flags in a separate patch later. - Changing the meaning of the last parameter of Field::cmp_prefix() from "number of bytes" (internal length) to "number of characters" (user visible length). The code calling cmp_prefix() from handler.cc was wrong. After this change, the call in handler.cc became correct. The code calling cmp_prefix() from key_rec_cmp() in key.cc was adjusted according to this change. - Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c now pass the new flag. A few new tests also were added, without the flag.	2023-04-04 12:30:50 +04:00
Marko Mäkelä	95d51369c9	Merge 10.10 into 10.11	2023-02-28 10:52:42 +02:00
Alexander Barkov	7f6b648d7d	MDEV-30661 UPPER() returns an empty string for U+0251 in uca1400 collations for utf8 String length growth during upper/lower conversion in Unicode collations depends only on the underlying MY_UNICASE_INFO used in the collation. Maintaining a separate member CHARSET_INFO::caseup_multiply and CHARSET_INFO::casedn_multiply duplicated this information and caused bugs like this (when MY_UNICASE_INFO and case??_multiply when out of sync because of incomplete CHARSET_INFO initialization). Fix: Changing CHARSET_INFO::caseup_multiply and CHARSET_INFO::casedn_multiply from members to virtual functions. The virtual functions in Unicode collations calculate case conversion growth factors from the MY_UNICASE_INFO. This guarantees that the growth factors are always in sync with the MY_UNICASE_INFO.	2023-02-17 17:33:27 +04:00
NTH19	80fbd0ee94	Remove redundant variable (#2237 )	2022-08-27 10:19:16 +01:00
Alexander Barkov	133446828c	MDEV-27009 Add UCA-14.0.0 collations - Added one neutral and 22 tailored (language specific) collations based on Unicode Collation Algorithm version 14.0.0. Collations were added for Unicode character sets utf8mb3, utf8mb4, ucs2, utf16, utf32. Every tailoring was added with four accent and case sensitivity flag combinations, e.g: * utf8mb4_uca1400_swedish_as_cs * utf8mb4_uca1400_swedish_as_ci * utf8mb4_uca1400_swedish_ai_cs * utf8mb4_uca1400_swedish_ai_ci and their _nopad_ variants: * utf8mb4_uca1400_swedish_nopad_as_cs * utf8mb4_uca1400_swedish_nopad_as_ci * utf8mb4_uca1400_swedish_nopad_ai_cs * utf8mb4_uca1400_swedish_nopad_ai_ci - Introducing a conception of contextually typed named collations: CREATE DATABASE db1 CHARACTER SET utf8mb4; CREATE TABLE db1.t1 (a CHAR(10) COLLATE uca1400_as_ci); The idea is that there is no a need to specify the character set prefix in the new collation names. It's enough to type just the suffix "uca1400_as_ci". The character set is taken from the context. In the above example script the context character set is utf8mb4. So the CREATE TABLE will make a column with the collation utf8mb4_uca1400_as_ci. Short collations names can be used in any parts of the SQL syntax where the COLLATE clause is understood. - New collations are displayed only one time (without character set combinations) by these statements: SELECT * FROM INFORMATION_SCHEMA.COLLATIONS; SHOW COLLATION; For example, all these collations: - utf8mb3_uca1400_swedish_as_ci - utf8mb4_uca1400_swedish_as_ci - ucs2_uca1400_swedish_as_ci - utf16_uca1400_swedish_as_ci - utf32_uca1400_swedish_as_ci have just one entry in INFORMATION_SCHEMA.COLLATIONS and SHOW COLLATION, with COLLATION_NAME equal to "uca1400_swedish_as_ci", which is the suffix without the character set name: SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLLATIONS WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci'; +-----------------------+ \| COLLATION_NAME \| +-----------------------+ \| uca1400_swedish_as_ci \| +-----------------------+ Note, the behaviour of old collations did not change. Non-unicode collations (e.g. latin1_swedish_ci) and old UCA-4.0.0 collations (e.g. utf8mb4_unicode_ci) are still displayed with the character set prefix, as before. - The structure of the table INFORMATION_SCHEMA.COLLATIONS was changed. The NOT NULL constraint was removed from these columns: - CHARACTER_SET_NAME - ID - IS_DEFAULT and from the corresponding columns in SHOW COLLATION. For example: SELECT COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT FROM INFORMATION_SCHEMA.COLLATIONS WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci'; +-----------------------+--------------------+------+------------+ \| COLLATION_NAME \| CHARACTER_SET_NAME \| ID \| IS_DEFAULT \| +-----------------------+--------------------+------+------------+ \| uca1400_swedish_as_ci \| NULL \| NULL \| NULL \| +-----------------------+--------------------+------+------------+ The NULL value in these columns now means that the collation is applicable to multiple character sets. The behavioir of old collations did not change. Make sure your client programs can handle NULL values in these columns. - The structure of the table INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY was changed. Three new NOT NULL columns were added: - FULL_COLLATION_NAME - ID - IS_DEFAULT New collations have multiple entries in COLLATION_CHARACTER_SET_APPLICABILITY. The column COLLATION_NAME contains the collation name without the character set prefix. The column FULL_COLLATION_NAME contains the collation name with the character set prefix. Old collations have full collation name in both FULL_COLLATION_NAME and COLLATION_NAME. SELECT COLLATION_NAME, FULL_COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY WHERE FULL_COLLATION_NAME RLIKE '^(utf8mb4\|latin1).swedish.ci$'; +-----------------------------+-------------------------------------+--------------------+------+------------+ \| COLLATION_NAME \| FULL_COLLATION_NAME \| CHARACTER_SET_NAME \| ID \| IS_DEFAULT \| +-----------------------------+-------------------------------------+--------------------+------+------------+ \| latin1_swedish_ci \| latin1_swedish_ci \| latin1 \| 8 \| Yes \| \| latin1_swedish_nopad_ci \| latin1_swedish_nopad_ci \| latin1 \| 1032 \| \| \| utf8mb4_swedish_ci \| utf8mb4_swedish_ci \| utf8mb4 \| 232 \| \| \| uca1400_swedish_ai_ci \| utf8mb4_uca1400_swedish_ai_ci \| utf8mb4 \| 2368 \| \| \| uca1400_swedish_as_ci \| utf8mb4_uca1400_swedish_as_ci \| utf8mb4 \| 2370 \| \| \| uca1400_swedish_nopad_ai_ci \| utf8mb4_uca1400_swedish_nopad_ai_ci \| utf8mb4 \| 2372 \| \| \| uca1400_swedish_nopad_as_ci \| utf8mb4_uca1400_swedish_nopad_as_ci \| utf8mb4 \| 2374 \| \| +-----------------------------+-------------------------------------+--------------------+------+------------+ - Other INFORMATION_SCHEMA queries: SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS; SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.PARAMETERS; SELECT TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES; SELECT DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA; SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.ROUTINES; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.EVENTS; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.EVENTS; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.ROUTINES; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.ROUTINES; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.TRIGGERS; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.TRIGGERS; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.VIEWS; display full collation names, including character sets prefix, for all collations, including new collations. Corresponding SHOW commands also display full collation names in collation related columns: SHOW CREATE TABLE t1; SHOW CREATE DATABASE db1; SHOW TABLE STATUS; SHOW CREATE FUNCTION f1; SHOW CREATE PROCEDURE p1; SHOW CREATE EVENT ev1; SHOW CREATE TRIGGER tr1; SHOW CREATE VIEW; These INFORMATION_SCHEMA queries and SHOW statements may change in the future, to display show collation names.	2022-08-10 15:04:24 +02:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Alexander Barkov	b915f79e4e	MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings	2022-01-21 12:16:07 +04:00
Alexander Barkov	0d68b0a2d6	MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str()	2021-09-27 17:10:22 +04:00
Marko Mäkelä	80ed136e6d	Merge 10.4 into 10.5	2021-04-21 09:01:01 +03:00
Monty	031f11717d	Fix all warnings given by UBSAN The easiest way to compile and test the server with UBSAN is to run: ./BUILD/compile-pentium64-ubsan and then run mysql-test-run. After this commit, one should be able to run this without any UBSAN warnings. There is still a few compiler warnings that should be fixed at some point, but these do not expose any real bugs. The 'special' cases where we disable, suppress or circumvent UBSAN are: - ref10 source (as here we intentionally do some shifts that UBSAN complains about. - x86 version of optimized int#korr() methods. UBSAN do not like unaligned memory access of integers. Fixed by using byte_order_generic.h when compiling with UBSAN - We use smaller thread stack with ASAN and UBSAN, which forced me to disable a few tests that prints the thread stack size. - Verifying class types does not work for shared libraries. I added suppression in mysql-test-run.pl for this case. - Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is safe to have overflows (two cases, in item_func.cc). Things fixed: - Don't left shift signed values (byte_order_generic.h, mysqltest.c, item_sum.cc and many more) - Don't assign not non existing values to enum variables. - Ensure that bool and enum values are properly initialized in constructors. This was needed as UBSAN checks that these types has correct values when one copies an object. (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...) - Ensure we do not called handler functions on unallocated objects or deleted objects. (events.cc, sql_acl.cc). - Fixed bugs in Item_sp::Item_sp() where we did not call constructor on Query_arena object. - Fixed several cast of objects to an incompatible class! (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc, sql_select.cc ...) - Ensure we do not do integer arithmetic that causes over or underflows. This includes also ++ and -- of integers. (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...) - Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that value_type is initialized to this instead of to -1, which is not a valid enum value for json_value_types. - Ensure we do not call memcpy() when second argument could be null. - Fixed that Item_func_str::make_empty_result() creates an empty string instead of a null string (safer as it ensures we do not do arithmetic on null strings). Other things: - Changed struct st_position to an OBJECT and added an initialization function to it to ensure that we do not copy or use uninitialized members. The change to a class was also motived that we used "struct st_position" and POSITION randomly trough the code which was confusing. - Notably big rewrite in sql_acl.cc to avoid using deleted objects. - Changed in sql_partition to use '^' instead of '-'. This is safe as the operator is either 0 or 0x8000000000000000ULL. - Added check for select_nr < INT_MAX in JOIN::build_explain() to avoid bug when get_select() could return NULL. - Reordered elements in POSITION for better alignment. - Changed sql_test.cc::print_plan() to use pointers instead of objects. - Fixed bug in find_set() where could could execute '1 << -1'. - Added variable have_sanitizer, used by mtr. (This variable was before only in 10.5 and up). It can now have one of two values: ASAN or UBSAN. - Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked it virtual. This was an effort to get UBSAN to work with loaded storage engines. I kept the change as the new place is better. - Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast in tabutil.cpp. - Added HAVE_REPLICATION around usage of rgi_slave, to get embedded server to compile with UBSAN. (Patch from Marko). - Added #ifdef for powerpc64 to avoid a bug in old gcc versions related to integer arithmetic. Changes that should not be needed but had to be done to suppress warnings from UBSAN: - Added static_cast<<uint16_t>> around shift to get rid of a LOT of compiler warnings when using UBSAN. - Had to change some '/' of 2 base integers to shift to get rid of some compile time warnings. Reviewed by: - Json changes: Alexey Botchkov - Charset changes in ctype-uca.c: Alexander Barkov - InnoDB changes & Embedded server: Marko Mäkelä - sql_acl.cc changes: Vicențiu Ciorbaru - build_explain() changes: Sergey Petrunia	2021-04-20 12:30:09 +03:00
Alexander Barkov	cfe5ee90c8	MDEV-22043 Special character leads to assertion in my_wc_to_printable_generic on 10.5.2 (debug) The code did not take into account that: - U+005C (backslash) can occupy more than mbminlen characters (e.g. in sjis) - Some character sets do not have a code for U+005C (e.g. swe7) Adding a new function my_wc_to_printable into MY_CHARSET_HANDLER to cover all special cases easier.	2020-05-09 16:01:30 +04:00
Marko Mäkelä	8b6cfda631	Merge 10.4 into 10.5	2020-02-07 08:51:20 +02:00
Monty	4d61f1247a	Fixed compiler warnings from gcc 7.4.1 - Fixed possible error in rocksdb/rdb_datadic.cc	2020-01-29 23:23:55 +02:00
Alexander Barkov	f1e13fdc8d	MDEV-21581 Helper functions and methods for CHARSET_INFO	2020-01-28 12:29:23 +04:00
Marko Mäkelä	5ab70e7f68	Merge 10.2 into 10.3	2019-12-27 15:14:48 +02:00
Marko Mäkelä	73985d8301	Merge 10.1 into 10.2	2019-12-23 07:14:51 +02:00
Alexander Barkov	3d98892232	Merge remote-tracking branch 'origin/5.5' into 10.1	2019-12-16 13:08:17 +04:00
Alexander Barkov	fc860d3fa3	MDEV-21065 UNIQUE constraint causes a query with string comparison to omit a row in the result set	2019-12-16 12:57:08 +04:00
Oleksandr Byelkin	55b2281a5d	Merge branch '10.2' into 10.3	2019-10-31 10:58:06 +01:00
Marko Mäkelä	19ceaf2928	Merge 10.1 into 10.2	2019-10-25 12:57:36 +03:00
Sergei Golubchik	790a74d22b	Merge branch 'github/5.5' into 10.1	2019-10-23 15:55:23 +02:00
Sergei Golubchik	719ac0ad4a	crash in string-to-int conversion using a specially crafted strings one could overflow `shift` variable and cause a crash by dereferencing d10[-2147483648] (on a sufficiently old gcc). This is a correct fix and a test case for Bug #29723340: MYSQL SERVER CRASH AFTER SQL QUERY WITH DATA ?AST	2019-10-19 11:48:38 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00

1 2 3 4 5

230 Commits