Files
php-src/ext/mbstring/tests/unicode_versions.phpt
Ayesh Karunaratne 3afb96184e ext/mbstring: Update to Unicode 16
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc18, #7502, #14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
2024-09-17 10:40:00 +09:00

42 lines
1.1 KiB
PHP

--TEST--
mbstring Unicode Data tests
--EXTENSIONS--
mbstring
--FILE--
<?php
echo "Char widths:\n";
print "ASCII (PHP): " . mb_strwidth('PHP', 'UTF-8') . "\n";
print "Vietnamese (Xin chào): " . mb_strwidth('Xin chào', 'UTF-8') . "\n";
print "Traditional Chinese (你好): " . mb_strwidth('你好', 'UTF-8') . "\n";
print "Sinhalese (අයේෂ්): " . mb_strwidth('අයේෂ්', 'UTF-8') . "\n";
print "Emoji (\u{1F418}): " . mb_strwidth("\u{1F418}", 'UTF-8') . "\n";
// New in Unicode 15.0, width=2
print "Emoji (\u{1F6DC}): " . mb_strwidth("\u{1F6DC}", 'UTF-8') . "\n";
// Changed in Unicode 16.0, U+2630...U+2637 are wide
print "Emoji (\u{2630}): " . mb_strwidth("\u{2630}", 'UTF-8') . "\n";
echo "Char case changes:\n";
print "Upper(\u{019b}) = \u{a7dc} : ";
var_dump(mb_strtoupper("\u{019b}", 'UTF-8') === "\u{a7dc}");
?>
--EXPECT--
Char widths:
ASCII (PHP): 3
Vietnamese (Xin chào): 8
Traditional Chinese (你好): 4
Sinhalese (අයේෂ්): 5
Emoji (🐘): 2
Emoji (🛜): 2
Emoji (☰): 2
Char case changes:
Upper(ƛ) = Ƛ : bool(true)