ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-12-08 06:09:58 +00:00

Author	SHA1	Message	Date
Timothy Flynn	1869399fd1	AK: Specialize Optional for Utf16String and Utf16FlyString We added this for String some time ago, so let's give Utf16String the same optimization. Note that Utf16String was already handling its data member potentially being null as of `5af99f4dd0`.	2025-08-19 06:24:09 -04:00
Timothy Flynn	298ec6a12a	AK: Ensure StringBuilder encodes U+10000 as 2 UTF-16 code units	2025-08-07 02:05:50 +02:00
Timothy Flynn	1b611fba67	AK: Ensure Utf16FlyString is hash-compatible with Utf16View/Utf16String	2025-08-07 02:05:50 +02:00
Timothy Flynn	2dc0a3b3ce	AK: Add trim methods to Utf16String that skip allocation when not needed If the string does not begin with any of the provided code units, we do not need to create a new string.	2025-08-05 15:13:36 +02:00
Timothy Flynn	0bf565b97f	AK: Allow comparing UTF-16 strings to UTF-8 strings Before now, you could compare a Utf16View to a StringView, but it would only be valid if the StringView were ASCII. When porting code to UTF-16, it will be handy to have a code point-aware implementation for non-ASCII StringViews.	2025-08-05 07:07:15 -04:00
Timothy Flynn	13ed6aba71	AK+LibIPC: Implement an encoder/decoder for UTF-16 strings	2025-08-02 10:10:14 -07:00
Timothy Flynn	a740bfd8ff	AK+LibUnicode: Implement Unicode-aware UTF-16 case transformations	2025-07-25 18:16:22 +02:00
Timothy Flynn	df77ae1920	AK: Implement creating a UTF-16 string from a repeated code point	2025-07-25 18:16:22 +02:00
Timothy Flynn	f53389bab1	AK: Add a couple of Utf16String factories * Utf16String::from_utf8_with_replacement_character * Utf16String::from_code_point	2025-07-24 19:00:20 +02:00
Timothy Flynn	2803d66d87	AK: Support UTF-16 string formatting The underlying storage used during string formatting is StringBuilder. To support UTF-16 strings, this patch allows callers to specify a mode during StringBuilder construction. The default mode is UTF-8, for which StringBuilder remains unchanged. In UTF-16 mode, we treat the StringBuilder's internal ByteBuffer as a series of u16 code units. Appending a single character will append 2 bytes for that character (cast to a char16_t). Appending a StringView will transcode the string to UTF-16. Utf16String also gains the same memory optimization that we added for String, where we hand-off the underlying buffer to Utf16String to avoid having to re-allocate. In the future, we may want to further optimize for ASCII strings. For example, we could defer committing to the u16-esque storage until we see a non-ASCII code point.	2025-07-18 12:45:38 -04:00
Timothy Flynn	fe676585f5	AK: Add a UTF-16 string with optimized short- and ASCII-string storage This is a strictly UTF-16 string with some optimizations for ASCII. * If created from a short UTF-8 or UTF-16 string that is also ASCII, then the string is stored in an inlined byte buffer. * If created with a long UTF-8 or UTF-16 string that is also ASCII, then the string is stored in an outlined char buffer. * If created with a short or long UTF-8 or UTF-16 string that is not ASCII, then the string is stored in an outlined char16 buffer. We do not store short non-ASCII text in the inlined buffer to avoid confusion with operations such as `length_in_code_units` and `code_unit_at`. For example, "😀" would be stored as 4 UTF-8 bytes in short string form. But we still want `length_in_code_units` to be 2, and `code_unit_at(0)` to be 0xD83D.	2025-07-18 12:45:38 -04:00

11 commits