| 
									
										
										
										
											2020-03-02 14:23:11 +01:00
										 |  |  | /*
 | 
					
						
							| 
									
										
											  
											
												AK: Introduce the new String, replacement for DeprecatedString
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.
Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.
Problems in DeprecatedString:
- It assumes string allocation never fails. This makes it impossible
  to use in allocation-sensitive contexts, and is the reason we had to
  ban DeprecatedString from the kernel entirely.
- The awkward null state. DeprecatedString can be null. It's different
  from the empty state, although null strings are considered empty.
  All code is immediately nicer when using Optional<DeprecatedString>
  but DeprecatedString came before Optional, which is how we ended up
  like this.
- The encoding of the underlying data is ambiguous. For the most part,
  we use it as if it's always UTF-8, but there have been cases where
  we pass around strings in other encodings (e.g ISO8859-1)
- operator[] and length() are used to iterate over DeprecatedString one
  byte at a time. This is done all over the codebase, and will *not*
  give the right results unless the string is all ASCII.
How we solve these issues in the new String:
- Functions that may allocate now return ErrorOr<String> so that ENOMEM
  errors can be passed to the caller.
- String has no null state. Use Optional<String> when needed.
- String is always UTF-8. This is validated when constructing a String.
  We may need to add a bypass for this in the future, for cases where
  you have a known-good string, but for now: validate all the things!
- There is no operator[] or length(). You can get the underlying data
  with bytes(), but for iterating over code points, you should be using
  an UTF-8 iterator.
Furthermore, it has two nifty new features:
- String implements a small string optimization (SSO) for strings that
  can fit entirely within a pointer. This means up to 3 bytes on 32-bit
  platforms, and 7 bytes on 64-bit platforms. Such small strings will
  not be heap-allocated.
- String can create substrings without making a deep copy of the
  substring. Instead, the superstring gets +1 refcount from the
  substring, and it acts like a view into the superstring. To make
  substrings like this, use the substring_with_shared_superstring() API.
One caveat:
- String does not guarantee that the underlying data is null-terminated
  like DeprecatedString does today. While this was nifty in a handful of
  places where we were calling C functions, it did stand in the way of
  shared-superstring substrings.
											
										 
											2022-12-01 13:27:43 +01:00
										 |  |  |  * Copyright (c) 2018-2022, Andreas Kling <awesomekling@gmail.com> | 
					
						
							| 
									
										
										
										
											2020-03-02 14:23:11 +01:00
										 |  |  |  * Copyright (c) 2020, Fei Wu <f.eiwu@yahoo.com> | 
					
						
							|  |  |  |  * | 
					
						
							| 
									
										
										
										
											2021-04-22 01:24:48 -07:00
										 |  |  |  * SPDX-License-Identifier: BSD-2-Clause | 
					
						
							| 
									
										
										
										
											2020-03-02 14:23:11 +01:00
										 |  |  |  */ | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  | #include <AK/CharacterTypes.h>
 | 
					
						
							| 
									
										
										
										
											2021-01-12 23:28:45 +03:30
										 |  |  | #include <AK/MemMem.h>
 | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  | #include <AK/Optional.h>
 | 
					
						
							| 
									
										
											  
											
												AK: Introduce the new String, replacement for DeprecatedString
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.
Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.
Problems in DeprecatedString:
- It assumes string allocation never fails. This makes it impossible
  to use in allocation-sensitive contexts, and is the reason we had to
  ban DeprecatedString from the kernel entirely.
- The awkward null state. DeprecatedString can be null. It's different
  from the empty state, although null strings are considered empty.
  All code is immediately nicer when using Optional<DeprecatedString>
  but DeprecatedString came before Optional, which is how we ended up
  like this.
- The encoding of the underlying data is ambiguous. For the most part,
  we use it as if it's always UTF-8, but there have been cases where
  we pass around strings in other encodings (e.g ISO8859-1)
- operator[] and length() are used to iterate over DeprecatedString one
  byte at a time. This is done all over the codebase, and will *not*
  give the right results unless the string is all ASCII.
How we solve these issues in the new String:
- Functions that may allocate now return ErrorOr<String> so that ENOMEM
  errors can be passed to the caller.
- String has no null state. Use Optional<String> when needed.
- String is always UTF-8. This is validated when constructing a String.
  We may need to add a bypass for this in the future, for cases where
  you have a known-good string, but for now: validate all the things!
- There is no operator[] or length(). You can get the underlying data
  with bytes(), but for iterating over code points, you should be using
  an UTF-8 iterator.
Furthermore, it has two nifty new features:
- String implements a small string optimization (SSO) for strings that
  can fit entirely within a pointer. This means up to 3 bytes on 32-bit
  platforms, and 7 bytes on 64-bit platforms. Such small strings will
  not be heap-allocated.
- String can create substrings without making a deep copy of the
  substring. Instead, the superstring gets +1 refcount from the
  substring, and it acts like a view into the superstring. To make
  substrings like this, use the substring_with_shared_superstring() API.
One caveat:
- String does not guarantee that the underlying data is null-terminated
  like DeprecatedString does today. While this was nifty in a handful of
  places where we were calling C functions, it did stand in the way of
  shared-superstring substrings.
											
										 
											2022-12-01 13:27:43 +01:00
										 |  |  | #include <AK/String.h>
 | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  | #include <AK/StringBuilder.h>
 | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | #include <AK/StringUtils.h>
 | 
					
						
							|  |  |  | #include <AK/StringView.h>
 | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  | #include <AK/Vector.h>
 | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-01-07 13:57:33 -07:00
										 |  |  | #ifdef KERNEL
 | 
					
						
							|  |  |  | #    include <Kernel/StdLib.h>
 | 
					
						
							|  |  |  | #else
 | 
					
						
							| 
									
										
										
										
											2022-12-04 18:02:33 +00:00
										 |  |  | #    include <AK/DeprecatedString.h>
 | 
					
						
							| 
									
										
										
										
											2022-10-11 00:48:45 +02:00
										 |  |  | #    include <AK/FloatingPointStringConversions.h>
 | 
					
						
							| 
									
										
										
										
											2023-01-07 13:57:33 -07:00
										 |  |  | #    include <string.h>
 | 
					
						
							| 
									
										
										
										
											2022-02-16 00:24:43 +02:00
										 |  |  | #endif
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | namespace AK { | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | namespace StringUtils { | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | bool matches(StringView str, StringView mask, CaseSensitivity case_sensitivity, Vector<MaskSpan>* match_spans) | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |     auto record_span = [&match_spans](size_t start, size_t length) { | 
					
						
							|  |  |  |         if (match_spans) | 
					
						
							|  |  |  |             match_spans->append({ start, length }); | 
					
						
							|  |  |  |     }; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |     if (str.is_null() || mask.is_null()) | 
					
						
							|  |  |  |         return str.is_null() && mask.is_null(); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-09-27 20:17:56 +03:00
										 |  |  |     if (mask == "*"sv) { | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |         record_span(0, str.length()); | 
					
						
							|  |  |  |         return true; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-04-01 20:58:27 +03:00
										 |  |  |     char const* string_ptr = str.characters_without_null_termination(); | 
					
						
							|  |  |  |     char const* string_start = str.characters_without_null_termination(); | 
					
						
							|  |  |  |     char const* string_end = string_ptr + str.length(); | 
					
						
							|  |  |  |     char const* mask_ptr = mask.characters_without_null_termination(); | 
					
						
							|  |  |  |     char const* mask_end = mask_ptr + mask.length(); | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |     while (string_ptr < string_end && mask_ptr < mask_end) { | 
					
						
							|  |  |  |         auto string_start_ptr = string_ptr; | 
					
						
							|  |  |  |         switch (*mask_ptr) { | 
					
						
							|  |  |  |         case '*': | 
					
						
							| 
									
										
										
										
											2021-09-27 20:17:56 +03:00
										 |  |  |             if (mask_ptr == mask_end - 1) { | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |                 record_span(string_ptr - string_start, string_end - string_ptr); | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |                 return true; | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |             } | 
					
						
							| 
									
										
										
										
											2021-09-27 20:17:56 +03:00
										 |  |  |             while (string_ptr < string_end && !matches({ string_ptr, static_cast<size_t>(string_end - string_ptr) }, { mask_ptr + 1, static_cast<size_t>(mask_end - mask_ptr - 1) }, case_sensitivity)) | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |                 ++string_ptr; | 
					
						
							|  |  |  |             record_span(string_start_ptr - string_start, string_ptr - string_start_ptr); | 
					
						
							|  |  |  |             --string_ptr; | 
					
						
							|  |  |  |             break; | 
					
						
							|  |  |  |         case '?': | 
					
						
							|  |  |  |             record_span(string_ptr - string_start, 1); | 
					
						
							|  |  |  |             break; | 
					
						
							| 
									
										
										
										
											2022-09-10 18:14:52 +02:00
										 |  |  |         case '\\': | 
					
						
							| 
									
										
										
										
											2022-12-16 19:20:53 +01:00
										 |  |  |             // if backslash is last character in mask, just treat it as an exact match
 | 
					
						
							|  |  |  |             // otherwise use it as escape for next character
 | 
					
						
							|  |  |  |             if (mask_ptr + 1 < mask_end) | 
					
						
							|  |  |  |                 ++mask_ptr; | 
					
						
							|  |  |  |             [[fallthrough]]; | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |         default: | 
					
						
							| 
									
										
										
										
											2021-09-27 20:17:56 +03:00
										 |  |  |             auto p = *mask_ptr; | 
					
						
							|  |  |  |             auto ch = *string_ptr; | 
					
						
							|  |  |  |             if (case_sensitivity == CaseSensitivity::CaseSensitive ? p != ch : to_ascii_lowercase(p) != to_ascii_lowercase(ch)) | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |                 return false; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |             break; | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  |         } | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |         ++string_ptr; | 
					
						
							|  |  |  |         ++mask_ptr; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |     } | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-29 00:04:12 +03:30
										 |  |  |     if (string_ptr == string_end) { | 
					
						
							|  |  |  |         // Allow ending '*' to contain nothing.
 | 
					
						
							|  |  |  |         while (mask_ptr != mask_end && *mask_ptr == '*') { | 
					
						
							|  |  |  |             record_span(string_ptr - string_start, 0); | 
					
						
							|  |  |  |             ++mask_ptr; | 
					
						
							|  |  |  |         } | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-10-25 09:04:39 +03:30
										 |  |  |     return string_ptr == string_end && mask_ptr == mask_end; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | } | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  | template<typename T> | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<T> convert_to_int(StringView str, TrimWhitespace trim_whitespace) | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |     auto string = trim_whitespace == TrimWhitespace::Yes | 
					
						
							|  |  |  |         ? str.trim_whitespace() | 
					
						
							|  |  |  |         : str; | 
					
						
							|  |  |  |     if (string.is_empty()) | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |         return {}; | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-20 16:27:33 +11:00
										 |  |  |     T sign = 1; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |     size_t i = 0; | 
					
						
							| 
									
										
										
										
											2022-04-01 20:58:27 +03:00
										 |  |  |     auto const characters = string.characters_without_null_termination(); | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  |     if (characters[0] == '-' || characters[0] == '+') { | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |         if (string.length() == 1) | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |             return {}; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |         i++; | 
					
						
							| 
									
										
										
										
											2020-12-20 16:27:33 +11:00
										 |  |  |         if (characters[0] == '-') | 
					
						
							|  |  |  |             sign = -1; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |     } | 
					
						
							| 
									
										
										
										
											2020-03-02 21:19:33 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  |     T value = 0; | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |     for (; i < string.length(); i++) { | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |         if (characters[i] < '0' || characters[i] > '9') | 
					
						
							|  |  |  |             return {}; | 
					
						
							| 
									
										
										
										
											2020-12-20 16:27:33 +11:00
										 |  |  | 
 | 
					
						
							|  |  |  |         if (__builtin_mul_overflow(value, 10, &value)) | 
					
						
							|  |  |  |             return {}; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         if (__builtin_add_overflow(value, sign * (characters[i] - '0'), &value)) | 
					
						
							|  |  |  |             return {}; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |     } | 
					
						
							| 
									
										
										
										
											2020-12-20 16:27:33 +11:00
										 |  |  |     return value; | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | } | 
					
						
							| 
									
										
										
										
											2020-03-02 21:19:33 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | template Optional<i8> convert_to_int(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<i16> convert_to_int(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<i32> convert_to_int(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<long> convert_to_int(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<long long> convert_to_int(StringView str, TrimWhitespace); | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  | 
 | 
					
						
							|  |  |  | template<typename T> | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<T> convert_to_uint(StringView str, TrimWhitespace trim_whitespace) | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |     auto string = trim_whitespace == TrimWhitespace::Yes | 
					
						
							|  |  |  |         ? str.trim_whitespace() | 
					
						
							|  |  |  |         : str; | 
					
						
							|  |  |  |     if (string.is_empty()) | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |         return {}; | 
					
						
							| 
									
										
										
										
											2020-03-02 21:19:33 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  |     T value = 0; | 
					
						
							| 
									
										
										
										
											2022-04-01 20:58:27 +03:00
										 |  |  |     auto const characters = string.characters_without_null_termination(); | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |     for (size_t i = 0; i < string.length(); i++) { | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |         if (characters[i] < '0' || characters[i] > '9') | 
					
						
							|  |  |  |             return {}; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-20 16:27:33 +11:00
										 |  |  |         if (__builtin_mul_overflow(value, 10, &value)) | 
					
						
							|  |  |  |             return {}; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         if (__builtin_add_overflow(value, characters[i] - '0', &value)) | 
					
						
							|  |  |  |             return {}; | 
					
						
							| 
									
										
										
										
											2020-03-02 21:19:33 +08:00
										 |  |  |     } | 
					
						
							| 
									
										
										
										
											2020-03-22 13:04:04 +01:00
										 |  |  |     return value; | 
					
						
							|  |  |  | } | 
					
						
							| 
									
										
										
										
											2020-03-02 21:19:33 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | template Optional<u8> convert_to_uint(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u16> convert_to_uint(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u32> convert_to_uint(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<unsigned long> convert_to_uint(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<unsigned long long> convert_to_uint(StringView str, TrimWhitespace); | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  | 
 | 
					
						
							|  |  |  | template<typename T> | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<T> convert_to_uint_from_hex(StringView str, TrimWhitespace trim_whitespace) | 
					
						
							| 
									
										
										
										
											2020-05-20 21:20:43 +03:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |     auto string = trim_whitespace == TrimWhitespace::Yes | 
					
						
							|  |  |  |         ? str.trim_whitespace() | 
					
						
							|  |  |  |         : str; | 
					
						
							|  |  |  |     if (string.is_empty()) | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |         return {}; | 
					
						
							| 
									
										
										
										
											2020-05-20 21:20:43 +03:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  |     T value = 0; | 
					
						
							| 
									
										
										
										
											2022-04-01 20:58:27 +03:00
										 |  |  |     auto const count = string.length(); | 
					
						
							| 
									
										
										
										
											2021-02-25 21:10:47 +01:00
										 |  |  |     const T upper_bound = NumericLimits<T>::max(); | 
					
						
							| 
									
										
										
										
											2020-05-20 21:20:43 +03:00
										 |  |  | 
 | 
					
						
							|  |  |  |     for (size_t i = 0; i < count; i++) { | 
					
						
							| 
									
										
										
										
											2021-06-18 16:21:27 +00:00
										 |  |  |         char digit = string[i]; | 
					
						
							| 
									
										
										
										
											2020-05-20 21:20:43 +03:00
										 |  |  |         u8 digit_val; | 
					
						
							| 
									
										
										
										
											2020-12-20 16:27:33 +11:00
										 |  |  |         if (value > (upper_bound >> 4)) | 
					
						
							|  |  |  |             return {}; | 
					
						
							| 
									
										
										
										
											2020-05-20 21:20:43 +03:00
										 |  |  | 
 | 
					
						
							|  |  |  |         if (digit >= '0' && digit <= '9') { | 
					
						
							|  |  |  |             digit_val = digit - '0'; | 
					
						
							|  |  |  |         } else if (digit >= 'a' && digit <= 'f') { | 
					
						
							|  |  |  |             digit_val = 10 + (digit - 'a'); | 
					
						
							|  |  |  |         } else if (digit >= 'A' && digit <= 'F') { | 
					
						
							|  |  |  |             digit_val = 10 + (digit - 'A'); | 
					
						
							|  |  |  |         } else { | 
					
						
							| 
									
										
										
										
											2020-06-12 21:07:52 +02:00
										 |  |  |             return {}; | 
					
						
							| 
									
										
										
										
											2020-05-20 21:20:43 +03:00
										 |  |  |         } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         value = (value << 4) + digit_val; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return value; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | template Optional<u8> convert_to_uint_from_hex(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u16> convert_to_uint_from_hex(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u32> convert_to_uint_from_hex(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u64> convert_to_uint_from_hex(StringView str, TrimWhitespace); | 
					
						
							| 
									
										
										
										
											2020-12-11 00:17:30 +11:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-12-20 21:06:54 +01:00
										 |  |  | template<typename T> | 
					
						
							|  |  |  | Optional<T> convert_to_uint_from_octal(StringView str, TrimWhitespace trim_whitespace) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     auto string = trim_whitespace == TrimWhitespace::Yes | 
					
						
							|  |  |  |         ? str.trim_whitespace() | 
					
						
							|  |  |  |         : str; | 
					
						
							|  |  |  |     if (string.is_empty()) | 
					
						
							|  |  |  |         return {}; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     T value = 0; | 
					
						
							| 
									
										
										
										
											2022-04-01 20:58:27 +03:00
										 |  |  |     auto const count = string.length(); | 
					
						
							| 
									
										
										
										
											2021-12-20 21:06:54 +01:00
										 |  |  |     const T upper_bound = NumericLimits<T>::max(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     for (size_t i = 0; i < count; i++) { | 
					
						
							|  |  |  |         char digit = string[i]; | 
					
						
							|  |  |  |         u8 digit_val; | 
					
						
							|  |  |  |         if (value > (upper_bound >> 3)) | 
					
						
							|  |  |  |             return {}; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         if (digit >= '0' && digit <= '7') { | 
					
						
							|  |  |  |             digit_val = digit - '0'; | 
					
						
							|  |  |  |         } else { | 
					
						
							|  |  |  |             return {}; | 
					
						
							|  |  |  |         } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         value = (value << 3) + digit_val; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return value; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | template Optional<u8> convert_to_uint_from_octal(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u16> convert_to_uint_from_octal(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u32> convert_to_uint_from_octal(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<u64> convert_to_uint_from_octal(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-10-11 00:48:45 +02:00
										 |  |  | #ifndef KERNEL
 | 
					
						
							|  |  |  | template<typename T> | 
					
						
							|  |  |  | Optional<T> convert_to_floating_point(StringView str, TrimWhitespace trim_whitespace) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     static_assert(IsSame<T, double> || IsSame<T, float>); | 
					
						
							|  |  |  |     auto string = trim_whitespace == TrimWhitespace::Yes | 
					
						
							|  |  |  |         ? str.trim_whitespace() | 
					
						
							|  |  |  |         : str; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     char const* start = string.characters_without_null_termination(); | 
					
						
							|  |  |  |     return parse_floating_point_completely<T>(start, start + str.length()); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | template Optional<double> convert_to_floating_point(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | template Optional<float> convert_to_floating_point(StringView str, TrimWhitespace); | 
					
						
							|  |  |  | #endif
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-03-10 08:48:54 +01:00
										 |  |  | bool equals_ignoring_ascii_case(StringView a, StringView b) | 
					
						
							| 
									
										
										
										
											2020-03-22 13:07:45 +01:00
										 |  |  | { | 
					
						
							|  |  |  |     if (a.length() != b.length()) | 
					
						
							|  |  |  |         return false; | 
					
						
							|  |  |  |     for (size_t i = 0; i < a.length(); ++i) { | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         if (to_ascii_lowercase(a.characters_without_null_termination()[i]) != to_ascii_lowercase(b.characters_without_null_termination()[i])) | 
					
						
							| 
									
										
										
										
											2020-03-22 13:07:45 +01:00
										 |  |  |             return false; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return true; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | bool ends_with(StringView str, StringView end, CaseSensitivity case_sensitivity) | 
					
						
							| 
									
										
										
										
											2020-05-26 02:58:34 -07:00
										 |  |  | { | 
					
						
							|  |  |  |     if (end.is_empty()) | 
					
						
							|  |  |  |         return true; | 
					
						
							|  |  |  |     if (str.is_empty()) | 
					
						
							|  |  |  |         return false; | 
					
						
							|  |  |  |     if (end.length() > str.length()) | 
					
						
							|  |  |  |         return false; | 
					
						
							| 
									
										
										
										
											2020-05-26 02:12:18 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  |     if (case_sensitivity == CaseSensitivity::CaseSensitive) | 
					
						
							|  |  |  |         return !memcmp(str.characters_without_null_termination() + (str.length() - end.length()), end.characters_without_null_termination(), end.length()); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     auto str_chars = str.characters_without_null_termination(); | 
					
						
							|  |  |  |     auto end_chars = end.characters_without_null_termination(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     size_t si = str.length() - end.length(); | 
					
						
							|  |  |  |     for (size_t ei = 0; ei < end.length(); ++si, ++ei) { | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         if (to_ascii_lowercase(str_chars[si]) != to_ascii_lowercase(end_chars[ei])) | 
					
						
							| 
									
										
										
										
											2020-05-26 02:12:18 -07:00
										 |  |  |             return false; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return true; | 
					
						
							| 
									
										
										
										
											2020-05-26 02:58:34 -07:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | bool starts_with(StringView str, StringView start, CaseSensitivity case_sensitivity) | 
					
						
							| 
									
										
										
										
											2020-07-18 17:59:38 +01:00
										 |  |  | { | 
					
						
							|  |  |  |     if (start.is_empty()) | 
					
						
							|  |  |  |         return true; | 
					
						
							|  |  |  |     if (str.is_empty()) | 
					
						
							|  |  |  |         return false; | 
					
						
							|  |  |  |     if (start.length() > str.length()) | 
					
						
							|  |  |  |         return false; | 
					
						
							|  |  |  |     if (str.characters_without_null_termination() == start.characters_without_null_termination()) | 
					
						
							|  |  |  |         return true; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (case_sensitivity == CaseSensitivity::CaseSensitive) | 
					
						
							|  |  |  |         return !memcmp(str.characters_without_null_termination(), start.characters_without_null_termination(), start.length()); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     auto str_chars = str.characters_without_null_termination(); | 
					
						
							|  |  |  |     auto start_chars = start.characters_without_null_termination(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     size_t si = 0; | 
					
						
							|  |  |  |     for (size_t starti = 0; starti < start.length(); ++si, ++starti) { | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         if (to_ascii_lowercase(str_chars[si]) != to_ascii_lowercase(start_chars[starti])) | 
					
						
							| 
									
										
										
										
											2020-07-18 17:59:38 +01:00
										 |  |  |             return false; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return true; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | bool contains(StringView str, StringView needle, CaseSensitivity case_sensitivity) | 
					
						
							| 
									
										
										
										
											2020-10-20 15:07:03 -06:00
										 |  |  | { | 
					
						
							|  |  |  |     if (str.is_null() || needle.is_null() || str.is_empty() || needle.length() > str.length()) | 
					
						
							|  |  |  |         return false; | 
					
						
							|  |  |  |     if (needle.is_empty()) | 
					
						
							|  |  |  |         return true; | 
					
						
							|  |  |  |     auto str_chars = str.characters_without_null_termination(); | 
					
						
							|  |  |  |     auto needle_chars = needle.characters_without_null_termination(); | 
					
						
							|  |  |  |     if (case_sensitivity == CaseSensitivity::CaseSensitive) | 
					
						
							|  |  |  |         return memmem(str_chars, str.length(), needle_chars, needle.length()) != nullptr; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |     auto needle_first = to_ascii_lowercase(needle_chars[0]); | 
					
						
							| 
									
										
										
										
											2020-11-12 23:44:32 +00:00
										 |  |  |     for (size_t si = 0; si < str.length(); si++) { | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         if (to_ascii_lowercase(str_chars[si]) != needle_first) | 
					
						
							| 
									
										
										
										
											2020-10-20 15:07:03 -06:00
										 |  |  |             continue; | 
					
						
							| 
									
										
										
										
											2020-11-12 23:44:32 +00:00
										 |  |  |         for (size_t ni = 0; si + ni < str.length(); ni++) { | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |             if (to_ascii_lowercase(str_chars[si + ni]) != to_ascii_lowercase(needle_chars[ni])) { | 
					
						
							| 
									
										
										
										
											2022-03-18 18:02:07 +00:00
										 |  |  |                 if (ni > 0) | 
					
						
							|  |  |  |                     si += ni - 1; | 
					
						
							| 
									
										
										
										
											2020-10-20 15:07:03 -06:00
										 |  |  |                 break; | 
					
						
							| 
									
										
										
										
											2020-11-12 23:44:32 +00:00
										 |  |  |             } | 
					
						
							|  |  |  |             if (ni + 1 == needle.length()) | 
					
						
							|  |  |  |                 return true; | 
					
						
							| 
									
										
										
										
											2020-10-20 15:07:03 -06:00
										 |  |  |         } | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return false; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | bool is_whitespace(StringView str) | 
					
						
							| 
									
										
										
										
											2020-09-20 18:05:04 +04:30
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2021-07-25 15:05:48 -06:00
										 |  |  |     return all_of(str, is_ascii_space); | 
					
						
							| 
									
										
										
										
											2021-01-03 02:56:02 +03:30
										 |  |  | } | 
					
						
							| 
									
										
										
										
											2020-09-20 18:05:04 +04:30
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | StringView trim(StringView str, StringView characters, TrimMode mode) | 
					
						
							| 
									
										
										
										
											2021-01-03 02:56:02 +03:30
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2020-09-20 18:05:04 +04:30
										 |  |  |     size_t substring_start = 0; | 
					
						
							|  |  |  |     size_t substring_length = str.length(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (mode == TrimMode::Left || mode == TrimMode::Both) { | 
					
						
							|  |  |  |         for (size_t i = 0; i < str.length(); ++i) { | 
					
						
							|  |  |  |             if (substring_length == 0) | 
					
						
							| 
									
										
										
										
											2022-07-11 17:32:29 +00:00
										 |  |  |                 return ""sv; | 
					
						
							| 
									
										
										
										
											2021-05-25 09:42:01 +02:00
										 |  |  |             if (!characters.contains(str[i])) | 
					
						
							| 
									
										
										
										
											2020-09-20 18:05:04 +04:30
										 |  |  |                 break; | 
					
						
							|  |  |  |             ++substring_start; | 
					
						
							|  |  |  |             --substring_length; | 
					
						
							|  |  |  |         } | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (mode == TrimMode::Right || mode == TrimMode::Both) { | 
					
						
							| 
									
										
										
										
											2022-10-11 14:38:09 +01:00
										 |  |  |         for (size_t i = str.length(); i > 0; --i) { | 
					
						
							| 
									
										
										
										
											2020-09-20 18:05:04 +04:30
										 |  |  |             if (substring_length == 0) | 
					
						
							| 
									
										
										
										
											2022-07-11 17:32:29 +00:00
										 |  |  |                 return ""sv; | 
					
						
							| 
									
										
										
										
											2022-10-11 14:38:09 +01:00
										 |  |  |             if (!characters.contains(str[i - 1])) | 
					
						
							| 
									
										
										
										
											2020-09-20 18:05:04 +04:30
										 |  |  |                 break; | 
					
						
							|  |  |  |             --substring_length; | 
					
						
							|  |  |  |         } | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     return str.substring_view(substring_start, substring_length); | 
					
						
							|  |  |  | } | 
					
						
							| 
									
										
										
										
											2021-01-12 23:28:45 +03:30
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | StringView trim_whitespace(StringView str, TrimMode mode) | 
					
						
							| 
									
										
										
										
											2021-05-25 09:42:01 +02:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2022-07-11 17:32:29 +00:00
										 |  |  |     return trim(str, " \n\t\v\f\r"sv, mode); | 
					
						
							| 
									
										
										
										
											2021-05-25 09:42:01 +02:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<size_t> find(StringView haystack, char needle, size_t start) | 
					
						
							| 
									
										
										
										
											2021-01-12 23:28:45 +03:30
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2021-07-01 14:58:37 +02:00
										 |  |  |     if (start >= haystack.length()) | 
					
						
							|  |  |  |         return {}; | 
					
						
							|  |  |  |     for (size_t i = start; i < haystack.length(); ++i) { | 
					
						
							|  |  |  |         if (haystack[i] == needle) | 
					
						
							|  |  |  |             return i; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return {}; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<size_t> find(StringView haystack, StringView needle, size_t start) | 
					
						
							| 
									
										
										
										
											2021-07-01 14:58:37 +02:00
										 |  |  | { | 
					
						
							|  |  |  |     if (start > haystack.length()) | 
					
						
							|  |  |  |         return {}; | 
					
						
							|  |  |  |     auto index = AK::memmem_optional( | 
					
						
							|  |  |  |         haystack.characters_without_null_termination() + start, haystack.length() - start, | 
					
						
							| 
									
										
										
										
											2021-01-12 23:28:45 +03:30
										 |  |  |         needle.characters_without_null_termination(), needle.length()); | 
					
						
							| 
									
										
										
										
											2021-07-01 14:58:37 +02:00
										 |  |  |     return index.has_value() ? (*index + start) : index; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<size_t> find_last(StringView haystack, char needle) | 
					
						
							| 
									
										
										
										
											2021-07-01 14:58:37 +02:00
										 |  |  | { | 
					
						
							|  |  |  |     for (size_t i = haystack.length(); i > 0; --i) { | 
					
						
							|  |  |  |         if (haystack[i - 1] == needle) | 
					
						
							|  |  |  |             return i - 1; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return {}; | 
					
						
							| 
									
										
										
										
											2021-01-12 23:28:45 +03:30
										 |  |  | } | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-12-15 21:20:14 +00:00
										 |  |  | Optional<size_t> find_last(StringView haystack, StringView needle) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     for (size_t i = haystack.length(); i > 0; --i) { | 
					
						
							|  |  |  |         auto value = StringUtils::find(haystack, needle, i - 1); | 
					
						
							|  |  |  |         if (value.has_value()) | 
					
						
							|  |  |  |             return value; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     return {}; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-09-30 21:19:53 +02:00
										 |  |  | Optional<size_t> find_last_not(StringView haystack, char needle) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     for (size_t i = haystack.length(); i > 0; --i) { | 
					
						
							|  |  |  |         if (haystack[i - 1] != needle) | 
					
						
							|  |  |  |             return i - 1; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return {}; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Vector<size_t> find_all(StringView haystack, StringView needle) | 
					
						
							| 
									
										
										
										
											2021-07-01 17:00:34 +02:00
										 |  |  | { | 
					
						
							|  |  |  |     Vector<size_t> positions; | 
					
						
							|  |  |  |     size_t current_position = 0; | 
					
						
							|  |  |  |     while (current_position <= haystack.length()) { | 
					
						
							|  |  |  |         auto maybe_position = AK::memmem_optional( | 
					
						
							|  |  |  |             haystack.characters_without_null_termination() + current_position, haystack.length() - current_position, | 
					
						
							|  |  |  |             needle.characters_without_null_termination(), needle.length()); | 
					
						
							|  |  |  |         if (!maybe_position.has_value()) | 
					
						
							|  |  |  |             break; | 
					
						
							|  |  |  |         positions.append(current_position + *maybe_position); | 
					
						
							|  |  |  |         current_position += *maybe_position + 1; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return positions; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | Optional<size_t> find_any_of(StringView haystack, StringView needles, SearchDirection direction) | 
					
						
							| 
									
										
										
										
											2021-07-01 18:12:21 +02:00
										 |  |  | { | 
					
						
							|  |  |  |     if (haystack.is_empty() || needles.is_empty()) | 
					
						
							|  |  |  |         return {}; | 
					
						
							|  |  |  |     if (direction == SearchDirection::Forward) { | 
					
						
							|  |  |  |         for (size_t i = 0; i < haystack.length(); ++i) { | 
					
						
							|  |  |  |             if (needles.contains(haystack[i])) | 
					
						
							|  |  |  |                 return i; | 
					
						
							|  |  |  |         } | 
					
						
							|  |  |  |     } else if (direction == SearchDirection::Backward) { | 
					
						
							|  |  |  |         for (size_t i = haystack.length(); i > 0; --i) { | 
					
						
							|  |  |  |             if (needles.contains(haystack[i - 1])) | 
					
						
							|  |  |  |                 return i - 1; | 
					
						
							|  |  |  |         } | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return {}; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-02-16 00:24:43 +02:00
										 |  |  | #ifndef KERNEL
 | 
					
						
							| 
									
										
										
										
											2022-12-04 18:02:33 +00:00
										 |  |  | DeprecatedString to_snakecase(StringView str) | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  | { | 
					
						
							|  |  |  |     auto should_insert_underscore = [&](auto i, auto current_char) { | 
					
						
							|  |  |  |         if (i == 0) | 
					
						
							|  |  |  |             return false; | 
					
						
							|  |  |  |         auto previous_ch = str[i - 1]; | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         if (is_ascii_lower_alpha(previous_ch) && is_ascii_upper_alpha(current_char)) | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  |             return true; | 
					
						
							|  |  |  |         if (i >= str.length() - 1) | 
					
						
							|  |  |  |             return false; | 
					
						
							|  |  |  |         auto next_ch = str[i + 1]; | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         if (is_ascii_upper_alpha(current_char) && is_ascii_lower_alpha(next_ch)) | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  |             return true; | 
					
						
							|  |  |  |         return false; | 
					
						
							|  |  |  |     }; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     StringBuilder builder; | 
					
						
							|  |  |  |     for (size_t i = 0; i < str.length(); ++i) { | 
					
						
							|  |  |  |         auto ch = str[i]; | 
					
						
							|  |  |  |         if (should_insert_underscore(i, ch)) | 
					
						
							|  |  |  |             builder.append('_'); | 
					
						
							| 
									
										
										
										
											2021-07-06 13:46:46 +02:00
										 |  |  |         builder.append_as_lowercase(ch); | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  |     } | 
					
						
							| 
									
										
										
										
											2022-12-06 01:12:49 +00:00
										 |  |  |     return builder.to_deprecated_string(); | 
					
						
							| 
									
										
										
										
											2021-02-20 22:39:22 +01:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-12-04 18:02:33 +00:00
										 |  |  | DeprecatedString to_titlecase(StringView str) | 
					
						
							| 
									
										
										
										
											2021-08-26 13:55:41 -04:00
										 |  |  | { | 
					
						
							|  |  |  |     StringBuilder builder; | 
					
						
							|  |  |  |     bool next_is_upper = true; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     for (auto ch : str) { | 
					
						
							|  |  |  |         if (next_is_upper) | 
					
						
							| 
									
										
										
										
											2022-10-20 08:44:18 -04:00
										 |  |  |             builder.append(to_ascii_uppercase(ch)); | 
					
						
							| 
									
										
										
										
											2021-08-26 13:55:41 -04:00
										 |  |  |         else | 
					
						
							| 
									
										
										
										
											2022-10-20 08:44:18 -04:00
										 |  |  |             builder.append(to_ascii_lowercase(ch)); | 
					
						
							| 
									
										
										
										
											2021-08-26 13:55:41 -04:00
										 |  |  |         next_is_upper = ch == ' '; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-12-06 01:12:49 +00:00
										 |  |  |     return builder.to_deprecated_string(); | 
					
						
							| 
									
										
										
										
											2021-08-26 13:55:41 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-12-04 18:02:33 +00:00
										 |  |  | DeprecatedString invert_case(StringView str) | 
					
						
							| 
									
										
										
										
											2022-05-18 22:23:45 -07:00
										 |  |  | { | 
					
						
							|  |  |  |     StringBuilder builder(str.length()); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     for (auto ch : str) { | 
					
						
							|  |  |  |         if (is_ascii_lower_alpha(ch)) | 
					
						
							|  |  |  |             builder.append(to_ascii_uppercase(ch)); | 
					
						
							|  |  |  |         else | 
					
						
							|  |  |  |             builder.append(to_ascii_lowercase(ch)); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-12-06 01:12:49 +00:00
										 |  |  |     return builder.to_deprecated_string(); | 
					
						
							| 
									
										
										
										
											2022-05-18 22:23:45 -07:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-12-04 18:02:33 +00:00
										 |  |  | DeprecatedString replace(StringView str, StringView needle, StringView replacement, ReplaceMode replace_mode) | 
					
						
							| 
									
										
										
										
											2021-09-11 02:15:44 +03:00
										 |  |  | { | 
					
						
							|  |  |  |     if (str.is_empty()) | 
					
						
							|  |  |  |         return str; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     Vector<size_t> positions; | 
					
						
							| 
									
										
										
										
											2022-07-05 22:33:15 +02:00
										 |  |  |     if (replace_mode == ReplaceMode::All) { | 
					
						
							| 
									
										
										
										
											2021-09-11 02:15:44 +03:00
										 |  |  |         positions = str.find_all(needle); | 
					
						
							|  |  |  |         if (!positions.size()) | 
					
						
							|  |  |  |             return str; | 
					
						
							|  |  |  |     } else { | 
					
						
							|  |  |  |         auto pos = str.find(needle); | 
					
						
							|  |  |  |         if (!pos.has_value()) | 
					
						
							|  |  |  |             return str; | 
					
						
							|  |  |  |         positions.append(pos.value()); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     StringBuilder replaced_string; | 
					
						
							|  |  |  |     size_t last_position = 0; | 
					
						
							|  |  |  |     for (auto& position : positions) { | 
					
						
							|  |  |  |         replaced_string.append(str.substring_view(last_position, position - last_position)); | 
					
						
							|  |  |  |         replaced_string.append(replacement); | 
					
						
							|  |  |  |         last_position = position + needle.length(); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     replaced_string.append(str.substring_view(last_position, str.length() - last_position)); | 
					
						
							| 
									
										
										
										
											2023-01-26 18:58:09 +00:00
										 |  |  |     return replaced_string.to_deprecated_string(); | 
					
						
							| 
									
										
										
										
											2021-09-11 02:15:44 +03:00
										 |  |  | } | 
					
						
							| 
									
										
											  
											
												AK: Introduce the new String, replacement for DeprecatedString
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.
Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.
Problems in DeprecatedString:
- It assumes string allocation never fails. This makes it impossible
  to use in allocation-sensitive contexts, and is the reason we had to
  ban DeprecatedString from the kernel entirely.
- The awkward null state. DeprecatedString can be null. It's different
  from the empty state, although null strings are considered empty.
  All code is immediately nicer when using Optional<DeprecatedString>
  but DeprecatedString came before Optional, which is how we ended up
  like this.
- The encoding of the underlying data is ambiguous. For the most part,
  we use it as if it's always UTF-8, but there have been cases where
  we pass around strings in other encodings (e.g ISO8859-1)
- operator[] and length() are used to iterate over DeprecatedString one
  byte at a time. This is done all over the codebase, and will *not*
  give the right results unless the string is all ASCII.
How we solve these issues in the new String:
- Functions that may allocate now return ErrorOr<String> so that ENOMEM
  errors can be passed to the caller.
- String has no null state. Use Optional<String> when needed.
- String is always UTF-8. This is validated when constructing a String.
  We may need to add a bypass for this in the future, for cases where
  you have a known-good string, but for now: validate all the things!
- There is no operator[] or length(). You can get the underlying data
  with bytes(), but for iterating over code points, you should be using
  an UTF-8 iterator.
Furthermore, it has two nifty new features:
- String implements a small string optimization (SSO) for strings that
  can fit entirely within a pointer. This means up to 3 bytes on 32-bit
  platforms, and 7 bytes on 64-bit platforms. Such small strings will
  not be heap-allocated.
- String can create substrings without making a deep copy of the
  substring. Instead, the superstring gets +1 refcount from the
  substring, and it acts like a view into the superstring. To make
  substrings like this, use the substring_with_shared_superstring() API.
One caveat:
- String does not guarantee that the underlying data is null-terminated
  like DeprecatedString does today. While this was nifty in a handful of
  places where we were calling C functions, it did stand in the way of
  shared-superstring substrings.
											
										 
											2022-12-01 13:27:43 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | ErrorOr<String> replace(String const& haystack, StringView needle, StringView replacement, ReplaceMode replace_mode) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     if (haystack.is_empty()) | 
					
						
							|  |  |  |         return haystack; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     // FIXME: Propagate Vector allocation failures (or do this without putting positions in a vector)
 | 
					
						
							|  |  |  |     Vector<size_t> positions; | 
					
						
							|  |  |  |     if (replace_mode == ReplaceMode::All) { | 
					
						
							|  |  |  |         positions = haystack.bytes_as_string_view().find_all(needle); | 
					
						
							|  |  |  |         if (!positions.size()) | 
					
						
							|  |  |  |             return haystack; | 
					
						
							|  |  |  |     } else { | 
					
						
							|  |  |  |         auto pos = haystack.bytes_as_string_view().find(needle); | 
					
						
							|  |  |  |         if (!pos.has_value()) | 
					
						
							|  |  |  |             return haystack; | 
					
						
							|  |  |  |         positions.append(pos.value()); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     StringBuilder replaced_string; | 
					
						
							|  |  |  |     size_t last_position = 0; | 
					
						
							|  |  |  |     for (auto& position : positions) { | 
					
						
							|  |  |  |         replaced_string.append(haystack.bytes_as_string_view().substring_view(last_position, position - last_position)); | 
					
						
							|  |  |  |         replaced_string.append(replacement); | 
					
						
							|  |  |  |         last_position = position + needle.length(); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     replaced_string.append(haystack.bytes_as_string_view().substring_view(last_position, haystack.bytes_as_string_view().length() - last_position)); | 
					
						
							|  |  |  |     return replaced_string.to_string(); | 
					
						
							|  |  |  | } | 
					
						
							| 
									
										
										
										
											2022-02-16 00:24:43 +02:00
										 |  |  | #endif
 | 
					
						
							| 
									
										
										
										
											2021-09-11 02:15:44 +03:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-09-11 01:02:24 +03:00
										 |  |  | // TODO: Benchmark against KMP (AK/MemMem.h) and switch over if it's faster for short strings too
 | 
					
						
							| 
									
										
										
										
											2021-11-11 00:55:02 +01:00
										 |  |  | size_t count(StringView str, StringView needle) | 
					
						
							| 
									
										
										
										
											2021-09-11 01:02:24 +03:00
										 |  |  | { | 
					
						
							|  |  |  |     if (needle.is_empty()) | 
					
						
							|  |  |  |         return str.length(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     size_t count = 0; | 
					
						
							|  |  |  |     for (size_t i = 0; i < str.length() - needle.length() + 1; ++i) { | 
					
						
							|  |  |  |         if (str.substring_view(i).starts_with(needle)) | 
					
						
							|  |  |  |             count++; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return count; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-02-26 15:25:24 +08:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | } |