Commit graph

67 commits

Author SHA1 Message Date
Timothy Flynn
a740bfd8ff AK+LibUnicode: Implement Unicode-aware UTF-16 case transformations 2025-07-25 18:16:22 +02:00
Timothy Flynn
7f069efbc4 AK: Implement a flyweight string for Utf16String
Utf16FlyString more or less works exactly the same as FlyString. It will
store the raw encoded data of the string instance. If the string is a
short ASCII string, Utf16FlyString holds the ShortString bytes; else,
Utf16FlyString holds a pointer to the Utf16StringData.
2025-07-18 12:45:38 -04:00
Timothy Flynn
fe676585f5 AK: Add a UTF-16 string with optimized short- and ASCII-string storage
This is a strictly UTF-16 string with some optimizations for ASCII.

* If created from a short UTF-8 or UTF-16 string that is also ASCII,
  then the string is stored in an inlined byte buffer.

* If created with a long UTF-8 or UTF-16 string that is also ASCII,
  then the string is stored in an outlined char buffer.

* If created with a short or long UTF-8 or UTF-16 string that is not
  ASCII, then the string is stored in an outlined char16 buffer.

We do not store short non-ASCII text in the inlined buffer to avoid
confusion with operations such as `length_in_code_units` and
`code_unit_at`. For example, "😀" would be stored as 4 UTF-8 bytes
in short string form. But we still want `length_in_code_units` to
be 2, and `code_unit_at(0)` to be 0xD83D.
2025-07-18 12:45:38 -04:00
ayeteadoe
25f5936dee CMake: Rename serenity_* helper functions/macros to ladybird_* 2025-07-03 23:19:41 +02:00
Timothy Flynn
62d9a84b8d AK+Everywhere: Replace custom number parsers with fast_float
Our floating point number parser was based on the fast_float library:
https://github.com/fastfloat/fast_float

However, our implementation only supports 8-bit characters. To support
UTF-16, we will need to be able to convert char16_t-based strings to
numbers as well. This works out-of-the-box with fast_float.

We can also use fast_float for integer parsing.
2025-07-03 09:51:56 -04:00
Timothy Flynn
35a1832d08 Tests/AK: Rename TestUtf16 / TestUtf8 to TestUtf16View / TestUtf8View
These are the files they actually test, so let's rename them to avoid
confusion with an upcoming Utf16String test.
2025-07-03 09:51:56 -04:00
Tomasz Strejczek
8f8e51b1fc AK: Implement AK::UnixDateTime::to_string()
Copy implementation of LibCore::DateTime::to_string()
to AK.
Rename TestDuration.cpp to TestTime.cpp and add
there tests for to_string().
2025-06-19 18:42:45 -06:00
Tomasz Strejczek
e03c558a0a AK: Implement demangle() for MSVC ABI
This implements demangle() using Windows API. Also some rudimentary
test is provided.
2025-06-17 18:39:18 -06:00
ayeteadoe
8cf01a25c2 AK: Add initial support for AK testsuite on Windows
We now explicitly enabling support for the minimum libraries needed
to build and run the AK testsuite. 81/82 tests are running and
passing. The exception is LexicalPath, as some path behaviour on
Windows is different than Unix, so the current tests will have lots of
platform specific failures. The implementer of LexicalPathWindows
recommended windows-specific tests here, so I will do that in a
follow up.
2025-05-20 10:58:43 -06:00
Ashton
4b3a3b0856 AK: Remove redundant TestPrint test
This test was only useful when AK/PrintfImplementation.h existed. But
that was removed 11 months ago, so since then this has just been
testing std library functions not implemented by us.
2025-05-18 19:18:13 -06:00
Andrew Kaster
01ac48b36f AK: Support storing blocks in AK::Function
This has two slightly different implementations for ARC and non-ARC
compiler modes. The main idea is to store a block pointer as our
closure and use either ARC magic or BlockRuntime methods to manage
the memory for the block. Things are complicated by the fact that
we don't yet force-enable swift, so we can't count on the swift.org
llvm fork being our compiler toolchain. The patch adds some CMake
checks and ifdefs to still support environments without support
for blocks or ARC.
2025-03-18 17:15:08 -06:00
Ivan Zuzak
49fc569136 AK: Add tests for GenericShorthands 2025-01-09 09:03:50 +01:00
Pavel Shliak
b521badbef AK: Remove QuickSelect 2024-12-05 16:53:28 +01:00
Pavel Shliak
ede0dbafc6 AK: Remove Statistics.h
This also removes MedianCut and GIFWriter
2024-12-05 16:53:28 +01:00
Jelle Raaijmakers
f88acedc8f AK: Remove unused floating point conversion code
Currently I don't expect this code to be ever used in Ladybird.
2024-10-08 19:02:51 +02:00
Andrew Kaster
782926601d Tests: Convert Swift tests to use Testing module where possible
The AK tests can't seem to use it because it crashes the frontend :)
2024-08-28 21:27:35 -06:00
Andrew Kaster
315a666e53 Tests: Add test to verify CxxSequence protocol conformance of containers
Building the test in debug mode currently crashes the swift frontend,
so we'll need to build this in release mode until that's fixed.
2024-08-17 17:44:37 -06:00
Andreas Kling
b88e0eb50a AK: Remove unused Complex.h 2024-06-18 12:00:14 +02:00
Andreas Kling
fe1aec124e AK: Remove unused ArbitrarySizedEnum class 2024-06-18 12:00:14 +02:00
Andreas Kling
6321e97b09 AK: Remove various unused things 2024-06-04 09:19:39 +02:00
Dan Klishch
45a0ba2167 AK: Introduce AK::enumerate
Co-Authored-By: Tim Flynn <trflynn89@pm.me>
2024-03-23 09:02:58 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
vincent-rg
a9df60ff1c AK: Update OptionParser::m_arg_index by substracting skipped args
On argument swapping to put positional ones toward the end,
m_arg_index was pointing at "last arg  index" + "skipped args" +
"consumed args" and thus was pointing ahead of the skipped ones.

m_arg_index now points after the current parsed option arguments.
2024-02-06 00:08:30 +01:00
Aliaksandr Kalenik
e394971209 AK+LibWeb: Use segmented vector to store commands in RecordingPainter
Using a vector to represent a list of painting commands results in many
reallocations, especially on pages with a lot of content.

This change addresses it by introducing a SegmentedVector, which allows
fast appending by representing a list as a sequence of fixed-size
vectors. Currently, this new data structure supports only the
operations used in RecordingPainter, which are appending and iterating.
2023-12-30 23:02:46 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Gurkirat Singh
f1b79e0cd3 AK: Implement slugify function for URL slug generation
The slugify function is used to convert input into URL-friendly slugs.
It processes each character in the input, keeping ascii alpha characters
after lowercase and replacing non-alphanum characters with the glue
character or a space if multiple spaces are encountered consecutively.
The resulting string is trimmed of leading and trailing whitespace, and
any internal whitespace is replaced with the glue character.

It is currently used in LibMarkdown headings generation code.
2023-10-30 10:39:59 +00:00
Tim Ledbetter
bf75ecdcf7 Tests/AK: Add FuzzyMatch tests 2023-10-06 22:09:18 +02:00
kleines Filmröllchen
213025f210 AK: Rename Time to Duration
That's what this class really is; in fact that's what the first line of
the comment says it is.

This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
2023-05-24 23:18:07 +02:00
Andreas Kling
b7e847e58b AK: Fix crash during teardown of self-owning objects
We now null out smart pointers *before* calling unref on the pointee.
This ensures that the same smart pointer can't be used to acquire a new
reference to the pointee after its destruction has begun.

I ran into this when destroying a non-empty IntrusiveList of RefPtrs,
but the problem was more general so this fixes it for all of RefPtr,
NonnullRefPtr, OwnPtr and NonnullOwnPtr.
2023-04-21 18:15:00 +02:00
Jelle Raaijmakers
f4342c9118 AK: Add AK::SIMD::exp_approximate
This approximation tries to generate values within 0.1% of their actual
expected value. Microbenchmarks indicate that this iterative SIMD
version can be up to 60x faster than `AK::SIMD::exp`.
2023-02-18 01:45:00 +01:00
Tim Schumacher
e8d5e938de AK: Remove the deprecated Stream implementation :^) 2023-02-08 19:18:26 +00:00
Staubfinger
da1023fcc5 AK: Add thresholds to quickselect_inline and Statistics::Median
I did a bit of Profiling and made the quickselect and median algorithms
use the best of option for the respective input size.
2023-02-03 19:04:15 +01:00
Staubfinger
becd6d106f AK: Testing for AK::quickselect_inline
The testing code found here is mainly derived from the Tests for
`AK::quick_sort`.
2023-02-03 19:04:15 +01:00
Tim Schumacher
093cf428a3 AK: Move memory streams from LibCore 2023-01-29 19:16:44 -07:00
Tim Schumacher
2470dd3bb5 AK: Move bit streams from LibCore 2023-01-29 19:16:44 -07:00
Tim Schumacher
ae64b68717 AK: Deprecate the old AK::Stream
This also removes a few cases where the respective header wasn't
actually required to be included.
2023-01-29 19:16:44 -07:00
Tim Schumacher
7526f9a8b7 AK: Remove CircularDuplexStream 2023-01-14 12:05:52 -05:00
Timothy Flynn
1d4f287582 AK: Implement FlyString for the new String class
This implements a FlyString that will de-duplicate String instances. The
FlyString will store the raw encoded data of the String instance: If the
String is a short string, FlyString holds the String::ShortString bytes;
otherwise FlyString holds a pointer to the Detail::StringData.

FlyString itself does not know about String's storage or how to refcount
its Detail::StringData. It defers to String to implement these details.
2023-01-12 11:23:58 +01:00
Timothy Flynn
6fcc1c7426 AK+LibUnicode: Provide Unicode-aware String case transformations
Since AK can't refer to LibUnicode directly, the strategy here is that
if you need case transformations, you can link LibUnicode and receive
them. If you try to use either of these methods without linking it, then
you'll of course get a linker error (note we don't do any fallbacks to
e.g. ASCII case transformations). If you don't need these methods, you
don't have to link LibUnicode.
2023-01-09 19:23:46 -07:00
Lucas CHOLLET
f12e81b74a AK: Add CircularBuffer
The class is very similar to `CircularDuplexStream` in its behavior.
Main differences are that `CircularBuffer`:
 - does not inherit from `AK::Stream`
 - uses `ErrorOr` for its API
 - is heap allocated (and OOM-Safe)

 This patch also add some tests.
2022-12-31 04:44:17 -07:00
Lenny Maiorani
f2336d0144 AK+Everywhere: Move custom deleter capability to OwnPtr
`OwnPtrWithCustomDeleter` was a decorator which provided the ability
to add a custom deleter to `OwnPtr` by wrapping and taking the deleter
as a run-time argument to the constructor. This solution means that no
additional space is needed for the `OwnPtr` because it doesn't need to
store a pointer to the deleter, but comes at the cost of having an
extra type that stores a pointer for every instance.

This logic is moved directly into `OwnPtr` by adding a template
argument that is defaulted to the default deleter for the type. This
means that the type itself stores the pointer to the deleter instead
of every instance and adds some type safety by encoding the deleter in
the type itself instead of taking a run-time argument.
2022-12-17 16:00:08 -05:00
Lenny Maiorani
5875e66531 Tests: ASCII-betical-ize CMakeLists AK tests 2022-12-17 18:32:26 +01:00
Marc Luqué
22f472249d AK: Introduce cutoff to insertion sort for Quicksort
Implement insertion sort in AK. The cutoff value 7 is a magic number
here, values [5, 15] should work well. Main idea of the cutoff is to
reduce recursion performed by quicksort to speed up sorting
of small partitions.
2022-12-12 15:03:57 +00:00
Andreas Kling
a3e82eaad3 AK: Introduce the new String, replacement for DeprecatedString
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.

Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.

Problems in DeprecatedString:

- It assumes string allocation never fails. This makes it impossible
  to use in allocation-sensitive contexts, and is the reason we had to
  ban DeprecatedString from the kernel entirely.

- The awkward null state. DeprecatedString can be null. It's different
  from the empty state, although null strings are considered empty.
  All code is immediately nicer when using Optional<DeprecatedString>
  but DeprecatedString came before Optional, which is how we ended up
  like this.

- The encoding of the underlying data is ambiguous. For the most part,
  we use it as if it's always UTF-8, but there have been cases where
  we pass around strings in other encodings (e.g ISO8859-1)

- operator[] and length() are used to iterate over DeprecatedString one
  byte at a time. This is done all over the codebase, and will *not*
  give the right results unless the string is all ASCII.

How we solve these issues in the new String:

- Functions that may allocate now return ErrorOr<String> so that ENOMEM
  errors can be passed to the caller.

- String has no null state. Use Optional<String> when needed.

- String is always UTF-8. This is validated when constructing a String.
  We may need to add a bypass for this in the future, for cases where
  you have a known-good string, but for now: validate all the things!

- There is no operator[] or length(). You can get the underlying data
  with bytes(), but for iterating over code points, you should be using
  an UTF-8 iterator.

Furthermore, it has two nifty new features:

- String implements a small string optimization (SSO) for strings that
  can fit entirely within a pointer. This means up to 3 bytes on 32-bit
  platforms, and 7 bytes on 64-bit platforms. Such small strings will
  not be heap-allocated.

- String can create substrings without making a deep copy of the
  substring. Instead, the superstring gets +1 refcount from the
  substring, and it acts like a view into the superstring. To make
  substrings like this, use the substring_with_shared_superstring() API.

One caveat:

- String does not guarantee that the underlying data is null-terminated
  like DeprecatedString does today. While this was nifty in a handful of
  places where we were calling C functions, it did stand in the way of
  shared-superstring substrings.
2022-12-06 15:21:26 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Dan Klishch
fdc53a5995 AK: Add framework for a unified floating point to string conversion
Currently, the floating point to string conversion is implemented
several times across the codebase. This commit provides a pretty
low-level function to unify all of such conversions. It converts the
given double to a fixed point decimal satisfying a few correctness
criteria.
2022-11-03 20:17:09 -06:00
davidot
53b7f5e6a1 AK: Add an exact and fast floating point parsing algorithm
This is based on the paper by Daniel Lemire called
"Number parsing at a Gigabyte per second", currently available at
https://arxiv.org/abs/2101.11408
An implementation can be found at
https://github.com/fastfloat/fast_float

To support both strtod like methods and String::to_double we have two
different APIs. The parse_first_floating_point gives back both the
result, next character to read and the error/out of range status.
Out of range here means we rounded to infinity 0.

The other API, parse_floating_point_completely, will return a floating
point only if the given character range contains just the floating point
and nothing else. This can be much faster as we can skip actually
computing the value if we notice we did not parse the whole range.

Both of these APIs support a very lenient format to be usable in as many
places as possible. Also it does not check for "named" values like
"nan", "inf", "NAN" etc. Because this can be different for every usage.

For integers and small values this new method is not faster and often
even a tiny bit slower than the current strtod implementation. However
the strtod implementation is wrong for a lot of values and has a much
less predictable running time.

For correctness this method was tested against known string -> double
datasets from https://github.com/nigeltao/parse-number-fxx-test-data
This method gives 100% accuracy.
The old strtod gave an incorrect value in over 50% of the numbers
tested.
2022-10-23 15:48:45 +02:00
Jelle Raaijmakers
8483064b59 AK: Add FloatingPoint.h
This is a set of functions that allow you to convert between arbitrary
IEEE 754 floating point types, as long as they can be represented
within 64 bits. Conversion methods between floats and doubles are
provided, as well as a generic `float_to_float()`.

Example usage:

  #include <AK/FloatingPoint.h>

  double val = 1.234;
  auto weird_f16 =
      convert_from_native_double<FloatingPointBits<0, 6, 10>>(val);

Signed and unsigned floats are supported, and both NaN and +/-Inf are
handled correctly. Values that do not fit in the target floating point
type are clamped.
2022-08-27 12:28:05 +02:00
safarp
704e1d13f4 AK: Allow printing wide characters using %ls modifier 2022-03-30 11:30:43 +04:30
Linus Groh
22308e52cf AK: Add an ArbitrarySizedEnum template
This is an enum-like type that works with arbitrary sized storage > u64,
which is the limit for a regular enum class - which limits it to 64
members when needing bit field behavior.

Co-authored-by: Ali Mohammad Pur <mpfard@serenityos.org>
2022-03-27 18:54:56 +02:00