Replace the remaining UTF-16-mode StringBuilder instances in LibWeb
with Utf16StringBuilder when they assemble Utf16String values or UTF-16
text views. This keeps those paths in UTF-16 throughout and uses
explicit ASCII append helpers for ASCII-only pieces.
Keep StringBuilder in place for JSON, markup, CSS serialization, and
other byte-oriented string construction paths.
Enable -Wexit-time-destructors for all in-tree library targets and
update process-lifetime library statics so they no longer register
exit-time destructors. Long-lived caches, lookup tables, singleton
registries, and generated constants now use NeverDestroyed or leaked
references where the data is intended to live until process exit.
Update LibWeb, LibLine, and the binding generators so regenerated
sources follow the same rule instead of reintroducing destructed
statics.
Represent BufferSource and ArrayBufferView as ordinary IDL typedefs over
their underlying union types, instead of special casing in the IDL
generator. This allows the union conversion/return machinery handle
these types consistently with other typedefs, which removes buffer
specific paths from the IDL generator.
This necessitates changing the WebIDL::BufferSource and
WebIDL::ArrayBufferView classes as views over these variants. This
replaces the old GC backed BufferableObject wrapper structure and
provide convenience helpers to determine things such as the byte length,
byte offset, backing buffer, and typed-array APIs.
Represent WebIDL C++ types with a single CppType model that tracks
nullability, optional presence, and contained storage.
GC-like values now use GC::Ref/GC::Ptr directly, while containers choose
"plain", "Root", or "Conservative" container types depending on what
they contain. For example, sequence<Element> becomes a RootVector of
GC::Ref values, while sequence<SomeDictionary> becomes a
ConservativeVector only when the dictionary contains GC-like values.
This moves the generated bindings away from wrapping GC values in
GC::Root by default.
This has broad fallout as the types passed to interfaces for GC
objects changes almost fully across the board.
Previously we were inconsistent by generating code for enum definitions
but not generating code for dictionaries. With future changes to the
IDL generator to expose helpers to convert to and from IDL values
this produced circular depdendencies. To solve this problem, also
generate the dictionary definitions in bindings headers.
Previously, the LibWeb bindings generator would output multiple per
interface files like Prototype/Constructor/Namespace/GlobalMixin
depending on the contents of that IDL file.
This complicates the build system as it means that it does not know
what files will be generated without knowledge of the contents of that
IDL file.
Instead, for each IDL file only generate a single Bindings/<IDLFile>.h
and Bindings/<IDLFile>.cpp.
While this does cost us an extra byte to serialize as it
contains _all_ interface names instead of the set of serializable
types, doing this will allow us to remove to use the same
enum for checking whether that interface is exposed in a future
commit.
We have a couple of ways to designate spec notes and (our) developer
notes in comments, but we never really settled on a single approach. As
a result, we have a bit of a mixed bag of note comments on our hands.
To the extent that I could find them, I changed developer notes to
`// NB: ...` and changed spec notes to `// NOTE: ...`. The rationale for
this is that in most web specs, notes are prefixed by `NOTE: ...` so
this makes it easier to copy paste verbatim. The choice for `NB: ...` is
pretty arbitrary, but it makes it stand out from the regular spec notes
and it was already in wide use in our codebase.
Previously we did not execute the constructor steps if we constructed a
Blob from a ByteBuffer and a String. Though we only construct a Blob
from ByteBuffer and String internally we still need to run these steps.
By creating the new type 'BlobPartsOrByteBuffer' we can consolidate
those two approaches to creating a Blob into our already existing
constructor steps.
This first pass only applies to the following two cases:
- Public functions returning a view type into an object they own
- Public ctors storing a view type
This catches a grand total of one (1) issue, which is fixed in
the previous commit.
This required some changes in LibURL & LibIPC since it has its own
definition of an BlobURLEntry. For now, we don't have a concrete usage
of MediaSource in LibURL so it is defined as an empty struct.
This removes one FIXME in an idl file.
Most locations already make this assumption, but we had a few that would
silently ignore data mismatches. Let's just always assume the type is
correct for now. If a bad actor has a hold of our transport socket, it's
probably better to crash WebContent than to continue on with incorrect
data.
In the future, maybe we will want to throw an exception instead.
Our structured serialization implementation had its own bespoke encoder
and decoder to serialize JS values. It also used a u32 buffer under the
hood, which made using its structures a bit awkward. We had previously
worked around its data structures in transferable streams, which nested
transfers of MessagePort instances. We basically had to add hooks into
the MessagePort to route to the correct transfer receiving steps, and
we could not invoke the correct AOs directly as the spec dictates.
We now use IPC mechanics to encode and decode data. This works because,
although we are encoding JS values, we are only ultimately encoding
primitive and basic AK types. The resulting data structures actually
enforce that we implement transferable streams exactly as the spec is
worded (I had planned to do that in a separate commit, but the fallout
of this patch actually required that change).
Our currently implementation of structured serialization has a design
flaw, where if the serialized/transferred type was not used in the
destination realm, it would not be seen as exposed and thus we would
not re-create the type on the other side.
This is very common, for example, transferring a MessagePort to a just
inserted iframe, or the just inserted iframe transferring a MessagePort
to it's parent. This is what Google reCAPTCHA does.
This flaw occurred due to relying on lazily populated HashMaps of
constructors, namespaces and interfaces. This commit changes it so that
per-type "is exposed" implementations are generated.
Since it no longer relies on interface name strings, this commit
changes serializable types to indicate their type with an enum,
in line with how transferrable types indicate their type.
This makes Google reCAPTCHA work on https://www.google.com/recaptcha/api2/demo
It currently doesn't work on non-Google origins due to a separate
same-origin policy bug.
By definition, the web allows lonely surrogates by default. Let's have
our string APIs reflect this, so we don't have to pass an allow option
all over the place.
To prepare for an upcoming Utf16String, this migrates Utf16View to store
its data as a char16_t. Most function definitions are moved inline and
made constexpr.
This also adds a UDL to construct a Utf16View from a string literal:
auto string = u"hello"sv;
This let's us remove the NTTP Utf16View constructor, as we have found
that such constructors bloat binary size quite a bit.
The streams AO that we were calling to close the stream would assert
if it was not in a readable state. This version of close is exported
publicly in the streams specification, and properly handles this
situation.
Fixes a crash in the imported test, and happens to fix some others!
Before this change, we were going through the chain of base classes for
each IDL interface object and having them set the prototype to their
prototype.
Instead of doing that, reorder things so that we set the right prototype
immediately in Foo::initialize(), and then don't bother in all the base
class overrides.
This knocks off a ~1% profile item on Speedometer 3.
The main streams AO file has gotten very large, and is a bit difficult
to navigate. In an effort to improve DX, this migrates ReadableStream
AOs to their own file. And the helper classes used for the tee and pipe-
to operations are also in their own files.
I can't find of a way to trigger this codepath, and there is for sure no
correpsonding WPT test, but let's fix this up for whenever it does
become relevant.
The spec intends to pass through a URL record object as it needs to
be serialized on removal. This has no functional impact on our
implementation other than the double parsing of every URL being
revoked.
It is also missing an error check for an invalid URL being passed
through. This does not impact our implementation currently as we
just end up using an empty URL which is not part of the blob entry
map. This will cause problems once DOMURL::parse is updated to
return an Optional<URL::URL> however.
Which will have proper handling of the exection context when performing
a microtask checkpoint. This fixes an assertion to be added in the next
commit where perform_a_microtask_check is invoked on a non-empty
execution context stack.
C++ will jovially select the implicit conversion operator, even if it's
complete bogus, such as for unknown-size types or non-destructible
types. Therefore, all such conversions (which incur a copy) must
(unfortunately) be explicit so that non-copyable types continue to work.
NOTE: We make an exception for trivially copyable types, since they
are, well, trivially copyable.
Co-authored-by: kleines Filmröllchen <filmroellchen@serenityos.org>