Commit graph

67 commits

Author SHA1 Message Date
Timothy Flynn
9375660b64 LibHTTP+LibWeb+RequestServer: Move Fetch's HTTP header infra to LibHTTP
The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP
caching) implementation. We currently have one implementation in LibWeb
for our in-memory cache and another in RequestServer for our disk cache.

The implementations both largely revolve around interacting with HTTP
headers. But in LibWeb, we are using Fetch's header infra, and in RS we
are using are home-grown header infra from LibHTTP.

So to give these a common denominator, this patch replaces the LibHTTP
implementation with Fetch's infra. Our existing LibHTTP implementation
was not particularly compliant with any spec, so this at least gives us
a standards-based common implementation.

This migration also required moving a handful of other Fetch AOs over
to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP
folder, so perhaps it makes sense for LibHTTP to be the implementation
of that entire set of facilities.)
2025-11-27 14:57:29 +01:00
Timothy Flynn
3dce6766a3 LibWeb: Extract some CORS and MIME Fetch helpers to their own files
An upcoming commit will migrate the contents of Headers.h/cpp to LibHTTP
for use outside of LibWeb. These CORS and MIME helpers depend on other
LibWeb facilities, however, so they cannot be moved.
2025-11-27 14:57:29 +01:00
Timothy Flynn
0fd80a8f99 LibTextCodec+LibWeb: Move isomorphic coders to LibTextCodec
This will be used outside of LibWeb.
2025-11-27 14:57:29 +01:00
Timothy Flynn
f675cfe90f LibWeb: Store HTTP methods and headers as ByteString
The spec declares these as a byte sequence, which we then implemented as
a ByteBuffer. This has become pretty awkward to deal with, as evidenced
by the plethora of `MUST(ByteBuffer::copy(...))` and `.bytes()` calls
everywhere inside Fetch. We would then treat the bytes as a string
anyways by wrapping them in StringView everywhere.

We now store these as a ByteString. This is more comfortable to deal
with, and we no longer need to continually copy underlying storage (as
ByteString is ref-counted).

This work is largely preparatory for an upcoming HTTP header refactor.
2025-11-26 09:15:06 -05:00
Timothy Flynn
ed27eea091 LibWeb: Do not copy the result of HeaderList::extract_header_list_values
There's no need to copy the Vector out of this result every time we call
it. We can move it out or access it directly.
2025-11-26 09:15:06 -05:00
Timothy Flynn
d70224ad2e LibWeb: Organize Fetch Headers.h/Headers.cpp a bit
Generally just define things in the order they are declared (will make a
change to use ByteString in this file a bit easier to follow). Also make
a couple of free functions be class methods on Header / HeaderList.
2025-11-26 09:15:06 -05:00
Aliaksandr Kalenik
69cede4a0f AK+LibWeb: Make StringBase::bytes() lvalue-only
Disallow calling `StringBase::bytes()` on temporaries to avoid returning
`ReadonlyBytes` that outlive the underlying string.

With this change, we catch a real UAF:
`load_result.data = maybe_response.release_value().bytes();`
All other updated call sites were already safe, they just needed to use
an intermediate named variable to satisfy the new lvalue-only
requirement.
2025-11-25 13:02:20 -05:00
Aliaksandr Kalenik
16b0f1e6c2 LibWeb: Delete unused ResourceLoader::load()
...and rename `load_unbuffered()` to `load()`.
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
3058274386 LibWeb: Use unbuffered network requests for all Fetch requests
Previously, unbuffered requests were only available as a special mode
for EventSource. With this change, they are enabled by default, which
means chunks can be read from the stream as soon as they arrive.

This unlocks some interesting possibilities, such as starting to parse
HTML documents before the entire response has been received (that, in
turn, allows us to initiate subresource fetches earlier or begin
executing scripts sooner), or start rendering videos before they are
fully downloaded.

Co-authored-by: Timothy Flynn <trflynn89@pm.me>
2025-11-20 06:29:13 -05:00
Timothy Flynn
813986237e LibWeb: Add some tests that exercise the HTTP disk cache
Our HTTP disk cache is currently manually tested against various sites.
This patch adds some tests to cover various scenarios, including non-
cacheable responses, expired responses, and revalidation.

In order to ensure we hit the disk cache in RequestServer, we must
disable the in-memory cache in WebContent.
2025-11-20 09:33:49 +01:00
Psychpsyo
100f37995f Everywhere: Clean up AD-HOC and FIXME comments without colons 2025-11-13 15:56:04 +01:00
Luke Wilde
167de08c81 LibWeb: Remove exception throwing from Fetch
These were only here to manage OOMs, but there's not really any way to
recover from small OOMs in Fetch especially with its async nature.
2025-11-07 04:08:30 +01:00
Timothy Flynn
e0a8eb3767 LibWeb+WebContent: Hook Fetch's HTTP cache into the clear-cache action
And fix a typo in an invocation to clear the cache.
2025-11-05 18:27:36 +01:00
Timothy Flynn
5b40398c39 LibWeb: Invoke process_response_consume_body with null in error cases
We were previously invoking it with an empty ByteBuffer, which will be
interpreted as a successful load by HTMLLinkElement in a future commit.
2025-11-05 18:27:36 +01:00
Luke Wilde
eeb5446c1b LibWeb: Avoid including Navigable.h in headers
This greatly reduces how much is recompiled when changing Navigable.h,
from >1000 to 82.
2025-10-20 10:16:55 +01:00
Julian Dominguez-Schatz
4e3387778e LibWeb: Respect IncludeCredentials for Set-Cookie during fetch
Per https://fetch.spec.whatwg.org/#http-network-fetch, Set-Cookie should
only store a cookie if IncludeCredentials::Yes is set. Fixes 1 web
platform test.
2025-09-24 10:12:56 +01:00
Timothy Flynn
2df4835025 LibWeb: Place HTTP cache logging behind a debug flag
It's quite verbose to have logging on by default here.
2025-09-19 13:52:07 +02:00
Timothy Flynn
b4df857a57 LibWeb+LibWebView+WebContent: Replace DNT with GPC
Global Privacy Control aims to be a replacement for Do Not Track. DNT
ended up not being a great solution, as it wasn't enforced by law. This
actually resulted in the DNT header serving as an extra fingerprinting
data point.

GPC is becoming enforced by law in USA states such as California and
Colorado. CA is further working on a bill which requires that browsers
implement such an opt-out preference signal (OOPS):

https://cppa.ca.gov/announcements/2025/20250911.html

This patch replaces DNT with GPC and hooks up the associated settings.
2025-09-16 10:38:20 +02:00
Luke Wilde
05438e70f1 LibWeb: Receive cookies through principal_host_defined_page
Previously we depended on an associated document on the ESO to get to
the page, but Workers do not have documents. However, we can simply get
to the page with `principal_host_defined_page`, removing the issue.
2025-09-09 15:28:38 +02:00
ayeteadoe
0a699132f3 WebContent: Enable in Windows CI 2025-08-23 16:04:36 -06:00
Tete17
658477620a LibWeb/LibURL/LibIPC: Extend createObjectURL to also accept MediaSources
This required some changes in LibURL & LibIPC since it has its own
definition of an BlobURLEntry. For now, we don't have a concrete usage
of MediaSource in LibURL so it is defined as an empty struct.

This removes one FIXME in an idl file.
2025-08-19 23:50:38 +02:00
Kenneth Myhra
1228063a85 LibWeb: Enforce Integrity Policy on Fetch requests 2025-08-14 13:37:38 +01:00
Kenneth Myhra
0dc2fb3781 LibWeb: Update Fetch's compute the redirect-taint concept 2025-08-12 07:08:33 -04:00
Kenneth Myhra
1b350596fb LibWeb: Align Fetching chapter's "To fetch" with latest spec changes 2025-08-08 11:12:53 +01:00
Kenneth Myhra
593ee1ae0a LibWeb: Implement AO populate request from client 2025-08-08 11:12:53 +01:00
Kenneth Myhra
70cafc558e LibWeb: Replace request's "window" with "traversable for user prompts"
User prompts are not tied to specific Windows or the client's Window.
They are tied to a traversable navigable (browser tab).
2025-08-08 11:12:53 +01:00
Kenneth Myhra
681e4e5d01 LibWeb: Define stream variable before using it
This is an editoral change from the fetch spec. Since we already defined
the stream before it being used this only re-numbers the spec steps.

It also corrects a minor typo ('followings' to 'following') which was
corrected in the same editoral spec change.
2025-08-08 11:12:53 +01:00
Andreas Kling
66a19b8550 LibWeb: Make ESO "fetch group" weakly reference the fetch records
Otherwise we end up holding on to every fetch record indefinitely.

Found by analyzing GC heap graphs on Discord.
2025-07-29 20:00:17 -04:00
Andreas Kling
03256a2543 LibWeb: Add "parallel queue" and allow it as fetch task destination
Note that it's not actually executing tasks in parallel, it's still
throwing them on the HTML event loop task queue, each with its own
unique task source.

This makes our fetch implementation a lot more robust when HTTP caching
is enabled, and you can now click links on https://terminal.shop/
without hitting TODO assertions in fetch.
2025-07-17 00:13:39 +02:00
Shannon Booth
8a3c66d8a6 LibWeb: Make a bunch of CSP classes not realm associated
These are not associated with a javascript realm, so to avoid
confusion about which realm these need to be created in, make
all of these objects a GC::Cell, and deal with the fallout.
2025-04-28 12:41:28 +02:00
Timothy Flynn
6539c72e7e LibWeb: Allow CORS requests from opaque origins to resource:// URLs
JavaScript module requests (in a non-worker context) always have CORS
enabled. However, CORS requests are only allowed for same-origin or
HTTP/S requests. This patch extends this to allow resource:// requests
from opaque origins (e.g. about: URLs).

We must also set the Access-Control-Allow-Origin header to "null" to
ensure that the response is accepted by the CORS checks. This does not
affect requesting resource:// URLs from resource:// URLs as those are
same-origin and skip CORS checks.

This ultimately enables requesting resource:// JS modules from the
about:settings page.
2025-04-23 19:58:58 -04:00
Timothy Flynn
5f9b1d3cd4 LibWeb: Make the main fetch response callback a bit easier to read
Each `if` branch in this callback returns a value, so let's add a bit of
whitespace between them to be easier on the eyes.
2025-04-23 19:58:58 -04:00
Timothy Flynn
2b7b7d4d23 Revert "LibWeb: Mark body stream with a TypeError if the request failed"
This reverts commit 4d0301d2d2.

This caused /html/dom/reflection-embedded.html to massively regress.
2025-04-21 09:06:15 +02:00
Timothy Flynn
4d0301d2d2 LibWeb: Mark fetched body streams with a TypeError if the request failed
This will cause an exception to be thrown if user attempts to read from
the response stream of a failed request.

This is unfortunately not testable in CI. It requires a network response
(i.e. not a file:// URL). We also cannot import relevant WPT tests; they
exercise this condition with a python-generated response.
2025-04-20 16:50:37 +02:00
Timothy Flynn
3e8c6dbaff LibWeb: Move TransformStream AOs into their own file
The main streams AO file has gotten very large, and is a bit difficult
to navigate. In an effort to improve DX, this migrates TransformStream
AOs to their own file.
2025-04-18 06:55:40 -04:00
Timothy Flynn
a9ddd427cb LibWeb: Move ReadableStream AOs into their own file
The main streams AO file has gotten very large, and is a bit difficult
to navigate. In an effort to improve DX, this migrates ReadableStream
AOs to their own file. And the helper classes used for the tee and pipe-
to operations are also in their own files.
2025-04-18 06:55:40 -04:00
Timothy Flynn
f070264800 Everywhere: Remove sv suffix from format string literals
This prevents the compile-time checks that would catch errors in the
format invocation (which would usually lead to a runtime crash).
2025-04-08 20:00:18 -04:00
Shannon Booth
a5df972055 LibWeb: Do not store network errors as a StringView
This is very clearly a very dangerous API to have, and was causing
a crash on Linux as a result of a stack use-after-free when visiting
https://www.index.hr/.

Fixes #3901
2025-04-02 11:43:53 +02:00
Luke Wilde
7643a079c0 LibWeb: Enforce Content Security Policy of Fetch responses 2025-03-19 00:55:14 +01:00
Luke Wilde
51796e2d3a LibWeb: Report CSP violations for request 2025-03-19 00:55:14 +01:00
Luke Wilde
6f771f45e2 LibWeb: Enforce Content Security Policy on Fetch requests 2025-03-19 00:55:14 +01:00
Luke Wilde
6d1f78198d LibWeb: Implement Resource Timing 2025-03-06 09:00:53 -07:00
Luke Wilde
23c84e62a5 LibWeb/Fetch: Update timing info with the timings received from RS 2025-03-06 09:00:53 -07:00
Luke Wilde
618697ef13 LibWeb: Make reference to global in report timing steps non-const
Marking a resource timing entry requires calling non-const methods on
the global object to append to the performance buffer.
2025-03-06 09:00:53 -07:00
Luke Wilde
209b10e53e RequestServer: Retrieve timing info from curl and pipe it to LibWeb
This timing info will be used to create a PerformanceResourceTiming
entry.
2025-03-06 09:00:53 -07:00
Luke Wilde
cae0ab2139 LibWeb: Make PolicyContainer GC allocated
This is required to store Content Security Policies, as their
Directives are implemented as subclasses with overridden virtual
functions. Thus, they cannot be stored as generic Directive classes, as
it'll lose the ability to call overridden functions when they are
copied.
2025-02-21 12:43:48 +00:00
Luke Wilde
b35979c3f7 LibWeb: Set Sec-Fetch-Site header to same-site where appropriate
This also fixes it looking at the request's current URL origin instead
of the request's actual origin.
2025-01-30 19:32:57 +01:00
Shannon Booth
00cef330ef LibWeb: Partition Blob URL fetches by Storage Key
This was a security mechanism introduced in the fetch spec, with
supporting AOs added to the FileAPI spec.
2025-01-21 19:22:07 +00:00
Shannon Booth
ca3d9d9ee0 LibURL+LibWeb+LibIPC: Represent blob URL entry's object using structs
Instead of just putting in members directly, wrap them up in structs
which represent what a URL blob entry is meant to hold per the spec.
This makes more obvious what this is meant to represent, such as the
ByteBuffer being used to represent the bytes behind a Blob.

This also allows us to use a stronger type for a function that needs
to return a Blob URL entry's object.
2025-01-21 19:22:07 +00:00
Shannon Booth
ffda698d3a LibWeb/Streams: Actually implement the piped through steps
This mistakenly implemented the 'piped to' operation on ReadableStream.
No functional difference as the caller was doing the extra work already
of 'piped through' vs 'piped to'.
2024-12-27 06:56:38 -08:00