Commit graph

14 commits

Author SHA1 Message Date
Sam Atkins
c5f117f1a2 LibWeb/Fetch: Store Body Blob as Ref not Root
To avoid some churn, Body::source() returns the SourceTypeInternal
object, as it's not exposed to JS. The constructor is also adjusted so
that Body::clone() doesn't have to convert to a SourceType and then
immediately back again.
2026-02-17 07:40:03 -05:00
Andreas Kling
37bdcc3488 LibWeb: Support MIME type sniffing for streaming HTTP responses
Previously, when loading a document, we would try to sniff the MIME
type by reading from the response body's source. However, for streaming
HTTP responses, the body source is Empty (the data comes through the
stream instead), so we had no bytes to sniff.

This caused pages like hypr.land (which sends no Content-Type header)
to be misidentified as plain text instead of HTML, since the MIME
sniffing algorithm would receive zero bytes and fall back to the
default type.

The fix captures the first bytes of the response body during fetch,
storing them on the Body object. These bytes are the "resource header"
defined by the MIME Sniffing spec - up to 1445 bytes, which is enough
to identify any MIME type the spec can detect.

Since bytes may arrive asynchronously during streaming, we use a
callback mechanism: if bytes aren't ready yet when load_document()
needs them, it registers a callback that fires once enough bytes have
been captured (or the stream ends).

The flow is:
1. FetchedDataReceiver receives network bytes, buffers them
2. When Body is created, buffered bytes are flushed to Body's sniff
   buffer, and subsequent bytes are appended as they arrive
3. Before calling load_document(), Navigable waits for sniff bytes
4. load_document() passes the bytes to MimeSniff::Resource::sniff()
2026-01-24 15:21:26 +01:00
Andreas Kling
03256a2543 LibWeb: Add "parallel queue" and allow it as fetch task destination
Note that it's not actually executing tasks in parallel, it's still
throwing them on the HTML event loop task queue, each with its own
unique task source.

This makes our fetch implementation a lot more robust when HTTP caching
is enabled, and you can now click links on https://terminal.shop/
without hitting TODO assertions in fetch.
2025-07-17 00:13:39 +02:00
Timothy Flynn
0cd5e99066 LibWeb: Use the correct target realm to tee a stream
We currently store Web::Fetch::Infrastructure::Response objects in the
HTTP cache. They are associated with their original realm, but when we
use a cached response, we clone it into the target realm. For example,
two <iframe> objects loading the same HTML will be in different realms.

When we clone the response, we must use the target realm throughout the
entire cloning process. We neglected to do this for the cloned response
body stream, which is cloned via teeing. The result was the the stream
for the "cloned" response was created in the original realm, causing
issues down the line when reading from that stream tried to handle read
promises on behalf of the original realm. There are protections in place
to prevent this from happening, and the cached response read would never
complete.
2025-04-30 09:30:15 -04:00
Timothy Flynn
a9ddd427cb LibWeb: Move ReadableStream AOs into their own file
The main streams AO file has gotten very large, and is a bit difficult
to navigate. In an effort to improve DX, this migrates ReadableStream
AOs to their own file. And the helper classes used for the tee and pipe-
to operations are also in their own files.
2025-04-18 06:55:40 -04:00
Andreas Kling
de424d6879 LibJS: Make Completion.[[Value]] non-optional
Instead, just use js_undefined() whenever the [[Value]] field is unused.
This avoids a whole bunch of presence checks.
2025-04-05 11:20:26 +02:00
Shannon Booth
9ce0c5914b LibWeb: Add a 'get a reader' helper method on ReadableStream 2024-12-25 12:00:54 +01:00
Jelle Raaijmakers
1514197e36 LibWeb: Remove dom_ from dom_exception_to_throw_completion
We're not converting `WebIDL::DOMException`, but `WebIDL::Exception`
instead.
2024-12-09 20:02:51 -07:00
Jelle Raaijmakers
17d5dfe597 LibWeb: Implement Web::Fetch::Body::fully_read() closer to spec
By actually using streams, they get marked as disturbed and the
`.bodyUsed` API starts to work. Fixes at least 94 subtests in the WPT
`fetch/api/request` test suite.

Co-authored-by: Timothy Flynn <trflynn89@pm.me>
2024-12-09 20:02:51 -07:00
Timothy Flynn
953fe75271 LibWeb: Remove exception handling from safely extracting response bodies
The entire purpose of this AO is to avoid handling exceptions, which we
can do now that the underlying AOs do not throw exceptions on OOM.
2024-12-09 20:02:51 -07:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Shannon Booth
1e54003cb1 LibJS+LibWeb: Rename Heap::allocate_without_realm to Heap::allocate
Now that the heap has no knowledge about a JavaScript realm and is
purely for managing the memory of the heap, it does not make sense
to name this function to say that it is a non-realm variant.
2024-11-13 16:51:44 -05:00
Shannon Booth
9b79a686eb LibJS+LibWeb: Use realm.create<T> instead of heap.allocate<T>
The main motivation behind this is to remove JS specifics of the Realm
from the implementation of the Heap.

As a side effect of this change, this is a bit nicer to read than the
previous approach, and in my opinion, also makes it a little more clear
that this method is specific to a JavaScript Realm.
2024-11-13 16:51:44 -05:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Renamed from Userland/Libraries/LibWeb/Fetch/Infrastructure/HTTP/Bodies.cpp (Browse further)