Commit graph

49 commits

Author SHA1 Message Date
Timothy Flynn
674075f79e Everywhere: Remove LibCore/System.h includes from header files
This reduces the number of compilation jobs when System.h changes from
about 750 to 60. (There are still a large number of linker jobs.)
2025-12-04 15:40:46 +00:00
Timothy Flynn
9375660b64 LibHTTP+LibWeb+RequestServer: Move Fetch's HTTP header infra to LibHTTP
The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP
caching) implementation. We currently have one implementation in LibWeb
for our in-memory cache and another in RequestServer for our disk cache.

The implementations both largely revolve around interacting with HTTP
headers. But in LibWeb, we are using Fetch's header infra, and in RS we
are using are home-grown header infra from LibHTTP.

So to give these a common denominator, this patch replaces the LibHTTP
implementation with Fetch's infra. Our existing LibHTTP implementation
was not particularly compliant with any spec, so this at least gives us
a standards-based common implementation.

This migration also required moving a handful of other Fetch AOs over
to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP
folder, so perhaps it makes sense for LibHTTP to be the implementation
of that entire set of facilities.)
2025-11-27 14:57:29 +01:00
Timothy Flynn
0480934afb LibHTTP+LibWeb: Remove unused HTTP::HTTPResponse
The only thing in HTTPResponse being used is reason_phrase_for_code,
which is just a static helper method. Move it to its own file and remove
HTTPResponse.

This is just one less thing to have to port to an upcoming HTTP header
refactor.
2025-11-27 14:57:29 +01:00
Aliaksandr Kalenik
69cede4a0f AK+LibWeb: Make StringBase::bytes() lvalue-only
Disallow calling `StringBase::bytes()` on temporaries to avoid returning
`ReadonlyBytes` that outlive the underlying string.

With this change, we catch a real UAF:
`load_result.data = maybe_response.release_value().bytes();`
All other updated call sites were already safe, they just needed to use
an intermediate named variable to satisfy the new lvalue-only
requirement.
2025-11-25 13:02:20 -05:00
Aliaksandr Kalenik
16b0f1e6c2 LibWeb: Delete unused ResourceLoader::load()
...and rename `load_unbuffered()` to `load()`.
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
3058274386 LibWeb: Use unbuffered network requests for all Fetch requests
Previously, unbuffered requests were only available as a special mode
for EventSource. With this change, they are enabled by default, which
means chunks can be read from the stream as soon as they arrive.

This unlocks some interesting possibilities, such as starting to parse
HTML documents before the entire response has been received (that, in
turn, allows us to initiate subresource fetches earlier or begin
executing scripts sooner), or start rendering videos before they are
fully downloaded.

Co-authored-by: Timothy Flynn <trflynn89@pm.me>
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
cce5197f90 LibWeb: Implement :resource support for load_unbuffered()
Preparation work before enabling unbuffered fetching by default.
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
1f4b53d9fc LibWeb: Implement :about protocol support for load_unbuffered()
Preparation work before enabling unbuffered fetching by default.
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
2fd7b70784 LibWeb: Implement file: protocol support for load_unbuffered()
Preparation work before enabling unbuffered fetching by default.
2025-11-20 06:29:13 -05:00
Timothy Flynn
ac246caa0c LibWeb: Remove now-unused Resource and ResourceClient
And deal with the fallout of transitive includes.
2025-11-05 18:27:36 +01:00
Julian Dominguez-Schatz
4e3387778e LibWeb: Respect IncludeCredentials for Set-Cookie during fetch
Per https://fetch.spec.whatwg.org/#http-network-fetch, Set-Cookie should
only store a cookie if IncludeCredentials::Yes is set. Fixes 1 web
platform test.
2025-09-24 10:12:56 +01:00
Timothy Flynn
b1b218596f LibWeb+WebContent: Add IPC to re-establish RequestServer connections 2025-08-10 11:02:50 +02:00
Timothy Flynn
3171d57639 LibWeb: Restore flags to prevent formatting timestamps as local time
The flag to stringify these timestamps as UTC was errantly dropped in
6fb2be96bf. This was causing test-web to
fail in time zones other than GMT+0.
2025-06-25 23:41:04 +02:00
Tomasz Strejczek
6fb2be96bf Everywhere: Replace DateTime::to_string() with UnixDateTime::to_string()
Replace LibCore::DateTime::to_string() with
AK::UnixDateTime::to_string().
Remove unncessary #include <LibCore/DateTime.h>.
2025-06-19 18:42:45 -06:00
Callum Law
6c3ceb9284 LibWeb: Don't crash when handling invalid HTTP status codes
Example crash: https://wpt.live/fetch/h1-parsing/status-code.window.html

There is still work to make the above tests pass.
2025-05-27 12:58:08 -06:00
Timothy Flynn
6539c72e7e LibWeb: Allow CORS requests from opaque origins to resource:// URLs
JavaScript module requests (in a non-worker context) always have CORS
enabled. However, CORS requests are only allowed for same-origin or
HTTP/S requests. This patch extends this to allow resource:// requests
from opaque origins (e.g. about: URLs).

We must also set the Access-Control-Allow-Origin header to "null" to
ensure that the response is accepted by the CORS checks. This does not
affect requesting resource:// URLs from resource:// URLs as those are
same-origin and skip CORS checks.

This ultimately enables requesting resource:// JS modules from the
about:settings page.
2025-04-23 19:58:58 -04:00
Shannon Booth
733dfdaa05 LibWeb: Avoid URL validity check for 'Resource'
Which was previously signally an invalid Resource by a default
constructed invalid URL. Instead, switch this over to an Optional
URL.
2025-04-19 07:18:43 -04:00
stasoid
32ddeb82d6 LibURL+LibWeb: Remove leading slash when converting url to path
...on Windows
2025-04-10 19:04:21 -06:00
Timothy Flynn
0de017df9b LibRequests: Move NetworkError stringification to LibRequests
Let's also rename the file to NetworkError.h while we're here. No need
to have "Enum" in the name.
2025-04-02 08:52:45 -04:00
Gingeh
5838c73a72 LibWeb: Restrict weird about:foo URIs
This commit:
- Prevents path traversal via the about: scheme
- Prevents loading about:inspector
- Requires about: URIs to be opaque paths
- Prevents crashes with invalid percent encoded paths
2025-03-12 10:41:06 +00:00
Luke Wilde
209b10e53e RequestServer: Retrieve timing info from curl and pipe it to LibWeb
This timing info will be used to create a PerformanceResourceTiming
entry.
2025-03-06 09:00:53 -07:00
Shannon Booth
097f7fe169 LibWeb/Loader: Explicitly parse URL generating directory response
Removing one more caller of the implicit URL constructors.
2025-03-04 16:24:19 -05:00
Andrew Kaster
47716a4e11 LibWeb: Explicitly capture resource in closures in load_resource
A hygine patch to not use a generic = capture, and remove some
unnecessary const-casts
2025-02-18 11:26:34 -07:00
Andrew Kaster
8760825bb4 LibWeb: Don't store Page on ResourceLoader
We only need a Page for file:// urls. At some point we probably
needed it for other kinds of requests, but the current functionality
doesn't need to store the Page pointer on the ResourceLoader.
2025-02-18 11:26:34 -07:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Andreas Kling
b3b97d2049 LibWeb: Unregister network requests *after* invoking callbacks
This ensures that the network request actually gets unreffed and deleted
at the right time.
2024-11-11 21:40:56 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Andreas Kling
a6d52e0c97 LibWeb: Add a basic content filter (ad blocking!) :^)
This patch adds a global (per-process) filter list to LibWeb that is
used to filter all outgoing resource load requests.

Basically we check the URL against a list of filter patterns and if
it's a match for any one of them, we immediately fail the load.

The filter list is a simple text file:

    ~/.config/BrowserContentFilters.txt

It's one filter per line and they are simple glob filters for now,
with implicit asterisks (*) at the start and end of the line.
2021-01-05 21:20:15 +01:00
Andreas Kling
5e157eaf37 LibWeb: Convert a bunch of String::format() => String::formatted() 2021-01-03 14:35:09 +01:00
Tom
a4b3eb6b2d LibWeb: Clear circular download reference when download finished 2020-12-31 22:15:00 +01:00
AnotherTest
83fed3fd5d LibWeb: Don't hold on to the Download instance after it's finished
Fixes* 4668
2020-12-31 16:57:09 +01:00
AnotherTest
4a2da10e38 ProtocolServer: Stream the downloaded data if possible
This patchset makes ProtocolServer stream the downloads to its client
(LibProtocol), and as such changes the download API; a possible
download lifecycle could be as such:
notation = client->server:'>', server->client:'<', pipe activity:'*'
```
> StartDownload(GET, url, headers, {})
< Response(0, fd 8)
* {data, 1024b}
< HeadersBecameAvailable(0, response_headers, 200)
< DownloadProgress(0, 4K, 1024)
* {data, 1024b}
* {data, 1024b}
< DownloadProgress(0, 4K, 2048)
* {data, 1024b}
< DownloadProgress(0, 4K, 1024)
< DownloadFinished(0, true, 4K)
```

Since managing the received file descriptor is a pain, LibProtocol
implements `Download::stream_into(OutputStream)`, which can be used to
stream the download into any given output stream (be it a file, or
memory, or writing stuff with a delay, etc.).
Also, as some of the users of this API require all the downloaded data
upfront, LibProtocol also implements `set_should_buffer_all_input()`,
which causes the download instance to buffer all the data until the
download is complete, and to call the `on_buffered_download_finish`
hook.
2020-12-30 13:31:55 +01:00
Andreas Kling
497f1fd472 LibWeb: Don't use ByteBuffer::wrap() when loading about: URLs
Let's just copy an empty string here to make ourselves a ByteBuffer.
2020-12-19 18:29:13 +01:00
Andreas Kling
685d5f4e25 LibProtocol: Remove use of ByteBuffer::wrap() in protocol API 2020-12-19 13:09:02 +01:00
Luke
62a74bf282 LibWeb: Advertise to servers that we support gzip encoding
We've had gzip support for a while now, but it never really got
used because we never advertised it.
2020-11-11 12:15:18 +01:00
Andreas Kling
2946a684ef ProtocolServer+LibWeb: Support more detailed HTTP requests
This patch adds the ability for ProtocolServer clients to specify which
HTTP method to use, and also to include an optional HTTP request body.
2020-09-28 11:55:26 +02:00
Andreas Kling
e2f32b8f9d LibCore: Make Core::Object properties more dynamic
Instead of everyone overriding save_to() and set_property() and doing
a pretty asymmetric job of implementing the various properties, let's
add a bit of structure here.

Object properties are now represented by a Core::Property. Properties
are registered with a getter and setter (optional) in constructors.
I've added some convenience macros for creating and registering
properties, but this does still feel a bit bulky. We'll have to
iterate on this and see where it goes.
2020-09-15 21:46:26 +02:00
asynts
b3d1a05261 Refactor: Expose const_cast by removing ByteBuffer::warp(const void*, size_t)
This function did a const_cast internally which made the call side look
"safe". This method is removed completely and call sites are replaced
with ByteBuffer::wrap(const_cast<void*>(data), size) which makes the
behaviour obvious.
2020-08-06 10:33:16 +02:00
AnotherTest
97256ad977 ProtocolServer+LibTLS: Pipe certificate requests from LibTLS to clients
This makes gemini.circumlunar.space (and some more gemini pages) work
again :^)
2020-08-02 18:57:51 +02:00
Andreas Kling
ee4cf0bc69 LibWeb: Let's not pass "%u" to String() and expect something to happen 2020-06-26 00:53:25 +02:00
Andreas Kling
4f7c7bbb09 LibWeb: Treat all HTTP 4xx codes as errors 2020-06-25 17:19:29 +02:00
Andreas Kling
1678aaa555 ProtocolServer+LibProtocol: Propagate HTTP status codes to clients
Clients now receive HTTP status codes like 200, 404, etc.
Note that a 404 with content is still considered a "successful"
download from ProtocolServer's perspective. It's up to the client
to interpret the status code.

I'm not sure if this is the best API, but it'll work for now.
2020-06-13 22:20:37 +02:00
Andreas Kling
e46ee46ed6 LibWeb: Silence debug spam about reuse of cached resources 2020-06-13 15:27:53 +02:00
Andreas Kling
f2aa21ebc4 LibWeb: Assert that we don't reuse cached resources with wrong type 2020-06-05 23:35:08 +02:00
Andreas Kling
d4ddb0013c LibWeb: Share decoded images at the Resource level :^)
This patch adds ImageResource as a subclass of Resource. This new class
also keeps a Gfx::ImageDecoder so that we can share decoded bitmaps
between all clients of an image resource inside LibWeb.

With this, we now share both encoded and decoded data for images. :^)

I had to change how the purgeable-volatile flag is updated to keep the
volatile-images-outside-the-visible-viewport optimization working.
HTMLImageElement now inherits from ImageResourceClient (a subclass of
ResourceClient with additional image-specific stuff) and informs its
ImageResource about whether it's inside the viewport or outside.

This is pretty awesome! :^)
2020-06-02 20:32:38 +02:00
Andreas Kling
7af337764e LibWeb: Add a naive Resource cache
This patch introduces a caching mechanism in ResourceLoader. It's keyed
on a LoadRequest object which is what you provide to load_resource()
when you want to load a resource.

We currently never prune the cache, so resources will stay in there
forever. This is obviously not gonna stay that way, but we're just
getting started here. :^)

This should drastically reduce the number of requests when loading
some sites (like Twitter) that reuse the same images over and over.
2020-06-01 21:58:29 +02:00
Andreas Kling
5ed66cb8d9 LibWeb: Start building a new Resource class to share more resources
A Resource represents a resource that we're loading, have loaded or
will soon load. Basically, it's a downloadable resource that can be
shared by multiple clients.

A typical usecase is multiple <img> elements with the same src.
In a future patch, we will try to make sure that those <img> elements
get the same Resource if possible. This will reduce network usage,
memory usage, and CPU usage. :^)

For now, this first patch simply introduces the mechanism.

You get a Resource by calling ResourceLoader::load_resource().
To get notified about changes to a Resource's load status, you inherit
from ResourceClient and implement the callbacks you're interested in.

This patch turns HTMLImageElement into a ResourceClient.
2020-06-01 21:36:43 +02:00
Andreas Kling
6ed11f1d1c LibWeb: Move ResourceLoader into a new Loader/ directory 2020-06-01 20:42:50 +02:00
Renamed from Libraries/LibWeb/ResourceLoader.cpp (Browse further)