Commit graph

56 commits

Author SHA1 Message Date
Timothy Flynn
163e8e5b44 LibWebView+RequestServer: Support clearing the HTTP disk cache
This is a bit of a blunt hammer, but this hooks an action to clear the
HTTP disk cache into the existing Clear Cache action. Upon invocation,
it stops all existing cache entries from making further progress, and
then deletes the entire cache index and all cache files.

In the future, we will of course want more fine-grained control over
cache deletion, e.g. via an about:history page.
2025-10-14 13:40:33 +02:00
Timothy Flynn
3516a2344f LibRequests+RequestServer: Begin implementing an HTTP disk cache
This adds a disk cache for HTTP responses received from the network. For
now, we take a rather conservative approach to caching. We don't cache a
response until we're 100% sure it is cacheable (there are heuristics we
can implement in the future based on the absence of specific headers).

The cache is broken into 2 categories of files:

1. An index file. This is a SQL database containing metadata about each
   cache entry (URL, timestamps, etc.).
2. Cache files. Each cached response is in its own file. The file is an
   amalgamation of all info needed to reconstruct an HTTP response. This
   includes the status code, headers, body, etc.

A cache entry is created once we receive the headers for a response. The
index, however, is not updated at this point. We stream the body into
the cache entry as it is received. Once we've successfully cached the
entire body, we create an index entry in the database. If any of these
steps failed along the way, the cache entry is removed and the index is
left untouched.

Subsequent requests are checked for cache hits from the index. If a hit
is found, we read just enough of the cache entry to inform WebContent of
the status code and headers. The body of the response is piped to WC via
syscalls, such that the transfer happens entirely in the kernel; no need
to allocate the memory for the body in userspace (WC still allocates a
buffer to hold the data, of course). If an error occurs while piping the
body, we currently error out the request. There is a FIXME to switch to
a network request.

Cache hits are also validated for freshness before they are used. If a
response has expired, we remove it and its index entry, and proceed with
a network request.
2025-10-14 13:40:33 +02:00
ayeteadoe
58be9e6400 RequestServer: Enable in Windows CI 2025-08-23 16:04:36 -06:00
Jelle Raaijmakers
9080af4085 RequestServer: Don't set CURLOPT_HTTPGET _and_ CURLOPT_CUSTOMREQUEST
We always set CURLOPT_CUSTOMREQUEST, so we can skip setting
CURLOPT_HTTPGET.
2025-08-13 10:30:04 -04:00
Jelle Raaijmakers
ed57d2de98 RequestServer: Don't return unused bool when setting curl options 2025-08-13 10:30:04 -04:00
Jelle Raaijmakers
da351ac468 RequestServer: Pass nullptr instead of an unused variable 2025-08-13 10:30:04 -04:00
Jelle Raaijmakers
585e4ed875 RequestServer: Add some useful dbgln_if()s
In trying to debug request handling, I've found these to help
understanding what's going on behind the scenes.
2025-08-13 10:30:04 -04:00
Jelle Raaijmakers
41cf150a5b LibDNS+RequestServer: Don't construct Vectors to validate DNS response
Instead of filling vectors and returning them just to invoke
`.is_empty()`, forward the calls to the underlying vectors directly.
2025-08-13 10:30:04 -04:00
rmg-x
d489e46448 RequestServer: Clarify comment for removing Content-Type header
This comment assumed that the reader knew how curl behaved with empty
header values.
2025-08-13 06:30:56 -04:00
rmg-x
d1db27dc42 RequestServer: Remove unused include of AK/Badge.h 2025-08-13 06:30:56 -04:00
Timothy Flynn
58f5fe7d79 RequestServer: Add an IPC method to reconnect N request clients
Similar to the same IPC on ImageDecoder, this just avoids IPC churn.
2025-08-10 11:02:50 +02:00
Luke Wilde
08a03534af Meta+RequestServer: Enable HTTP/3 for curl
This adds an overlay port for curl that adds the features required for
HTTP/3.

This is not quite compatible with the upstream vcpkg.json, because
enabling HTTP/2 makes it use the default SSL backend, which is
sectransp for macOS and schannel on Windows. These backends are not
compatible with ngtcp2. Additionally, we can not build curl with
multiple SSL backends when using ngtcp2.

I couldn't find a way to selectively disable/enable dependencies based
on what features are enabled, so I made HTTP/2 pick OpenSSL in our
overlay port. Upstream vcpkg will likely want to support wolfSSL and
GnuTLS backends for ngtcp2, so they'll be additional work to get this
into upstream.
2025-07-09 14:44:56 -06:00
Luke Wilde
05ee3f1876 RequestServer: Wait until initial connection is open for multiplexing
By default, if multiple requests start to a newly seen origin, curl
will not wait for a connection to open to figure out if the server
supports multiplexing and will instead open a new connection for each
request (including a new TLS session and such)

This is particularly an issue for initial page load, where a complex
website could, for example, request tens of items at once (e.g. a bunch
of scripts).

We can be kinder to servers that support multiplexing by telling curl
to wait till an initial connection is established to determine if
multiplexing is supported.

On my machine and internet connection, this reduces the amount of
connections to github.githubassets.com on initial load of
https://github.com/LadybirdBrowser/ladybird from 12 to 2.
2025-07-07 14:11:26 +02:00
Andrew Kaster
d9c85288d9 LibTLS: Remove blocking option and simplify Options struct
The complex macro for options with defaults doesn't make sense
now that there's only one option.
2025-06-23 17:49:21 +02:00
Luke Wilde
2dead9231d RequestServer: Handle client disappearance more gracefully
Without these fixes, RequestServer was likely to crash if the client
crashed (e.g. WebContent). This was because there was no error handling
for when writing to the client failed.

This is particularly an issue because RequestServer has shared
instances, so it would then crash every other client of RequestServer.
Then, because another RequestServer instance is not currently spun up,
it becomes impossible to start any clients that need a RequestServer
instance. Recreating a RequestServer should also be handled, but that's
not in the scope of this change.

We can tell curl that we failed to write data to the client and that
the request should be aborted by returning `CURL_WRITEFUNC_ERROR` from
the write callback.

It is also possible for requests to be destroyed with buffered data,
which is normal to happen if the client disappears
(i.e. ConnectionFromClient is destroyed) or the request is cancelled by
the client. We log a warning in case this is not expected, to assist
with debugging related issues.
2025-06-13 17:03:57 +02:00
Ali Mohammad Pur
4b5664f867 LibWebView+RequestSever: Wire up a validate-DNSSEC setting option to RS 2025-06-11 18:16:29 +02:00
Andreas Kling
e0e09f71be RequestServer: Don't try to self-destruct already-destroyed request 2025-05-29 03:46:49 +02:00
Aliaksandr Kalenik
ceaeea3c26 RequestServer: Use write notifier instead of busy waiting for socket
...to become writable.

Solves triangular deadlock problem that happened in the following case
(copied from https://github.com/LadybirdBrowser/ladybird/issues/1816):
- The WebContent process is spinning on
   `send_sync_but_allow_failure` waiting for the UI process to respond
- The UI process is spinning on `send_sync_but_allow_failure`, waiting
   for RequestServer to respond
- RequestServer is stuck in this loop, trying to write to the
  WebContent's socket file (when I attach to RS, we are always in the
  sched_yield call, so we're spinning on EAGAIN).

For me the issue was reliably reproducible on Google Maps and with this
change we no longer deadlock there.
2025-05-24 16:28:48 +03:00
Colin Reeder
5ac88e7726 RequestServer: Leave Accept-Encoding up to curl 2025-05-19 13:18:44 +02:00
Mohamed amine Bounya
b77643a2e8 RequestServer: Don't assert for socket fd not being CURL_SOCKET_BAD
The assertion in `WebSocketImplCurl::did_connect()` keeps failing for
multiple websockets when loading `https://www.speedtest.net/` since
commit 14ebcd4. This fixes that by checking and returning false if
something went wrong and letting the caller function handle it.
2025-04-30 18:20:26 -06:00
Ali Mohammad Pur
2c13504bfc LibWebView+RequestServer: Add some UI for DNS settings 2025-04-22 18:05:07 -04:00
Timothy Flynn
b54a520b69 LibRequests+RequestServer: Add an error code for bad content encoding
This error is set by curl when, e.g., a gzipped response body has an
invalid gzip encoding.
2025-04-20 16:50:37 +02:00
Timothy Flynn
e0c4801e0f RequestServer: Ignore CURLE_RECV_ERROR in some cases
The HTTPS server used by WPT will close TLS connections without sending
a "close notify" alert. For responses that did not have a Content-Length
header, curl treats this as an error.
2025-04-20 16:50:37 +02:00
stasoid
beb11f0447 RequestServer: Compile on Windows 2025-04-10 19:03:00 -06:00
Aliaksandr Kalenik
db8c443392 Everywhere: Make TransportSocket non-movable
Instead of wrapping all non-movable members of TransportSocket in OwnPtr
to keep it movable, make TransportSocket itself non-movable and wrap it
in OwnPtr.
2025-04-09 15:27:52 +02:00
rmg-x
f39d14fa8a RequestServer: Remove check for square brackets in websocket_connect
This is no longer necessary since commit:
6480e1a3fe
2025-04-08 09:13:33 +02:00
rmg-x
27c19c02d2 RequestServer: Remove check for square brackets in host before resolving
This is no longer needed since `IPv6Address::from_string` supports
square brackets. After the update to curl, `CURLOPT_RESOLVE` now
supports replacing IPv6 hosts as well.
2025-04-05 14:26:09 -04:00
Timothy Flynn
0de017df9b LibRequests: Move NetworkError stringification to LibRequests
Let's also rename the file to NetworkError.h while we're here. No need
to have "Enum" in the name.
2025-04-02 08:52:45 -04:00
Timothy Flynn
cf69f52d53 LibIPC+Everywhere: Always pass ownership of transferred data to clients
This has been a longstanding ergonomic issue with our IPC compiler. Non-
trivial types were previously passed by const&. So if we wanted to avoid
expensive copies, we would have to const_cast and move the data.

We now pass ownership of all transferred data to the client subclasses.
This allows us to remove const_cast from these methods, and allows us to
avoid some trivial expensive copies that we didn't bother to const_cast.
2025-03-09 11:14:20 -04:00
Andrew Kaster
00c9031304 RequestServer: Remove dead code from ConnectionFromClient 2025-03-06 14:43:02 -05:00
Luke Wilde
209b10e53e RequestServer: Retrieve timing info from curl and pipe it to LibWeb
This timing info will be used to create a PerformanceResourceTiming
entry.
2025-03-06 09:00:53 -07:00
devgianlu
85d46a71d9 LibDNS: Ensure non-blocking socket is used for TCP connections 2025-02-22 18:39:58 +01:00
Andrew Kaster
06faa7b160 LibWebSocket+RequestServer: Resolve WebSocket hosts using our resolver 2025-02-20 15:04:50 -07:00
Andrew Kaster
1809ab2743 RequestServer: Enable the new WebSocketImplCurl backend for WebSockets 2025-02-20 15:04:50 -07:00
Shannon Booth
2823ac92d0 RequestServer: Do not check for invalid URL starting a request
Now that LibIPC ensures that an invalid URL is not passed through,
we do not need to check the validity of the URL here.
2025-02-19 08:01:35 -05:00
devgianlu
42a18a4a91 RequestServer: Use default certificate for DNS over TLS 2025-02-18 15:46:44 +01:00
devgianlu
24d3da64e5 LibWebSocket: Support specifying root certificate path 2025-02-17 19:52:43 +01:00
David Hewitt
6b661a91c6 RequestServer: Send empty headers in requests 2025-02-17 13:43:16 +01:00
rmg-x
41c6f93aa8 RequestServer: Remove IPV6 bracket notation on host before resolving
Previously, we were passing in the serialized value which included
square brackets but that isn't a valid IPV6 address.
2025-02-17 12:02:41 +01:00
stasoid
3e46cb9067 LibWebView+ImageDecoder+RequestServer+WebContent: Add init_transport 2025-02-12 22:32:13 -07:00
rmg-x
ec481aa08a LibDNS+RequestServer: Fix UAF in lookup() by changing Span -> Vector
Co-authored-by: Ali Mohammad Pur <ali.mpfard@gmail.com>
2025-02-11 07:24:33 +01:00
rmg-x
17c0d4469c RequestServer: Check for empty list of IP addresses in DNS result
Before, if something went wrong with DNS lookup and there were unrelated
records (i.e. not A or AAAA) then we would still attempt to build a
resolve list. This resulted in curl errors related to the option itself
and displayed as "unknown network error" to the user.
2025-02-03 00:14:22 +01:00
Andrew Kaster
91161d77e0 RequestServer: Remove unused content length check for received data 2025-01-23 21:35:58 +01:00
Andreas Kling
7cd6ea6f33 RequestServer: Clean up the CURLM "multi handle" when client drops
Otherwise we may leak all kinds of things inside CURL.
2024-12-25 19:14:45 +01:00
Shannon Booth
0fa54c2327 LibURL+LibWeb: Make URL::serialize return a String
Simplifying a bunch of uneeded error handling around the place.
2024-12-04 16:34:13 +00:00
rmg-x
c490118b6c RequestServer: Free curl string lists and check result before setting
Previously, we leaked the `curl_slist`s on every request. This also
validates the pointer we get from `curl_slist_append` before setting the
option.

Also, use the `set_option` helper for CURLOPT_RESOLVE as it will print
when there is an error.
2024-12-01 11:32:45 +01:00
rmg-x
32358df13e RequestServer: Remove unused global "g_dns_cache"
`curl_slist_append` copies the string so we don't need to keep it around
ourselves.
2024-12-01 11:32:45 +01:00
rmg-x
9b8987e34e RequestServer: Use existing type for buffered UDP socket 2024-12-01 11:32:45 +01:00
Sam Atkins
900c131178 LibURL: Make URL::serialized_host() infallible
This can no longer fail, so update the return type to match.

This makes a few more methods now unable to return errors, but one thing
at a time. 😅
2024-11-30 12:07:39 +01:00
Ali Mohammad Pur
ff311c1560 RequestServer+LibDNS: Don't .await() the DNS lookup promise
...and make sure it will eventually complete (or fail) by adding a
timeout retry sequence.

Fixes an issue where RequestServer would stick around after exit,
waiting for piled up DNS requests for a long time.
2024-11-25 11:46:35 +01:00