This mode allows us to test the HTTP disk cache with two mechanisms:
1. If RequestServer is launched with --http-disk-cache-mode=testing, it
will cache requests with a X-Ladybird-Enable-Disk-Cache header.
2. In test mode, RS will include a X-Ladybird-Disk-Cache-Status response
header indicating how the response was handled by the cache. There is
no standard way for a web request to know what happened with respect
to the disk cache, so this fills that hole for testing.
This mode is not exposed to users.
The Windows RequestPipe implementation uses a non blocking local socket
pair, which means the non-fatal "resource is temporarily unavailable"
error that can occur in the non-blocking HTTP Response data writes can
be retried. This was seen often when loading https://ladybird.org.
While the EAGAIN errno is defined on Windows, WSAEWOULDBLOCK is the
error code returned in this scenario, so we were not detecting that we
could retry and treated the failed write attempt as a proper error.
We now detect WSAEWOULDBLOCK and convert it into the errno equivalent
EWOULDBLOCK. There is precedent for doing a similar conversion in the
Windows PosixSocketHelper::read() implementation.
Finally, we retry when we receive either EAGAIN or EWOULDBLOCK error
codes on all platforms. While POSIX allows these 2 error codes to have
the same value, which they do on Linux according to
https://www.man7.org/linux/man-pages/man3/errno.3.html, it is not
guarenteed. So we now ensure platforms that return EWOULDBLOCK with a
value different than EAGAIN also perform write retries.
When a request becomes stale, we will now issue a revalidation request
(if the response indicates it may be revalidated). We do this by issuing
a normal fetch request, with If-None-Match and/or If-Modified-Since
request headers.
If the server replies with an HTTP 304 status, we update the stored
response headers to match the 304's headers, and serve the response to
the client from the cache.
If the server replies with any other code, we remove the cache entry.
We will open a new cache entry to cache the new response, if possible.
We currently store response headers in the cache entry file, before the
response body. When we implement cache revalidation, we will need to
update the stored response headers with whatever headers are received
in a 304 response. It's not unlikely that those headers will have a size
that differs from the stored headers. We would then have to rewrite the
entire response body after the new headers.
Instead of dealing with those inefficiencies, let's instead store the
response headers in the cache index. This will allow us to update the
headers with a simple SQL query.
The Win32 API equivalent to pipe2() is CreatePipe(), which creates read
and write anonymous pipe handles that we can set to non-blocking via
SetNamedPipeHandleState(); however, this initial approach caused issues
as our Windows infrastructure assumes socket-based handles/fds and that
we don't use Windows pipes at all, see Core::System::is_socket() in
SystemWindows.cpp. So we use socketpair() to keep our current
assumptions true.
Given that Windows uses socketpair() and Unix uses pipe2(), this
RequestPipe abstraction avoids ifdef soup by hiding the details about
how the read/write fds pair is created and how response data is written
to the client.
We previously had no protection against the same URL being requested
multiple times at the same time. For example, if a URL did not have any
cache entry and became requested twice, we would open two cache writers
concurrently. This would result in both writers piping the response to
disk, and we'd have a corrupt cache file.
We now hold back requests under certain scenarios until existing cache
entries have completed:
* If we are opening a cache entry for reading:
- If there is an existing reader entry, carry on as normal. We can
have multiple readers.
- If there is an existing writer entry, defer the request until it is
complete.
* If we are opening a cache entry for writing:
- If there is an existing reader or writer entry, defer the request
until it is complete.
This object will be needed in a future commit to store requests awaiting
other requests to finish. Doing this in a separate commit just to make
that commit less noisy.
We previously waited until we received all response headers before we
would create the cache entry. We now create one immediately, and handle
writing the headers in its own function. This will allow us to know if
a cache entry writer already exists for a given cache key, and thus
prevent creating a second writer at the same time.
We currently manage request lifetime as both an ActiveRequest structure
and a series of lambda callbacks. In an upcoming patch, we will want to
"pause" a request to de-duplicate equivalent requests, such that only
one request goes over the network and saves its response to the disk
cache.
To make that easier to reason about, this adds a Request class to manage
the lifetime of a request via a state machine. We will now be able to
add a "waiting for disk cache" state to stop the request.