This mode allows us to test the HTTP disk cache with two mechanisms:
1. If RequestServer is launched with --http-disk-cache-mode=testing, it
will cache requests with a X-Ladybird-Enable-Disk-Cache header.
2. In test mode, RS will include a X-Ladybird-Disk-Cache-Status response
header indicating how the response was handled by the cache. There is
no standard way for a web request to know what happened with respect
to the disk cache, so this fills that hole for testing.
This mode is not exposed to users.
We can use Duration as-is without coercing values to seconds. This
probably doesn't make much difference in real-world scenarios, but when
we hammer the cache during tests, this truncation will cause flakey
behavior.
This commit extends is_cacheable() to allow storage of responses that
rely on heuristic cacheability, including status codes defined as
heuristically cacheable.
We implement heuristic freshness lifetime calculation based on the
Last-Modified header as guidance, and apply it when no explicit
expiration information is present.
We were treating Content-Type as exempt from header updates during cache
revalidation and incorrectly allowing Content-Length to get overwritten
when handling an HTTP 304 response.
This caused cached entries to end up with a mismatched Content-Length
that described the validating response instead of the stored body.
Per RFC 9111, we're allowed to cache HTTP responses that don't have a
"Cache-Control" header, provided they do have an "Expires" header.
This lets us cache JavaScript resources on https://x.com/ and makes
it load much faster when cached.
When a request becomes stale, we will now issue a revalidation request
(if the response indicates it may be revalidated). We do this by issuing
a normal fetch request, with If-None-Match and/or If-Modified-Since
request headers.
If the server replies with an HTTP 304 status, we update the stored
response headers to match the 304's headers, and serve the response to
the client from the cache.
If the server replies with any other code, we remove the cache entry.
We will open a new cache entry to cache the new response, if possible.
We previously waited until we received all response headers before we
would create the cache entry. We now create one immediately, and handle
writing the headers in its own function. This will allow us to know if
a cache entry writer already exists for a given cache key, and thus
prevent creating a second writer at the same time.
This adds a disk cache for HTTP responses received from the network. For
now, we take a rather conservative approach to caching. We don't cache a
response until we're 100% sure it is cacheable (there are heuristics we
can implement in the future based on the absence of specific headers).
The cache is broken into 2 categories of files:
1. An index file. This is a SQL database containing metadata about each
cache entry (URL, timestamps, etc.).
2. Cache files. Each cached response is in its own file. The file is an
amalgamation of all info needed to reconstruct an HTTP response. This
includes the status code, headers, body, etc.
A cache entry is created once we receive the headers for a response. The
index, however, is not updated at this point. We stream the body into
the cache entry as it is received. Once we've successfully cached the
entire body, we create an index entry in the database. If any of these
steps failed along the way, the cache entry is removed and the index is
left untouched.
Subsequent requests are checked for cache hits from the index. If a hit
is found, we read just enough of the cache entry to inform WebContent of
the status code and headers. The body of the response is piped to WC via
syscalls, such that the transfer happens entirely in the kernel; no need
to allocate the memory for the body in userspace (WC still allocates a
buffer to hold the data, of course). If an error occurs while piping the
body, we currently error out the request. There is a FIXME to switch to
a network request.
Cache hits are also validated for freshness before they are used. If a
response has expired, we remove it and its index entry, and proceed with
a network request.