Commit graph

87 commits

Author SHA1 Message Date
Timothy Flynn
918f6a4c9f LibHTTP: Ensure we use the Vary key when updating last access time
Apparently, sqlite will fill this placeholder value in with NULL if we
do not pass a value. The query being executed here is:

    UPDATE CacheIndex
    SET last_access_time = ?
    WHERE cache_key = ? AND vary_key = ?;
2026-02-06 16:24:49 +01:00
Zaggy1024
4eb310cd3f LibWeb: Skip range requests for media if the server won't accept them
Currently, this just respects the reported value from Accept-Ranges,
but we could also just try sending a range request and see if the
server rejects it, then fall back to a normal request after. For now,
this is fine, and we can make it use a fallback later if needed.
2026-01-29 05:22:27 -06:00
Zaggy1024
99020b50a3 LibHTTP: Implement extraction of the Content-Range values in HeaderList 2026-01-29 05:22:27 -06:00
Timothy Flynn
4585734696 LibHTTP: Honor the min-fresh Cache-Control request directive 2026-01-28 11:31:04 -05:00
Timothy Flynn
896ecb28ab LibHTTP: Honor the max-stale Cache-Control request directive 2026-01-28 11:31:04 -05:00
Timothy Flynn
4a728b1f29 LibHTTP: Honor the max-age Cache-Control request directive 2026-01-28 11:31:04 -05:00
Timothy Flynn
26ddd0a904 LibHTTP: Honor the no-cache Cache-Control request directive 2026-01-28 11:31:04 -05:00
Timothy Flynn
cb1ad8a904 LibHTTP: Honor the no-store Cache-Control request directive 2026-01-28 11:31:04 -05:00
Timothy Flynn
2918537596 LibHTTP: Add helper to extract a duration from a Cache-Control directive 2026-01-28 11:31:04 -05:00
Timothy Flynn
40800fd91e LibHTTP: Implement a strict method to extract Cache-Control directives
Our previous implementation was a bit too tolerant of bad header values.
For example, extracting a "max-age" from a header value of "abmax-agecd"
would have incorrectly parsed successfully.

We now find exact (case-insensitive) directive matches. We also handle
quoted string values, which may contain important delimeters that we
would have previously split on.
2026-01-28 11:31:04 -05:00
Timothy Flynn
54c2ecedca RequestServer: Do not flush the disk cache for unsuccessful requests
If a request failed, or was stopped, do not attempt to write the cache
entry footer to disk. Note that at this point, the cache index will not
have been created, thus this entry will not be used in the future. We do
still delete any partial file on disk.

This serves as a more general fix for the issue addressed in commit
9f2ac14521.
2026-01-23 14:24:20 +01:00
Timothy Flynn
12552f0d72 LibHTTP: Avoid UAF while deleting exempt cache headers
HeaderList::delete involves a Vector::remove_all_matching internally.
So if an exempt header appeared again later in the header list, we would
be accessing the name string of the previously deleted header.
2026-01-22 13:18:29 -05:00
Timothy Flynn
d3041dc054 LibHTTP+LibWeb: Support the HTTP Vary response header
We now partition the HTTP disk cache based on the Vary response header.
If a cached response contains a Vary header, we look for each of the
header names in the outgoing HTTP request. The outgoing request must
match every header value in the original request for the cache entry
to be used; otherwise, a new request will be issued, and a separate
cache entry will be created.

Note that we must now defer creating the disk cache file itself until we
have received the response headers. The Vary key is computed from these
headers, and affects the partitioned disk cache file name.

There are further optimizations we can make here. If we have a Vary
mismatch, we could find the best candidate cached response and issue a
conditional HTTP request. The content server may then respond with an
HTTP 304 if the mismatched request headers are actually okay. But for
now, if we have a Vary mismatch, we issue an unconditional request as
a purely correctness-oriented patch.
2026-01-22 08:54:49 -05:00
Timothy Flynn
36a826815d LibHTTP+LibWeb+RequestServer: Store request headers in the HTTP caches
We need to store request headers in order to handle Vary mismatches.

(Note we should also be using BLOB for header storage in sqlite, as they
are not necessarily UTF-8.)
2026-01-22 08:54:49 -05:00
Timothy Flynn
24da225b3b LibHTTP: Update disk cache entry format comment with latest format
In commit 20cd19be4d, HTTP headers were
moved from the cache entry to the cache index, but this comment was not
updated.
2026-01-22 08:54:49 -05:00
Timothy Flynn
aa1517b727 LibHTTP+LibWeb+RequestServer: Handle the Fetch API's cache mode
If the cache mode is no-store, we must not interact with the cache at
all.

If the cache mode is reload, we must not use any cached response.

If the cache-mode is only-if-cached or force-cache, we are permitted
to respond with stale cache responses.

Note that we currently cannot test only-if-cached in test-web. Setting
this mode also requires setting the cors mode to same-origin, but our
http-test-server infra requires setting the cors mode to cors.
2026-01-22 07:05:06 -05:00
Timothy Flynn
6b91199253 LibHTTP+LibWeb: Move Infrastructure::Request::CacheMode to LibHTTP
We will need to send this enum over IPC to RequestServer to affect the
disk cache's behavior.
2026-01-22 07:05:06 -05:00
Timothy Flynn
2ac219405f LibHTTP+LibWeb: Purge non-fresh entries from the memory cache
Once a cache entry is not fresh, we now remove it from the memory cache.
We will avoid handling revalidation from within WebContent. Instead, we
will just forward the request to RequestServer, where the disk cache
will handle revalidation for itself if needed.
2026-01-19 08:02:14 -05:00
Timothy Flynn
17d7c2b6bd LibHTTP: Allow revalidating heuristically cacheable responses
This is expected by WPT (the /fetch/http-cache/304-update.any.html test
in particular).
2026-01-19 08:02:14 -05:00
Timothy Flynn
bc1cafc716 LibHTTP+LibWebView+RequestServer: Allow using the disk cache during WPT
We currently disable the disk cache because the WPT runner will run more
than one RequestServer process at a time. The SQLite database does not
handle this concurrent read/write access well.

We will now enable the disk cache with a per-process database. This is
needed to ensure that WPT Fetch cache tests are sufficiently handled by
RequestServer.
2026-01-19 08:02:14 -05:00
Timothy Flynn
457a319cda LibHTTP: Define the DiskCache::cache_directory getter as const 2026-01-19 08:02:14 -05:00
Zaggy1024
84c0eb3dbf LibCore+LibHTTP+RequestServer: Send data via sockets instead of pipes
This brings the implementation on Unix in line with Windows, so we can
drop a few ifdefs.
2026-01-19 06:53:29 -05:00
Timothy Flynn
cb4da2c6c2 LibHTTP: Defer setting the response time until headers are received
We currently set the response time to when the cache entry writer is
created. This is more or less the same as the request start time, so
this is not correct.

This was a regression from 5384f84550.
That commit changed when the writer was created, but did not move the
setting of the response time to match.

We now set the response time to when the HTTP response headers have been
received (again), which matches how Chromium behaves:

https://source.chromium.org/chromium/chromium/src/+/refs/tags/144.0.7500.0:net/url_request/url_request_job.cc;l=425-433
2026-01-10 23:31:42 +01:00
Timothy Flynn
453764d3f0 LibHTTP: Do not respond to Range requests with cached full responses
If we have the response for a non-Range request in the memory cache, we
would previously use it in reply to Range requests. Similar to commit
878b00ae61f998a26aad7f50fae66cf969878ad6, we are just punting on Range
requests in the HTTP caches for now.
2026-01-10 09:02:41 -05:00
Timothy Flynn
b35645523c LibHTTP+LibWeb: Make memory cache debug logs consistent with disk cache
Let's also not yell.
2026-01-10 09:02:41 -05:00
Timothy Flynn
04171d42f0 LibHTTP: Prefix disk cache debug messages with "[disk]" text
A future commit will format memory cache debug messages similarly to the
disk cache messages. To make it easy to read them both at a glance when
both debug flags are turned on, let's add a prefix to these messages.
2026-01-10 09:02:41 -05:00
Timothy Flynn
0d99d54c46 LibHTTP+LibWeb: Do not cache range requests (for now)
We currently do not handle responses for range requests at all in our
HTTP caches. This means if we issue a request for a range of bytes=1-10,
that response will be served to a subsequent request for a range of
bytes=10-20. This is obviously invalid - so until we handle these
requests, just don't cache them for now.
2026-01-08 11:59:12 +01:00
Timothy Flynn
9f2ac14521 LibHTTP+RequestServer: Do not flush partial responses to the cache index
If the cURL request completes with anything other than CURLE_OK, we must
not keep the cache entry. For example, if the server's connection closes
while transferring data, we receive CURLE_PARTIAL_FILE. We don't want
this cache entry to be treated as valid in a subsequent request.
2026-01-08 11:59:12 +01:00
Sam Kravitz
bef8423f0f AK: Rename CaseInsensitiveStringTraits
To CaseInsensitiveASCIIStringTraits. This change indicates that these
traits are about ASCII-only insensitivity.
2025-12-31 10:24:42 +01:00
Timothy Flynn
9c8322d1b3 LibHTTP: Use correct cache key type in disk cache index entry storage
We also don't need to store the cache key itself in the entry struct.
2025-12-21 09:24:51 -06:00
Timothy Flynn
bf7b812d0b LibHTTP+LibWeb: Store the in-memory HTTP cache without JS realms
The in-memory HTTP Fetch cache currently keeps the realm which created
each cache entry alive indefinitely. This patch migrates this cache to
LibHTTP, to ensure it is completely unaware of any JS objects.

Now that we are not interacting with Fetch response objects, we can no
longer use Streams infrastructure to pipe the response body into the
Fetch response. Fetch also ultimately creates the cache response once
the HTTP response headers have arrived. So the LibHTTP cache will hold
entries in a pending list until we have received the entire response
body. Then it is moved to a completed list and may be used thereafter.
2025-12-21 08:59:31 -06:00
Timothy Flynn
46b3218241 LibHTTP+LibWeb: Use LibHTTP to calculate stale-while-revalidate values
No need to duplicate this in LibWeb.

In doing so, this also fixes an apparent bug for SWR handling in LibWeb.
We were previously deciding if we were in the SWR lifetime with:

    stale_while_revalidate > current_age

However, the SWR lifetime is meant to be an additional time on top of
the freshness lifetime:

    freshness_lifetime + stale_while_revalidate > current_age
2025-12-14 11:33:02 -05:00
Timothy Flynn
add8402536 LibHTTP+RequestServer: Implement the stale-while-revalidate directive
This directive allows our disk cache to serve stale responses for a time
indicated by the directive itself, while we revalidate the response in
the background.

Issuing requests that weren't initiated by a client is a new thing for
RequestServer. In this implementation, we associate the request with
the client that initiated the request to the stale cache entry. This
adds a "background request" mode to the Request object, to prevent us
from trying to send any of the revalidation response over IPC.
2025-12-13 13:07:02 -06:00
Timothy Flynn
8a0c8743b6 LibHTTP: Correctly hold an exclusive cache entry for revalidation
We were returning the incorrect result when upgrading a cache entry to
have exclusivity on must-revalidate requests. This could result in the
entry being read and updated at the same time, especially if the server
returned a non-304 response.
2025-12-13 13:07:02 -06:00
Timothy Flynn
0946d802bc LibHTTP+RequestServer: Mark a couple classes as final 2025-12-13 13:07:02 -06:00
Timothy Flynn
aae8574d25 LibHTTP: Place HTTP disk cache log points behind a debug flag
These log points are quite verbose. Before we enable the disk cache by
default, let's place them behind a debug flag.
2025-12-02 12:19:42 +01:00
Timothy Flynn
2453f0bc04 LibHTTP+LibWeb: Use LibHTTP's cache implementation in LibWeb
There are a couple of remaining RFC 9111 methods in LibWeb's Fetch, but
these are currently directly tied to the way we store GC-allocated HTTP
response objects. So de-coupling that is left as a future exercise.
2025-11-29 08:35:02 -05:00
Timothy Flynn
21bbbacd07 LibHTTP+RequestServer: Move the HTTP cache implementation to LibHTTP
We currently have two ongoing implementations of RFC 9111, HTTP caching.
In order to consolidate these, this patch moves the implementation from
RequestServer to LibHTTP for re-use within LibWeb.
2025-11-29 08:35:02 -05:00
Andreas Kling
949053cee7 LibHTTP: Remove unused HttpRequest functions re: basic auth 2025-11-28 08:48:33 -05:00
Timothy Flynn
9375660b64 LibHTTP+LibWeb+RequestServer: Move Fetch's HTTP header infra to LibHTTP
The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP
caching) implementation. We currently have one implementation in LibWeb
for our in-memory cache and another in RequestServer for our disk cache.

The implementations both largely revolve around interacting with HTTP
headers. But in LibWeb, we are using Fetch's header infra, and in RS we
are using are home-grown header infra from LibHTTP.

So to give these a common denominator, this patch replaces the LibHTTP
implementation with Fetch's infra. Our existing LibHTTP implementation
was not particularly compliant with any spec, so this at least gives us
a standards-based common implementation.

This migration also required moving a handful of other Fetch AOs over
to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP
folder, so perhaps it makes sense for LibHTTP to be the implementation
of that entire set of facilities.)
2025-11-27 14:57:29 +01:00
Timothy Flynn
0480934afb LibHTTP+LibWeb: Remove unused HTTP::HTTPResponse
The only thing in HTTPResponse being used is reason_phrase_for_code,
which is just a static helper method. Move it to its own file and remove
HTTPResponse.

This is just one less thing to have to port to an upcoming HTTP header
refactor.
2025-11-27 14:57:29 +01:00
Timothy Flynn
426773e8cf LibHTTP: Add a method to remove a header from a HeaderMap 2025-11-02 13:03:29 -05:00
ayeteadoe
25f5936dee CMake: Rename serenity_* helper functions/macros to ladybird_* 2025-07-03 23:19:41 +02:00
Shannon Booth
3f73cd30a2 LibURL: Rename 'cannot have a base URL' to 'has an opaque path'
This follows a rename made in the URL specification.
2025-04-06 08:24:54 -04:00
rmg-x
63249ba96a LibHTTP: Add more reason phrases for 4xx response codes
https://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml
2025-03-14 01:23:52 +01:00
Jonne Ransijn
d7596a0a61 AK: Don't implicitly convert Optional<T&> to Optional<T>
C++ will jovially select the implicit conversion operator, even if it's
complete bogus, such as for unknown-size types or non-destructible
types. Therefore, all such conversions (which incur a copy) must
(unfortunately) be explicit so that non-copyable types continue to work.

NOTE: We make an exception for trivially copyable types, since they
are, well, trivially copyable.

Co-authored-by: kleines Filmröllchen <filmroellchen@serenityos.org>
2024-12-04 01:58:22 +01:00
Sam Atkins
900c131178 LibURL: Make URL::serialized_host() infallible
This can no longer fail, so update the return type to match.

This makes a few more methods now unable to return errors, but one thing
at a time. 😅
2024-11-30 12:07:39 +01:00
Pavel Shliak
caf7983039 LibHTTP: Clean up #include directives
This change aims to improve the speed of incremental builds.
2024-11-21 14:08:33 +01:00
Ali Mohammad Pur
7f72c28e78 LibHTTP: Make HeaderMap movable and copyable 2024-11-20 21:37:58 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00