ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2026-04-19 02:10:26 +00:00

Author	SHA1	Message	Date
Timothy Flynn	ef134c940e	LibHTTP: Correctly normalize header whitespace in cache utilities We also shouldn't trim whitespace at all when reading headers from the cache index. We store them as-is and should therefore read them as-is.	2026-02-26 22:27:46 +01:00
Timothy Flynn	3b5c5f68bb	LibHTTP: Use IdentityHashTraits for HashMaps keyed by the cache key The cache key itself is already an integral export of a SHA-1 hash of some request fields. We don't need to hash it again for these maps.	2026-02-24 15:10:59 +01:00
Shannon Booth	2e3c59f791	LibHTTP: Simplify serializing a URL for cache storage The serialization function already has a flag to skip the fragment or not.	2026-02-18 12:52:19 -05:00
Timothy Flynn	bda0820b8b	LibHTTP: Use a memory-backed database for the disk cache in test modes This just lets us create fewer cache directories during WPT. We do still create cache entries on disk, so for WPT, we introduce an extra cache key to prevent conflicts. There is an existing FIXME about this.	2026-02-15 15:25:30 -05:00
Praise-Garfield	9b8e341828	LibHTTP: Implement the must-understand cache directive This implements the must-understand response cache directive per RFC 9111 Section 5.2.2.3. When a response contains must-understand, this cache now ignores the no-store directive for status codes whose caching behavior it implements. For status codes the cache does not understand, the response is not stored.	2026-02-14 14:34:34 -05:00
Shannon Booth	d3624c328f	LibDatabase: Allow creating a memory backed database	2026-02-14 10:25:33 -05:00
Timothy Flynn	7d60d0bfb7	LibHTTP+LibWebView+RequestServer: Allow users to set disk cache limits This adds a settings box to about:settings to allow users to limit the disk cache size. This will override the default 5 GiB limit. We do not automatically delete cache data if the new limit is suddenly less than the used disk space; this will happen on the next request. This allows multiple changes to the settings in a row without thrashing the cache. In the future, we can add more toggles, such as disabling the disk cache altogether.	2026-02-13 10:20:52 -05:00
Timothy Flynn	16fb2ea3b7	LibHTTP: Impose a limit on singular disk cache entry sizes Let's not attempt to cache entries that are excessively large. We limit the cache data size to be 1/8 of the total disk cache limit, with a cap of 256 MiB.	2026-02-13 10:20:52 -05:00
Timothy Flynn	d773ba25cf	LibHTTP: Impose a limit on the total disk cache size Rather than letting our disk cache grow unbounded, let's impose a limit on the estimated total disk cache size. The limits chosen are vaguely inspired by Chromium. We impose a total disk cache limit of 5 GiB. Chromium imposes an overall limit of 1.25 GiB; I've chosen more here because we currently cache uncompressed data from cURL. The limit is further restricted by the amount of available disk space, which we just check once at startup (as does Chromium). We will choose a percentage of the free space available on systems with limited space. Our eviction errs on the side of simplicity. We will remove the least recently accessed entries until the total estimated cache size does not exceed our limit. This could potentially be improved in the future. For example, if the next entry to consider is 40 MiB, and we only need to free 1 MiB of space, we could try evicting slightly more recently used entries. This would prevent evicting more than we need to.	2026-02-13 10:20:52 -05:00
Timothy Flynn	5f2063d5d9	LibHTTP: Include request header length in the estimated disk cache size Request headers were added in `36a826815d`, but this estimation was not updated.	2026-02-13 10:20:52 -05:00
Timothy Flynn	d97a3d9b5a	LibHTTP+RequestServer: Send revalidation attributes without parsing The caching RFC is quite strict about the format of date strings. If we received a revalidation attribute with an invalid date string, we would previously fail a runtime assertion. This was because to start a revalidation request, we would simply check for the presence of any revalidation header; but then when we issued the request, we would fail to parse the header, and end up with all attributes being null. We now don't parse the revalidation attributes at all. Whatever we receive in the Last-Modified response header is what we will send in the If-Modified-Since request header, verbatim. For better or worse, this is how other browsers behave. So if the server sends us an invalid date string, it can receive its own date format for revalidation.	2026-02-10 09:09:53 -05:00
Timothy Flynn	918f6a4c9f	LibHTTP: Ensure we use the Vary key when updating last access time Apparently, sqlite will fill this placeholder value in with NULL if we do not pass a value. The query being executed here is: UPDATE CacheIndex SET last_access_time = ? WHERE cache_key = ? AND vary_key = ?;	2026-02-06 16:24:49 +01:00
Timothy Flynn	4585734696	LibHTTP: Honor the min-fresh Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	896ecb28ab	LibHTTP: Honor the max-stale Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	4a728b1f29	LibHTTP: Honor the max-age Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	26ddd0a904	LibHTTP: Honor the no-cache Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	cb1ad8a904	LibHTTP: Honor the no-store Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	2918537596	LibHTTP: Add helper to extract a duration from a Cache-Control directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	40800fd91e	LibHTTP: Implement a strict method to extract Cache-Control directives Our previous implementation was a bit too tolerant of bad header values. For example, extracting a "max-age" from a header value of "abmax-agecd" would have incorrectly parsed successfully. We now find exact (case-insensitive) directive matches. We also handle quoted string values, which may contain important delimeters that we would have previously split on.	2026-01-28 11:31:04 -05:00
Timothy Flynn	54c2ecedca	RequestServer: Do not flush the disk cache for unsuccessful requests If a request failed, or was stopped, do not attempt to write the cache entry footer to disk. Note that at this point, the cache index will not have been created, thus this entry will not be used in the future. We do still delete any partial file on disk. This serves as a more general fix for the issue addressed in commit `9f2ac14521`.	2026-01-23 14:24:20 +01:00
Timothy Flynn	12552f0d72	LibHTTP: Avoid UAF while deleting exempt cache headers HeaderList::delete involves a Vector::remove_all_matching internally. So if an exempt header appeared again later in the header list, we would be accessing the name string of the previously deleted header.	2026-01-22 13:18:29 -05:00
Timothy Flynn	d3041dc054	LibHTTP+LibWeb: Support the HTTP Vary response header We now partition the HTTP disk cache based on the Vary response header. If a cached response contains a Vary header, we look for each of the header names in the outgoing HTTP request. The outgoing request must match every header value in the original request for the cache entry to be used; otherwise, a new request will be issued, and a separate cache entry will be created. Note that we must now defer creating the disk cache file itself until we have received the response headers. The Vary key is computed from these headers, and affects the partitioned disk cache file name. There are further optimizations we can make here. If we have a Vary mismatch, we could find the best candidate cached response and issue a conditional HTTP request. The content server may then respond with an HTTP 304 if the mismatched request headers are actually okay. But for now, if we have a Vary mismatch, we issue an unconditional request as a purely correctness-oriented patch.	2026-01-22 08:54:49 -05:00
Timothy Flynn	36a826815d	LibHTTP+LibWeb+RequestServer: Store request headers in the HTTP caches We need to store request headers in order to handle Vary mismatches. (Note we should also be using BLOB for header storage in sqlite, as they are not necessarily UTF-8.)	2026-01-22 08:54:49 -05:00
Timothy Flynn	24da225b3b	LibHTTP: Update disk cache entry format comment with latest format In commit `20cd19be4d`, HTTP headers were moved from the cache entry to the cache index, but this comment was not updated.	2026-01-22 08:54:49 -05:00
Timothy Flynn	aa1517b727	LibHTTP+LibWeb+RequestServer: Handle the Fetch API's cache mode If the cache mode is no-store, we must not interact with the cache at all. If the cache mode is reload, we must not use any cached response. If the cache-mode is only-if-cached or force-cache, we are permitted to respond with stale cache responses. Note that we currently cannot test only-if-cached in test-web. Setting this mode also requires setting the cors mode to same-origin, but our http-test-server infra requires setting the cors mode to cors.	2026-01-22 07:05:06 -05:00
Timothy Flynn	6b91199253	LibHTTP+LibWeb: Move Infrastructure::Request::CacheMode to LibHTTP We will need to send this enum over IPC to RequestServer to affect the disk cache's behavior.	2026-01-22 07:05:06 -05:00
Timothy Flynn	2ac219405f	LibHTTP+LibWeb: Purge non-fresh entries from the memory cache Once a cache entry is not fresh, we now remove it from the memory cache. We will avoid handling revalidation from within WebContent. Instead, we will just forward the request to RequestServer, where the disk cache will handle revalidation for itself if needed.	2026-01-19 08:02:14 -05:00
Timothy Flynn	17d7c2b6bd	LibHTTP: Allow revalidating heuristically cacheable responses This is expected by WPT (the /fetch/http-cache/304-update.any.html test in particular).	2026-01-19 08:02:14 -05:00
Timothy Flynn	bc1cafc716	LibHTTP+LibWebView+RequestServer: Allow using the disk cache during WPT We currently disable the disk cache because the WPT runner will run more than one RequestServer process at a time. The SQLite database does not handle this concurrent read/write access well. We will now enable the disk cache with a per-process database. This is needed to ensure that WPT Fetch cache tests are sufficiently handled by RequestServer.	2026-01-19 08:02:14 -05:00
Timothy Flynn	457a319cda	LibHTTP: Define the DiskCache::cache_directory getter as const	2026-01-19 08:02:14 -05:00
Zaggy1024	84c0eb3dbf	LibCore+LibHTTP+RequestServer: Send data via sockets instead of pipes This brings the implementation on Unix in line with Windows, so we can drop a few ifdefs.	2026-01-19 06:53:29 -05:00
Timothy Flynn	cb4da2c6c2	LibHTTP: Defer setting the response time until headers are received We currently set the response time to when the cache entry writer is created. This is more or less the same as the request start time, so this is not correct. This was a regression from `5384f84550`. That commit changed when the writer was created, but did not move the setting of the response time to match. We now set the response time to when the HTTP response headers have been received (again), which matches how Chromium behaves: https://source.chromium.org/chromium/chromium/src/+/refs/tags/144.0.7500.0:net/url_request/url_request_job.cc;l=425-433	2026-01-10 23:31:42 +01:00
Timothy Flynn	453764d3f0	LibHTTP: Do not respond to Range requests with cached full responses If we have the response for a non-Range request in the memory cache, we would previously use it in reply to Range requests. Similar to commit 878b00ae61f998a26aad7f50fae66cf969878ad6, we are just punting on Range requests in the HTTP caches for now.	2026-01-10 09:02:41 -05:00
Timothy Flynn	b35645523c	LibHTTP+LibWeb: Make memory cache debug logs consistent with disk cache Let's also not yell.	2026-01-10 09:02:41 -05:00
Timothy Flynn	04171d42f0	LibHTTP: Prefix disk cache debug messages with "[disk]" text A future commit will format memory cache debug messages similarly to the disk cache messages. To make it easy to read them both at a glance when both debug flags are turned on, let's add a prefix to these messages.	2026-01-10 09:02:41 -05:00
Timothy Flynn	0d99d54c46	LibHTTP+LibWeb: Do not cache range requests (for now) We currently do not handle responses for range requests at all in our HTTP caches. This means if we issue a request for a range of bytes=1-10, that response will be served to a subsequent request for a range of bytes=10-20. This is obviously invalid - so until we handle these requests, just don't cache them for now.	2026-01-08 11:59:12 +01:00
Timothy Flynn	9f2ac14521	LibHTTP+RequestServer: Do not flush partial responses to the cache index If the cURL request completes with anything other than CURLE_OK, we must not keep the cache entry. For example, if the server's connection closes while transferring data, we receive CURLE_PARTIAL_FILE. We don't want this cache entry to be treated as valid in a subsequent request.	2026-01-08 11:59:12 +01:00
Timothy Flynn	9c8322d1b3	LibHTTP: Use correct cache key type in disk cache index entry storage We also don't need to store the cache key itself in the entry struct.	2025-12-21 09:24:51 -06:00
Timothy Flynn	bf7b812d0b	LibHTTP+LibWeb: Store the in-memory HTTP cache without JS realms The in-memory HTTP Fetch cache currently keeps the realm which created each cache entry alive indefinitely. This patch migrates this cache to LibHTTP, to ensure it is completely unaware of any JS objects. Now that we are not interacting with Fetch response objects, we can no longer use Streams infrastructure to pipe the response body into the Fetch response. Fetch also ultimately creates the cache response once the HTTP response headers have arrived. So the LibHTTP cache will hold entries in a pending list until we have received the entire response body. Then it is moved to a completed list and may be used thereafter.	2025-12-21 08:59:31 -06:00
Timothy Flynn	46b3218241	LibHTTP+LibWeb: Use LibHTTP to calculate stale-while-revalidate values No need to duplicate this in LibWeb. In doing so, this also fixes an apparent bug for SWR handling in LibWeb. We were previously deciding if we were in the SWR lifetime with: stale_while_revalidate > current_age However, the SWR lifetime is meant to be an additional time on top of the freshness lifetime: freshness_lifetime + stale_while_revalidate > current_age	2025-12-14 11:33:02 -05:00
Timothy Flynn	add8402536	LibHTTP+RequestServer: Implement the stale-while-revalidate directive This directive allows our disk cache to serve stale responses for a time indicated by the directive itself, while we revalidate the response in the background. Issuing requests that weren't initiated by a client is a new thing for RequestServer. In this implementation, we associate the request with the client that initiated the request to the stale cache entry. This adds a "background request" mode to the Request object, to prevent us from trying to send any of the revalidation response over IPC.	2025-12-13 13:07:02 -06:00
Timothy Flynn	8a0c8743b6	LibHTTP: Correctly hold an exclusive cache entry for revalidation We were returning the incorrect result when upgrading a cache entry to have exclusivity on must-revalidate requests. This could result in the entry being read and updated at the same time, especially if the server returned a non-304 response.	2025-12-13 13:07:02 -06:00
Timothy Flynn	0946d802bc	LibHTTP+RequestServer: Mark a couple classes as final	2025-12-13 13:07:02 -06:00
Timothy Flynn	aae8574d25	LibHTTP: Place HTTP disk cache log points behind a debug flag These log points are quite verbose. Before we enable the disk cache by default, let's place them behind a debug flag.	2025-12-02 12:19:42 +01:00
Timothy Flynn	2453f0bc04	LibHTTP+LibWeb: Use LibHTTP's cache implementation in LibWeb There are a couple of remaining RFC 9111 methods in LibWeb's Fetch, but these are currently directly tied to the way we store GC-allocated HTTP response objects. So de-coupling that is left as a future exercise.	2025-11-29 08:35:02 -05:00
Timothy Flynn	21bbbacd07	LibHTTP+RequestServer: Move the HTTP cache implementation to LibHTTP We currently have two ongoing implementations of RFC 9111, HTTP caching. In order to consolidate these, this patch moves the implementation from RequestServer to LibHTTP for re-use within LibWeb.	2025-11-29 08:35:02 -05:00

46 commits