ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-12-07 21:59:54 +00:00

Author	SHA1	Message	Date
Timothy Flynn	21bbbacd07	LibHTTP+RequestServer: Move the HTTP cache implementation to LibHTTP We currently have two ongoing implementations of RFC 9111, HTTP caching. In order to consolidate these, this patch moves the implementation from RequestServer to LibHTTP for re-use within LibWeb.	2025-11-29 08:35:02 -05:00
Timothy Flynn	4de3f77d37	RequestServer: Add a hook to advance a request's clock time for testing For example, we will want to be able to test that a cached object was expired after N seconds. Rather than waiting that time during testing, this adds a testing-only request header to internally advance the clock for a single HTTP request.	2025-11-20 09:33:49 +01:00
Timothy Flynn	b2c112c41a	LibWebView+RequestServer: Add a simple test mode for the HTTP disk cache This mode allows us to test the HTTP disk cache with two mechanisms: 1. If RequestServer is launched with --http-disk-cache-mode=testing, it will cache requests with a X-Ladybird-Enable-Disk-Cache header. 2. In test mode, RS will include a X-Ladybird-Disk-Cache-Status response header indicating how the response was handled by the cache. There is no standard way for a web request to know what happened with respect to the disk cache, so this fills that hole for testing. This mode is not exposed to users.	2025-11-20 09:33:49 +01:00
Andreas Kling	837d5fb7ea	RequestServer: Add heuristic cacheability and freshness per RFC 9111 This commit extends is_cacheable() to allow storage of responses that rely on heuristic cacheability, including status codes defined as heuristically cacheable. We implement heuristic freshness lifetime calculation based on the Last-Modified header as guidance, and apply it when no explicit expiration information is present.	2025-11-15 10:07:49 -05:00
Timothy Flynn	3f61f0f189	RequestServer: Add a time parameter to the clear cache endpoint This allows removing cache entries last accessed since a provided timestamp.	2025-11-12 09:06:21 -05:00
Timothy Flynn	ba49942b6d	LibRequests+RequestServer: Add a method to estimate disk cache size This allows estimating the cache size stored on disk since a provided time stamp, and in total.	2025-11-12 09:06:21 -05:00
Timothy Flynn	a4e3890c05	RequestServer: Implement stale cache revalidation When a request becomes stale, we will now issue a revalidation request (if the response indicates it may be revalidated). We do this by issuing a normal fetch request, with If-None-Match and/or If-Modified-Since request headers. If the server replies with an HTTP 304 status, we update the stored response headers to match the 304's headers, and serve the response to the client from the cache. If the server replies with any other code, we remove the cache entry. We will open a new cache entry to cache the new response, if possible.	2025-11-02 13:03:29 -05:00
Timothy Flynn	3d45a209b6	RequestServer: Rename CacheEntryReader::m_headers to m_response_headers Let's be extra clear what we're talking about here.	2025-11-02 13:03:29 -05:00
Timothy Flynn	20cd19be4d	RequestServer: Store HTTP response headers in the cache index We currently store response headers in the cache entry file, before the response body. When we implement cache revalidation, we will need to update the stored response headers with whatever headers are received in a 304 response. It's not unlikely that those headers will have a size that differs from the stored headers. We would then have to rewrite the entire response body after the new headers. Instead of dealing with those inefficiencies, let's instead store the response headers in the cache index. This will allow us to update the headers with a simple SQL query.	2025-11-02 13:03:29 -05:00
Timothy Flynn	7f37889ff1	RequestServer: De-duplicate some disk cache requests We previously had no protection against the same URL being requested multiple times at the same time. For example, if a URL did not have any cache entry and became requested twice, we would open two cache writers concurrently. This would result in both writers piping the response to disk, and we'd have a corrupt cache file. We now hold back requests under certain scenarios until existing cache entries have completed: * If we are opening a cache entry for reading: - If there is an existing reader entry, carry on as normal. We can have multiple readers. - If there is an existing writer entry, defer the request until it is complete. * If we are opening a cache entry for writing: - If there is an existing reader or writer entry, defer the request until it is complete.	2025-10-28 11:52:51 +01:00
Timothy Flynn	95d23d02f1	RequestServer: Pass the Request object to disk cache entry factories This object will be needed in a future commit to store requests awaiting other requests to finish. Doing this in a separate commit just to make that commit less noisy.	2025-10-28 11:52:51 +01:00
Timothy Flynn	5384f84550	RequestServer: Create disk cache writers for new requests immediately We previously waited until we received all response headers before we would create the cache entry. We now create one immediately, and handle writing the headers in its own function. This will allow us to know if a cache entry writer already exists for a given cache key, and thus prevent creating a second writer at the same time.	2025-10-28 11:52:51 +01:00
Timothy Flynn	163e8e5b44	LibWebView+RequestServer: Support clearing the HTTP disk cache This is a bit of a blunt hammer, but this hooks an action to clear the HTTP disk cache into the existing Clear Cache action. Upon invocation, it stops all existing cache entries from making further progress, and then deletes the entire cache index and all cache files. In the future, we will of course want more fine-grained control over cache deletion, e.g. via an about:history page.	2025-10-14 13:40:33 +02:00
Timothy Flynn	3516a2344f	LibRequests+RequestServer: Begin implementing an HTTP disk cache This adds a disk cache for HTTP responses received from the network. For now, we take a rather conservative approach to caching. We don't cache a response until we're 100% sure it is cacheable (there are heuristics we can implement in the future based on the absence of specific headers). The cache is broken into 2 categories of files: 1. An index file. This is a SQL database containing metadata about each cache entry (URL, timestamps, etc.). 2. Cache files. Each cached response is in its own file. The file is an amalgamation of all info needed to reconstruct an HTTP response. This includes the status code, headers, body, etc. A cache entry is created once we receive the headers for a response. The index, however, is not updated at this point. We stream the body into the cache entry as it is received. Once we've successfully cached the entire body, we create an index entry in the database. If any of these steps failed along the way, the cache entry is removed and the index is left untouched. Subsequent requests are checked for cache hits from the index. If a hit is found, we read just enough of the cache entry to inform WebContent of the status code and headers. The body of the response is piped to WC via syscalls, such that the transfer happens entirely in the kernel; no need to allocate the memory for the body in userspace (WC still allocates a buffer to hold the data, of course). If an error occurs while piping the body, we currently error out the request. There is a FIXME to switch to a network request. Cache hits are also validated for freshness before they are used. If a response has expired, we remove it and its index entry, and proceed with a network request.	2025-10-14 13:40:33 +02:00

14 commits