ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2026-04-18 18:00:31 +00:00

Author	SHA1	Message	Date
Andreas Kling	34d954e2d7	LibRegex: Add ECMAScriptRegex and migrate callers Add `ECMAScriptRegex`, LibRegex's C++ facade for ECMAScript regexes. The facade owns compilation, execution, captures, named groups, and error translation for the Rust backend, which lets callers stop depending on the legacy parser and matcher types directly. Use it in the remaining non-LibJS callers: URLPattern, HTML input pattern handling, and the places in LibHTTP that only needed token validation. Where a full regex engine was unnecessary, replace those call sites with direct character checks. Also update focused LibURL, LibHTTP, and WPT coverage for the migrated callers and corrected surrogate handling.	2026-03-27 17:32:19 +01:00
Shannon Booth	d76330645a	LibHTTP: Ignore empty list elements when extracting token headers It turns out that the validation of header values in `db5f16f042` was a bit over aggressive. extract_token_headers previously treated empty list elements (empty or whitespace-only after trimming) as parse failures. This is incorrect per RFC 9110, which specifies that recipients must ignore empty list elements in comma-separated header values. > A recipient MUST parse and ignore a reasonable number of empty > list elements	2026-03-16 13:55:26 +01:00
Shannon Booth	db5f16f042	LibHTTP: Parse token-list headers according to their ABNF The previous implementation did not fully align with each headers ABNF, so would not reject some headers as we should have been doing. Fixes 6 WPT subtests for https://wpt.live/cors/access-control-expose-headers-parsing.window.html	2026-03-01 18:16:16 +00:00
Timothy Flynn	ef134c940e	LibHTTP: Correctly normalize header whitespace in cache utilities We also shouldn't trim whitespace at all when reading headers from the cache index. We store them as-is and should therefore read them as-is.	2026-02-26 22:27:46 +01:00
Timothy Flynn	0652a33043	LibHTTP: Return a StringView from HTTP::normalize_header_value This lets callers that do not need a string avoid a needless allocation. All callers that do need a string will already either: * Turn it into a ByteString themselves * Pass this along to the isomorphic encoder	2026-02-26 22:27:46 +01:00
Timothy Flynn	3b5c5f68bb	LibHTTP: Use IdentityHashTraits for HashMaps keyed by the cache key The cache key itself is already an integral export of a SHA-1 hash of some request fields. We don't need to hash it again for these maps.	2026-02-24 15:10:59 +01:00
Ben Wiederhake	2e51182560	LibHTTP: Remove unused header in HttpRequest	2026-02-23 12:15:23 +01:00
Ben Wiederhake	7ad95c78af	LibHTTP: Remove unused header in HeaderList	2026-02-23 12:15:23 +01:00
Ben Wiederhake	2a369a2a26	LibHTTP: Remove unused header in ParsedCookie	2026-02-23 12:15:23 +01:00
Shannon Booth	2e3c59f791	LibHTTP: Simplify serializing a URL for cache storage The serialization function already has a flag to skip the fragment or not.	2026-02-18 12:52:19 -05:00
Timothy Flynn	bda0820b8b	LibHTTP: Use a memory-backed database for the disk cache in test modes This just lets us create fewer cache directories during WPT. We do still create cache entries on disk, so for WPT, we introduce an extra cache key to prevent conflicts. There is an existing FIXME about this.	2026-02-15 15:25:30 -05:00
Praise-Garfield	3e719be607	LibHTTP: Handle InvalidURL in parse_error_to_string The ParseError::InvalidURL variant is returned by from_raw_request() when a query string cannot be converted to a valid String. However, parse_error_to_string() does not handle this variant, causing it to fall through to VERIFY_NOT_REACHED() and crash. This adds the missing case.	2026-02-14 14:35:01 -05:00
Praise-Garfield	9b8e341828	LibHTTP: Implement the must-understand cache directive This implements the must-understand response cache directive per RFC 9111 Section 5.2.2.3. When a response contains must-understand, this cache now ignores the no-store directive for status codes whose caching behavior it implements. For status codes the cache does not understand, the response is not stored.	2026-02-14 14:34:34 -05:00
Shannon Booth	d3624c328f	LibDatabase: Allow creating a memory backed database	2026-02-14 10:25:33 -05:00
Timothy Flynn	7d60d0bfb7	LibHTTP+LibWebView+RequestServer: Allow users to set disk cache limits This adds a settings box to about:settings to allow users to limit the disk cache size. This will override the default 5 GiB limit. We do not automatically delete cache data if the new limit is suddenly less than the used disk space; this will happen on the next request. This allows multiple changes to the settings in a row without thrashing the cache. In the future, we can add more toggles, such as disabling the disk cache altogether.	2026-02-13 10:20:52 -05:00
Timothy Flynn	16fb2ea3b7	LibHTTP: Impose a limit on singular disk cache entry sizes Let's not attempt to cache entries that are excessively large. We limit the cache data size to be 1/8 of the total disk cache limit, with a cap of 256 MiB.	2026-02-13 10:20:52 -05:00
Timothy Flynn	d773ba25cf	LibHTTP: Impose a limit on the total disk cache size Rather than letting our disk cache grow unbounded, let's impose a limit on the estimated total disk cache size. The limits chosen are vaguely inspired by Chromium. We impose a total disk cache limit of 5 GiB. Chromium imposes an overall limit of 1.25 GiB; I've chosen more here because we currently cache uncompressed data from cURL. The limit is further restricted by the amount of available disk space, which we just check once at startup (as does Chromium). We will choose a percentage of the free space available on systems with limited space. Our eviction errs on the side of simplicity. We will remove the least recently accessed entries until the total estimated cache size does not exceed our limit. This could potentially be improved in the future. For example, if the next entry to consider is 40 MiB, and we only need to free 1 MiB of space, we could try evicting slightly more recently used entries. This would prevent evicting more than we need to.	2026-02-13 10:20:52 -05:00
Timothy Flynn	5f2063d5d9	LibHTTP: Include request header length in the estimated disk cache size Request headers were added in `36a826815d`, but this estimation was not updated.	2026-02-13 10:20:52 -05:00
Praise-Garfield	b270b2cacb	LibHTTP: Fix inverted Content-Range complete-length parsing parse_single_byte_content_range_as_values() has the condition on consume_specific('') inverted. When the complete-length is a numeric value like "1000", the negated check causes the wildcard branch to run, discarding the length. When it is "" (unknown), the else branch tries to parse digits after consuming the "", which fails entirely. Removing the "!" fixes both cases so that "" correctly produces an empty complete_length, and numeric values are parsed normally. Also adds an EOF check after parsing to reject trailing garbage, matching the pattern used by parse_single_range_header_value().	2026-02-13 09:39:49 +01:00
Timothy Flynn	d97a3d9b5a	LibHTTP+RequestServer: Send revalidation attributes without parsing The caching RFC is quite strict about the format of date strings. If we received a revalidation attribute with an invalid date string, we would previously fail a runtime assertion. This was because to start a revalidation request, we would simply check for the presence of any revalidation header; but then when we issued the request, we would fail to parse the header, and end up with all attributes being null. We now don't parse the revalidation attributes at all. Whatever we receive in the Last-Modified response header is what we will send in the If-Modified-Since request header, verbatim. For better or worse, this is how other browsers behave. So if the server sends us an invalid date string, it can receive its own date format for revalidation.	2026-02-10 09:09:53 -05:00
Timothy Flynn	d75aee2a56	LibHTTP+LibWeb: Move the IncludeCredentials enum to LibHTTP This will be sent over IPC to RequestServer in an upcoming patch.	2026-02-10 12:21:20 +01:00
Timothy Flynn	8b10a3a39e	LibHTTP: Clean up old cookie code a bit * Transfer cookie-related enums over IPC as u8, rather than an int * Use AK's safe ASCII ctype alternatives * Use SCREAMING_CASE for constants	2026-02-10 12:21:20 +01:00
Timothy Flynn	8d97389038	LibHTTP+Everywhere: Move the cookie implementation to LibHTTP This will allow parsing cookies outside of LibWeb. LibHTTP is basically becoming the home of HTTP WG specs.	2026-02-10 12:21:20 +01:00
Timothy Flynn	918f6a4c9f	LibHTTP: Ensure we use the Vary key when updating last access time Apparently, sqlite will fill this placeholder value in with NULL if we do not pass a value. The query being executed here is: UPDATE CacheIndex SET last_access_time = ? WHERE cache_key = ? AND vary_key = ?;	2026-02-06 16:24:49 +01:00
Zaggy1024	4eb310cd3f	LibWeb: Skip range requests for media if the server won't accept them Currently, this just respects the reported value from Accept-Ranges, but we could also just try sending a range request and see if the server rejects it, then fall back to a normal request after. For now, this is fine, and we can make it use a fallback later if needed.	2026-01-29 05:22:27 -06:00
Zaggy1024	99020b50a3	LibHTTP: Implement extraction of the Content-Range values in HeaderList	2026-01-29 05:22:27 -06:00
Timothy Flynn	4585734696	LibHTTP: Honor the min-fresh Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	896ecb28ab	LibHTTP: Honor the max-stale Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	4a728b1f29	LibHTTP: Honor the max-age Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	26ddd0a904	LibHTTP: Honor the no-cache Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	cb1ad8a904	LibHTTP: Honor the no-store Cache-Control request directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	2918537596	LibHTTP: Add helper to extract a duration from a Cache-Control directive	2026-01-28 11:31:04 -05:00
Timothy Flynn	40800fd91e	LibHTTP: Implement a strict method to extract Cache-Control directives Our previous implementation was a bit too tolerant of bad header values. For example, extracting a "max-age" from a header value of "abmax-agecd" would have incorrectly parsed successfully. We now find exact (case-insensitive) directive matches. We also handle quoted string values, which may contain important delimeters that we would have previously split on.	2026-01-28 11:31:04 -05:00
Timothy Flynn	54c2ecedca	RequestServer: Do not flush the disk cache for unsuccessful requests If a request failed, or was stopped, do not attempt to write the cache entry footer to disk. Note that at this point, the cache index will not have been created, thus this entry will not be used in the future. We do still delete any partial file on disk. This serves as a more general fix for the issue addressed in commit `9f2ac14521`.	2026-01-23 14:24:20 +01:00
Timothy Flynn	12552f0d72	LibHTTP: Avoid UAF while deleting exempt cache headers HeaderList::delete involves a Vector::remove_all_matching internally. So if an exempt header appeared again later in the header list, we would be accessing the name string of the previously deleted header.	2026-01-22 13:18:29 -05:00
Timothy Flynn	d3041dc054	LibHTTP+LibWeb: Support the HTTP Vary response header We now partition the HTTP disk cache based on the Vary response header. If a cached response contains a Vary header, we look for each of the header names in the outgoing HTTP request. The outgoing request must match every header value in the original request for the cache entry to be used; otherwise, a new request will be issued, and a separate cache entry will be created. Note that we must now defer creating the disk cache file itself until we have received the response headers. The Vary key is computed from these headers, and affects the partitioned disk cache file name. There are further optimizations we can make here. If we have a Vary mismatch, we could find the best candidate cached response and issue a conditional HTTP request. The content server may then respond with an HTTP 304 if the mismatched request headers are actually okay. But for now, if we have a Vary mismatch, we issue an unconditional request as a purely correctness-oriented patch.	2026-01-22 08:54:49 -05:00
Timothy Flynn	36a826815d	LibHTTP+LibWeb+RequestServer: Store request headers in the HTTP caches We need to store request headers in order to handle Vary mismatches. (Note we should also be using BLOB for header storage in sqlite, as they are not necessarily UTF-8.)	2026-01-22 08:54:49 -05:00
Timothy Flynn	24da225b3b	LibHTTP: Update disk cache entry format comment with latest format In commit `20cd19be4d`, HTTP headers were moved from the cache entry to the cache index, but this comment was not updated.	2026-01-22 08:54:49 -05:00
Timothy Flynn	aa1517b727	LibHTTP+LibWeb+RequestServer: Handle the Fetch API's cache mode If the cache mode is no-store, we must not interact with the cache at all. If the cache mode is reload, we must not use any cached response. If the cache-mode is only-if-cached or force-cache, we are permitted to respond with stale cache responses. Note that we currently cannot test only-if-cached in test-web. Setting this mode also requires setting the cors mode to same-origin, but our http-test-server infra requires setting the cors mode to cors.	2026-01-22 07:05:06 -05:00
Timothy Flynn	6b91199253	LibHTTP+LibWeb: Move Infrastructure::Request::CacheMode to LibHTTP We will need to send this enum over IPC to RequestServer to affect the disk cache's behavior.	2026-01-22 07:05:06 -05:00
Timothy Flynn	2ac219405f	LibHTTP+LibWeb: Purge non-fresh entries from the memory cache Once a cache entry is not fresh, we now remove it from the memory cache. We will avoid handling revalidation from within WebContent. Instead, we will just forward the request to RequestServer, where the disk cache will handle revalidation for itself if needed.	2026-01-19 08:02:14 -05:00
Timothy Flynn	17d7c2b6bd	LibHTTP: Allow revalidating heuristically cacheable responses This is expected by WPT (the /fetch/http-cache/304-update.any.html test in particular).	2026-01-19 08:02:14 -05:00
Timothy Flynn	bc1cafc716	LibHTTP+LibWebView+RequestServer: Allow using the disk cache during WPT We currently disable the disk cache because the WPT runner will run more than one RequestServer process at a time. The SQLite database does not handle this concurrent read/write access well. We will now enable the disk cache with a per-process database. This is needed to ensure that WPT Fetch cache tests are sufficiently handled by RequestServer.	2026-01-19 08:02:14 -05:00
Timothy Flynn	457a319cda	LibHTTP: Define the DiskCache::cache_directory getter as const	2026-01-19 08:02:14 -05:00
Zaggy1024	84c0eb3dbf	LibCore+LibHTTP+RequestServer: Send data via sockets instead of pipes This brings the implementation on Unix in line with Windows, so we can drop a few ifdefs.	2026-01-19 06:53:29 -05:00
Timothy Flynn	cb4da2c6c2	LibHTTP: Defer setting the response time until headers are received We currently set the response time to when the cache entry writer is created. This is more or less the same as the request start time, so this is not correct. This was a regression from `5384f84550`. That commit changed when the writer was created, but did not move the setting of the response time to match. We now set the response time to when the HTTP response headers have been received (again), which matches how Chromium behaves: https://source.chromium.org/chromium/chromium/src/+/refs/tags/144.0.7500.0:net/url_request/url_request_job.cc;l=425-433	2026-01-10 23:31:42 +01:00
Timothy Flynn	453764d3f0	LibHTTP: Do not respond to Range requests with cached full responses If we have the response for a non-Range request in the memory cache, we would previously use it in reply to Range requests. Similar to commit 878b00ae61f998a26aad7f50fae66cf969878ad6, we are just punting on Range requests in the HTTP caches for now.	2026-01-10 09:02:41 -05:00
Timothy Flynn	b35645523c	LibHTTP+LibWeb: Make memory cache debug logs consistent with disk cache Let's also not yell.	2026-01-10 09:02:41 -05:00
Timothy Flynn	04171d42f0	LibHTTP: Prefix disk cache debug messages with "[disk]" text A future commit will format memory cache debug messages similarly to the disk cache messages. To make it easy to read them both at a glance when both debug flags are turned on, let's add a prefix to these messages.	2026-01-10 09:02:41 -05:00
Timothy Flynn	0d99d54c46	LibHTTP+LibWeb: Do not cache range requests (for now) We currently do not handle responses for range requests at all in our HTTP caches. This means if we issue a request for a range of bytes=1-10, that response will be served to a subsequent request for a range of bytes=10-20. This is obviously invalid - so until we handle these requests, just don't cache them for now.	2026-01-08 11:59:12 +01:00

1 2 3

110 commits