browsertrix-crawler/src
Ilya Kreymer 178b10a37f
remove early serialization which may result in missing WARC-Protocol and security metadata (#844)
- drop early serialization in handleFetchResponse(), can result in
writing WARC record too early, before the WARC-Protocol and other data
is available. (Added previously for requests loaded via browser context /
service worker which did not get a 'loadingFinished' message, but now
these will still be closed in awaitPageResources())
- don't log 'skipping URL from unknown frame' warning since it is often
spurious, since frame can be added in subsequent message and response is
*not* skipped.
2025-05-29 08:33:30 -07:00
..
util remove early serialization which may result in missing WARC-Protocol and security metadata (#844) 2025-05-29 08:33:30 -07:00
crawler.ts tmpdir: use os.tmpdir() instead of hardcoded '/tmp' (#842) 2025-05-28 12:48:06 -07:00
create-login-profile.ts Make sure all exit calls use ExitCodes enum (#767) 2025-02-11 12:04:38 -08:00
main.ts Add more exit codes to detect interruption reason (#764) 2025-02-10 14:00:55 -08:00
replaycrawler.ts Remove extra console.log statements (#811) 2025-04-02 09:25:11 -07:00