browsertrix-crawler/src/util
Ilya Kreymer 178b10a37f
remove early serialization which may result in missing WARC-Protocol and security metadata (#844)
- drop early serialization in handleFetchResponse(), can result in
writing WARC record too early, before the WARC-Protocol and other data
is available. (Added previously for requests loaded via browser context /
service worker which did not get a 'loadingFinished' message, but now
these will still be closed in awaitPageResources())
- don't log 'skipping URL from unknown frame' warning since it is often
spurious, since frame can be added in subsequent message and response is
*not* skipped.
2025-05-29 08:33:30 -07:00
..
argParser.ts lang code fixes: (#834) 2025-05-12 16:06:29 -07:00
blockrules.ts Retry support and additional fixes (#743) 2025-01-25 22:55:49 -08:00
browser.ts remove early serialization which may result in missing WARC-Protocol and security metadata (#844) 2025-05-29 08:33:30 -07:00
constants.ts support pause interrupt: (#825) 2025-05-05 10:10:08 -07:00
file_reader.ts Remove hardcoded /tmp prefix from path (#843) 2025-05-28 15:46:19 -07:00
flowbehavior.ts Support for behaviors from 'recorder flow' JSON created in devtools (#818) 2025-04-09 12:24:29 +02:00
healthcheck.ts Add more exit codes to detect interruption reason (#764) 2025-02-10 14:00:55 -08:00
logger.ts Add option to push behavior + behavior script logs to Redis (#805) 2025-04-03 15:46:10 -07:00
originoverride.ts Retry support and additional fixes (#743) 2025-01-25 22:55:49 -08:00
proxy.ts Strip credentials from proxy address in crawl logs (#778) 2025-02-26 15:23:38 -05:00
recorder.ts remove early serialization which may result in missing WARC-Protocol and security metadata (#844) 2025-05-29 08:33:30 -07:00
redis.ts Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
replayserver.ts deps update: update webrecorder dependencies (#810) 2025-04-01 22:11:56 -07:00
reqresp.ts Add WARC-Protocol header (#715) 2025-05-19 18:59:52 -07:00
screencaster.ts Remove extra console.log statements (#811) 2025-04-02 09:25:11 -07:00
screenshots.ts Implemented option for FullPage screenshot after the behaviours have run (#656) 2024-11-23 21:26:55 -08:00
seeds.ts Apply exclusions to redirects (#745) 2025-01-28 11:28:23 -08:00
sitemapper.ts Remove extra console.log statements (#811) 2025-04-02 09:25:11 -07:00
state.ts optimization: normalize dedup status: treat 0 (response code not yet known) or 206 as 200… (#835) 2025-05-28 15:46:40 -07:00
storage.ts exit code cleanup (#753) 2025-02-06 17:54:51 -08:00
textextract.ts Better logging of all queue WARCWriter operations (#536) 2024-04-12 14:31:07 -07:00
timing.ts link extraction promise cleanup: (#701) 2024-10-11 00:11:24 -07:00
wacz.ts Streaming in-place WACZ creation + CDXJ indexing (#673) 2024-08-29 13:21:20 -07:00
warcwriter.ts useSHA1 Parameter for generating SHA1 record hashes (#532) (#812) 2025-04-02 17:10:50 -07:00
worker.ts browser crash handling, follow-up to #808: (#813) 2025-04-03 16:10:54 -07:00