mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-10-19 06:23:16 +00:00
![]() * interrupts: simplify interrupt behavior: - SIGTERM/SIGINT behave same way, trigger an graceful shutdown after page load improvements of remote state / parallel crawlers (for browsertrix-cloud): - SIGUSR1 before SIGINT/SIGTERM ensures data is saved, mark crawler as done - for use with graceful stopping crawl - SIGUSR2 before SIGINT/SIGTERM ensures data is saved, does not mark crawler as done - for use with scaling down a single crawler * scope check: check scope of URL retrieved from queue (in case scoping rules changed), urls matching seed automatically in scope! |
||
---|---|---|
.. | ||
argParser.js | ||
blockrules.js | ||
browser.js | ||
constants.js | ||
redis.js | ||
screencaster.js | ||
seeds.js | ||
state.js | ||
storage.js | ||
textextract.js | ||
windowconcur.js |