browsertrix-crawler/util
Ilya Kreymer 65933c6b12
Interrupt Handling Fixes (#167)
* interrupts: simplify interrupt behavior:
- SIGTERM/SIGINT behave same way, trigger an graceful shutdown after page load

improvements of remote state / parallel crawlers (for browsertrix-cloud):
- SIGUSR1 before SIGINT/SIGTERM ensures data is saved, mark crawler as done - for use with graceful stopping crawl
- SIGUSR2 before SIGINT/SIGTERM ensures data is saved, does not mark crawler as done - for use with scaling down a single crawler

* scope check: check scope of URL retrieved from queue (in case scoping rules changed), urls matching seed automatically in scope!
2022-09-20 17:09:52 -07:00
..
argParser.js Default Wait-Time Improvements (#162) 2022-09-08 23:39:26 -07:00
blockrules.js Page Resource Block Rules Avoid Duplicate Handlers + Ignore top-level pages + README update (0.4.4) (#81) 2021-08-17 20:54:18 -07:00
browser.js Logging and browser improvements: (#158) 2022-08-21 00:30:25 -07:00
constants.js Customizable extract selectors + typo fix (0.4.2) (#72) 2021-07-23 18:31:43 -07:00
redis.js Support for uploading to S3 (#95) 2021-11-23 12:53:30 -08:00
screencaster.js Page-reuse concurrency + Browser Repair + Screencaster Cleanup Improvements (#157) 2022-08-19 09:23:40 -07:00
seeds.js Interrupt Handling Fixes (#167) 2022-09-20 17:09:52 -07:00
state.js 0.6.0 Wait State + Screencasting Fixes (#141) 2022-06-17 11:58:44 -07:00
storage.js Health Check + Size Limits + Profile fixes (#138) 2022-05-18 22:51:55 -07:00
textextract.js Arg Parsing Refactor + Support for YAML Config Support (take 2!) (#59) 2021-06-23 19:45:40 -07:00
windowconcur.js Page-reuse concurrency + Browser Repair + Screencaster Cleanup Improvements (#157) 2022-08-19 09:23:40 -07:00