mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-10-19 06:23:16 +00:00
Mention command line options when restarting (#577)
It's probably worth reminding people that the command line options need to be passed in again since the crawl state doesn't include them. Refs #568
This commit is contained in:
parent
1735c3d8e2
commit
2ef116d667
1 changed files with 1 additions and 1 deletions
|
@ -127,7 +127,7 @@ A crawl can be gracefully interrupted with Ctrl-C (SIGINT) or a SIGTERM (see bel
|
|||
|
||||
When a crawl is interrupted, the current crawl state is written to the `crawls` subdirectory inside the collection directory. The crawl state includes the current YAML config, if any, plus the current state of the crawl.
|
||||
|
||||
This crawl state YAML file can then be used as `--config` option to restart the crawl from where it was left of previously.
|
||||
This crawl state YAML file can then be used as `--config` option to restart the crawl from where it was left of previously. When restarting a crawl you will need to include any command line options you used to start the original crawl (e.g. `--url`), since these are not persisted to the crawl state.
|
||||
|
||||
By default, the crawl interruption waits for current pages to finish. A subsequent SIGINT will cause the crawl to stop immediately. Any unfinished pages are recorded in the `pending` section of the crawl state (if gracefully finished, the section will be empty).
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue