mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2026-02-06 18:00:30 +00:00
Fixes #920 - Downloads profile, custom behavior, and seed list to `/downloads` directory in the crawl - Seed File: Downloaded into downloads. Never refetched if already exists on subsequent crawl restarts. - Custom Behaviors: Git: Downloaded into dir, then moved to /downloads/behaviors/<dir name>. if already exist, failure to downloaded will reuse existing directory - Custom Behaviors: File: Downloaded into temp file, then moved to /downloads/behaviors/<name.js>. if already exists, failure to download will reuse existing file. - Profile: using `/profile` directory to contain the browser profile - Profile: downloaded to temp file, then placed into /downloads/profile.tar.gz. If failed to download, but already exists, existing /profile directory is used - Also fixes #897 |
||
|---|---|---|
| .. | ||
| proxies | ||
| crawl-1.yaml | ||
| crawl-2.yaml | ||
| driver-1.mjs | ||
| pages.jsonl | ||
| sample-profile.tar.gz | ||
| urlSeedFile.txt | ||