browsertrix-crawler/tests
Tessa Walsh 2af94ffab5
Support downloading seed file from URL (#852)
Fixes #841 

Crawler work toward long URL lists in Browsertrix. This PR moves seed
handling from the arg parser's validation step to the crawler's
bootstrap step in order to be able to async fetch the seed file from a
URL.

---------

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-07-03 10:49:37 -04:00
..
custom-behaviors Support for behaviors from 'recorder flow' JSON created in devtools (#818) 2025-04-09 12:24:29 +02:00
fixtures Support custom css selectors for extracting links (#689) 2024-11-08 11:04:41 -05:00
invalid-behaviors detect invalid custom behaviors on load: (#450) 2023-12-13 15:14:53 -05:00
.DS_Store tests text extraction (#30) 2021-03-01 16:00:23 -08:00
adblockrules.test.js tests: reduce logging (#596) 2024-06-26 13:05:13 -07:00
add-exclusion.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
basic_crawl.test.js Streaming in-place WACZ creation + CDXJ indexing (#673) 2024-08-29 13:21:20 -07:00
blockrules.test.js tests: disable blockrules youtube tests in CI (#698) 2024-10-04 17:37:13 -07:00
brave-query-redir.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
collection_name.test.js Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
config_file.test.js Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
config_stdin.test.js tests: reduce logging (#596) 2024-06-26 13:05:13 -07:00
crawl_overwrite.js Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
custom-behavior-flow.test.js Support for behaviors from 'recorder flow' JSON created in devtools (#818) 2025-04-09 12:24:29 +02:00
custom-behavior.test.js Support for behaviors from 'recorder flow' JSON created in devtools (#818) 2025-04-09 12:24:29 +02:00
custom_driver.test.js Support custom css selectors for extracting links (#689) 2024-11-08 11:04:41 -05:00
custom_selector.test.js Validate Autoclick selector, fail crawl if invalid (#800) 2025-03-30 13:48:41 -07:00
dryrun.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
exclude-redirected.test.js Apply exclusions to redirects (#745) 2025-01-28 11:28:23 -08:00
extra_hops_depth.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
file_stats.test.js Retry same queue (#757) 2025-02-06 18:48:40 -08:00
http-auth.test.js http auth support per seed (supersedes #566): (#616) 2024-06-20 16:35:30 -07:00
lang-code.test.js lang code fixes: (#834) 2025-05-12 16:06:29 -07:00
limit_reached.test.js Add more exit codes to detect interruption reason (#764) 2025-02-10 14:00:55 -08:00
log_filtering.test.js Better default crawlId (#806) 2025-04-01 13:40:03 -07:00
mult_url_crawl_with_favicon.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
multi-instance-crawl.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
non-html-crawl.test.js Always download PDF + non HTML page cleanup + enterprise policy cleanup (#629) 2024-06-26 09:16:24 -07:00
pageinfo-records.test.js Add WARC-Protocol header (#715) 2025-05-19 18:59:52 -07:00
proxy.test.js Strip credentials from proxy address in crawl logs (#778) 2025-02-26 15:23:38 -05:00
qa_compare.test.js tests: update qa test to use awp site 2025-03-21 13:06:53 -07:00
retry-failed.test.js Retry same queue (#757) 2025-02-06 18:48:40 -08:00
rollover-writer.test.js Autoclick Support (#729) 2025-01-16 09:38:11 -08:00
saved-state.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
scopes.test.js Support downloading seed file from URL (#852) 2025-07-03 10:49:37 -04:00
screenshot.test.js Fix for --rolloverSize for individual WARCs in 1.x (#542) 2024-04-15 13:43:08 -07:00
seeds.test.js tests: use old.webrecorder.net for testing (#710) 2024-10-31 13:24:58 -04:00
sitemap-parse.test.js Deps update 1.6.1 (#826) 2025-05-02 00:43:37 -07:00
storage.test.js tests: reduce logging (#596) 2024-06-26 13:05:13 -07:00
text-extract.test.js Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
upload-wacz.test.js base: bump to brave 1.80.113 (#857) 2025-06-30 19:55:38 -07:00
url_file_list.test.js Support downloading seed file from URL (#852) 2025-07-03 10:49:37 -04:00
warcinfo.test.js tests: reduce logging (#596) 2024-06-26 13:05:13 -07:00