browsertrix-crawler/config
Ilya Kreymer 4495532606
Always download PDF + non HTML page cleanup + enterprise policy cleanup (#629)
Adds enterprise policy to always download PDF and sets download dir to
/dev/null
Moves policies to chromium.json and brave.json for clarity
Further cleanup of non-HTML loading path:
- sets downloadResponse when page load is aborted but response is
actually download
- sets firstResponse when first response finishes, but page doesn't
fully load
 - logs that non-HTML pages skip all post-crawl behaviors in one place
 - move page extra delay to separate awaitPageExtraDelay() function, applied for all pages (while post-load delay only applied to HTML pages)

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-06-26 09:16:24 -07:00
..
policies Always download PDF + non HTML page cleanup + enterprise policy cleanup (#629) 2024-06-26 09:16:24 -07:00
config.yaml Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
openssl.conf Add --netIdleWait, bump dependencies (0.7.0-beta.2) (#145) 2022-07-08 17:17:46 -07:00
uwsgi.ini Add --netIdleWait, bump dependencies (0.7.0-beta.2) (#145) 2022-07-08 17:17:46 -07:00