Commit graph

8 commits

Author SHA1 Message Date
Tessa Walsh
1325cc3868
Gracefully handle non-absolute path for create-login-profile --filename (#521)
Fixes #513 

If an absolute path isn't provided to the `create-login-profile`
entrypoint's `--filename` option, resolve the value given within
`/crawls/profiles`.

Also updates the docs cli-options section to include the
`create-login-profile` entrypoint and adjusts the script to
automatically generate this page accordingly.

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2024-03-29 13:46:54 -07:00
Ilya Kreymer
783d006d52
follow-up to #428: update ignore files (#431)
- actually update lint/prettier/git ignore files with scatch, crawls, test-crawls, behaviors, as needed
2023-11-09 17:13:53 -08:00
Emma Segal-Grossman
2a49406df7
Add Prettier to the repo, and format all the files! (#428)
This adds prettier to the repo, and sets up the pre-commit hook to
auto-format as well as lint.
Also updates ignores files to exclude crawls, test-crawls, scratch, dist as needed.
2023-11-09 16:11:11 -08:00
Tessa Walsh
62fe4b4a99
Add options to filter logs by --logLevel and --context (#271)
* Add .DS_Store to gitignore

* Add --logLevel and --context filtering options

* Add log filtering test
2023-04-01 10:07:59 -07:00
Tessa Walsh
e02058f001 Add ad blocking via request interception (#173)
* ad blocking via request interception, extending block rules system, adding new AdBlockRules
* Load list of hosts to block from https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts added as json on image build
* Enabled via --blockAds and setting a custom message via --adBlockMessage
* new test to check for ad blocking
* Add test-crawls dir to .gitignore and .dockerignore
2022-11-15 18:30:27 -08:00
Ilya Kreymer
a875aa90d3 Dockerfile: switch to cmd 'crawl', instead of entrypoint to support running 'pywb' also
update README with docker-compose and docker run examples, update commandline example
default output to './crawls' subdirectory
2020-11-01 21:35:00 -08:00
Ilya Kreymer
91b8994a08 refactor crawler and default driver:
- add extensible defaultDriver, wrap crawling functionality in Crawler class
- support headless/non-headless, custom driver
- support custom collection name for pywb, generate-cdx option
- autoplay: add slightly delay for splash loading
2020-11-01 19:53:47 -08:00
Ilya Kreymer
ded83b52b3 initial commit after split from zimit 2020-10-31 13:16:37 -07:00