mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-10-19 06:23:16 +00:00
- parse URL username/password, store in 'auth' field in seed, or pass in 'auth' field directly (from yaml config) - add 'Authorization' header with base64 encoded basic auth via setExtraHTTPHeaders() - tests: add test for crawling with auth using http-server using local docs build (now build docs as part of CI) - docs: add HTTP Auth to YAML config section --------- Co-authored-by: Ed Summers <ehs@pobox.com>
This commit is contained in:
parent
6329b19a20
commit
3339374092
8 changed files with 437 additions and 9 deletions
|
@ -46,3 +46,16 @@ seeds:
|
|||
depth: 1
|
||||
scopeType: "prefix"
|
||||
```
|
||||
|
||||
## HTTP Auth
|
||||
|
||||
Browsertrix Crawler supports HTTP Basic Auth, which can be provide on a per-seed basis as part of the URL, for example:
|
||||
`--url https://username:password@example.com/`.
|
||||
|
||||
Alternatively, credentials can be added to the `auth` field for each seed:
|
||||
|
||||
```yaml
|
||||
seeds:
|
||||
- url: https://example.com/
|
||||
auth: username:password
|
||||
```
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue