mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-10-19 14:33:17 +00:00
update CHANGES for 0.5.0 release
This commit is contained in:
parent
7ed5586bdb
commit
bfd72835d1
2 changed files with 12 additions and 4 deletions
14
CHANGES.md
14
CHANGES.md
|
@ -1,15 +1,23 @@
|
|||
## CHANGES
|
||||
|
||||
v0.5.0
|
||||
- State: Support for serialization and reloading of crawl state to config.yaml
|
||||
- Scope: support for `scopeType: domain` to include all subdomains and ignoring 'www.' if specified in the seed.
|
||||
- Profiles: support loading remote profile from URL as well as local file
|
||||
- Non-HTML Pages: Load non-200 responses in browser, even if non-html, fix waiting issues with non-HTML pages (eg. PDFs)
|
||||
- Config options: Fix setting user-agent
|
||||
- Page behavior: latest browsertrix-behaviors, also add experimental Cloudflare interstitial wait.
|
||||
- Error handling: better error handling for redis errors
|
||||
- State: Support loading of crawl state from config.yaml
|
||||
- State: Support serialization of crawl state to `crawls` subdirectory, both while running (keeping last N states) and on exit.
|
||||
- State: Graceful saving of crawl state on ctrl+c interrupt
|
||||
- State: Memory or Redis based crawl state
|
||||
- Config: Aadditional crawl config via env var
|
||||
- Config: Support additional options via `CRAWL_ARGS` environment variable
|
||||
- WACZ Upload: Support for S3 upload of WACZ upon crawl completion
|
||||
- WACZ Upload: HTTP/Redis webhook to notify of upload completion
|
||||
- Crawl Scope: Support for `extraHops` to optionally crawl an extra hop beyond scope
|
||||
- Signing: Support for optional signing of WACZ
|
||||
- Dependencies: update to latest pywb and wacz packages
|
||||
- Dependencies: update to latest pywb, wacz and browsertrix-behaviors packages
|
||||
|
||||
|
||||
v0.4.4
|
||||
- Page Block Rules Fix: 'request already handled' errors by avoiding adding duplicate handlers to same page.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{
|
||||
"name": "browsertrix-crawler",
|
||||
"version": "0.5.0-beta.8",
|
||||
"version": "0.5.0",
|
||||
"main": "browsertrix-crawler",
|
||||
"repository": "https://github.com/webrecorder/browsertrix-crawler",
|
||||
"author": "Ilya Kreymer <ikreymer@gmail.com>, Webrecorder Software",
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue