update CHANGES for 0.5.0 release

2025-10-19 14:33:17 +00:00 · 2022-04-09 21:57:58 -07:00 · 2022-04-09 21:57:58 -07:00 · bfd72835d1
commit bfd72835d1
parent 7ed5586bdb
2 changed files with 12 additions and 4 deletions
--- a/CHANGES.md
+++ b/CHANGES.md
@ -1,15 +1,23 @@
 ## CHANGES

 v0.5.0
- State: Support for serialization and reloading of crawl state to config.yaml
+- Scope: support for `scopeType: domain` to include all subdomains and ignoring 'www.' if specified in the seed.
+- Profiles: support loading remote profile from URL as well as local file
+- Non-HTML Pages: Load non-200 responses in browser, even if non-html, fix waiting issues with non-HTML pages (eg. PDFs)
+- Config options: Fix setting user-agent
+- Page behavior: latest browsertrix-behaviors, also add experimental Cloudflare interstitial wait.
+- Error handling: better error handling for redis errors
+- State: Support loading of crawl state from config.yaml
+- State: Support serialization of crawl state to `crawls` subdirectory, both while running (keeping last N states) and on exit.
 - State: Graceful saving of crawl state on ctrl+c interrupt
 - State: Memory or Redis based crawl state
- Config: Aadditional crawl config via env var
+- Config: Support additional options via `CRAWL_ARGS` environment variable
 - WACZ Upload: Support for S3 upload of WACZ upon crawl completion
 - WACZ Upload: HTTP/Redis webhook to notify of upload completion
 - Crawl Scope: Support for `extraHops` to optionally crawl an extra hop beyond scope
 - Signing: Support for optional signing of WACZ
- Dependencies: update to latest pywb and wacz packages
+- Dependencies: update to latest pywb, wacz and browsertrix-behaviors packages
+

 v0.4.4
 - Page Block Rules Fix: 'request already handled' errors by avoiding adding duplicate handlers to same page.
--- a/package.json
+++ b/package.json
@ -1,6 +1,6 @@
 {
  "name": "browsertrix-crawler",
-  "version": "0.5.0-beta.8",
+  "version": "0.5.0",
  "main": "browsertrix-crawler",
  "repository": "https://github.com/webrecorder/browsertrix-crawler",
  "author": "Ilya Kreymer <ikreymer@gmail.com>, Webrecorder Software",