browsertrix-crawler/src
Ilya Kreymer 88a2fbd0a0
Fix 206 response + general video handling (#646)
Refactors handling of 206 responses:
- If a 206 response is encountered, and its actually the full range,
convert to 200 and rewrite range and content-range headers to x-range
and x-orig-range. This is to support rewriting of 206 responses for DASH
manifests
- If a partial 206 response starting with `0-`, do a full async fetch
separately.
- If a partial 206 response not starting with 0-, just ignore (very
likely a duplicate picked up when handling the 0- response)
- Don't stream content-types that can be rewritten, since streaming
prevents rewriting. Fixes rewriting on DASH/HLS manifests which have no
content-length and don't get properly rewritten.
- Overall, adds missing rewriting of DASH/HLS manifests that have no
content-length and are served as 206.
- Update to latest wabac.js which fixes rewriting of DASH manifest to
avoid duplicate '<?xml' prefix, webrecorder/wabac.js#192
- Fixes #645
2024-07-17 13:24:25 -07:00
..
util Fix 206 response + general video handling (#646) 2024-07-17 13:24:25 -07:00
crawler.ts don't disable extraHops when using sitemaps: (#639) 2024-07-11 19:48:43 -07:00
create-login-profile.ts Loosen selectors for login fields in automated profile creation (#638) 2024-07-11 15:55:06 -07:00
defaultDriver.ts Add Prettier to the repo, and format all the files! (#428) 2023-11-09 16:11:11 -08:00
main.ts QA Crawl Support (Beta) (#469) 2024-03-22 17:32:42 -07:00
replaycrawler.ts Improved handling of pages that redirect back to the same page. (#635) 2024-07-08 10:51:37 -07:00