mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-10-19 06:23:16 +00:00

Chromium now interrupts fetch() if abort() is called or page is navigated, so autofetch behavior using native fetch() is less than ideal. This PR adds support for __bx_fetch() command for autofetch behavior (supported in browsertrix-behaviors 0.6.6) to fetch separately from browser's reguar fetch() - __bx_fetch() starts a fetch, but does not return content to browser, doesn't need abort(), unaffected by page navigation, but will still try to use browser network stack when possible, making it more efficient for background fetching. - if network stack fetch fails, fallback to regular node fetch() in the crawler. Additional improvements for interrupted fetch: - don't store truncated media responses, even for 200 - avoid doing duplicate async fetching if response already handled (eg. fetch handled in multiple contexts) - fixes #735, where fetch was interrupted, resulted in an empty response
73 lines
1.7 KiB
YAML
73 lines
1.7 KiB
YAML
name: Node.js CI
|
|
|
|
on:
|
|
push:
|
|
pull_request:
|
|
|
|
jobs:
|
|
lint:
|
|
runs-on: ubuntu-latest
|
|
|
|
strategy:
|
|
matrix:
|
|
node-version: [20.x]
|
|
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
- name: Use Node.js ${{ matrix.node-version }}
|
|
uses: actions/setup-node@v3
|
|
with:
|
|
node-version: ${{ matrix.node-version }}
|
|
- name: install requirements
|
|
run: yarn install
|
|
- name: run linter
|
|
run: yarn lint && yarn format
|
|
|
|
build:
|
|
runs-on: ubuntu-latest
|
|
|
|
strategy:
|
|
matrix:
|
|
node-version: [20.x]
|
|
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
|
|
- name: Use Node.js ${{ matrix.node-version }}
|
|
uses: actions/setup-node@v3
|
|
with:
|
|
node-version: ${{ matrix.node-version }}
|
|
|
|
- uses: actions/setup-python@v4
|
|
with:
|
|
python-version: 3.x
|
|
|
|
- name: install requirements
|
|
run: yarn install
|
|
|
|
- name: build js
|
|
run: yarn run tsc
|
|
|
|
- name: build docker
|
|
run: docker compose build
|
|
|
|
- name: install python deps for docs
|
|
run: pip install mkdocs-material
|
|
|
|
- name: build docs for crawl test
|
|
run: cd docs/ && mkdocs build
|
|
|
|
- name: add http-server for tests
|
|
run: yarn add -D http-server
|
|
|
|
- name: install py-wacz as root for tests
|
|
run: sudo pip install wacz --ignore-installed
|
|
|
|
- name: run all tests as root
|
|
run: sudo DOCKER_HOST_NAME=172.17.0.1 CI=true yarn test -validate
|
|
|
|
- name: run saved state + qa compare test as non-root - with volume owned by current user
|
|
run: |
|
|
sudo rm -rf ./test-crawls
|
|
mkdir test-crawls
|
|
sudo CI=true yarn test ./tests/saved-state.test.js ./tests/qa_compare.test.js
|