mirror of
https://github.com/webrecorder/browsertrix-crawler.git
synced 2025-10-19 06:23:16 +00:00

Initial (beta) support for QA/replay crawling! - Supports running a crawl over a given WACZ / list of WACZ (multi WACZ) input, hosted in ReplayWeb.page - Runs local http server with full-page, ui-less ReplayWeb.page embed - ReplayWeb.page release version configured in the Dockerfile, pinned ui.js and sw.js fetched directly from cdnjs Can be deployed with `webrecorder/browsertrix-crawler qa` entrypoint. - Requires `--qaSource`, pointing to WACZ or multi-WACZ json that will be replay/QAd - Also supports `--qaRedisKey` where QA comparison data will be pushed, if specified. - Supports `--qaDebugImageDiff` for outputting crawl / replay/ diff images. - If using --writePagesToRedis, a `comparison` key is added to existing page data where: ``` comparison: { screenshotMatch?: number; textMatch?: number; resourceCounts: { crawlGood?: number; crawlBad?: number; replayGood?: number; replayBad?: number; }; }; ``` - bump version to 1.1.0-beta.2
39 lines
666 B
HTML
39 lines
666 B
HTML
<!doctype html>
|
|
<html>
|
|
<head>
|
|
<script src="/ui.js"></script>
|
|
<style>
|
|
html {
|
|
width: 100%;
|
|
height: 100%;
|
|
display: flex;
|
|
}
|
|
body {
|
|
width: 100%;
|
|
margin: 0;
|
|
padding: 0;
|
|
}
|
|
replay-web-page {
|
|
margin: 0;
|
|
padding: 0;
|
|
border: 0;
|
|
position: fixed;
|
|
width: 100vw;
|
|
height: 100vh;
|
|
top: 0;
|
|
left: 0;
|
|
}
|
|
</style>
|
|
</head>
|
|
<body>
|
|
<replay-web-page
|
|
embed="replayonly"
|
|
deepLink="true"
|
|
source="$SOURCE"
|
|
url="about:blank"
|
|
ts=""
|
|
coll="replay"
|
|
>
|
|
</replay-web-page>
|
|
</body>
|
|
</html>
|