Stowage/browsertrix-crawler - Remotebranch.eu

Stowage/browsertrix-crawler

mirror of https://github.com/webrecorder/browsertrix-crawler.git synced 2025-10-19 14:33:17 +00:00

Author	SHA1	Message	Date
Ilya Kreymer	f51154facb	Chrome 112 + new headless mode + consistent viewport tweaks (#316 ) * base: update to chrome 112 headless: switch to using new headless mode available in 112 which is more in sync with headful mode viewport: use fixed viewport matching screen dimensions for headless and headful mode (if GEOMETRY is set) profiles: fix catching new window message, reopening page in current window versions: bump to pywb 2.7.4, update puppeteer-core to (20.2.1) bump to 0.10.0-beta.4 * profile: force reopen in current window only for headless mode (currently breaks otherwise), remove logging messages	2023-05-22 16:24:39 -07:00
Ilya Kreymer	77f0a935aa	stopping: if crawl is marked as stopping, and no warcs found, mark state as failed also (to avoid loop in cloud when (#314 ) crawler is restarted)	2023-05-19 07:38:16 -07:00
Tessa Walsh	f3c64b2b07	Consolidate wacz error loglines (#306 ) * Print WACZ and reindexing errors/stacktraces on single line * Log full stderr as single line if debug is enabled	2023-05-07 13:00:56 -07:00
Ilya Kreymer	f4c4203381	crawl stopping / additional states: (#303 ) * crawl stopping / additional states: - adds check for 'isCrawlStopped()' which checks redis key to see if crawl has been stopped externally, and interrupts work loop and prevents crawl from starting on load - additional crawl states: 'generate-wacz', 'generate-cdx', 'generate-warc', 'uploading-wacz', and 'pending-wait' to indicate when crawl is no longer running but crawler performing work - addresses part of webrecorder/browsertrix-cloud#263, webrecorder/browsertrix-cloud#637	2023-05-03 16:25:59 -07:00
Tessa Walsh	d4bc9e80b9	Catch 4xx and 5xx page.goto() responses to mark invalid URLs as failed (#300 ) * Catch 400 pywb errors on page load and mark page failed * Add --failOnFailedSeed option to fail crawl with exit code 1 if seed doesn't load, resolves #207 * Handle 4xx or 5xx page.goto responses as page load errors	2023-04-26 16:49:32 -07:00
Ilya Kreymer	71b618fe94	Switch back to Puppeteer from Playwright (#301 ) - reduced memory usage, avoids memory leak issues caused by using playwright (see #298) - browser: split Browser into Browser and BaseBrowser - browser: puppeteer-specific functions added to Browser for additional flexibility if need to change again later - browser: use defaultArgs from playwright - browser: attempt to recover if initial target is gone - logging: add debug logging from process.memoryUsage() after every page - request interception: use priorities for cooperative request interception - request interception: move to setupPage() to run once per page, enable if any of blockrules, adblockrules or originOverrides are used - request interception: fix originOverrides enabled check, fix to work with catch-all request interception - default args: set --waitUntil back to 'load,networkidle2' - Update README with changes for puppeteer - tests: fix extra hops depth test to ensure more than one page crawled --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-04-26 15:41:35 -07:00
Ilya Kreymer	d4e222fab2	merge regression fixes from 0.9.1: full page screenshot + allow service workers if no profile used (#297 ) * browser: just pass profileUrl and track if custom profile is used browser: don't disable service workers always (accidentally added as part of playwright migration) only disable if using profile, same as 0.8.x behavior fix for #288 * Fix full page screenshot (#296) --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-04-24 10:26:56 -07:00
Ilya Kreymer	3d8e21ea59	origin override: add --originOverride source=dest to allow routing where https://src-host:src-port/path/page.html -> http://dest-host:dest-port/path/page.html where source=https://src-host:src-port and dest=http://dest-host:dest-port (#281 )	2023-04-19 19:17:15 -07:00
Tessa Walsh	4143ebbd02	Store archive dir size in Redis (#291 )	2023-04-19 18:10:02 -07:00
Tessa Walsh	c23cd66c66	Store done in redis as integer and only save full json in redis for failed pages (#284 ) * Store done in redis as integer rather than full json * Add numFailed to crawler stats * Cast numDone to int before returning * Increment done counter for failed URLs * Fix movefailed to push failed URL to failed not done key * Don't add failed to total stats twice	2023-04-13 13:31:33 -07:00
Tessa Walsh	3864c76090	Add option to log errors to redis (#279 )	2023-04-11 11:32:52 -04:00
Tessa Walsh	b303af02ef	Add --title and --description CLI args to write metadata into datapackage.json (#276 ) Multi-word values including spaces must be enclosed in double quotes. Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2023-04-04 10:46:03 -04:00
Ilya Kreymer	78faa965c5	Add --maxPageLimit override (#275 ) * max page limit: - rename --limit -> --pageLimit (keep alias for now) - add new --maxPageLimit flag which overrides --pageLimit to ensure it is not greater than max - readme: add new --pageLimit, --maxPageLimit to README	2023-04-03 11:10:47 -07:00
Ilya Kreymer	86e930d633	blockrules/logger: use global logger var (#274 )	2023-04-03 10:58:13 -07:00
Tessa Walsh	62fe4b4a99	Add options to filter logs by --logLevel and --context (#271 ) * Add .DS_Store to gitignore * Add --logLevel and --context filtering options * Add log filtering test	2023-04-01 10:07:59 -07:00
Tessa Walsh	746d80adc7	Ensure crawler can't run out of space with --diskUtilization param (#264 ) * Implement --diskUtilization * Keep threshold fixed but project usage based on archive dir size	2023-03-31 09:35:18 -07:00
Ilya Kreymer	4ba6e949d3	Reset locked pending URLs when crawler restarts. (#267 ) * pending lock reset: - quicker retry of pending URLs after crawler crash by clearing pending page locks - pending urls are locked with <crawl>:p:<url> to indicate they are current being rendered - when a crawler restarts, check if <crawl>:p:<url> is set to its unique id and remove pending lock, to allow the URL to be retried again, as it's no longer actively being crawled.	2023-03-30 21:29:41 -07:00
Tessa Walsh	b0e93cb06e	Add option for sleep interval after behaviors run + timing cleanup (#257 ) * Add --pageExtraDelay option to add extra delay/wait time after every page (fixes #131) * Store total page time in 'maxPageTime', include pageExtraDelay * Rename timeout->pageLoadTimeout * cleanup: - store seconds for most interval checks, convert to ms only for api calls, remove most sec<->ms conversions - add secondsElapsed() utility function to help checking time elapsed - cleanup comments --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-03-22 11:50:18 -07:00
Ilya Kreymer	02fb137b2c	Catch loading issues (#255 ) * various loading improvements to avoid pages getting 'stuck' + load state tracking - add PageState object, store loadstate (0 to 4) as well as other per-page-state properties on defined object. - set loadState to 0 (failed) by default - set loadState to 1 (content-loaded) on 'domcontentloaded' event - if page.goto() finishes, set to loadState to 2 'full-page-load'. - if page.goto() times out, if no domcontentloaded either, fail immediately. if domcontentloaded reached, extract links, but don't run behaviors - page considered 'finished' if it got to at least loadState 2 'full-pageload', even if behaviors timed out - pages: log 'loadState' as part of pages.jsonl - improve frame detection: detect if frame actually not from a frame tag (eg. OBJECT) tag, and skip as well - screencaster: try screencasting every frame for now instead of every other frame, for smoother screencasting - deps: behaviors: bump to browsertrix-behaviors 0.5.0-beta.0 release (includes autoscroll improvements) - workers ids: just use 0, 1, ... n-1 worker indexes, send numeric index as part of screencast messages - worker: only keeps track of crash state to recreate page, decouple crash and page failed/succeeded state - screencaster: allow reusing caster slots with fixed ids - interrupt timedCrawlPage() wait if 'crash' event happens - crawler: pageFinished() callback when page finishes - worker: add workerIdle callback, call screencaster.stopById() and send 'close' message when worker is empty	2023-03-20 18:31:37 -07:00
Ilya Kreymer	07e503a8e6	Logger cleanup (#254 ) * logging: convert logger to a singleton to simplify use * add logger to create-login-profile.js	2023-03-17 14:24:44 -07:00
Ilya Kreymer	82808d8133	Dev 0.9.0 Beta 1 Work - Playwright Removal + Worker Refactor + Redis State (#253 ) * Migrate from Puppeteer to Playwright! - use playwright persistent browser context to support profiles - move on-new-page setup actions to worker - fix screencaster, init only one per page object, associate with worker-id - fix device emulation: load on startup, also replace '-' with space for more friendly command-line usage - port additional chromium setup options - create / detach cdp per page for each new page, screencaster just uses existing cdp - fix evaluateWithCLI to call CDP command directly - workers directly during WorkerPool - await not necessary * State / Worker Refactor (#252) * refactoring state: - use RedisCrawlState, defaulting to local redis, remove MemoryCrawlState and BaseState - remove 'real' accessors / draining queue - no longer neede without puppeteer-cluster - switch to sorted set for crawl queue, set depth + extraHops as score, (fixes #150) - override console.error to avoid logging ioredis errors (fixes #244) - add MAX_DEPTH as const for extraHops - fix immediate exit on second interrupt * worker/state refactor: - remove job object from puppeteer-cluster - rename shift() -> nextFromQueue() - condense crawl mgmt logic to crawlPageInWorker: init page, mark pages as finished/failed, close page on failure, etc... - screencaster: don't screencast about:blank pages * more worker queue refactor: - remove p-queue - initialize PageWorkers which run in its own loop to process pages, until no pending pages, no queued pages - add setupPage(), teardownPage() to crawler, called from worker - await runWorkers() promise which runs all workers until completion - remove: p-queue, node-fetch, update README (no longer using any puppeteer-cluster base code) - bump to 0.9.0-beta.1 * use existing data object for per-page context, instead of adding things to page (will be more clear with typescript transition) * more fixes for playwright: - fix profile creation - browser: add newWindowPageWithCDP() to create new page + cdp in new window, use with timeout - crawler: various fixes, including for html check - logging: addition logging for screencaster, new window, etc... - remove unused packages --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-03-17 12:50:32 -07:00
Tessa Walsh	f19f1fcb8d	Minor crawler fixes after puppeteer-cluster removal refactoring (#250 ) * Remove screencaster from Worker/WorkerPool * Don't increment errors in crawlPageInWorker * Set pageTarget variable early	2023-03-13 15:07:59 -07:00
Ilya Kreymer	4b8a414410	Add total timeout + limit redis queue retries (#248 ) * time limits: readd total timeount to runTask() in worker, just in case refactor working runTask() to either return true/false if task was timed out if timed out, recreate the page redis: add limit to retried URLs, currently set to 1 * retry: remove URL if not retrying, log removal of URL from queue	2023-03-13 14:48:04 -07:00
Tessa Walsh	aadd9a0483	Add timedRun to prevent async operations from hanging (#243 ) * Add timedRun and apply to network requests * Remove debugging print statement * minor tweaks: - move seconds to 2nd param, make param required - use FETCH_TIMEOUT_SECS for fetch events and PAGE_OP_TIMEOUT_SECS for in-page events respectively - use timedRun() for check CF action - remove extra async * additional logging ensure queue is cleared when interrupting! --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-03-10 20:11:24 -08:00
Tessa Walsh	1bee46b321	Remove puppeteer-cluster + iframe filtering + health check refactor + logging improvements (0.9.0-beta.0) (#219 ) * This commit removes puppeteer-cluster as a dependency in favor of a simpler concurrency implementation, using p-queue to limit concurrency to the number of available workers. As part of the refactor, the custom window concurrency model in windowconcur.js is removed and its logic implemented in the new Worker class's initPage method. * Remove concurrency models, always use new tab * logging improvements: include worker-id in logs, use 'worker' context - logging: log info string / version as first line - logging: improve logging of error stack traces - interruption: support interrupting crawl directly with 'interrupt' check which stops the job queue - interruption: don't repair if interrupting, wait for queue to be idle - log text extraction - init order: ensure wb-manager init called first, then logs created - logging: adjust info->debug logging - Log no jobs available as debug * tests: bail on first failure * iframe filtering: - fix filtering for about:blank iframes, support non-async shouldProcessFrame() - filter iframes both for behaviors and for link extraction - add 5-second timeout to link extraction, to avoid link extraction holding up crawl! - cache filtered frames * healthcheck/worker reuse: - refactor healthchecker into separate class - increment healthchecker (if provided) if new page load fails - remove expermeintal repair functionality for now - add healthcheck * deps: bump puppeteer-core to 17.1.2 - bump to 0.9.0-beta.0 -------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-03-08 18:31:19 -08:00
Ilya Kreymer	63717c4b04	Crawl log (#231 ) * logging: - write most of the crawl log to '{coll}/logs/crawl-{iso-timestamp}.log', part of #230 - ensure log filename consists of numeric timestamp only - close log before wacz file is generated to allow storing log in wacz - close log after writing stats - add logs/ directory to wacz with new py-wacz - deps: bump to py-wacz 0.4.8 to support logs in wacz	2023-02-24 18:31:08 -08:00
Ilya Kreymer	5da379cb5f	Logging and Behavior Tweaks (#229 ) - Ensure page is included in all logging details - Update logging messages to be a single string, with variables added in the details - Always wait for all pending wait requests to finish (unless counter <0) - Don't set puppeteer-cluster timeout (prep for removing puppeeteer-cluster) - Add behaviorTimeout to running behaviors in crawler, in addition to in behaviors themselves. - Add logging for behavior start, finish and timeout - Move writeStats() logging to beginning of each page as well as at the end, to avoid confusion about pending pages. - For events from frames, use frameUrl along with current page - deps: bump browsertrix-behaviors to 0.4.2 - version: bump to 0.8.1	2023-02-23 18:50:22 -08:00
Ilya Kreymer	a59ec05a85	update behaviors to 0.4.1, rename 'Behavior line' -> 'Behavior log' (#223 )	2023-02-04 16:02:43 -08:00
Tessa Walsh	0cf6219d80	Fix --overwrite CLI flag (#220 ) * Delete collection if --overwrite before wb-manager init * Add tests	2023-02-02 21:02:47 -08:00
Ilya Kreymer	38a9dbdaae	behaviors: don't run behaviors in iframes that are about:blank or are… (#211 ) * behaviors: don't run behaviors in iframes that are about:blank or are from an ad-host (even if ad-blocking is not disabled), fixes #210 * logging: log behavior wait start and success, in addition to error, with url in details	2023-01-23 16:47:33 -08:00
Ilya Kreymer	a767721f5e	crawl state: add getPendingList() to return pending state from either… (#205 ) * crawl state: add getPendingList() to return pending state from either memory or redis crawl state, fix stats logging with redis state. Return pending list as json object logging: check if data object is an error, log fields from error. Convert missing console.* to new logger * evaluate failuire: log with error, not fatal	2023-01-23 10:43:12 -08:00
Tessa Walsh	0192d05f4c	Implement improved json-l logging - Add Logger class with methods for info, error, warn, debug, fatal - Add context, timestamp, and details fields to log entries - Log messages as JSON Lines - Replace puppeteer-cluster stats with custom stats implementation - Log behaviors by default - Amend argParser to reflect logging changes - Capture and log stdout/stderr from awaited child_processes - Modify tests to use webrecorder.net to avoid timeouts	2023-01-19 14:17:27 -05:00
Ilya Kreymer	5ee05985b1	Use VNC for headful profile creation (#197 ) * profiles: use vnc for automatic profile creation (fixes #194): - add x11vnc and serve via vnc when not headless, keep existing screencast for headless mode - use @novnc/novnc to serve vnc JS library - add novnc_lite.html to serve the content from an iframe - optimization: don't show initial blank page / don't wait for initial page in puppeteer * more vnc work: - set position of browser at 0,0, avoid needing offset to fit - add /vncpass endpoint to query vnc password (for use with browsertrix-cloud) - remove websockify, x11vnc now supports ws connections directly! - vnc_lite: support reconnecting ws if gracefully disconnected * x11vnc cleanup: just pass password via cmdline to simplify setup * make interactive profile creation default, automated enabled only if --automated or --username / --password flags are specified README updates: - mention new VNC-based streaming - mention new --automated flag, move automated info below interactive * README: adjust auto-login example to use mastodon example instead of twitter, which works more consistently	2023-01-09 23:56:53 -08:00
Tessa Walsh	f35d495103	Add screenshot functionality (#188 ) * Add screenshot and thumbnail functionality Introduces a --screenshot CLI option, which takes a comma-separated list of screenshot types: view,fullPage,thumbnail. In addition, this commit: - Adds '--experimental-global-webcrypto' to ensure webcrypto is available in node - Deprecates newContext, instead always using page context for 1 worker and window context for >1 worker * Separate screenshotTypes into exported const Co-authored-by: Emma Dickson <emmadickson@Emmas-MacBook-Air.local>	2022-12-21 09:06:13 -08:00
Ilya Kreymer	057cc82897	new setting: add support for specifying language via the --lang flag (#186 )	2022-11-21 11:59:37 -08:00
Tessa Walsh	e02058f001	Add ad blocking via request interception (#173 ) * ad blocking via request interception, extending block rules system, adding new AdBlockRules * Load list of hosts to block from https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts added as json on image build * Enabled via --blockAds and setting a custom message via --adBlockMessage * new test to check for ad blocking * Add test-crawls dir to .gitignore and .dockerignore	2022-11-15 18:30:27 -08:00
Ilya Kreymer	277314f2de	Convert to ESM (#179 ) * switch base image to chrome/chromium 105 with node 18.x * convert all source to esm for node 18.x, remove unneeded node-fetch dependency * ci: use node 18.x, update to latest actions * tests: convert to esm, run with --experimental-vm-modules * tests: set higher default timeout (90s) for all tests * tests: rename driver test fixture to .mjs for loading in jest * bump to 0.8.0	2022-11-15 18:30:27 -08:00
Ilya Kreymer	65933c6b12	Interrupt Handling Fixes (#167 ) * interrupts: simplify interrupt behavior: - SIGTERM/SIGINT behave same way, trigger an graceful shutdown after page load improvements of remote state / parallel crawlers (for browsertrix-cloud): - SIGUSR1 before SIGINT/SIGTERM ensures data is saved, mark crawler as done - for use with graceful stopping crawl - SIGUSR2 before SIGINT/SIGTERM ensures data is saved, does not mark crawler as done - for use with scaling down a single crawler * scope check: check scope of URL retrieved from queue (in case scoping rules changed), urls matching seed automatically in scope!	2022-09-20 17:09:52 -07:00
Ilya Kreymer	314ee3f730	Default Wait-Time Improvements (#162 ) - netIdleWait better defaults: if not set, set to 15 seconds for page/page-spa scope, otherwise to 2 seconds - default behaviors: include autoscroll in default behavior as well - restart: if crawl already done, don't attempt to crawl further. if 'waitOnDone' set, wait for signal before exiting. - bump to puppeteer-core 17.1.2 - bump to 0.7.0-beta.4	2022-09-08 23:39:26 -07:00
Ilya Kreymer	5c931275ed	pending wait: set max pending request wait to 120 seconds	2022-09-02 17:53:04 -07:00
Ilya Kreymer	e22d95e2f0	Logging and browser improvements: (#158 ) * logging: add 'jserrors' option to --logging to print JS errors * browser config: use flags from playwright * browser: use socat to allow connecting via devtools via crawling on port 9222	2022-08-21 00:30:25 -07:00
Ilya Kreymer	6cc38bf511	Page-reuse concurrency + Browser Repair + Screencaster Cleanup Improvements (#157 ) * new window: use cdp instead of window.open * new window tweaks: add reuseCount, use browser.target() instead of opening a new blank page * rename NewWindowPage -> ReuseWindowConcurrency, move to windowconcur.js potential fix for #156 * browser repair: - when using window-concurrency, attempt to repair / relaunch browser if cdp errors occur - mark pages as failed and don't reuse if page error or cdp errors occur - screencaster: clear previous targets if screencasting when repairing browser * bump version to 0.7.0-beta.3	2022-08-19 09:23:40 -07:00
Ilya Kreymer	827c153679	fix for latest puppeteer: page._client -> page._client()	2022-08-17 21:40:10 -07:00
Ilya Kreymer	c5d208024a	Wait Default + Logging Improvements (#153 ) improved logging of pywb + redis: - if 'logging' includes 'pywb', log pywb and redis output, to pywb.log and redis.log - otherwise, just ignore (don't print to stdout as that's too confusing) - print if wb-manager fails, likely due to existing collection waitUntil: default to just 'load' to avoid potential infinite loop, separate --netIdle can configure idle wait dependency: update to latest puppeteer-core (16.1.0)	2022-08-11 18:44:39 -07:00
Ilya Kreymer	e3b8b5ba21	Add --netIdleWait, bump dependencies (0.7.0-beta.2) (#145 ) - add --netIdleWait option, default to 10 seconds - necessary for some sites that start fetching immediately after page load - add openssl.conf to allow pywb to avoid 'unsafe legacy renegotiation disabled' from openssl - update to browsertrix-behaviors 0.3.2 - update current url for screencasting of page before page load starts bump to 0.7.0-beta.2	2022-07-08 17:17:46 -07:00
Ilya Kreymer	0a309af740	Update to Chrome/Chromium 101 - (0.7.0 Beta 0) (#144 ) * update base image - switch to browsertrix-base-image:101 with chrome/chromium 101, - includes additional fonts and ubuntu 22.04 as base. - add --disable-site-isolation-trials as default flag to support behaviors accessing iframes * debugging support for shared redis state: - support pausing crawler indefinitely if crawl state is set to 'debug' - must be set/unset manually via external redis - designed for browsertrix-cloud for now bump to 0.7.0-beta.0	2022-06-30 19:24:26 -07:00
Ilya Kreymer	cf90304fa7	0.6.0 Wait State + Screencasting Fixes (#141 ) * new options: - to support browsertrix-cloud, add a --waitOnDone option, which has browsertrix crawler wait when finished - when running with redis shared state, set the `<crawl id>:status` field to `running`, `failing`, `failed` or `done` to let job controller know crawl is finished. - set redis state to `failing` in case of exception, set to `failed` in case of >3 or more failed exits within 60 seconds (todo: make customizable) - when receiving a SIGUSR1, assume final shutdown and finalize files (eg. save WACZ) before exiting. - also write WACZ if exiting due to size limit exceed, but not do to other interruptions - change sleep() to be in seconds * misc fixes: - crawlstate.finished() -> isFinished() - return if >0 pages and none left in queue - don't fail crawl if isFinished() is true - don't keep looping in pending wait for urls to finish if received abort request * screencast improvements (fix related to webrecorder/browsertrix-cloud#233) - more optimized screencasting, don't close and restart after every page. - don't assume targets change after every page, they don't in window mode! - only send 'close' message when target is actually closed * bump to 0.6.0	2022-06-17 11:58:44 -07:00
Ilya Kreymer	70ba9241ca	limit interrupt fix: after self-interrupting, only look at local pending list (for redis state) logging: don't log CF check errors, do log when errorCount is reset	2022-05-19 06:25:46 +00:00
Ilya Kreymer	93b6dad7b9	Health Check + Size Limits + Profile fixes (#138 ) - Add optional health check via `--healthCheckPort`. If set, runs a server on designated port that returns 200 if healthcheck succeeds (num of consecutive failed page loads < 2*num workers), or 503 if fails. Useful for k8s health check - Add crawl size limit (in bytes), via `--sizeLimit`. Crawl exits (and state optionally saved) when size limit is exceeded. - Add crawl total time limit (in seconds), via `--timeLimit`. Crawl exists (and state optionally saved) when total running time is exceeded. - Add option to overwrite existing collection. If `--overwrite` is included, any existing data for specified collection is deleted. - S3 Storage refactor, simplify, don't add additional paths by default. - Add interpolateFilename as generic utility, supported in filename and STORE_PATH env value. - wacz save: reenable wacz validation after save. - Profiles: support /navigate endpoint, return origins from /ping, prevent opening new tabs. - bump to 0.6.0-beta.1	2022-05-18 22:51:55 -07:00
Ilya Kreymer	500ed1f9a1	Profile Creation Improvements (#136 ) * interactive profile api improvements: - refactor profile creation into separate class - if profile starts with '@', load as relative path using current s3 storage - support uploading profiles to s3 - profile api: support filename passed to /createProfieJS as part of json POST - profile api: support /ping to keep profile browser running, --shutdownWait to add autoshutdown timeout (extendable via ping) - profile api: add /target to retrieve target and /navigate to navigate by url. * bump to 0.6.0-beta.0	2022-05-05 14:27:17 -05:00

1 2 3