Commit graph

5 commits

Author SHA1 Message Date
Ilya Kreymer
bc4a95883d
clear out core dumps to avoid using up volume space: (#740)
- add 'ulimit -c' to startup script
- delete any './core' files that exist in working dir just in case
- fixes #738
2025-01-16 15:50:59 -08:00
Ilya Kreymer
1fe810b1df
Improved support for running as non-root (#503)
This PR provides improved support for running crawler as non-root,
matching the user to the uid/gid of the crawl volume.

This fixes #502 initial regression from 0.12.4, where `chmod u+x` was
used instead of `chmod a+x` on the node binary files.

However, that was not enough to fully support equivalent signal handling
/ graceful shutdown as when running with the same user. To make the
running as different user path work the same way:
- need to switch to `gosu` instead of `su` (added in Brave 1.64.109
image)
- run all child processes as detached (redis-server, socat, wacz, etc..)
to avoid them automatically being killed via SIGINT/SIGTERM
- running detached is controlled via `DETACHED_CHILD_PROC=1` env
variable, set to 1 by default in the Dockerfile (to allow for overrides
just in case)

A test has been added which runs one of the tests with a non-root
`test-crawls` directory to test the different user path. The test
(saved-state.test.js) includes sending interrupt signals and graceful
shutdown and allows testing of those features for a non-root gosu
execution.

Also bumping crawler version to 1.0.1
2024-03-21 08:16:59 -07:00
Tessa Walsh
b303af02ef
Add --title and --description CLI args to write metadata into datapackage.json (#276)
Multi-word values including spaces must be enclosed in double quotes.

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2023-04-04 10:46:03 -04:00
Ed Summers
cd17764b77
Check if group/user exists (#176)
Ensure that group and user do not already exist before creating them.

Fixes #174
2022-11-03 17:28:13 -07:00
Ed Summers
3ba64535a5
Run in Docker as User (#171)
* Run in Docker as User

This follows a similar pattern to pywb to run as the user that owns the
crawls directory.

bump version to 0.7.0-beta.6

Closes #170
2022-09-28 12:49:52 -07:00