browsertrix-crawler/docker-entrypoint.sh
Ilya Kreymer 1fe810b1df
Improved support for running as non-root (#503)
This PR provides improved support for running crawler as non-root,
matching the user to the uid/gid of the crawl volume.

This fixes #502 initial regression from 0.12.4, where `chmod u+x` was
used instead of `chmod a+x` on the node binary files.

However, that was not enough to fully support equivalent signal handling
/ graceful shutdown as when running with the same user. To make the
running as different user path work the same way:
- need to switch to `gosu` instead of `su` (added in Brave 1.64.109
image)
- run all child processes as detached (redis-server, socat, wacz, etc..)
to avoid them automatically being killed via SIGINT/SIGTERM
- running detached is controlled via `DETACHED_CHILD_PROC=1` env
variable, set to 1 by default in the Dockerfile (to allow for overrides
just in case)

A test has been added which runs one of the tests with a non-root
`test-crawls` directory to test the different user path. The test
(saved-state.test.js) includes sending interrupt signals and graceful
shutdown and allows testing of those features for a non-root gosu
execution.

Also bumping crawler version to 1.0.1
2024-03-21 08:16:59 -07:00

27 lines
623 B
Bash
Executable file

#!/bin/sh
# Get UID/GID from volume dir
VOLUME_UID=$(stat -c '%u' /crawls)
VOLUME_GID=$(stat -c '%g' /crawls)
# Get the UID/GID we are running as
MY_UID=$(id -u)
MY_GID=$(id -g)
# If we aren't running as the owner of the /crawls/ dir then add a new user
# btrix with the same UID/GID of the /crawls dir and run as that user instead.
if [ "$MY_GID" != "$VOLUME_GID" ] || [ "$MY_UID" != "$VOLUME_UID" ]; then
groupadd btrix
groupmod -o --gid $VOLUME_GID btrix
useradd -ms /bin/bash -g $VOLUME_GID btrix
usermod -o -u $VOLUME_UID btrix > /dev/null
exec gosu btrix:btrix "$@"
else
exec "$@"
fi