Restore the previous behaviour where the Prometheus /metrics endpoint
required auth if auth was enabled.
A new -prometheus-no-auth flag allows you to override this and disable
auth for that specific endpoint.
- Helper method for internal server errors with consistent logging.
- Add PanicOnError option to panic on internal server errors. This
makes it easier to traces where the condition was hit in testing.
- Do not allow '.' as path component, because it undermines depth
checks, and add tests
- Fix GiB reporting
- Fix metrics label
- Helper function for http errors
This contains all the glue to make Server use the new repo.Handler:
- Remove all old handlers
- Add ServeHTTP to make Server a single http.Handler
- Remove Goji routing and replace by net/http and custom routing logic
Additionally, this implements two-level backup repositories.
Goji routes incoming requests without first URL decoding the path, so
'%2F' in a URL will not be decoded to a '/' before routing. But by the
time that we perform the path checks for private urls on r.URL.Path,
these characters have been decoded.
As a consequence, a user 'foo' could use 'foo%2Fbar' as the repo name.
The private repo check would see that the path starts with 'foo/' and
allow it, and rest-server would happily create a 'foo/bar' repo. Other
more harmful variants are possible.
To resolve this issue, we now reject any name part that contains a '/'.
Additionally, we immediately reject a few other characters that are
disallowed under some operating systems or filesystems.
* Add --max-size flag to limit repository size
* Only update repo size on successful write
* Use initial size as current size for first SaveBlob
* Apply LimitReader to request body
* Use HTTP 413 for size overage responses
* Refactor size limiting; do checks after every write
* Remove extra commented lines, d'oh
* Account for deleting blobs when counting space usage
* Remove extra commented line
* Fix unrelated bug (inverted err check)
* Update comment to trigger new CI build
Admittedly, in some places just document the fact that we ignore error
return values, 'cause we don't know what to do with it. At least, the
linter is happy.
Exposes a few metrics for Prometheus under /metrics if started with --prometheus.
Example:
# HELP rest_server_blob_read_bytes_total Total number of bytes read from blobs
# TYPE rest_server_blob_read_bytes_total counter
rest_server_blob_read_bytes_total{repo="test",type="data"} 2.13557024e+09
rest_server_blob_read_bytes_total{repo="test",type="index"} 1.198653e+06
rest_server_blob_read_bytes_total{repo="test",type="keys"} 5388
rest_server_blob_read_bytes_total{repo="test",type="locks"} 1975
rest_server_blob_read_bytes_total{repo="test",type="snapshots"} 10018
# HELP rest_server_blob_read_total Total number of blobs read
# TYPE rest_server_blob_read_total counter
rest_server_blob_read_total{repo="test",type="data"} 3985
rest_server_blob_read_total{repo="test",type="index"} 21
rest_server_blob_read_total{repo="test",type="keys"} 12
rest_server_blob_read_total{repo="test",type="locks"} 12
rest_server_blob_read_total{repo="test",type="snapshots"} 32
# HELP rest_server_blob_write_bytes_total Total number of bytes written to blobs
# TYPE rest_server_blob_write_bytes_total counter
rest_server_blob_write_bytes_total{repo="test",type="data"} 1.063726179e+09
rest_server_blob_write_bytes_total{repo="test",type="index"} 395586
rest_server_blob_write_bytes_total{repo="test",type="locks"} 1975
rest_server_blob_write_bytes_total{repo="test",type="snapshots"} 1933
# HELP rest_server_blob_write_total Total number of blobs written
# TYPE rest_server_blob_write_total counter
rest_server_blob_write_total{repo="test",type="data"} 226
rest_server_blob_write_total{repo="test",type="index"} 6
rest_server_blob_write_total{repo="test",type="locks"} 12
rest_server_blob_write_total{repo="test",type="snapshots"} 6
The Linux kernel page cache ALWAYS knows better. Fighting it brings
only worse performance. Usage of fadvise() is wrong 9 out of 10 times.
Removing the whole fs package brings a nice 100% speedup when running
costly prune command. And that is measured on localhost, the improvement
could be much bigger when using network with higher latency.