mirror of
https://codeberg.org/forgejo/forgejo.git
synced 2025-12-08 06:29:47 +00:00
If, for any reason (e.g. server crash), a task is recorded as done in the database but the logs are still in the database instead of being in storage, they need to be collected. The log_in_storage field is only set to true after the logs have been transfered to storage and can be relied upon to reflect which tasks have lingering logs. A cron job collects lingering logs every day, 3000 at a time, sleeping one second between them. In normal circumstances there will be only a few of them, even on a large instance, and there is no need to collect them as quickly as possible. When there are a lot of them for some reason, garbage collection must happen at a rate that is not too hard on storage I/O. Refs https://codeberg.org/forgejo/forgejo/issues/9999 --- Note on backports: the v11 backport is done manually because of minor conflicts. https://codeberg.org/forgejo/forgejo/pulls/10024 ## Checklist The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org). ### Tests - I added test coverage for Go changes... - [x] in their respective `*_test.go` for unit tests. - [x] in the `tests/integration` directory if it involves interactions with a live Forgejo server. - I added test coverage for JavaScript changes... - [ ] in `web_src/js/*.test.js` if it can be unit tested. - [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)). ### Documentation - [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change. - [x] I did not document these changes and I do not expect someone else to do it. ### Release notes - [ ] I do not want this change to show in the release notes. - [x] I want the title to show in the release notes with a link to this pull request. - [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title. <!--start release-notes-assistant--> ## Release notes <!--URL:https://codeberg.org/forgejo/forgejo--> - Bug fixes - [PR](https://codeberg.org/forgejo/forgejo/pulls/10009): <!--number 10009 --><!--line 0 --><!--description Z2FyYmFnZSBjb2xsZWN0IGxpbmdlcmluZyBhY3Rpb25zIGxvZ3M=-->garbage collect lingering actions logs<!--description--> <!--end release-notes-assistant--> Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net> Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10009 Reviewed-by: Mathieu Fenniak <mfenniak@noreply.codeberg.org> Reviewed-by: Gusted <gusted@noreply.codeberg.org> Co-authored-by: Earl Warren <contact@earl-warren.org> Co-committed-by: Earl Warren <contact@earl-warren.org>
131 lines
4.1 KiB
Go
131 lines
4.1 KiB
Go
// Copyright 2022 The Gitea Authors. All rights reserved.
|
|
// SPDX-License-Identifier: MIT
|
|
|
|
package dbfs
|
|
|
|
import (
|
|
"context"
|
|
"io/fs"
|
|
"os"
|
|
"path"
|
|
"time"
|
|
|
|
"forgejo.org/models/db"
|
|
)
|
|
|
|
/*
|
|
The reasons behind the DBFS (database-filesystem) package:
|
|
When a Gitea action is running, the Gitea action server should collect and store all the logs.
|
|
|
|
The requirements are:
|
|
* The running logs must be stored across the cluster if the Gitea servers are deployed as a cluster.
|
|
* The logs will be archived to Object Storage (S3/MinIO, etc.) after a period of time.
|
|
* The Gitea action UI should be able to render the running logs and the archived logs.
|
|
|
|
Some possible solutions for the running logs:
|
|
* [Not ideal] Using local temp file: it can not be shared across the cluster.
|
|
* [Not ideal] Using shared file in the filesystem of git repository: although at the moment, the Gitea cluster's
|
|
git repositories must be stored in a shared filesystem, in the future, Gitea may need a dedicated Git Service Server
|
|
to decouple the shared filesystem. Then the action logs will become a blocker.
|
|
* [Not ideal] Record the logs in a database table line by line: it has a couple of problems:
|
|
- It's difficult to make multiple increasing sequence (log line number) for different databases.
|
|
- The database table will have a lot of rows and be affected by the big-table performance problem.
|
|
- It's difficult to load logs by using the same interface as other storages.
|
|
- It's difficult to calculate the size of the logs.
|
|
|
|
The DBFS solution:
|
|
* It can be used in a cluster.
|
|
* It can share the same interface (Read/Write/Seek) as other storages.
|
|
* It's very friendly to database because it only needs to store much fewer rows than the log-line solution.
|
|
* In the future, when Gitea action needs to limit the log size (other CI/CD services also do so), it's easier to calculate the log file size.
|
|
* Even sometimes the UI needs to render the tailing lines, the tailing lines can be found be counting the "\n" from the end of the file by seek.
|
|
The seeking and finding is not the fastest way, but it's still acceptable and won't affect the performance too much.
|
|
*/
|
|
|
|
type DbfsMeta struct { //revive:disable-line:exported
|
|
ID int64 `xorm:"pk autoincr"`
|
|
FullPath string `xorm:"VARCHAR(500) UNIQUE NOT NULL"`
|
|
BlockSize int64 `xorm:"BIGINT NOT NULL"`
|
|
FileSize int64 `xorm:"BIGINT NOT NULL"`
|
|
CreateTimestamp int64 `xorm:"BIGINT NOT NULL"`
|
|
ModifyTimestamp int64 `xorm:"BIGINT NOT NULL"`
|
|
}
|
|
|
|
type DbfsData struct { //revive:disable-line:exported
|
|
ID int64 `xorm:"pk autoincr"`
|
|
Revision int64 `xorm:"BIGINT NOT NULL"`
|
|
MetaID int64 `xorm:"BIGINT index(meta_offset) NOT NULL"`
|
|
BlobOffset int64 `xorm:"BIGINT index(meta_offset) NOT NULL"`
|
|
BlobSize int64 `xorm:"BIGINT NOT NULL"`
|
|
BlobData []byte `xorm:"BLOB NOT NULL"`
|
|
}
|
|
|
|
func init() {
|
|
db.RegisterModel(new(DbfsMeta))
|
|
db.RegisterModel(new(DbfsData))
|
|
}
|
|
|
|
func OpenFile(ctx context.Context, name string, flag int) (File, error) {
|
|
f, err := newDbFile(ctx, name)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
err = f.open(flag)
|
|
if err != nil {
|
|
_ = f.Close()
|
|
return nil, err
|
|
}
|
|
return f, nil
|
|
}
|
|
|
|
func Open(ctx context.Context, name string) (File, error) {
|
|
return OpenFile(ctx, name, os.O_RDONLY)
|
|
}
|
|
|
|
func Create(ctx context.Context, name string) (File, error) {
|
|
return OpenFile(ctx, name, os.O_RDWR|os.O_CREATE|os.O_TRUNC)
|
|
}
|
|
|
|
func Rename(ctx context.Context, oldPath, newPath string) error {
|
|
f, err := newDbFile(ctx, oldPath)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
defer f.Close()
|
|
return f.renameTo(newPath)
|
|
}
|
|
|
|
func Remove(ctx context.Context, name string) error {
|
|
f, err := newDbFile(ctx, name)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
defer f.Close()
|
|
return f.delete()
|
|
}
|
|
|
|
var _ fs.FileInfo = (*DbfsMeta)(nil)
|
|
|
|
func (m *DbfsMeta) Name() string {
|
|
return path.Base(m.FullPath)
|
|
}
|
|
|
|
func (m *DbfsMeta) Size() int64 {
|
|
return m.FileSize
|
|
}
|
|
|
|
func (m *DbfsMeta) Mode() fs.FileMode {
|
|
return os.ModePerm
|
|
}
|
|
|
|
func (m *DbfsMeta) ModTime() time.Time {
|
|
return fileTimestampToTime(m.ModifyTimestamp)
|
|
}
|
|
|
|
func (m *DbfsMeta) IsDir() bool {
|
|
return false
|
|
}
|
|
|
|
func (m *DbfsMeta) Sys() any {
|
|
return nil
|
|
}
|