Commit graph

5 commits

Author SHA1 Message Date
Niklas Haas
a37c00c4e9 avformat/shared: add missing ret = 0
Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-14 11:10:12 +02:00
Andreas Rheinhardt
5b9d8901a9 avformat/shared: use av_fallthrough to mark fallthroughs
Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-06-04 19:43:15 +02:00
Niklas Haas
afce637550 avformat/shared: add option to verify cache file contents
This will effectively disable the cache but allows the cache layer to verify
cached files against the original input file. Useful only for debugging
the shared cache protocol itself, as file corruption can already be caught by
the CRC check.
2026-06-04 17:48:12 +02:00
Niklas Haas
ca748964fe avformat/shared: implement 16-bit CRC check
Decided to split this off from the previous commit in case we
ever want to revert it, since it does double the overhead of the spacemap
as well as adding extra overhead to both the read and write path.

Bump the cache version to 2 to reflect the changed disk format.
2026-06-04 17:48:12 +02:00
Niklas Haas
56de70a2e6 avformat: add shared concurrent block cache protocol
This adds a new protocol shared:URI which is distinct from the existing
`cache:` in that it is explicity designed to be thread-safe and cross-process,
enabling multiple ffmpeg processes (or multiple ffmpeg decoders within the same
process) to share a single cache file, for e.g. a remote HTTP stream. As such,
it uses a radically different internal design.

To facilitate zero-knowledge cross-process interoperability, the cache file
itself is just a memory-mapped representation of the underlying file data,
which has the side benefit that the resulting cache file will contain a
working copy of the streamed file (assuming the stream was read to
completion).

To keep track of which regions are cached and which are not, we use a
secondary file that contains a minimal header along with a static bytemap of
blocks within the file. This secondary file is also used to store metadata
such as the filesize, if known, as well as marking "failed" blocks.

Both files can grow dynamically in order to accommodate larger/growing files,
and can be atomically updated (through the use of shared space maps). I have
extensively checked the space map initalization and update code for race
conditions, and I believe the current design to be solid.

That said, it is the user's responsibility to some extent to ensure that the
same URI is not used for different streams, as we rely on the URI to uniquely
identify the cache files. That said, we use a cryptographic hash with
sufficient collision resistance to protect against possible abuse. The lack of
any implicit default on `-cache_dir` also means that `shared:` can't be enabled
via URL injection to possibly access random files on the disk (or intentionally
leak content from other streams with similar URIs, even if the cryptograhic
hash function is broken).
2026-06-04 17:48:12 +02:00