Commit graph

7 commits

Author SHA1 Message Date
Lynne
f80addbb07
ffv1enc_vulkan: fix encoding with large contexts
When RGB_LINECACHE == 2, then top2 is not the current line.
2025-12-04 16:53:58 +01:00
Lynne
8a2d921627
ffv1_common: minor RGB optimization 2025-05-20 19:53:01 +09:00
Lynne
bd41838b60
ffv1enc_vulkan: switch to 2-line cache, unify prediction code 2025-05-20 19:53:01 +09:00
Lynne
77f777d925
ffv1/vulkan: redo context count tracking and quant_table_idx management
This commit also makes it possible for the encoder to choose a different
quantization table on a per-slice basis, as well as adding this capability
to the decoder.

Also, this commit fully fixes decoding of context=1 encoded files.
2025-04-14 06:10:42 +02:00
Lynne
6bad55eb17
ffv1: add a Vulkan-based decoder
This patch adds a fully-featured level 3 and 4 decoder for FFv1,
supporting Golomb and all Range coding variants, all pixel formats,
and all features, except for the newly added floating-point formats.

On a 6000 Ada, for 3840x2160 bgr0 content at 50Mbps (standard desktop
recording), it is able to do 400fps.
An Alder Lake with 24 threads can barely do 100fps.
2025-03-17 08:51:23 +01:00
Lynne
b2ebe9884e
ffv1enc_vulkan: refactor code to support sharing with decoder
The shaders were written to support sharing, but needed slight
tweaking.
2025-03-17 08:49:14 +01:00
Lynne
ed2391d341
ffv1enc: add a Vulkan encoder
This commit implements a standard, compliant, version 3 and version 4
FFv1 encoder, entirely in Vulkan. The encoder is written in standard
GLSL and requires a Vulkan 1.3 supporting GPU with the BDA extension.

The encoder can use any amount of slices, but nominally, should use
32x32 slices (1024 in total) to maximize parallelism.

All features are supported, as well as all pixel formats.
This includes:
 - Rice
 - Range coding with a custom quantization table
 - PCM encoding

CRC calculation is also massively parallelized on the GPU.

Encoding of unaligned dimensions on subsampled data requires
version 4, or requires oversizing the image to 64-pixel alignment
and cropping out the padding via container flags.

Performance-wise, this makes 1080p real-time screen capture possible
at 60fps on even modest GPUs.
2024-11-18 07:54:22 +01:00