Commit graph

14 commits

Author SHA1 Message Date
Timo Rothenpieler
262d41c804 all: fix typos found by codespell 2025-08-03 13:48:47 +02:00
Lynne
cb8f4b675d
vulkan/ffv1: unify encode and decode get/put primitives
This simply makes a get_rac/put_rac_internal variant that can be
reused.
2025-05-20 19:53:02 +09:00
Lynne
52595025c5
ffv1enc_vulkan: minor EC optimizations 2025-05-20 19:53:01 +09:00
Lynne
7c0a8c07ce
ffv1enc_vulkan: unify EC code between setup and encode 2025-05-20 19:53:00 +09:00
Lynne
69f83bafd1
ffv1enc_vulkan: get rid of temporary data for the setup shader 2025-05-20 19:53:00 +09:00
Lynne
36c6c66deb vulkan/rangecoder: minor cleanup 2025-04-16 23:38:16 +02:00
Lynne
e040c087c7
vulkan: add support for expect/assume
This commit adds support for compiler hints.
While on AMD these are not used/needed, Nvidia benefits from them, and gives
a sizeable 10% speedup on 4k.
2025-04-14 06:10:43 +02:00
Lynne
4d561e6a1e
vulkan_ffv1: remove need for scratch data during setup
This saves on some VRAM, but mainly allows for a more unified path.
2025-04-14 06:10:43 +02:00
Lynne
45d7abf6d9
vulkan_ffv1: init overread/corrupt fields
Forgotten.
2025-04-14 06:10:42 +02:00
Lynne
defebd74c0
vulkan_ffv1: slightly optimize the range decoder
GPUs have cmovs as standard.
2025-04-14 06:10:41 +02:00
Lynne
6bad55eb17
ffv1: add a Vulkan-based decoder
This patch adds a fully-featured level 3 and 4 decoder for FFv1,
supporting Golomb and all Range coding variants, all pixel formats,
and all features, except for the newly added floating-point formats.

On a 6000 Ada, for 3840x2160 bgr0 content at 50Mbps (standard desktop
recording), it is able to do 400fps.
An Alder Lake with 24 threads can barely do 100fps.
2025-03-17 08:51:23 +01:00
Lynne
b2ebe9884e
ffv1enc_vulkan: refactor code to support sharing with decoder
The shaders were written to support sharing, but needed slight
tweaking.
2025-03-17 08:49:14 +01:00
Lynne
9691ac6af2
ffv1enc_vulkan: increase max outstanding byte count to 16bit
The issue is that at higher resolutions, the outstanding byte counter
overflowed in case the image had a lot of blank areas.
2024-11-20 05:23:35 +01:00
Lynne
ed2391d341
ffv1enc: add a Vulkan encoder
This commit implements a standard, compliant, version 3 and version 4
FFv1 encoder, entirely in Vulkan. The encoder is written in standard
GLSL and requires a Vulkan 1.3 supporting GPU with the BDA extension.

The encoder can use any amount of slices, but nominally, should use
32x32 slices (1024 in total) to maximize parallelism.

All features are supported, as well as all pixel formats.
This includes:
 - Rice
 - Range coding with a custom quantization table
 - PCM encoding

CRC calculation is also massively parallelized on the GPU.

Encoding of unaligned dimensions on subsampled data requires
version 4, or requires oversizing the image to 64-pixel alignment
and cropping out the padding via container flags.

Performance-wise, this makes 1080p real-time screen capture possible
at 60fps on even modest GPUs.
2024-11-18 07:54:22 +01:00