Commit graph

16 commits

Author SHA1 Message Date
Lynne
8dcf02ac63
vulkan: remove IS_WITHIN macro
This is the more correct GLSL solution.
2026-01-19 16:37:15 +01:00
Lynne
026e94e339
vulkan_prores_raw: use compile-time SPIR-V generation 2026-01-12 17:28:42 +01:00
Lynne
cfcf52a08c
vulkan: deduplicate shorthand casting defines to common.comp 2025-12-22 19:46:27 +01:00
averne
a2475d16ed lavc/vulkan/common: allow configurable bitstream caching in shared memory 2025-12-15 12:29:00 +00:00
Lynne
c3291993eb
vulkan_ffv1: use proper rounded divisions for plane width and height
Fixes #20314
2025-12-13 19:12:24 +01:00
averne
fd2fd3828c libavcodec/vulkan: remove unnessary member in GetBitContext
The number of remaining bits can be calculated using existing state.
This simplifies calculations and frees up one register.
2025-11-30 19:21:08 +01:00
averne
ef7354d471 libavcodec/vulkan: introduce cached bitstream reader
This stores a small buffer in shared memory per decode thread (16 bytes),
which helps reduce the number of memory accesses.
The bitstream buffer is first aligned to a 4 byte boundary, so that the
buffer can be filled with a single memory request.
2025-11-30 19:21:04 +01:00
Lynne
d36d88dcbb
vulkan/common: add reverse2 endian reversal macro 2025-11-26 15:16:39 +01:00
Lynne
2c3315b04c
lavc/vulkan/common: sign-ify lengths
This makes left_bits return useful data rather than overflowing, and
also saves some 64-bit integer operations, which is still always a plus sadly.
2025-08-05 23:51:21 +09:00
Lynne
6bad55eb17
ffv1: add a Vulkan-based decoder
This patch adds a fully-featured level 3 and 4 decoder for FFv1,
supporting Golomb and all Range coding variants, all pixel formats,
and all features, except for the newly added floating-point formats.

On a 6000 Ada, for 3840x2160 bgr0 content at 50Mbps (standard desktop
recording), it is able to do 400fps.
An Alder Lake with 24 threads can barely do 100fps.
2025-03-17 08:51:23 +01:00
Lynne
89704f07bb
lavc/vulkan: add a u8vec2buf buffer type
Useful, since it doesn't have alignment limitations.
2025-02-21 03:19:20 +01:00
IndecisiveTurtle
351fd8460a vulkan/common: Add put_bytes_count 2024-11-28 10:03:01 +09:00
IndecisiveTurtle
e3ac63b213 vulkan/common: Use u32vec2 buffer type instead of u64
According to the GL_EXT_buffer_reference spec alignment
"must be a power of two and be greater than or equal to the largest scalar/component type in the block."

This means by using u32vec2 we can drop the requirement alignment from 8 bytes to 4 bytes
and save a pack64 call in reverse8 (though I assume in most ISAs that compiles to nothing)

Allows the vc2 vulkan encoder to function without setting PB_UNALIGNED
2024-11-28 09:31:43 +09:00
IndecisiveTurtle
f794ed48c0 vulkan/common: Fix off-by-one error in flush_put_bits
If caller wrote a divisible by eight number of bits it would write an extra byte.
Also increment by to_write instead of BUF_BYTES which overly pads the bitstream.
2024-11-28 09:31:43 +09:00
Lynne
4d3e96c90c
lavc/vulkan/common: fix reverse4's incorrect swizzle
The function is responsible for converting little to big endian.
It had an incorrect swizzle for the last 2 bytes.
2024-11-20 05:23:36 +01:00
Lynne
ed2391d341
ffv1enc: add a Vulkan encoder
This commit implements a standard, compliant, version 3 and version 4
FFv1 encoder, entirely in Vulkan. The encoder is written in standard
GLSL and requires a Vulkan 1.3 supporting GPU with the BDA extension.

The encoder can use any amount of slices, but nominally, should use
32x32 slices (1024 in total) to maximize parallelism.

All features are supported, as well as all pixel formats.
This includes:
 - Rice
 - Range coding with a custom quantization table
 - PCM encoding

CRC calculation is also massively parallelized on the GPU.

Encoding of unaligned dimensions on subsampled data requires
version 4, or requires oversizing the image to 64-pixel alignment
and cropping out the padding via container flags.

Performance-wise, this makes 1080p real-time screen capture possible
at 60fps on even modest GPUs.
2024-11-18 07:54:22 +01:00