ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-02-06 18:00:17 +00:00

Author	SHA1	Message	Date
Lynne	f2a55af9a4	vulkan_dpx: switch to compile-time SPIR-V generation	2026-01-12 17:28:43 +01:00
Lynne	e27b510da8	vulkan_prores: generate SPIR-V at compile-time	2026-01-12 17:28:42 +01:00
Lynne	026e94e339	vulkan_prores_raw: use compile-time SPIR-V generation	2026-01-12 17:28:42 +01:00
Lynne	f2affdfafb	configure/make: support compile-time SPIR-V generation	2026-01-12 17:28:40 +01:00
Lynne	6eced88188	vulkan: merge ProRes and ProRes RAW iDCTs This cleans up the code a bit, and reduces binary size.	2025-12-22 19:46:26 +01:00
Lynne	9e8e34d475	vulkan_ffv1: remove unused RCT shader files The 2 files were made redundant when the RCT was merged into encode/decode.	2025-12-13 22:12:26 +01:00
averne	c384b1e803	vulkan/prores: use vkCmdClearColorImage The VK spec forbids using clear commands on YUV images, so we need to allocate separate per-plane images. This removes the need for a separate reset shader.	2025-12-07 18:17:36 +00:00
Lynne	531ce713a0	dpxdec: add a Vulkan hwaccel	2025-11-26 15:16:43 +01:00
Lynne	bb30a0d0d8	vulkan_prores_raw: split up decoding and DCT This commit optimizes the Vulkan decoder by splitting up decoding from iDCT, and merging the few tables needed directly into the shader. The speedup on Intel is 10x.	2025-11-26 15:16:41 +01:00
averne	98412edfed	lavc: add a ProRes Vulkan hwaccel Add a shader-based Apple ProRes decoder. It supports all codec features for profiles up to the 4444 XQ profile, ie.: - 4:2:2 and 4:4:4 chroma subsampling - 10- and 12-bit component depth - Interlacing - Alpha The implementation consists in two shaders: the VLD kernel does entropy decoding for color/alpha, and the IDCT kernel performs the inverse transform on color components. Benchmarks for a 4k yuv422p10 sample: - AMD Radeon 6700XT: 178 fps - Intel i7 Tiger Lake: 37 fps - NVidia Orin Nano: 70 fps	2025-10-25 19:54:13 +00:00
Lynne	75aeffb1c6	lavc: add a ProRes RAW Vulkan hwaccel This commit adds a ProRes RAW hardware implementation written in Vulkan. Both version 0 and version 1 streams are supported. The implementation is highly parallelized, with 512 invocations dispatched per every tile, with generally 4k tiles on a 5.8k stream. Thanks to unlord for the 8-point iDCT. Benchmark for a generic 5.8k RAW HQ file: 6900XT: 63fps 7900XTX: 84fps 6000 Ada: 120fps Intel: 9fps	2025-08-08 18:29:41 +09:00
Lynne	7576410af7	ffv1enc_vulkan: implement RCT search for level >= 4	2025-05-20 19:53:01 +09:00
Lynne	ebbc7ff650	ffv1enc_vulkan: merge all encoder variants into one file Makes it easier to work with, despite the heavy ifdeffery.	2025-05-20 19:52:55 +09:00
Lynne	66b8c92df2	vulkan_ffv1: cache only 2 lines when decoding RGB This reduces the intermediate VRAM used for RGB decoding by a factor of 100x for 6k video. This also speeds the decoder up by 16% for 4k RGB24 and 31% for 6k video. This is equivalent to what the software decoder does, but with less pointers.	2025-04-14 06:10:42 +02:00
Lynne	6bad55eb17	ffv1: add a Vulkan-based decoder This patch adds a fully-featured level 3 and 4 decoder for FFv1, supporting Golomb and all Range coding variants, all pixel formats, and all features, except for the newly added floating-point formats. On a 6000 Ada, for 3840x2160 bgr0 content at 50Mbps (standard desktop recording), it is able to do 400fps. An Alder Lake with 24 threads can barely do 100fps.	2025-03-17 08:51:23 +01:00
Lynne	ed2391d341	ffv1enc: add a Vulkan encoder This commit implements a standard, compliant, version 3 and version 4 FFv1 encoder, entirely in Vulkan. The encoder is written in standard GLSL and requires a Vulkan 1.3 supporting GPU with the BDA extension. The encoder can use any amount of slices, but nominally, should use 32x32 slices (1024 in total) to maximize parallelism. All features are supported, as well as all pixel formats. This includes: - Rice - Range coding with a custom quantization table - PCM encoding CRC calculation is also massively parallelized on the GPU. Encoding of unaligned dimensions on subsampled data requires version 4, or requires oversizing the image to 64-pixel alignment and cropping out the padding via container flags. Performance-wise, this makes 1080p real-time screen capture possible at 60fps on even modest GPUs.	2024-11-18 07:54:22 +01:00
Lynne	4e861ad8e0	libavcodec/Makefile: add a makefile for Vulkan shaders	2024-10-15 17:45:19 +02:00

17 commits