ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2025-12-08 06:09:50 +00:00

Author	SHA1	Message	Date
Lynne	eb9e000584	vulkan_decode: add ifdefs around VP9 definitions and privatize profile struct The struct is not referenced anywhere else.	2025-08-08 15:07:33 +00:00
Benjamin Cheng	f7a5128109	vulkan_av1: Fix frame threading Basically do the same thing that was done for VP9, and remove the vestigial frame_id_alloc_mask in the context.	2025-08-08 14:45:58 +00:00
Lynne	75aeffb1c6	lavc: add a ProRes RAW Vulkan hwaccel This commit adds a ProRes RAW hardware implementation written in Vulkan. Both version 0 and version 1 streams are supported. The implementation is highly parallelized, with 512 invocations dispatched per every tile, with generally 4k tiles on a 5.8k stream. Thanks to unlord for the 8-point iDCT. Benchmark for a generic 5.8k RAW HQ file: 6900XT: 63fps 7900XTX: 84fps 6000 Ada: 120fps Intel: 9fps	2025-08-08 18:29:41 +09:00
Lynne	2caf23e7c4	vp9: add Vulkan VP9 hwaccel	2025-08-08 18:29:40 +09:00
Timo Rothenpieler	262d41c804	all: fix typos found by codespell	2025-08-03 13:48:47 +02:00
Lynne	7b45d9c5fd	vulkan_ffv1: pipe through slice decoding status	2025-05-20 19:53:02 +09:00
Lynne	ec3f3457fd	vulkan_decode: add STORAGE flag to output images In filtering, and SDR encoding, we use storage images. This fixes using Vulkan filters on Intel. Tested not to break anything on the three major vendors.	2025-04-19 10:59:16 +02:00
Lynne	193610d9ba	vulkan_decode: allow using NULL offsets/nb_slices in ff_vk_decode_add_slice() For codecs like VP9 which use a single slice.	2025-03-27 17:22:11 +01:00
Lynne	5fc4acae9c	vulkan_decode: allow using NULL sequence_params when decoding The function had some checks to allow for this, but as it always tried to dereference a bufferref, it wasn't fully ready.	2025-03-27 17:22:11 +01:00
Lynne	6bad55eb17	ffv1: add a Vulkan-based decoder This patch adds a fully-featured level 3 and 4 decoder for FFv1, supporting Golomb and all Range coding variants, all pixel formats, and all features, except for the newly added floating-point formats. On a 6000 Ada, for 3840x2160 bgr0 content at 50Mbps (standard desktop recording), it is able to do 400fps. An Alder Lake with 24 threads can barely do 100fps.	2025-03-17 08:51:23 +01:00
Lynne	31176b16ac	vulkan_decode: use VK_KHR_video_maintenance2 if available	2025-03-17 08:49:12 +01:00
Lynne	e15e85b869	vulkan_decode: adjust number of async contexts created This caps the number of contexts we create based on thread count. This saves VRAM and filters out cases where more async is of lesser benefit.	2025-03-17 08:49:11 +01:00
Lynne	4495802bdb	vulkan_decode: support multiple image views Enables non-monochrome video decoding using all our existing functions in the context of an SDR decoder.	2025-03-17 08:49:11 +01:00
Lynne	491b65e343	vulkan_decode: support software-defined decoders	2025-03-17 08:49:11 +01:00
Lynne	551041e384	vulkan_decode: remove informative queries We queried the decoder whether it was able to decode sucessfully, but since we operated asynchronously, we weren't able to do anything with this information but let the user know decoding failed for the previous frame(s). Since we parse the slice headers ourselves and we're reasonably sure we can decode before actually starting to decode, this was rarely triggered on corrupt data, and hardware's understanding of whether there was an error or not is vague. There's also a semantic problem with our use of the queries - if there's a seek, we flush, but what happens to the queries is vague according to the spec. Most hardware dealt fine, since queries are nothing more than GPU memory with integers stored. But with Intel, they seem to be more of a register to which a driver must keep track of, leading to issues if there's been a reset (seek) and we query the previous submission before the seek. Just get rid of them. The query code is still used in encoding. This fixes seeking with HEVC and AV1 on Intel.	2025-01-03 14:53:41 +09:00
Lynne	8fbecfd1a0	vulkan_decode: add queue_flags field to specify queue used	2024-12-23 04:25:09 +09:00
Lynne	2e06b84e27	vulkan: do not reinvent a queue context struct We recently introduced a public field which was a superset of the queue context we used to have. Switch to using it entirely. This also allows us to get rid of the NIH function which was valid only for video queues.	2024-12-23 04:25:09 +09:00
Lynne	7239be07be	vulkan_decode: use a single execution pool Originally, the decoder had a single execution pool, with one execution context per thread. Execution pools were always intended to be thread-safe, as long as there were enough execution contexts in the pool to satisfy all threads. Due to synchronization issues, the threading part was removed at some point, and, for decoding, each thread had its own execution pool. Having a single execution pool per context is hacky, not to mention wasteful. Most importantly, we cannot associate single shaders across multiple execution pools for a single application. This means that we cannot use shaders to either apply film grain, or use this framework for software-defined decoders. The recent commits added threading capabilities back to the execution pool, and the number of contexts in each pool was increased. This was done with the assumption that the execution pool was singular, which it was not. This led to increased parallelism and number of frames in flight, which is taxing on memory. This commit finally restores proper threading behaviour. The validation layer has isses that are reported and addressed in the earlier commit.	2024-12-23 04:25:08 +09:00
Anton Khirnov	56ba57b672	lavc/refstruct: move to lavu and make public It is highly versatile and generally useful.	2024-12-15 14:03:47 +01:00
Lynne	41f65b7326	vulkan_decode: ensure there's at least one context per decode thread Otherwise, what may happen is that 2 threads will both write into the same context.	2024-11-28 01:29:21 +09:00
Lynne	a5e6860a89	vulkan_decode: fix counting for parallelism ff_vk_exec_pool_init used to multiply the number by the number of queues, but that got changed, yet this use of the function was not updated.	2024-11-28 01:29:15 +09:00
Lynne	37d5cb84e8	vulkan: check if current buffer has finished execution before picking another This saves resources, as dependencies are freed/reclaimed with a lower latency, and provies a speedup.	2024-10-04 10:10:42 +02:00
Lynne	5e9845f11e	vulkan(_decode): fix, simplify and improve queries The old query code never worked properly, and did some hideous heuristics to read the status bit, and work that into a return code. This is all best left to callers to do, which simplifies our code a lot. This also fixes minor validation errors regarding calling queries which are not in their active state.	2024-09-09 07:05:46 +02:00
Lynne	9c65325819	vulkan_decode: use ff_vk_init This solves the issue of an av_log function being called with a context with invalid class. Co-authored-by: Anton Khirnov <anton@khirnov.net>	2024-09-09 07:05:45 +02:00
Lynne	66e950fcac	vulkan_video: move imageview creation and DPB fields to common context Shared between decoders and encoders.	2024-09-09 07:05:44 +02:00
Lynne	18d964fc2c	vulkan: enable encoding of images if video_maintenance1 is enabled Vulkan encoding was designed in a very... consolidated way. You had to know the exact codec and profile that the image was going to eventually be encoded as at... image creation time. Unfortunately, as good as our code is, glimpsing into the exact future isn't what its capable of. video_maintenance1 removed that requirement, which only then made encoding images practically possible.	2024-08-16 01:22:16 +02:00
Lynne	869f4aec48	vulkan_decode: use the correct queue family for decoding ops In `680d969a30`, the new API was used to find a queue family for dispatch, but the found queue family was not used for decoding, just for dispatching.	2024-08-16 01:22:08 +02:00
Lynne	680d969a30	vulkan_decode: port to the new queue family API	2024-08-11 05:13:16 +02:00
Lynne	1c05661ec4	vulkan_decode: add \n to error message	2024-08-11 05:13:15 +02:00
Lynne	ca591e6b50	vulkan_decode: force layered_dpb to 0 when dedicated_dpb is 0 layered_dpb only makes sense when dedicated_dpb is set to 1. For some mysterious reason, some Nvidia drivers stopped indicating SEPARATE_REFRENCES, but kept the COINCIDE flag, which broke the code.	2024-08-11 05:13:14 +02:00
Lynne	6757cdb535	vulkan_video: remove NIH pooled buffer implementation The code predates ff_vk_get_pooled_buffer().	2024-08-11 05:13:10 +02:00
Lynne	db09f1a5d8	vulkan_av1: add workaround for NVIDIA drivers tested on broken CTS The first release of the CTS for AV1 decoding had incorrect offsets for the OrderHints values. The CTS will be fixed, and eventually, the drivers will be updated to the proper spec-conforming behaviour, but we still need to add a workaround as this will take months. Only NVIDIA use these values at all, so limit the workaround to only NVIDIA. Also, other vendors don't tend to provide accurate CTS information.	2024-04-15 02:40:02 +02:00
Andreas Rheinhardt	790f793844	avutil/common: Don't auto-include mem.h There are lots of files that don't need it: The number of object files that actually need it went down from 2011 to 884 here. Keep it for external users in order to not cause breakages. Also improve the other headers a bit while just at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:43 +01:00
Lynne	ecdc94b97f	vulkan_av1: port to the new stable API Co-Authored-by: Dave Airlie <airlied@redhat.com>	2024-03-25 08:54:40 +01:00
Andreas Rheinhardt	ccb432c1fe	avcodec/vulkan_decode: Remove always-false check These fields are set for all Vulkan decoding hwaccels; they would be useless if it were different. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-07 09:00:47 +01:00
Andreas Rheinhardt	f9d35e78fe	avcodec/vulkan_decode: Un-sparse extensions table Only three of the 226 (== AV_CODEC_ID_AV1) entries have been used. Unsparsing this table is especially important given that this array lives in .data.rel.ro. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-07 09:00:39 +01:00
Andreas Rheinhardt	f7b227bec3	avcodec/vulkan_video: Merge dec part of FFVkCodecMap and extension props All the fields of FFVkCodecMap are either decoder-only or encoder-only (with the latter being unused and unset for now). Yet there is already a per-decoder struct containing static information about these decoders, namely VkExtensionProperties. This commit merges the decoder-parts of FFVkCodecMap with the VkExtensionProperties into a common structure. Given that FFVkCodecMap is now unused, it is removed. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-07 09:00:30 +01:00
Andreas Rheinhardt	e429b0fdb7	avutil/vulkan: Don't autoinclude vulkan_loader.h Only include it where necessary. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-03 22:55:26 +01:00
Andreas Rheinhardt	cb15b7b29e	avcodec/vulkan_video: Don't use sparse table ff_vk_codec_map currently is an array indexed by AVCodecID; it has AV_CODEC_ID_FIRST_AUDIO (= 65536) entries, but uses only three of them; only 24B of 1MiB were actually used This commit fixes this by adding an AVCodecID field to the table and making it non-sparse. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-03 17:17:13 +01:00
Sam James	2f24f10d9c	libavcodec: fix -Wint-conversion in vulkan FIx warnings (soon to be errors in GCC 14, already so in Clang 15): ``` src/libavcodec/vulkan_av1.c: In function ‘vk_av1_create_params’: src/libavcodec/vulkan_av1.c:183:43: error: initialization of ‘long long unsigned int’ from ‘void *’ makes integer from pointer without a cast [-Wint-conversion] 183 \| .videoSessionParametersTemplate = NULL, \| ^~~~ src/libavcodec/vulkan_av1.c:183:43: note: (near initialization for ‘(anonymous).videoSessionParametersTemplate’) ``` Use Vulkan's VK_NULL_HANDLE instead of bare NULL. Fix Trac ticket #10724. Was reported downstream in Gentoo at https://bugs.gentoo.org/919067. Signed-off-by: Sam James <sam@gentoo.org>	2024-01-06 22:38:55 +01:00
Lynne	70864e6adb	vulkan_decode: correct flipped condition in image layout Changed by the previous commit. Caused validation issues on hardware with !reuse_dpb_dst but not layered_dpb.	2023-10-25 22:01:21 +02:00
Lynne	0b3616231d	vulkan_decode: fix another validation issue Surprising no one, the insane usage rule has a catch.	2023-10-25 20:51:55 +02:00
Lynne	467e411839	vulkan_decode: fix pedantic validation issue "Validation Error: [ VUID-VkImageViewCreateInfo-imageViewType-04974 ] Object 0: handle = 0x9f9b41000000003c, type = VK_OBJECT_TYPE_IMAGE; \| MessageID = 0xc120e150 \| vkCreateImageView(): Using pCreateInfo->viewType VK_IMAGE_VIEW_TYPE_2D and the subresourceRange.layerCount VK_REMAINING_ARRAY_LAYERS=(17) and must 1 (try looking into VK_IMAGE_VIEW_TYPE_*_ARRAY). The Vulkan spec states: If viewType is VK_IMAGE_VIEW_TYPE_1D, VK_IMAGE_VIEW_TYPE_2D, or VK_IMAGE_VIEW_TYPE_3D; and subresourceRange.layerCount is VK_REMAINING_ARRAY_LAYERS, then the remaining number of layers must be 1"	2023-10-25 20:51:54 +02:00
Lynne	9ee4f47c94	vulkan_decode: use coded_width/height instead of the non-coded width and height Partially fixes https://streams.videolan.org/issues/19938/20000_20180305-15.04.59.ts The is coded as 1920x1080, meant to be rendered at 1440x1080 with cropping, or 1680x1080 before cropping. Currently, the created DPB is 1440x1080, which results in the image being decoded incorrectly, as the decoder overwrites output memory. This commit fixes this.	2023-10-25 20:51:05 +02:00
Andreas Rheinhardt	6695c0af0e	avcodec/vulkan_decode: Use RefStruct API for shared_ref Avoids allocations, error checks and indirections. Also increases type-safety. Reviewed-by: Lynne <dev@lynne.ee> Tested-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-07 22:35:50 +02:00
Lynne	9310ffc809	vulkan_decode: don't call get_proc_addr on every frame's destruction The issue is that we cannot rely on any context existing when we free frames. The Vulkan functions are loaded in each context separately, so until now, we've just been loading them on every frame's destruction. Rather than do this, just save the function pointers we need in each frame. The function pointers are guaranteed to not change and exist.	2023-09-15 17:35:22 +02:00
Lynne	552a5fa496	vulkan_hevc: switch from a buffer pool to a malloc and simplify Simpler and more robust now that contexts are not shared between threads.	2023-09-15 17:35:19 +02:00
Andreas Rheinhardt	c1b6235d41	avcodec/vulkan_decode: Factor creating session params out, fix leak All Vulkan HWAccels share the same boilerplate code for creating session params and this includes a common bug: In case actually creating the video session parameters fails, the buffer destined to hold them leaks; in case of HEVC this is also true if get_data_set_buf() fails. This commit factors this code out and fixes the leak. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-09-15 02:38:22 +02:00
Lynne	398467f519	vulkan_decode: convert max level from vulkan to av for comparisons	2023-09-08 06:56:43 +02:00
Andreas Rheinhardt	8238bc0b5e	avcodec/defs: Add AV_PROFILE_* defines, deprecate FF_PROFILE_* defines These defines are also used in other contexts than just AVCodecContext ones, e.g. in libavformat. Furthermore, given that these defines are public, the AV-prefix is the right one, so deprecate (and not just move) the FF-macros. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-09-07 00:39:02 +02:00

1 2

66 commits