We recently introduced a public field which was a superset
of the queue context we used to have.
Switch to using it entirely.
This also allows us to get rid of the NIH function which was
valid only for video queues.
Originally, the decoder had a single execution pool, with one
execution context per thread. Execution pools were always intended
to be thread-safe, as long as there were enough execution contexts
in the pool to satisfy all threads.
Due to synchronization issues, the threading part was removed at some
point, and, for decoding, each thread had its own execution pool.
Having a single execution pool per context is hacky, not to mention
wasteful.
Most importantly, we *cannot* associate single shaders across multiple
execution pools for a single application. This means that we cannot
use shaders to either apply film grain, or use this framework for
software-defined decoders.
The recent commits added threading capabilities back to the execution
pool, and the number of contexts in each pool was increased. This was
done with the assumption that the execution pool was singular, which
it was not. This led to increased parallelism and number of frames
in flight, which is taxing on memory.
This commit finally restores proper threading behaviour.
The validation layer has isses that are reported and addressed in the
earlier commit.
The first release of the CTS for AV1 decoding had incorrect
offsets for the OrderHints values.
The CTS will be fixed, and eventually, the drivers will be
updated to the proper spec-conforming behaviour, but we still
need to add a workaround as this will take months.
Only NVIDIA use these values at all, so limit the workaround
to only NVIDIA. Also, other vendors don't tend to provide accurate
CTS information.
Only three of the 226 (== AV_CODEC_ID_AV1) entries
have been used. Unsparsing this table is especially
important given that this array lives in .data.rel.ro.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
All the fields of FFVkCodecMap are either decoder-only
or encoder-only (with the latter being unused and unset for now).
Yet there is already a per-decoder struct containing
static information about these decoders, namely
VkExtensionProperties.
This commit merges the decoder-parts of FFVkCodecMap
with the VkExtensionProperties into a common structure.
Given that FFVkCodecMap is now unused, it is removed.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The issue is that we cannot rely on any context existing when we free
frames. The Vulkan functions are loaded in each context separately,
so until now, we've just been loading them on every frame's destruction.
Rather than do this, just save the function pointers we need in each
frame. The function pointers are guaranteed to not change and exist.
All Vulkan HWAccels share the same boilerplate code for creating
session params and this includes a common bug: In case actually
creating the video session parameters fails, the buffer destined
to hold them leaks; in case of HEVC this is also true if
get_data_set_buf() fails.
This commit factors this code out and fixes the leak.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
libavcodec/hwconfig.h currently contains HWACCEL_CAP_* flags
as well as the definition of AVCodecHWConfigInternal and some
macros to create them.
The users of these two are nearly disjoint: The flags are used
by files providing AVHWAccels whereas AVCodecHWConfigInternal
is used by files providing codecs (for FFCodec.hw_configs).
This patch therefore moves these flags to a new file hwaccel_internal.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit scraps a bool to signal to recreate the session parameters,
but instead destroys them, forcing them to be recreated.
As this can happen between start_frame and end_frame, do this
at both places.
Move the slice offsets buffer to the thread decode context.
It isn't part of the resources for frame decoding, the driver
has to process and finish with it at submission time.
That way, it doesn't need to be alloc'd + freed on every frame.