This commit introduces a new hardware-accelerated video filter, scale_d3d11,
which performs scaling and format conversion using Direct3D 11. The filter enables
efficient GPU-based scaling and pixel format conversion (p010 to nv12), reducing
CPU overhead and latency in video pipelines.
This change improves pipeline stability and reduces
dynamic GPU surface allocations when using AMF with copy_frame = 1.
This optimization has no negative effect.
Add type removed function wrappers to resolve UB of calling function
through pointer to incorrect function type.
Fixes: FATE-{hmac,srtp}
Fixes: call to function av_md5_init through pointer to incorrect
function type 'void (*)(void *)' and similar for others.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
av_get_token() allocates an output buffer with the same size as the
input. Generally, this is harmless, but when the input string is large
and consists of many small tokens, calling av_get_token() repeatedly to
extract all tokens will significantly amplify memory allocations.
To fix this, after obtaining the return value, simply realloc the buffer
to the actual size needed for output string.
Fixes OOM when parsing filter graph string.
Fixes OSS-Fuzz: 394983446
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
If the image data is not at the start of the buffer allocation, such as
when the buffer has padding before the image data, this function maps too
much memory, since src_data + src_buf->size exceeds the buffer size.
Fix this by subtracting the difference between the buffer start and the
provided image data pointer from the size of the memory range to map.
An easy way to reproduce this issue is using the vf_pad filter, which
allocates image data buffers with a nonzero offset whenever padding is
requested before the start of the image data.
The issue is that vulkan_device_create_internal() is only called for
devices that lavu creates by itself.
For external devices, this was never done.
This also solves some mid-function declaration warnings.
This feature fundamentally relies on host-visible VRAM, which restricts the
set of available memory types to (typically) host-visible device-local ones.
When resizable BAR is disabled, this memory type is usually limited to
e.g. 256 MiB in size, which is just plain insufficient for allocation of
general purpose GPU images, causing OOM errors on even the simplest of
commands.
The easiest solution is to disable host transfers entirely on machines
without host-addressable VRAM. In theory, we could try and recover the use
of host transfers for images which are *not* restricted to device-local
memory types, but this is rarely the case in practice, and the effort
required would exceed the benefit, especially since ReBAR is a standard
feature on all platforms recent enough to have Vulkan drivers, and only
occasionally disabled in the UEFI for by default for some hare-brained
notion of "backwards compatibiility" with ancient software.
This fixes building with Clang in MSVC mode, for x86, which was
broken in 6e49b86996 (in Nov 2024);
previously it failed with undefined symbols for the constants
defined with DECLARE_ASM_CONST, accessed via inline assembly.
Before 57861911a3, there was an
#elif defined(__GNUC__) || defined(__clang__)
case before the
#elif defined(_MSC_VER)
case for defining DECLARE_ASM_CONST, which included av_used.
(This case included the explicit "defined(__clang__)" since
f637046d31.)
After 57861911a3, it used the
generic definition of DECLARE_ASM_CONST that also included
av_used - which also worked for Clang in MSVC mode. But after
6e49b86996, Clang in MSVC mode
ended up using the MSVC specific variant which lacked the
av_used declaration, causing linker errors due to undefined
symbols.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fix incorrect enum value used in color primaries check by replacing
AVCOL_SPC_UNSPECIFIED with AVCOL_PRI_UNSPECIFIED.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Fixes use of bultins on clang x86_64-pc-windows-msvc which does not
define any __GNUC__. Also on other targets __GNUC__ is defined to 4 by
default, so any feature testing based on version is not really valid.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
On NVIDIA, there's a global maximum limit of approximately 112 queues,
which means it takes ONLY 7 total programs using the maximum amount of
queues to cause the driver to error out/*segfault* during initialization.
Also, each queue takes about 30ms to allocate, which quickly adds up.
This reduces the queues allocate to the minimum that we would be happy
with. Its not worth limiting decode/encode queues as they're generally
not a lot, and do help.
GCC/Clang is smart enough to emit minss/maxss the same way as these functions.
The only theoretical benefit was in x86_32, where x87 floats are used, but the
penalty of making the clipping opaque to the compiler's scheduler plus moving
values from mmx regs to xmm and back will offset any potential speedup.
x86_32 builds targetting anything made in the last two decades and a half
should use -msse -mfp=sse anyway.
Signed-off-by: James Almer <jamrial@gmail.com>
Just do it like av_frame_replace().
Also "fixes" the swapped order or arguments to av_malloc_array()
(the first is supposed to be the number of elements, the second
the size of an element) and therefore part of ticket #11620.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(I don't know whether this can be triggered for a file with
nonnegative channel count, given that src's extended data can't
have been allocated in this case.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Useful to let the compiler and static analyzers know that
something is unreachable without adding an av_assert
(which would be either dead for the compiler or add runtime
overhead) for this.
The implementation used here enforces the use of a message
to provide a reason why a particular code is supposed to be
unreachable.
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>