Fix#20940
The feedback and its sub-filter both request frame
from each other, casuing block since 4440e499ba
The feedback should only request inputs[1] once
rather than continuously request frame cause blocking.
This patch add check whether feedback already request
inputs[1] via ff_outlink_frame_wanted(ctx->outputs[1]),
if true, then exit and waiting inputs[0] because it means
we need more frames input to proceed.
Signed-off-by: Jack Lau <jacklau1222gm@gmail.com>
(cherry picked from commit 3f0842294f)
The check if a native layout can be created from the sources was incomplete and
casued a crash with custom layouts if the layout contained a native channel
multiple times, as in this example command line:
ffmpeg -lavfi "sine[a0];sine,pan=FL+FL[a1];[a0][a1]amerge[aout]" -map "[aout]" -t 1 -f framecrc -
Signed-off-by: Marton Balint <cus@passwd.hu>
(cherry picked from commit e8b10a9b09)
For GET_UTF8(val, GET_BYTE, ERROR), val has type of uint32_t,
GET_BYTE must return an unsigned integer, otherwise signed
extension happened due to val= (GET_BYTE), and GET_UTF8 went to
the error path.
This bug incidentally cancelled the bug where hb_buffer_add_utf8
was being called with incorrect argument, allowing drawtext to
function correctly on x86 and macOS ARM, which defined char as
signed. However, on Linux and Android ARM environments, because
char is unsigned by default, GET_UTF8 now returns the correct
return, which unexpectedly revealed issue #20906.
(cherry picked from commit a5cc0e5c9e)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
From the doc of HarfBuzz, what hb_buffer_add_utf8 needs is the
number of bytes, not Unicode character:
hb_buffer_add_utf8(buf, text, strlen(text), 0, strlen(text));
Fix issue #20906.
(cherry picked from commit 9bc3c572ea)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
no test case
Found-by: Joshua Rogers <joshua@joshua.hu> with ZeroPath
Reviewed-by: Joshua Rogers <joshua@joshua.hu>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit ad956ff076)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since 3b26b782ee it would only look at the
first channel.
Signed-off-by: Carl Hetherington <cth@carlh.net>
Reviewed-by: Niklas Haas <ffmpeg@haasn.xyz>
(cherry picked from commit 1eb2cbd865)
text + 1 can break a multibyte character, e.g., Chinese in UTF-8.
There is no space at the beginning in this case.
(cherry picked from commit 1d06e8ddcd)
Remove redundant av_freep() to avoid double free since task will be freed in dnn_free_model_tf() after the success of ff_queue_push_back().
Fixes: af052f9066 ("lavfi/dnn: fix mem leak in TF backend error handle")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
(cherry picked from commit b8d5f65b9e)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Instead of undconditionally using the first input. This covers the case of
one layer fully obscuring another layer, in which case that should become
the new "base" layer.
This prevents leaking stale metadata from previous frames, for example if
an overlay temporarily obscures this input and then un-obscures it again. It
is worth pointing out that this does change the semantics subtly, because of
the smoothing period on detected HDR metadata, but I argue that the new
behavior is an improvement, as it will avoid leaking past metadata that is
definitely no longer relevant after an image is unobscured.
Sometimes, one input fully obscures another. In this case, we can skip
actually rendering any input below the obscuring one.
The reason I don't simply start the main render loop at `idx_start` will
become apparent in the following commit.
We can't use pl_frame_is_cropped() on this dummy frame, but we need to
determine the reference frame before we can map the real output, so to
resolve this conflict, we just reimplement the crop detection logic using
the output link dimensions.
It is possible for pl_queue_update() to return PL_QUEUE_OK, but to generate
an empty frame mix. This happens if the first frame of that input is in the
future.
In this case, we should skip an input as not active, similar to inputs that
have already reached EOF.
Instead of copying over the entire target and changing a few fields,
set the entire struct to a whitelist of safe properties that we want to
persist on the intermediate texture.
In particular, this avoids leaking irrelevant state related to the
acquire/release callbacks, e.g., which can otherwise cause deadlocks
when the same vulkan frame is attempted to be acquired twice.
Add check for the return value of av_malloc_array() to avoid potential NULL pointer dereference.
Fixes: d3be186ed1 ("avfilter/firequalizer: add dumpfile and dumpscale option")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
I also tried replacing some of the instructions by more elaborate ones
using masks, but I found no performance gain significant enough to be worth
maintaining two code paths, so this implementation merely replaces the AVX2
implementation by drop-in AVX512 equivalents.
bwdif8_c: 6362.2 ( 1.00x)
bwdif8_sse2: 1004.9 ( 6.33x)
bwdif8_ssse3: 946.0 ( 6.73x)
bwdif8_avx2: 477.9 (13.31x)
bwdif8_avx512: 273.3 (23.28x)
bwdif10_c: 6341.5 ( 1.00x)
bwdif10_sse2: 872.4 ( 7.27x)
bwdif10_ssse3: 803.4 ( 7.89x)
bwdif10_avx2: 416.7 (15.22x)
bwdif10_avx512: 224.3 (28.27x)
Realtime test at 3840x2160 yuv420p:
avx2: frame=20000 fps=3370 q=-0.0 Lsize=N/A time=00:06:40.00 bitrate=N/A speed=67.4x elapsed=0:00:05.93
avx512: frame=20000 fps=5077 q=-0.0 Lsize=N/A time=00:06:40.00 bitrate=N/A speed= 102x elapsed=0:00:03.93
The use of this function is gated behind avx512icl so that it doesn't
downclock on Skylake.
This commit introduces a new hardware-accelerated video filter, scale_d3d11,
which performs scaling and format conversion using Direct3D 11. The filter enables
efficient GPU-based scaling and pixel format conversion (p010 to nv12), reducing
CPU overhead and latency in video pipelines.
There is no convenient way, from the command line, to figure out which
formats a filter actually supports. This commit changes that by adding a
log output, at debug level, to simply print the list of formats each filter
advertises on its links, before any negotiation.
Furthermore, we can use the exact same helper function to also print out the
corresponding filter links when there is an error during format negotiation.
We need to use AV_BRINT_SIZE_UNLIMITED because the default format list for
filters like vf_scale is about 1700 characters long, significantly larger than
the the 1 kB default buffer.
The new logic should be easier to follow.
It also uses ff_inlink_consume_frame() for all simple passthrough operations
making custom get_audio_buffer callback unnecessary.
Fate changes are because the new logic does not repacketize input audio up
until the crossfade. Content is the same.
Signed-off-by: Marton Balint <cus@passwd.hu>
This gives vastly improved blending results than when blending directly in
the desired output colorspace. Overridable by the existing "disable_linear"
option.
This is functionally similar to combining multiple "libplacebo" filters,
but does not rely on the existence of a Vulkan filter link, so it can be used
without performance penalty in all circumstances. It's also enabled by
default, without requiring special action from the user.
The previous formula was introduced without justification in 6e713841e8,
and the only thing Paul had to say about it over IRC was that it was copied
from an unspecified source on the internet.
I decided to do some testing and came to the conclusion that this term not
only produces "illegal" files, but also lowers PSNR score, over the naive
implementation without this extra term.
Here are the results of a round-trip test, using allrgb/allyuv (respectively)
as the input, and fade=alpha=yes:n=256 to cycle through every possible alpha
value, comparing the round-trip output against the input:
Before patch:
PSNR r:26.677431 g:26.677431 b:26.677431 a:inf average:27.926818 min:6.012093 max:55.400791
PSNR y:26.677431 u:21.101981 v:21.101981 a:inf average:23.548981 min:9.013835 max:53.182303 (full)
PSNR y:27.348055 u:21.101981 v:21.101981 a:inf average:23.625238 min:9.554991 max:45.652221 (limited)
After patch:
PSNR r:27.321996 g:27.321996 b:27.321996 a:inf average:28.571384 min:6.012093 max:52.424553
PSNR y:27.321996 u:23.187879 v:23.187879 a:inf average:25.431773 min:9.013835 max:50.199232 (full)
PSNR y:27.868544 u:23.187879 v:23.187879 a:inf average:25.515660 min:9.554991 max:45.078298 (limited)
It's worth pointing out that previous version sometimes artificially inflates
PSNR by producing values that are too high (i.e. RGB > A), such as for the
input pair (R = 255, A = 2) which should give R = 2, but actually gives R = 3
under the old logic.
As a second evaluation without this shortcoming, here is a comparison against
the reference value computed with a floating point format:
Before patch:
PSNR r:53.600599 g:53.957833 b:53.540948 a:inf average:54.945316 min:50.508901 max:inf (premul only)
PSNR r:30.734183 g:30.734183 b:30.734183 a:inf average:31.983570 min:12.058264 max:inf (round-trip)
After patch:
PSNR r:61.751104 g:65.239091 b:61.339191 a:inf average:63.710714 min:55.441130 max:inf (premul only)
PSNR r:32.611851 g:32.611851 b:32.611851 a:inf average:33.861238 min:12.058264 max:inf (round-trip)