This commit adds support for 4:2:2 decoding for HEVC and H.264 on
NVIDIA Blackwell GPUs. Additionally, it supports 10-bit decoding
for H.264 on Blackwell GPUs.
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
`POS(1,` and `POS(2,` may trigger UBSAN report:
"runtime error: applying non-zero offset 304 to null pointer"
Looks like values are not used without `chroma_format_idc`,
so maybe there is no other issues than the UB.
Can't reproduce with "fate".
Signed-off-by: Vitaly Buka <vitalybuka@google.com>
Signed-off-by: James Almer <jamrial@gmail.com>
These properties are unreliable because they depend on the frames decoded so
far, users should check directly the presence of the decoded AVFrame side data
or AVFrame flags.
Signed-off-by: Marton Balint <cus@passwd.hu>
This can be useful in other places, e.g. it can replace objpool in
fftools.
The API is modified in the following nontrivial ways:
* opaque pointers can be passed through to all user callbacks
* read and write were previously separate callbacks in order to
accomodate the caller wishing to write a new reference to the FIFO and
keep the original one; the two callbacks are now merged into one, and
a flags argument is added that allows to request such behaviour on a
per-call basis
* new peek and drain functions
This does not replicate on my setup, thus this is a blind fix based on ossfuzz trace
Fixes: use of uninitialized value
Fixes: 71747/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5427736120721408
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
And ensure the buffer is synced between threads.
Based on a patch by Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: James Almer <jamrial@gmail.com>
This removes the ABI breaking use of sizeof(AVFilmGrainParams), and achieves the
same size reduction to decoder structs as 08b1bffa49.
Signed-off-by: James Almer <jamrial@gmail.com>
AVFilmGrainAFGS1Params, the offending struct, is using sizeof(AVFilmGrainParams)
when it should not. This change also forgot to make the necessary changes to the
frame threading sync code.
Both of these will be fixed by the following commit.
H274FilmGrainDatabase will be handled later.
This reverts commit 08b1bffa49.
Signed-off-by: James Almer <jamrial@gmail.com>
Film grain support adds a huge amount of overhead to the H264Context
structure for a feature that is rarely used. On low end devices or
pages that have lots of media this bloats memory usage rapidly.
This changes the static film grain metadata allocations to be dynamic
which reduces the H264Context size from 851808 bytes to 53444 bytes.
Bug: https://crbug.com/359358875
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Niklas Haas <git@haasn.dev>
The "progress2" API in pthread_slice.c currently associates a progress
value with a thread rather than a job, relying on the broken assumption
that a job's thread number is equal to its job number modulo thread
count.
This removes this API entirely, and changes hevcdec to use a
ThreadProgress-based implementation that associates a
mutex/cond/progress value with every job.
Fixes races and deadlocks in hevdec with slice threading, e.g. some of
those mentioned in #11221.
It can't be higher than vps_max_sub_layers.
Do this while keeping the workaround for qsvenc_hevc calling ff_hevc_parse_sps()
without a vps_list, as in some cases it needs to parse an sps to generate a fake
vps derived from it.
Signed-off-by: James Almer <jamrial@gmail.com>
With multilayer001.heic:
Before:
[hevc @ ...] Scalability type 2 not supported
[hevc @ ...] Ignoring unsupported VPS extension
[hevc @ ...] The following bit-depths are currently specified: 8, 9, 10 and 12 bits, chroma_format_idc is 0, depth is 0
After:
[hevc @ ...] Scalability type 2 not supported
[hevc @ ...] Ignoring unsupported VPS extension
[hevc @ ...] SPS 1 references an unsupported VPS extension. Ignoring
Signed-off-by: James Almer <jamrial@gmail.com>
The per-frame reference picture set contains two more lists -
INTER_LAYER[01]. Assuming at most two layers, INTER_LAYER1 is always
empty, but is added anyway for completeness.
When inter-layer prediction is enabled, INTER_LAYER0 for the
second-layer frame will contain the base-layer frame from the same
access unit, if it exists.
The new lists are then used in per-slice reference picture set
construction as per F.8.3.4 "Decoding process for reference picture
lists construction".
pkt_dts needs to be set manually when using the receive_frame() callback, so
it was unset after 2fdecbb239.
Fixes PTS guessing for certain files with broken timestamps. Cf.
https://github.com/mpv-player/mpv/issues/14806
Reported-by: llyyr <llyyr.public@gmail.com>
Handle them together with other sps-dependent arrays.
Note that current code only allocates these arrays when hwaccel is not
set, but this is wrong as the relevant code runs BEFORE get_format() is
called and hence before we know whether hwaccel is in use.