To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.
This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.
Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
seq_decode is used to ensure that a picture and all of its reference
pictures use the same SPS. Any time the SPS changes, seq_decode should
be incremented. Prior to this patch, seq_decode was incremented in
frame_context_setup, which is called after the SPS is potentially
changed in decode_sps. Should the decoder encounter an error between
changing the SPS and incrementing seq_decode, the SPS could be modified
while seq_decode was not incremented, which could lead to invalid
reference pictures and various downstream issues. By instead updating
seq_decode within the picture set manager, we ensure seq_decode and the
SPS are always updated in tandem.
This loosens the coupling between CBS and the decoder by no longer using
CodedBitstreamH266Context (containing the most recently parsed PSs & PH)
to retrieve the PSs & PH in the decoder. Doing so is beneficial in two
ways:
1. It improves robustness to the case in which an AVPacket doesn't
contain precisely one PU.
2. It allows the decoder parameter set manager to properly handle the
case in which a single PU (erroneously) contains conflicting
parameter sets.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Right now, the private contexts of every decoder supporting
H.274 film grain synthesis (namely H.264, HEVC and VVC)
contain a H274FilmGrainDatabase; said structure is very large
700442B before this commit) and takes up the overwhelming
majority of said contexts: Removing it reduces sizeof(H264Context)
by 92.88%, sizeof(HEVCContext) by 97.78% and sizeof(VVCContext)
by 99.86%. This is especially important for H.264 and HEVC
when using frame-threading.
The content of said film grain database does not depend on
any input parameter; it is shareable between all its users and
could be hardcoded in the binary (but isn't, because it is so huge).
This commit adds a database with static storage duration to h274.c
and uses it instead of the elements in the private contexts above.
It is still lazily initialized as-needed; a mutex is used
for the necessary synchronization. An alternative would be to use
an AV_ONCE to initialize the whole database either in the decoders'
init function (which would be wasteful given that most videos
don't use film grain synthesis) or in ff_h274_apply_film_grain().
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows compilers to optimize accesses like
ff_vvc_diag_scan_x[2][2][x] by baking the offset derived
from [2][2] into the relocation (so that it is performed
at link-time).
Reviewed-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_vvc_palette_escape_val() can return AVERROR in which case the
coeff*scale will overflow.
Fixes: runtime error: signed integer overflow: -1094995529 * 6528 cannot
be represented in type 'int'
Fixes: 435225406/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-5118570024730624
Found-by: OSS-Fuzz
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Consider the following sequence of NALUs (with some PPSs etc. omitted
for brevity):
1. SPS (ID=0, content=A)
2. IDR (SPS=0)
3. IDR (SPS=0)
4. SPS (ID=0, content=B)
5. TRAIL (SPS=0)
When decode_sps is called for NALU 3., ps->sps_id_used is cleared as
IDRs are one way of forming a CLVSS. Then, old_sps is non-NULL
containing the result of calling decode_sps for NALU 2. We haven't
received any SPSs between NALUs 2. and 3., therefore old_sps and rsps
are identical and the function returns. The issue is that, at this
point, ps->sps_id_used is still zero despite the SPS being used for IDR
3. This results in the check for conflicting SPSs not working properly
when decode_sps is called for NALU 5., allowing prediction between
pictures with different SPSs and probably all sorts of other
shenanigans.
Patch addresses the problem outlined above by also setting
ps->sps_id_used in the early return case.
Prior to this patch, kth_order_egk_decode could read arbitrarily
large values which then overflowed and caused various issues.
Patch fixes this by making kth_order_egk_decode falliable,
requiring the caller to specify an upper bound and returning an
error if the read value would exceed that bound.
This patch resolves the same issue as
eb52251c0a, but I think this is the proper
fix as it also addresses issues with syntax elements besides
ff_vvc_num_signalled_palette_entries.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Fixes: index 107 out of bounds for type 'uint16_t const[63]'
Fixes: 421336912/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-6436225806565376
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The ret value is checked later on again, so this check
is redundant and would cause the frame to not be unrefd on
failure as well.
So remove this check and add one before av_frame_remove_side_data
to ensure it is not called with an invalid frame.
Fix CID 1648350
Reviewed-by: Frank Plowman <post@frankplowman.com>
Add handling here for
sps_scaling_matrix_for_alternative_colour_space_disabled_flag.
Also add parentheses to make behaviour a little more explicit,
where &&'s precedence over || was relied on previously.
Reported-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Frank Plowman <post@frankplowman.com>
While the current code iterated over the messages, it always returned
in the first iteration. Instead keep iterating and warn for failure to
parse. At time of writing, none of the parsing functions seems to
actually return an error, ever.
Fix CID 1648348
When checking for filmgrain here, needs_fg can be true even when
film_grain_characteristics is NULL (when aom_film_grain.enable is true),
therefore this check could end up dereferencing film_grain_characteristics
even though it is NULL.
Fix CID 1648347
Add three missing requirements on bitstream conformance from 7.4.3.19 of
H.266 (V3). Issue found using fuzzing.
Signed-off-by: Frank Plowman <post@frankplowman.com>
When called for palette-predicted CUs, boundary_strength could cause
undefined behaviour due to accessing uninitialised motion information.
The spec doesn't include this, but in the reference software it seems
the deblock strength is always set to 0 for palette CUs due to some
implementation details: perhaps this is a spec issue?
Signed-off-by: Frank Plowman <post@frankplowman.com>
In d5dbcc00d8, it was hoped that detection
of subpicture overlaps could be performed at the tile level, so as to
avoid introducing per-CTU checks. Unfortunately since that patch,
fuzzing has indicated there are some structures involving
pps_subpic_one_or_more_tiles_slice where tile-level checking is not
sufficient. Performing the check at the CTU level should (touch wood)
be the be-all and and-all of this, as CTUs are the lowest common
denominator of the picture partitioning.
Signed-off-by: Frank Plowman <post@frankplowman.com>