When encoding VP9 with a YUV pixel format (e.g. yuv420p) and
AVCOL_SPC_RGB colorspace metadata, libvpxenc unconditionally set
VPX_CS_SRGB. This produced a spec-violating bitstream: Profile 0
(4:2:0) with sRGB colorspace, which is only valid for Profile 1/3
(4:4:4). The resulting file is undecodable.
Fix this by setting ctx->vpx_cs to VPX_CS_SRGB in set_pix_fmt()
for 4:4:4 YUV formats when AVCOL_SPC_RGB is set, matching the
existing GBRP path. This covers the legitimate case of RGB data in
YUV444 containers (e.g. H.264 High 4:4:4 with identity matrix).
With this change, any AVCOL_SPC_RGB that reaches the switch in
set_colorspace() is guaranteed to be a subsampled format where
sRGB is invalid. Return an error so the user can fix their
pipeline rather than silently producing incorrect output.
To reproduce:
ffmpeg -f lavfi -i testsrc=s=64x64:d=1:r=1 \
-c:v libvpx-vp9 -pix_fmt yuv420p -colorspace rgb bad.webm
ffprobe bad.webm
# -> "vp9 (Profile 0), none(pc, gbr/...), 64x64"
ffmpeg -i bad.webm -f null -
# -> 0 frames decoded, error
See also:
https://issues.webmproject.org/487307225
Signed-off-by: Guangyu Sun <gsun@roblox.com>
Signed-off-by: James Zern <jzern@google.com>
For cases when returning early without updating any pixels, we
previously returned to return address in the caller's scope,
bypassing one function entirely. While this may seem like a neat
optimization, it makes the return stack predictor mispredict
the returns - which potentially can cost more performance than
it gains.
Secondly, if the armv9.3 feature GCS (Guarded Control Stack) is
enabled, then returns _must_ match the expected value; this feature
is being enabled across linux distributions, and by fixing the
hevc assembly, we can enable the security feature on ffmpeg as well.
Cap ulNumDecodeSurfaces to 32 and ulNumOutputSurfaces to 64 to prevent
cuvidCreateDecoder from failing with CUDA_ERROR_INVALID_VALUE when
initial_pool_size exceeds the hardware limits.
Also cap the decoder index pool (dpb_size) to 32 so that indices
handed out via av_refstruct_pool_get stay within the valid range
for cuvidDecodePicture's CurrPicIdx.
When unsafe_output is enabled, stop holding idx_ref in the unmap
callback. Since cuvidMapVideoFrame copies decoded data into an
independent output mapping slot, the decode surface index can safely
be reused as soon as the DPB releases it, without waiting for the
downstream consumer to release the mapped frame. This decouples the
decode surface index lifetime (max 32) from the output mapping slot
lifetime (max 64), eliminating the "No decoder surfaces left" error
that occurred when downstream components like nvenc held too many
frames.
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
Fixes ticket #22420.
When the first decoded frame is type 1, xan_decode_frame_type1() reads y_buffer as prior-frame state before any data has been written to it.
Since y_buffer is allocated with av_malloc(), this may propagate uninitialized heap data into the decoded luma output.
Allocate y_buffer with av_mallocz() instead.
Fixes UB in the form or adding a 0 offset to a NULL pointer, and substracting a
NULL pointer from another.
Signed-off-by: James Almer <jamrial@gmail.com>
The buffers are allocated using the worst case scenario of the entire NALU
being written, when this is in many times not the case.
Signed-off-by: James Almer <jamrial@gmail.com>
The specification for LCEVC states that start codes may be three or four bytes
long except for the first NALU in an AU, which must be four bytes long.
Signed-off-by: James Almer <jamrial@gmail.com>
The specification for H.26{4,5,6} states that start codes may be three or four
bytes long long except for the first NALU in an AU, and for NALUs of parameter
set types, which must be four bytes long.
This is checked by ff_cbs_h2645_unit_requires_zero_byte(), which is made
available outside of CBS for this change.
Signed-off-by: James Almer <jamrial@gmail.com>
The correct syntax after country_code is:
t35_uk_country_code_second_octet b(8)
t35_uk_manufacturer_code_first_octet b(8)
t35_uk_manufacturer_code_second_octet b(8)
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: use of uninitialized memory
Fixes: 490707906/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EXR_DEC_fuzzer-6310933506097152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This also reverts: c2364e9222
Fixes: out of array access (testcase exists but did not replicate for me)
Founbd-by: Gil Portnoy <dddhkts1@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: invalid state leading to out of array access
Fixes: 490615782/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-4711353817563136
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
An H.264 picture with 65536 slices makes slice_num collide with the
slice_table sentinel. slice_table is uint16_t, initialized via
memset(..., -1, ...) so spare entries (one per row, mb_stride =
mb_width + 1) stay 0xFFFF. slice_num is an uncapped ++h->current_slice.
At slice 65535 the collision makes slice_table[spare] == slice_num
pass, defeating the deblock_topleft check in xchg_mb_border and the
top_type zeroing in fill_decode_caches.
With both guards bypassed at mb_x = 0, top_borders[top_idx][-1]
underflows 96 bytes and XCHG writes at -88 below the allocation
(plus -72 and -56 for chroma in the non-444 path).
Fixes: heap-buffer-overflow
Found-by: Nicholas Carlini <nicholas@carlini.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array read with --disable-safe-bitstream-reader
Fixes: poc_wmv2.avi
Note, this requires the safe bitstream reader to be turned off by the user and the user disregarding the security warning
Change suggested by: Guanni Qu <qguanni@gmail.com>
Found-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
`spectrum_decode` currently executes Frequency Domain (FD) decoding steps
for all channels, regardless of their `core_mode`. When a channel is in
Linear Prediction Domain (LPD) mode (`core_mode == 1`), FD-specific
parameters such as scalefactor offsets (`sfo`) and individual channel
stream (`ics`) information are not parsed.
This causes a global-buffer-overflow in `dequant_scalefactors`. Because
`spectrum_scale` is called on LPD channels, it uses stale or
uninitialized `sfo` values to index `ff_aac_pow2sf_tab`. In the reported
crash, a stale `sfo` value of 240 resulted in an index of 440
(240 + POW_SF2_ZERO), exceeding the table's size of 428.
Fix this by ensuring `spectrum_scale` and `imdct_and_windowing` are only
called for channels where `core_mode == 0` (FD).
Co-authored-by: CodeMender <codemender-patching@google.com>
Fixes: https://issues.oss-fuzz.com/486160985
Group assignments by filter family (qpel, epel), variant
(base, uni, bi, uni_w, bi_w) and direction (pixels, h, v, hv).
Add NEON8_FNASSIGN_QPEL_H macro to replace repeated manual
qpel horizontal assignments.
No functional change.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Add NEON-optimized implementations for HEVC QPEL uni-directional
weighted HV interpolation (put_hevc_qpel_uni_w_hv) at 8-bit depth,
for block widths 6, 12, 24, and 48.
These functions perform horizontal then vertical 8-tap QPEL filtering
with weighting (wx, ox, denom) and output to uint8_t. Previously
only widths 4, 8, 16, 32, 64 were implemented; this completes
coverage for all standard HEVC block widths.
Performance results on Apple M4:
./tests/checkasm/checkasm --test=hevc_pel --bench
put_hevc_qpel_uni_w_hv6_8_neon: 3.11x
put_hevc_qpel_uni_w_hv12_8_neon: 3.19x
put_hevc_qpel_uni_w_hv24_8_neon: 2.26x
put_hevc_qpel_uni_w_hv48_8_neon: 1.80x
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Move the subs instruction before the store macro in the 8x-unrolled
loops of qpel_uni_w_v4/v8/v16/v64 and qpel_uni_w_hv4/hv8/hv16, so
that many NEON instructions from the store macro separate it from the
conditional branch. This gives the CPU pipeline time to resolve the
condition flags before the branch decision.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Not only do some sources not provide an aspect ratio, as is the case of
MPEG-TS, but also some enhanced streams have no change in dimensions, and this
heuristic would generate bugus values.
Instead, we need to parse the LCEVC bitstream for a Global Config process block
in order to get the actual dimensions. This add a little overhead, but it can't
be avoided.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: use after free
Fixes: 478301106/clusterfuzz-testcase-minimized-ffmpeg_dem_HEVC_fuzzer-6155792247226368
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array writes
Fixes: 492054712/clusterfuzz-testcase-minimized-ffmpeg_BSF_EXTRACT_EXTRADATA_fuzzer-5705993148497920
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>