Commit graph

53979 commits

Author SHA1 Message Date
Marvin Scholz
3daf664a5a avcodec: cbs_lcevc: remove dead code
The error is already handled before the loop, so this can never be true.

Fix Coverity issue 1683139
2026-04-28 12:24:54 +00:00
Jun Zhao
3cdd76ba96 avcodec/libsvtav1: reject tiny inputs on SVT-AV1 < 3.0.0
SVT-AV1 < 3.0.0 requires input dimensions of at least 64x64.
Older versions may otherwise silently accept smaller inputs without
producing output and cause the caller to hang. Reject such inputs
explicitly in config_enc_params() to produce a clear error.
v3.0.0+ supports sub-64px dimensions and validates the
input itself, so the check is gated with SVT_AV1_CHECK_VERSION.

Fix #22817

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Zuxy Meng
c55ab93eef avcodec/x86/h264_intrapred: Replace pred8x8_horizontal_8_mmxext with SSE2
Deprecating MMX. Instruction count unchanged.

Signed-off-by: Zuxy Meng <zuxy.meng@gmail.com>
2026-04-27 20:31:20 -07:00
Jeongkeun Kim
4ea59d5665 avcodec/aarch64: add NEON DCA LFE FIR filter functions
Port lfe_fir0_float and lfe_fir1_float to AArch64 NEON. These polyphase
FIR interpolation filters have an x86 SSE/AVX path but no AArch64
equivalent, falling back to scalar C.

The inner loop computes two dot products per output pair. Precomputing a
reversed LFE sample vector before the inner loop avoids per-iteration
shuffle overhead.

Benchmarks on AWS Graviton3 (Neoverse V1, c7g.xlarge):
  lfe_fir0_float: C 5902.0 cycles -> NEON 2135.0 cycles (2.77x)
  lfe_fir1_float: C 2836.3 cycles -> NEON 1527.8 cycles (1.86x)
Measured with: taskset -c 0 ./tests/checkasm/checkasm --test=dcadsp --bench,
3-run average, Ubuntu 22.04 (kernel 6.8.0-1052-aws), perf_event_paranoid=0.

Signed-off-by: Jeongkeun Kim <variety0724@gmail.com>
2026-04-27 20:13:23 +00:00
Georgii Zagoruiko
1ced59326a aarch64/vvc: Optimisations of put_chroma_hv() functions for 10/12-bit
Apple M4:
put_chroma_hv_10_2x2_c:                                  9.1 ( 1.00x)
put_chroma_hv_10_4x4_c:                                 20.1 ( 1.00x)
put_chroma_hv_10_8x8_c:                                 35.6 ( 1.00x)
put_chroma_hv_10_8x8_neon:                              15.4 ( 2.31x)
put_chroma_hv_10_16x16_c:                              113.7 ( 1.00x)
put_chroma_hv_10_16x16_neon:                            57.0 ( 1.99x)
put_chroma_hv_10_32x32_c:                              406.9 ( 1.00x)
put_chroma_hv_10_32x32_neon:                           225.7 ( 1.80x)
put_chroma_hv_10_64x64_c:                             1498.8 ( 1.00x)
put_chroma_hv_10_64x64_neon:                           876.2 ( 1.71x)
put_chroma_hv_10_128x128_c:                           5757.0 ( 1.00x)
put_chroma_hv_10_128x128_neon:                        3446.6 ( 1.67x)
put_chroma_hv_12_2x2_c:                                  9.9 ( 1.00x)
put_chroma_hv_12_4x4_c:                                 19.2 ( 1.00x)
put_chroma_hv_12_8x8_c:                                 36.1 ( 1.00x)
put_chroma_hv_12_8x8_neon:                              17.9 ( 2.02x)
put_chroma_hv_12_16x16_c:                              112.2 ( 1.00x)
put_chroma_hv_12_16x16_neon:                            55.6 ( 2.02x)
put_chroma_hv_12_32x32_c:                              416.6 ( 1.00x)
put_chroma_hv_12_32x32_neon:                           224.3 ( 1.86x)
put_chroma_hv_12_64x64_c:                             1464.8 ( 1.00x)
put_chroma_hv_12_64x64_neon:                           860.1 ( 1.70x)
put_chroma_hv_12_128x128_c:                           5776.8 ( 1.00x)
put_chroma_hv_12_128x128_neon:                        3445.2 ( 1.68x)

RPi5:
put_chroma_hv_10_2x2_c:                                118.5 ( 1.00x)
put_chroma_hv_10_4x4_c:                                190.6 ( 1.00x)
put_chroma_hv_10_8x8_c:                                303.1 ( 1.00x)
put_chroma_hv_10_8x8_neon:                             172.6 ( 1.76x)
put_chroma_hv_10_16x16_c:                             1036.1 ( 1.00x)
put_chroma_hv_10_16x16_neon:                           626.7 ( 1.65x)
put_chroma_hv_10_32x32_c:                             3624.4 ( 1.00x)
put_chroma_hv_10_32x32_neon:                          2386.9 ( 1.52x)
put_chroma_hv_10_64x64_c:                            13612.1 ( 1.00x)
put_chroma_hv_10_64x64_neon:                          9314.8 ( 1.46x)
put_chroma_hv_10_128x128_c:                          52975.4 ( 1.00x)
put_chroma_hv_10_128x128_neon:                       37083.5 ( 1.43x)
put_chroma_hv_12_2x2_c:                                118.6 ( 1.00x)
put_chroma_hv_12_4x4_c:                                188.1 ( 1.00x)
put_chroma_hv_12_8x8_c:                                303.4 ( 1.00x)
put_chroma_hv_12_8x8_neon:                             176.7 ( 1.72x)
put_chroma_hv_12_16x16_c:                             1037.9 ( 1.00x)
put_chroma_hv_12_16x16_neon:                           626.5 ( 1.66x)
put_chroma_hv_12_32x32_c:                             3629.0 ( 1.00x)
put_chroma_hv_12_32x32_neon:                          2386.6 ( 1.52x)
put_chroma_hv_12_64x64_c:                            13649.0 ( 1.00x)
put_chroma_hv_12_64x64_neon:                          9313.6 ( 1.47x)
put_chroma_hv_12_128x128_c:                          52978.0 ( 1.00x)
put_chroma_hv_12_128x128_neon:                       37101.2 ( 1.43x)
2026-04-27 20:10:57 +00:00
Andreas Rheinhardt
3dd03e4d4c avcodec/libvorbisenc: Cleanup on error
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:08:46 +02:00
Andreas Rheinhardt
df70e8297b avcodec/libvorbisenc: Return error upon error
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:08:46 +02:00
Andreas Rheinhardt
51c5cc0ca3 avcodec/libvorbisenc: Fix leak of vorbis_comment upon error
Just free it immediately and unconditionally after it is no longer
needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:08:46 +02:00
Semih Baskan
bdc0a29204 avcodec/nvenc: gate MVHEVC capability check on codec_id
NV_ENC_H264_PROFILE_HIGH_10 and NV_ENC_HEVC_PROFILE_MULTIVIEW_MAIN
both equal 3 when their respective NVENC_HAVE_* flags are defined.
The MVHEVC check in nvenc_check_capabilities() matches against
ctx->profile alone, so an H.264 encode with profile=high10 is
rejected as if it were an HEVC multiview request on hardware
without MVHEVC support.

Signed-off-by: Semih Baskan <strst.gs@gmail.com>
2026-04-26 19:25:12 +00:00
Macdu
78da965b58 avcodec/atrac9tab: correct base curve value 2026-04-25 22:44:39 +02:00
Ramiro Polla
08f56d4898 avcodec/webp: remove write-only lossless field from WebPContext 2026-04-23 16:46:42 +00:00
Ramiro Polla
9e8e82308b avcodec/webp: use av_fourcc2str() to print fourccs 2026-04-23 16:46:42 +00:00
Lynne
162ad61486
vulkan/ffv1: fix second-line linecache initialization for Golomb
This was a difficult problem to find.

Sponsored-by: Sovereign Tech Fund
2026-04-22 23:24:04 +02:00
Ashrit Shetty
9acd820732 avcodec/mfenc: populate video input type with size, rate, interlace
mf_encv_input_adjust() currently only validates the pixel format and
otherwise leaves the input IMFMediaType unchanged. The Microsoft
H.264, H.265 and AV1 encoder MFTs tolerate this and internally infer
the missing attributes from the previously-set output type. Other
MediaFoundation encoder MFTs that follow the specification more
strictly reject the input type with MF_E_INVALIDMEDIATYPE (due to
MF_E_ATTRIBUTENOTFOUND on MF_MT_FRAME_SIZE / MF_MT_FRAME_RATE) when
those attributes are absent, which causes IMFTransform::SetInputType
to fail and aborts encoding.

Set MF_MT_FRAME_SIZE, MF_MT_FRAME_RATE and MF_MT_INTERLACE_MODE on
the input media type, mirroring what mf_encv_output_adjust() already
writes to the output type. Behaviour on the Microsoft MFTs is
unchanged (they were already using these values) and encoding now
works with stricter third-party MFTs.

The MF_MT_FRAME_SIZE assignment has been present but commented out
since the original MediaFoundation wrapper was added in 050b72ab5e.

Signed-off-by: Ashrit Shetty <ashritshetty@microsoft.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2026-04-22 10:48:24 +03:00
Diego de Souza
0bba4603e2 avcodec/cuviddec: fix monochrome formats misclassified as YUV444
Monochrome formats (gray, gray10le) have log2_chroma_w == 0 and
log2_chroma_h == 0 because they have no chroma planes — the same
values as YUV444. This caused them to be misclassified as YUV444 by
the is_yuv444 detection introduced in bcea693f75.

After fed6612415 changed cuvid_test_capabilities to use is_yuv444
instead of hardcoding cudaVideoChromaFormat_420, monochrome AV1
streams were rejected with "Codec av1_cuvid is not supported with
this chroma format".

Add an nb_components > 1 guard to exclude single-component formats
from the YUV444 path.

Patch by: Aniket Dhok <adhok@nvidia.com>
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2026-04-21 16:51:54 +02:00
Diego de Souza
afc8556a6a avcodec/cuviddec: handle 4-byte AV1CodecConfigurationRecord
AV1CodecConfigurationRecord may contain only the 4-byte header and no
configOBUs. Still skip the header in that case so only configOBUs are
passed to cuvidParseVideoData().

Otherwise the av1C header itself is treated as sequence header data
and AV1 decoding can fail with an unknown error.

Suggested-by: Aniket Dhok <adhok@nvidia.com>
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2026-04-21 14:03:58 +00:00
Paul Adenot
99d8d3891f avcodec: Allow enabling DTX in libopusenc 2026-04-21 13:38:44 +00:00
Víctor Manuel Jáquez Leal
f4ac7dee87 avcodec/vulkan_encode_h265: fix capabilities flags
Replace the H264 ones.
2026-04-21 13:08:59 +00:00
Jun Zhao
75838b9c89 lavc/hevc: add aarch64 NEON for reference sample filtering
3-tap [1,2,1]>>2: shared implementation body across size-specialized
entry points (8x8/16x16/32x32) to reduce code size. Fold the 3-tap
kernel into uhadd + urhadd: uhadd gives floor((prev+next)/2), then
urhadd rounds with curr to produce (prev + 2*curr + next + 2) >> 2
on 16 bytes in-place (no widen/narrow needed). Overlap-last technique
for tail avoids partial stores. Caller pads input arrays by 16 bytes
to guarantee safe over-read.

Strong smoothing (32x32): preloaded weight tables, interleaved
umull/umlal pairs (two 16-byte blocks at a time) to hide
rshrn-to-store latency, with paired st1 for 32-byte writes.

checkasm --bench --runs=15 (Apple M4, average of 3 trials):
  ref_filter_3tap_8x8_8_neon:    4.1x
  ref_filter_3tap_16x16_8_neon:  3.3x
  ref_filter_3tap_32x32_8_neon:  2.5x
  ref_filter_strong_8_neon:      1.9x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-21 07:50:49 +00:00
Jun Zhao
a3d8e417c0 lavc/hevc: extract reference sample filter into function pointers
Extract 3-tap [1,2,1]>>2 and strong intra smoothing from
intra_pred() into HEVCPredContext function pointers, preparing
for arch-specific overrides.

ref_filter_3tap[3] indexed by log2_size - 3 (sizes 8/16/32).
ref_filter_strong for 32x32 luma only.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-21 07:50:49 +00:00
Zuxy Meng
dc23adde9b avcodec/x86/h264_intrapred: Replace pred8x8_dc_8_mmxext with SSE2
Deprecating MMX w/o performance regression; nearly identical performance
numbers on my Zen 4 (1.99x vs c)

Signed-off-by: Zuxy Meng <zuxy.meng@gmail.com>
2026-04-20 19:38:56 -07:00
Andreas Rheinhardt
6135ccbf80 avcodec/pdvenc: Return directly upon error
This encoder has the FF_CODEC_CAP_INIT_CLEANUP cap set.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt
87a6be19f8 avcodec/pdvenc: Remove always false check
av_image_check_size() already checks that width*height
fits into an int.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt
c94cb9c04f avcodec/pdvenc: Remove always-false pixel format check
Already checked via CODEC_PIXFMTS.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt
e908c92f5a avcodec/cavs: Don't allocate block separately
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:21:41 +02:00
Andreas Rheinhardt
415b466d41 avcodec/x86/vp3dsp: Port ff_vp3_idct_dc_add_mmxext to SSE2
This change should improve performance on Skylake and later
Intel CPUs (which have only half the ports for saturated adds/subs
for mmx register compared to xmm register): llvm-mca predicts
a 25% performance improvement on Skylake.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt
e7a613b274 avcodec/x86/vp3dsp: Avoid loads and stores
Instead reuse values in registers.

Old benchmarks:
idct_add_c:                                             74.2 ( 1.00x)
idct_add_sse2:                                          60.4 ( 1.23x)
idct_put_c:                                            100.8 ( 1.00x)
idct_put_sse2:                                          58.7 ( 1.72x)

New benchmarks:
idct_add_c:                                             74.2 ( 1.00x)
idct_add_sse2:                                          55.2 ( 1.34x)
idct_put_c:                                            107.5 ( 1.00x)
idct_put_sse2:                                          54.1 ( 1.99x)

Hint: For x64, all the intermediate stores could be avoided.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt
ed59fc77e8 avcodec/x86/vp3dsp: Use named args in idct functions
Also avoid REX prefixes while just at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt
c1af56357b avcodec/x86/vp3dsp: Avoid unnecessary macro, repetition
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt
84b9de0633 avcodec/x86/vp3dsp: Port ff_put_vp_no_rnd_pixels8_l2_mmx to SSE2
This allows to use pavgb to reduce the amount of instructions used
to calculate the average; processing two rows via movhps allows
to reduce the amount of pxor and pavgb even further and turned
out to be beneficial.
This patch also avoids a load as the constant used here can be easily
generated at runtime.

Old benchmarks:
put_no_rnd_pixels_l2_c:                                 13.3 ( 1.00x)
put_no_rnd_pixels_l2_mmx:                               11.6 ( 1.15x)

New benchmarks:
put_no_rnd_pixels_l2_c:                                 13.4 ( 1.00x)
put_no_rnd_pixels_l2_sse2:                               7.5 ( 1.77x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:15:54 +02:00
Andreas Rheinhardt
79ce4432e0 avcodec/simple_idct10_template: Reduce amount of registers used
This allows to avoid the stack for the 8 bit simple IDCT;
for the other IDCTs, it avoids storing and restoring two
xmm registers on Win64.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 07:36:37 +02:00
Michael Niedermayer
d538a71ad5 avcodec/svq1dec: Check input space for minimum
We reject inputs that are significantly smaller than the smallest frame.
This check raises the minimum input needed before time consuming computations are performed
it thus improves the computation per input byte and reduces the potential DoS impact

Fixes: Timeout
Fixes: 472769364/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SVQ1_DEC_fuzzer-5519737145851904

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-18 18:32:50 +00:00
Vignesh Venkat
7faa6ee2aa libavformat/matroska: Support smpte 2094-50 metadata
Add support for parsing and muxing smpte 2094-50 metadata. It will
be stored as an ITUT-T35 message in the BlockAdditional element with
an AddId type of 4 (which is reserved for ITUT-T35 in the matroska
spec).

https://www.matroska.org/technical/codec_specs.html#itu-t35-metadata

Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
2026-04-17 18:51:25 +00:00
Lynne
c1b19ee69f
aacdec: add support for 960-frame HE-AAC (DAB+) decoding
Finally, after so many years. I'm sure there's good DAB+ content
out there being broadcast. Go and listen to it.
2026-04-17 16:46:52 +02:00
Hassan Hany
9e4041d5ea avcodec/opus: use precomputed NLSF weights for Silk decoder
Precompute the SILK NLSF residual weights from the stage-1 codebooks and use the table during LPC decode. This removes the per-coefficient mandated fixed-point weight calculation in silk_decode_lpc() while preserving the same decoded values.
2026-04-17 14:39:20 +00:00
Peter von Kaenel
d013863f00 avcodec/lcevcdec: poll on LCEVC_Again from LCEVC_ReceiveDecoderPicture
The V-Nova LCEVC pipeline processes frames on internal background
worker threads. LCEVC_ReceiveDecoderPicture returns LCEVC_Again (-1)
when the worker has not yet completed the frame, which is the
documented "not ready, try again" response. The original code treated
any non-zero return as a fatal error (AVERROR_EXTERNAL), causing decode
to abort mid-stream.

Poll until LCEVC_Success or a genuine error is returned.

Signed-off-by: Peter von Kaenel <Peter.vonKaenel@harmonicinc.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-16 17:19:28 -03:00
Andreas Rheinhardt
9ab37ef918 avcodec/packet: Remove always-true check
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
4b5e1d25c3 avcodec/decode: Short-circuit side-data processing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
a85709537e avcodec/decode: Avoid temporary frame in ff_reget_buffer()
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
b595b3075e avcodec/decode: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
f99e4a0f23 avcodec/decode: Optimize call away if possible
post_process_opaque is only used by LCEVC, so it is unused
on most builds.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
312bfd512d avcodec/decode: Remove always-true checks
dc->lcevc.ctx is only != NULL for video.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
2d062dd0c6 avcodec/decode: Make post_process_opaque a RefStruct reference
Avoids the post_process_opaque_free callback; the only user of
this is already a RefStruct reference and presumably other users
would want to use a pool for this, too, so they would use
RefStruct-objects, too.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt
0ee1947d9b avcodec/lcevcdec: Use pool to avoid allocations of FFLCEVCFrame
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Hassan Hany
3b19a61837 avcodec/vorbisdec: validate windowtype and transformtype 2026-04-16 10:24:41 +00:00
Gyan Doshi
5abc240a27 avcodec/videotoolboxenc: add missing field and rectify cap flags 2026-04-16 09:58:06 +00:00
Andreas Rheinhardt
3de38c6b6e avcodec/h264chroma: Fix incorrect alignment documentation
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 07:36:01 +02:00
Andreas Rheinhardt
53a26f7d41 avcodec/mpegvideo_dec: Use C version of h264chroma mc2 functions
H.264 only uses these functions with height 2 or 4 and
the aarch64, arm and mips versions of them optimize based
on this. Yet this is not true when these functions are used
by the lowres code in mpegvideo_dec.c. So revert back to
the C versions of these functions for mpegvideo_dec so that
the H.264 decoder can still use fully optimized functions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 07:36:01 +02:00
James Almer
2feb213287 avcodec/lcevc: make CBS reallocate the LCEVC payload
Frame side data unfortunately lacks padding, which CBS needs, so we can't reuse
the existing AVBufferRef.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-15 16:13:44 -03:00
Stéphane Cerveau
3f9e04b489 vulkan: fix encode feedback query handling
Check that the driver supports both BUFFER_OFFSET and BYTES_WRITTEN
encode feedback flags before creating the query pool, failing with
EINVAL if either is missing.

Set these flags explicitly instead of masking off HAS_OVERRIDES with a
bitwise NOT, which could pass unrecognized bits from newer drivers to
vkCreateQueryPool causing validation errors and
crashes.
2026-04-14 21:31:45 +00:00