Commit graph

53858 commits

Author SHA1 Message Date
James Almer
ad7d270935 avcodec/libdav1d: call ff_attach_decode_data() on output frames
This will allow the injection of LCEVC side data.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer
823c6fc0b8 avcodec/decode: make LCEVC injection available to decoders that don't call ff_get_buffer()
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer
8528c697c7 avcodec/av1dec: add support for LCEVC ITU-T35 payloads
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer
4c7a8df34d avcodec/av1dec: refactor parsing ITU-T35 metadata
Use a switch case. Will be useful in the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer
29d8c2af4d avcodec/libdav1d: add support for LCEVC ITU-T35 payloads
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer
fe1ffd63fb avcodec/libdav1d: refactor parsing ITU-T35 metadata
Use a switch case. Will be useful in the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
Andreas Rheinhardt
1a7979a2f8 avcodec/x86/h26x/h2656_inter: Simplify splatting coefficients
For pre-AVX2, vpbroadcastw is emulated via a load, followed
by two shuffles. Yet given that one always wants to splat
multiple pairs of coefficients which are adjacent in memory,
one can do better than that: Load all of them at once, perform
a punpcklwd with itself and use one pshufd per register.
In case one has to sign-extend the coefficients, too,
one can replace the punpcklwd with one pmovsxbw (instead of one
per register) and use pshufd directly afterwards.

This saved 4816B of .text here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt
a72b00675c avcodec/x86/h26x/h2656_inter: Don't prepare unused coeffs for hv funcs
8 tap motion compensation functions with both vertical and horizontal
components are under severe register pressure, so that the filter
coefficients have to be put on the stack. Before this commit,
this meant that coefficients for use with pmaddubsw and pmaddwd
were always created. Yet this is completely unnecessary, as
every such register is only used for exactly one purpose and
it is known at compile time which one it is (only 8bit horizontal
filters are used with pmaddubsw), so only prepare that one.
This also allows to half the amount of stack used.

This saves 2432B of .text here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt
88870f33ab avcodec/x86/h26x/h2656_inter: Remove always-true checks
It has already been checked before that we are only dealing
with high bitdepth here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt
c00721310f avcodec/x86/hevc/deblock: Avoid vmovdqa
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt
4c179adeaf avcodec/Makefile: Add avformat->h2645_parse.o lcevctab.o dependencies
Fixes static --disable-everything builds.
Forgotten in 053822d9ce
and 49c449b33a.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 23:25:31 +01:00
Andreas Rheinhardt
e91727e7ef avcodec/x86/mpeg4videodsp: Fix build failure without x86asm
Since ba793127c4,
the x86 mpeg4videodsp code uses ff_emulated_edge_mc_sse2()
instead of ff_emulated_edge_mc_8. This leads to linker errors
when x86asm is disabled. Fix this by also falling back to ff_gmc_c()
in case edge emulation is needed with external SSE2 being unavailable.

An alternative is to go back to ff_emulated_edge_mc_8(), but this
would readd the uglyness to videodsp for a niche case.

Reported-by: James Almer <jamrial@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 22:39:05 +01:00
James Almer
a5c10346fc avcodec/lcevcdec: do nothing with unsupported pixel formats
Instead of failing and stopping the decoding process.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 18:33:12 -03:00
James Almer
d069ba22ff avcodec/decode: don't try to apply LCEVC enhancements if some other kind of post processing is active
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 20:14:13 +00:00
James Almer
d6a22cda38 avcodec/decode: add a hwaccel specific post_process callback to FrameDecodeData
Leave the existing one for non decoder-specific, post processing usage.
With this, scenarios like nvdec decoding can work algonside lcevc enhancement application.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 20:14:13 +00:00
Priyanshu Thapliyal
d1bcaab230 avcodec/alsdec: preserve full float value in zero-truncated samples
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-28 12:18:37 +00:00
Priyanshu Thapliyal
febc82690d avcodec/alsdec: propagate read_diff_float_data() errors in read_frame_data()
The return value of read_diff_float_data() was previously ignored,
allowing decode to continue silently with partially transformed samples
on malformed floating ALS input. Check and propagate the error.

All failure paths in read_diff_float_data() already return
AVERROR_INVALIDDATA, so the caller fix is sufficient without
any normalization inside the function.

Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-28 11:53:38 +00:00
Andreas Rheinhardt
bb65b54f2f avcodec/x86/sbcdsp: Port MMX sbc_calc_scalefactors to SSE4
Besides giving a nice speedup over the MMX version,
it also avoids processing unnecessarily much input and
touching unnecessarily much output in the 2ch-4subbands case.

calc_scalefactors_1ch_4subbands_c:                     106.9 ( 1.00x)
calc_scalefactors_1ch_4subbands_mmx:                    46.7 ( 2.29x)
calc_scalefactors_1ch_4subbands_sse4:                   11.8 ( 9.05x)
calc_scalefactors_1ch_8subbands_c:                     220.5 ( 1.00x)
calc_scalefactors_1ch_8subbands_mmx:                    92.3 ( 2.39x)
calc_scalefactors_1ch_8subbands_sse4:                   23.8 ( 9.28x)
calc_scalefactors_2ch_4subbands_c:                     222.5 ( 1.00x)
calc_scalefactors_2ch_4subbands_mmx:                   139.3 ( 1.60x)
calc_scalefactors_2ch_4subbands_sse4:                   23.6 ( 9.41x)
calc_scalefactors_2ch_8subbands_c:                     440.3 ( 1.00x)
calc_scalefactors_2ch_8subbands_mmx:                   196.8 ( 2.24x)
calc_scalefactors_2ch_8subbands_sse4:                   46.5 ( 9.48x)

The MMX version has been removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
cd886bf0a5 avcodec/x86/sbcdsp: Port ff_sbc_analyze_[48]_mmx to SSE2
Halfs the amount of pmaddwd and improves performance a lot:
sbc_analyze_4_c:                                        55.7 ( 1.00x)
sbc_analyze_4_mmx:                                       7.0 ( 7.94x)
sbc_analyze_4_sse2:                                      4.3 (12.93x)
sbc_analyze_8_c:                                       131.1 ( 1.00x)
sbc_analyze_8_mmx:                                      22.4 ( 5.84x)
sbc_analyze_8_sse2:                                     10.7 (12.25x)

It also saves 224B of .text and allows to remove the emms_c()
from sbcenc.c (notice that ff_sbc_calc_scalefactors_mmx()
issues emms on its own, so it already abides by the ABI).

Hint: A pshufd could be avoided per function if the constants
were reordered.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
90215634f1 avcodec/sbcenc: Remove redundant memset()
A codec's private context is zero-allocated.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
f670006960 avcodec/sbcenc: Use correct size for PutBitContext
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
3540a6a308 avcodec/sbcenc: Don't output uninitialized data
Check in init whether the parameters are valid.
This can be triggered with
ffmpeg -i tests/data/asynth-44100-2.wav -c sbc -sbc_delay 0.001 \
-b:a 100k -f null -

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
b5ce98b3ff avcodec/sbcdsp: Constify
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
0a81a1ce66 avcodec/x86/sbcdsp: Fix calculating four-subbands stereo scalefactors
sbc_calc_scalefactors uses an int32_t [16/*max blocks*/][2/*max
channels*/][8/*max subbands*/] array. The MMX version of this code
treats the two inner arrays as one [2*8] array to process
and it processes subbands*channels of them. But when subbands
is < 8 and channels is two, the entries to process are not
contiguous: One has to process 0..subbands-1 and 8..7+subbands,
yet the code processed 0..2*subbands-1.
This commit fixes this by processing entries 0..7+subbands
if there are two channels.

Before this commit, the following command line triggered an
av_assert2() in put_bits():
ffmpeg_g -i tests/data/asynth-44100-2.wav -c sbc -b:a 200k \
-sbc_delay 0.003 -f null -

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
1c9f56f969 avcodec/sbc: Use union to save space
One buffer is encoder-only, the other decoder-only.
Also move crc_ctx before the buffers (into padding).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
7e032d6963 avcodec/sbcdec: Remove AVClass* from context
This decoder has no private class.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
James Almer
eb40d70081 avcodec/lcevcdec: add missing pixel formats
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-27 21:00:14 -03:00
James Almer
96b1b0bf67 avcodec/lcevcdec: also decompose NON_IDR NALUs
The first Global Config process block may be in one of them.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-27 20:56:45 -03:00
Anton Khirnov
3befae81f1 lavc/decode: change sw format selection logic in avcodec_default_get_format()
Choose the first non-hwaccel format rather than the last one. This
matches the logic in ffmpeg CLI and selects YUVA rather than YUV for
HEVC with alpha.
2026-03-27 19:42:08 -03:00
Andreas Rheinhardt
6ed6815b46 avcodec/tests/motion: Remove test tool
It only tests MMX (me_cmp does not have pure MMX functions any more)
and MMXEXT and is therefore x86-only. Furthermore, checkasm is superior
in every regard.

Removing it also fixes a build failure (there is no dependency of this
tool on me_cmp).

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-27 18:48:48 +01:00
osamu620
edab091ac2 avcodec/jpeg2000: Remove trailing whitespace
Remove trailing whitespace
2026-03-27 13:56:00 +00:00
Osamu Watanabe
8490363634 avcodec/jpeg2000: Fix undefined behavior on ROI shift-up 2026-03-27 13:56:00 +00:00
Georgii Zagoruiko
1c385023aa aarch64/vvc: Optimisations of put_chroma_v() functions for 10/12-bit
Apple M4:
put_chroma_v_10_2x2_c:                                   5.8 ( 1.00x)
put_chroma_v_10_4x4_c:                                   9.0 ( 1.00x)
put_chroma_v_10_4x4_neon:                                1.7 ( 5.29x)
put_chroma_v_10_8x8_c:                                  22.1 ( 1.00x)
put_chroma_v_10_8x8_neon:                                5.8 ( 3.79x)
put_chroma_v_10_16x16_c:                                56.3 ( 1.00x)
put_chroma_v_10_16x16_neon:                             21.2 ( 2.66x)
put_chroma_v_10_32x32_c:                               181.6 ( 1.00x)
put_chroma_v_10_32x32_neon:                             86.9 ( 2.09x)
put_chroma_v_10_64x64_c:                               680.3 ( 1.00x)
put_chroma_v_10_64x64_neon:                            337.4 ( 2.02x)
put_chroma_v_10_128x128_c:                            2567.3 ( 1.00x)
put_chroma_v_10_128x128_neon:                         1374.8 ( 1.87x)
put_chroma_v_12_2x2_c:                                   6.4 ( 1.00x)
put_chroma_v_12_4x4_c:                                   8.2 ( 1.00x)
put_chroma_v_12_4x4_neon:                                1.5 ( 5.56x)
put_chroma_v_12_8x8_c:                                  18.9 ( 1.00x)
put_chroma_v_12_8x8_neon:                                5.7 ( 3.29x)
put_chroma_v_12_16x16_c:                                52.6 ( 1.00x)
put_chroma_v_12_16x16_neon:                             19.9 ( 2.65x)
put_chroma_v_12_32x32_c:                               185.7 ( 1.00x)
put_chroma_v_12_32x32_neon:                             81.9 ( 2.27x)
put_chroma_v_12_64x64_c:                               661.8 ( 1.00x)
put_chroma_v_12_64x64_neon:                            342.1 ( 1.93x)
put_chroma_v_12_128x128_c:                            2547.8 ( 1.00x)
put_chroma_v_12_128x128_neon:                         1368.0 ( 1.86x)

RPi4:
put_chroma_v_10_2x2_c:                                  64.8 ( 1.00x)
put_chroma_v_10_4x4_c:                                 157.2 ( 1.00x)
put_chroma_v_10_4x4_neon:                               39.7 ( 3.96x)
put_chroma_v_10_8x8_c:                                 562.1 ( 1.00x)
put_chroma_v_10_8x8_neon:                               98.8 ( 5.69x)
put_chroma_v_10_16x16_c:                              1170.7 ( 1.00x)
put_chroma_v_10_16x16_neon:                            380.7 ( 3.07x)
put_chroma_v_10_32x32_c:                              3696.6 ( 1.00x)
put_chroma_v_10_32x32_neon:                           1723.8 ( 2.14x)
put_chroma_v_10_64x64_c:                             13170.9 ( 1.00x)
put_chroma_v_10_64x64_neon:                           7284.1 ( 1.81x)
put_chroma_v_10_128x128_c:                           46068.3 ( 1.00x)
put_chroma_v_10_128x128_neon:                        27219.5 ( 1.69x)
put_chroma_v_12_2x2_c:                                  63.8 ( 1.00x)
put_chroma_v_12_4x4_c:                                 156.5 ( 1.00x)
put_chroma_v_12_4x4_neon:                               39.3 ( 3.98x)
put_chroma_v_12_8x8_c:                                 560.9 ( 1.00x)
put_chroma_v_12_8x8_neon:                               98.7 ( 5.68x)
put_chroma_v_12_16x16_c:                              1169.9 ( 1.00x)
put_chroma_v_12_16x16_neon:                            380.8 ( 3.07x)
put_chroma_v_12_32x32_c:                              3693.9 ( 1.00x)
put_chroma_v_12_32x32_neon:                           1728.4 ( 2.14x)
put_chroma_v_12_64x64_c:                             13170.9 ( 1.00x)
put_chroma_v_12_64x64_neon:                           7284.9 ( 1.81x)
put_chroma_v_12_128x128_c:                           46068.0 ( 1.00x)
put_chroma_v_12_128x128_neon:                        27224.6 ( 1.69x)
2026-03-27 13:42:50 +00:00
Priyanshu Thapliyal
ae6f233988 avcodec/alsdec: fix mantissa unpacking in compressed Part A path
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-26 16:25:09 +00:00
Priyanshu Thapliyal
e7b4ddc9d6 avcodec/pngdec: fix dead overflow check in decode_text_to_exif()
The expression (exif_len & ~SIZE_MAX) is always 0 for size_t,
making the overflow guard permanently dead code.

Reported-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-25 16:48:12 +00:00
Aleksoid
e84b3c7e98 avcodec/vp9: Fixed memory leak when vp9_frame_alloc() function fails. 2026-03-25 14:31:34 +00:00
Kacper Michajłow
e17d84ac8a avcodec/vp9: fix cbs fragment leak on error
Fixes: c0bf1382a7
Fixes: 490257166/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VP9_fuzzer-6185031050788864
Fixes: 490131106/clusterfuzz-testcase-minimized-fuzzer_loadfile-5438205762797568
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-25 14:02:19 +00:00
Zhao Zhili
44ad73031d avcodec/bsf/lcevc_metadata: fix copy-paste typo in chroma loc setup
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-25 12:19:46 +00:00
Priyanshu Thapliyal
1853c80e20 avcodec/alsdec: fix abs(INT_MIN) UB in read_diff_float_data()
Replace abs() with FFABSU() to avoid undefined behavior when
raw_samples[c][i] == INT_MIN. Per libavutil/common.h, FFABS()
has the same INT_MIN UB as abs(); FFABSU() is the correct
helper as it casts to unsigned before negation.

Reported-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-25 00:16:41 +00:00
Andreas Rheinhardt
92d06a8027 avcodec/vvc/ctu: Put scratchbufs into union to save space
This reduces sizeof(VVCLocalContext) from 4580576B to
3408032B here.

Reviewed-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 18:12:00 +01:00
Andreas Rheinhardt
a10c731723 avcodec/vvc/ctu: Move often accessed fields to the start of structs
And move the big buffers to the end. This reduces codesize
as offset+displacement addressing modes are either unavailable
or require more bytes of displacement is too large. E.g. this
saves 5952B on x64 here and 3008B on AArch64. This change should
also improve data locality.

Reviewed-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 18:10:55 +01:00
Andreas Rheinhardt
e41799d6ec avcodec/vvc: Use static_assert where appropriate
Reviewed-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 18:09:43 +01:00
James Almer
d61d724905 avcodec/bsf/lcevc_metadata: write Aditional Info blocks after the Global Config block
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-24 11:14:17 -03:00
James Almer
35a1e43a6a avcodec/cbs_lcevc: fix writing process blocks with size 6
6 is an undefined value for payload_size_type. For those, 7 is used to signal
a custom_byte_size synxtax element.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-24 11:12:25 -03:00
Michael Niedermayer
1bde76da89 avcodec/dvdsub_parser: Fix buf_size check
Fixes: signed integer overflow
Fixes: out of array access
Fixes: dvdsub_int_overflow_mixed_ps.mpg

Found-by: Quang Luong of Calif.io in collaboration with OpenAI Codex
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-22 00:33:26 +00:00
Andreas Rheinhardt
9d97771bc6 avcodec/bsf/extract_extradata: Remove pointless checks
It doesn't hurt to keep track of filtered_size:
The end result will be ignored if extradata is not removed
from the bitstream.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-21 15:05:17 +00:00
Andreas Rheinhardt
1dd853010a avcodec/bsf/extract_extradata: Redo extracting LCEVC extradata
Changes compared to the current version include:
1. We no longer use a dummy PutByteContext on the first pass
for checking whether there is extradata in the NALU. Instead
the first pass no longer writes anything to any PutByteContext
at all; the size information is passed via additional int*
parameters. (This no longer discards const when initializing
the dummy PutByteContext, fixing a compiler warning.)
2. We actually error out on invalid data in the first pass,
ensuring that the second pass never fails.
3. The first pass is used to get the exact sizes of both
the extradata and the filtered data. This obviates the need
for reallocating the buffers lateron. (It also means
that the extradata side data will have been allocated with
av_malloc (ensuring proper alignment) instead of av_realloc().)
4. The second pass now writes both extradata and (if written)
the filtered data instead of parsing the NALUs twice.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-21 15:05:17 +00:00
Andreas Rheinhardt
548b9f5ca7 avcodec/bsf/extract_extradata: Inline constants
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-21 15:05:17 +00:00
Michael Niedermayer
313e776ba7 avcodec/ffv1dec: Allocate the minimum size for fltmap and fltmap32 with the current implementation
Found-by: Lynne
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-20 15:50:09 +00:00
Martin Storsjö
f72f692afa aarch64: Add PAC sign/validation of the link register
Whenever the link register is stored on the stack, sign it
before storing it and validate at a symmetrical point (with the
stack at the same level as when it was signed).

These macros only have an effect if built with PAC enabled (e.g.
through -mbranch-protection=standard), otherwise they don't
generate any extra instructions.

None of these cases were present when PAC support was added
in 248986a0db in 2022.

Without these changes, PAC still had an effect in the compiler
generated code and in the existing cases where we these macros were
used - but make it apply to the remaining cases of link register
on the stack.
2026-03-20 13:16:06 +02:00