Commit graph

53712 commits

Author SHA1 Message Date
Lynne
c40318d663 aacdec_usac_mps212: Fix CID 1681701
Fixes Coverity issue #1681701
2026-03-07 11:56:47 +00:00
Lynne
46cf8f1873 aacdec_usac_mps212: fix CID 1681703
Fixes Coverity issue #1681703
2026-03-07 11:56:47 +00:00
Lynne
558738a6d0 aacdec_usac_mps212: Fix CID 1681704
Fixes Coverity issue #1681704
2026-03-07 11:56:47 +00:00
Lynne
e7e001a804 aacdec_usac_mps212: fix CID 1681705
Fixes Coverity issue #1681705
2026-03-07 11:56:47 +00:00
Michael Niedermayer
c2364e9222 avcodec/aac/aacdec_usac_mps212: Fix invalid array index
Without the specification, limiting the index is the best that can be done.

Fixes: out of array access
Fixes: 487591441/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_LATM_fuzzer-6205915698364416

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-07 11:11:52 +00:00
Michael Niedermayer
c4ee599760 avcodec/aac/aacdec_usac_mps212: Fix invalid shift
Fixes: left shift of negative value -2
Fixes: 487591441/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_LATM_fuzzer-6205915698364416

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-07 11:11:52 +00:00
James Almer
a1496ced65 avcodec/av1dec: sync frame header and tile group behavior with CBS
A new Sequence Header or a Temporal Delimiter OBU invalidate any previous frame
if not yet complete (As is the case of missing Tile Groups).
Similarly, a new Frame Header invalidates any onging Tile Group parsing.

Fixes: out of array access
Fixes: av1dec_tile_desync.mp4
Fixes: av1dec_tile_desync_bypass.mp4

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-06 23:18:35 -03:00
James Almer
282cf4425d avcodec/cbs_av1: don't try to write a Redundant Frame Header as a normal one
Section 6.8.1 of the AV1 specification states:

"If obu_type is equal to OBU_REDUNDANT_FRAME_HEADER, it is a requirement of
bitstream conformance that SeenFrameHeader is equal to 1."

Leave the existing behavior for reading scenarios as such a file may still
be readable.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-06 23:18:35 -03:00
Michael Niedermayer
d5e2e678ab
avcodec/magicyuv: fix small median images
Fixes: out of array acces
Fixes: 487838419/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MAGICYUV_DEC_fuzzer-4683933221715968

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-06 23:25:29 +01:00
Michael Niedermayer
6084f07189
avcodec/utils: fix duration computation based on frame_bytes
Fixes: signed integer overflow: 256 * 8396351 cannot be represented in type 'int'
Fixes: 482692578/clusterfuzz-testcase-minimized-ffmpeg_dem_SWF_fuzzer-5865521093607424

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-06 23:08:03 +01:00
Andreas Rheinhardt
0ddece40c5 avcodec/x86/vvc/alf: Simplify vb_pos comparisons
The value of vb_pos at vb_bottom, vb_above is known
at compile-time, so one can avoid the modifications
to vb_pos and just compare against immediates.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
1960320112 avcodec/x86/vvc/alf: Avoid pointless wrappers for alf_filter
They are completely unnecessary for the 8bit case (which only
handles 8bit) and overtly complicated for the 10 and 12bit cases:
All one needs to do is set up the (1<<bpp)-1 vector register
and jmp from (say) the 12bpp function stub inside the 10bpp
function. The way it is done here even allows to share the
prologue between the two functions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
467f8d8415 avcodec/x86/vvc/alf: Improve offsetting pointers
It can be combined with an earlier lea for the loop
processing 16 pixels at a time; it is unnecessary
for the tail, because the new values will be overwritten
immediately afterwards anyway.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
cb5f6c055b avcodec/x86/vvc/alf: Don't modify rsp unnecessarily
The vvc_alf_filter functions don't use x86inc's stack managment
feature at all; they merely push and pop some regs themselves.
So don't tell x86inc to provide stack (which in this case
entails aligning the stack).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
38062ebd18 avcodec/x86/vvc/alf: Remove pointless counter, stride
Each luma alf block has 2*12 auxiliary coefficients associated
with it that the alf_filter functions consume; the C version
simply increments the pointers.

The x64 dsp function meanwhile does things differenty:
The vvc_alf_filter functions have three levels of loops.
The middle layer uses two counters, one of which is
just the horizontal offset xd in the current line. It is only
used for addressing these auxiliary coefficients and
yet one needs to perform work translate from it to
the coefficient offset, namely a *3 via lea and a *2 scale.
Furthermore, the base pointers of the coefficients are incremented
in the outer loop; the stride used for this is calculated
in the C wrapper functions. Furthermore, due to GPR pressure xd
is reused as loop counter for the innermost loop; the
xd from the middle loop is pushed to the stack.

Apart from the translation from horizontal offset to coefficient
offset all of the above has been done for chroma, too, although
the coefficient pointers don't get modified for them at all.

This commit changes this to just increment the pointers
after reading the relevant coefficients.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
d2e7fe5b19 avcodec/x86/vvc/alf: Improve deriving ac
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
5da3cab645 avcodec/x86/vvc/alf: Avoid broadcast
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
c9da0193ff avcodec/x86/vvc/alf: Don't use 64bit where unnecessary
Reduces codesize (avoids REX prefixes).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
a489a623fb avcodec/x86/vvc/alf: Use memory sources directly
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
df7885d6c3 avcodec/x86/vvc/alf: Improve writing classify parameters
The permutation that was applied before the write macro
is actually only beneficial when one has 16 entries to write,
so move it into the macro to write 16 entries and optimize
the other macro.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
1bc91eb552 avcodec/x86/vvc/alf: Avoid checking twice
Also avoids a vpermq in case width is eight.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:42 +01:00
Andreas Rheinhardt
e4a9d54e48 avcodec/x86/vvc/alf: Avoid nonvolatile registers
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
a2d9cd6dcb avcodec/x86/vvc/alf: Don't calculate twice
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
01a897020e avcodec/x86/vvc/alf: Use xmm registers where sufficient
One always has eight samples when processing the luma remainder,
so xmm registers are sufficient for everything. In fact, this
actually simplifies loading the luma parameters.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
9cb5280c0e avcodec/x86/vvc/alf: Improve storing 8bpp
When width is known to be 8 (i.e. for luma that is not width 16),
the upper lane is unused, so use an xmm-sized packuswb and avoid
the vpermq altogether. For chroma not known to be 16 (i.e. 4,8 or
12) defer extracting from the high lane until it is known to be needed.
Also do so via vextracti128 instead of vpermq (also do this for
bpp>8).
Also use vextracti128 and an xmm-sized packuswb in case of width 16
instead of an ymm-sized packuswb followed by vextracti128.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
56a4c15c23 avcodec/x86/vvc/alf: Avoid checking twice
Also avoid doing unnecessary work in the width==8 case.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
43cc8f05df avcodec/x86/vvc/alf: Don't clip for 8bpp
packuswb does it already.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
a8b3b9c26f avcodec/x86/vvc/alf: Remove unused array
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
94f9ad8061 avcodec/x86/vvc/alf: Use immediate for shift when possible
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
2159e40ab3 avcodec/x86/vvc/of: Avoid jump
At the end of the height==8 codepath, a jump to RET at the end
of the height==16 codepath is performed. Yet the epilogue
is so cheap on Unix64 that this jump is not worthwhile.
For Win64 meanwhile, one can still avoid jumps, because
for width 16 >8bpp and width 8 8bpp content a jump is performed
to the end of the height==8 position, immediately followed
by a jump to RET. These two jumps can be combined into one.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
2a93d09968 avcodec/x86/vvc/of: Ignore upper lane for width 8
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
9fe9fd95b6 avcodec/x86/vvc/of: Only clip for >8bpp
packuswb does it already for 8bpp.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
83694749ad avcodec/x86/vvc/of,dsp_init: Avoid unnecessary wrappers
Write them in assembly instead; this exchanges a call+ret
with a jmp and also avoids the stack for (1<<bpp)-1.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
d6ed5d6e3d avcodec/x86/vvc/of: Deduplicate writing, save jump
Both the 8bpp width 16 and >8bpp width 8 cases write
16 contiguous bytes; deduplicate writing them. In fact,
by putting this block of code at the end of the SAVE macro,
one can even save a jmp for the width 16 8bpp case
(without adversely affecting the other cases).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
e7e19fcb1b avcodec/x86/vvc/of: Avoid unnecessary jumps
For 8bpp width 8 content, an unnecessary jump was performed
for every write: First to the end of the SAVE_8BPC macro,
then to the end of the SAVE macro. This commit changes this.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
dee361a5bf avcodec/x86/vvc/of: Avoid initialization, addition for last block
When processing the last block, we no longer need to preserve
some registers for the next block, allowing simplifications.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
c6205355b4 avcodec/x86/vvc/of: Avoid initialization, addition for first block
Output directly to the desired destination registers instead
of zeroing them, followed by adding the desired values.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Andreas Rheinhardt
f177672df2 avcodec/x86/vvc/of: Avoid unnecessary additions
BDOF_PROF_GRAD just adds some values to m12,m13,
so one can avoid two pxor, paddw by deferring
saving these registers prematurely.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-06 20:02:41 +01:00
Michael Niedermayer
e5c1ca60d8
avcodec/cbs_h266_syntax_template: bound slice width/height by remaining tiles
Fixes: out of array access
Fixes: crash_vvc_heap_oob_read.bin

Found-by: akshay jain <akshaythe@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-06 04:06:01 +01:00
Michael Niedermayer
d707a4af80
avcodec/pnmdec: Check input size against width*height assuming at least 1bit per pixel
Fixes: Timeout
Fixes: 481427018/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PHM_DEC_fuzzer-6315469467615232
Fixes: 485843949/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PHM_DEC_fuzzer-4753439270961152

Found-by:  continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-06 02:33:59 +01:00
IndecisiveTurtle
cebe0b577e lavc: implement a Vulkan-based prores encoder
Adds a vulkan implementation of the reference prores kostya encoder. Provides about 3-4x speedup over the CPU code
2026-03-05 14:02:39 +00:00
IndecisiveTurtle
576de002e5 lavc: Split out common components used by vulkan prores encoder 2026-03-05 14:02:39 +00:00
Martin Storsjö
74cfcd1c69 aarch64/vvc: Fix DCE undefined references with MSVC
This fixes compiling with MSVC for aarch64 after
510999f6b0.

While MSVC does do dead code elimintation for function references
within e.g. "if (0)", it doesn't do that for functions referenced
within a static function, even if that static function itself ends
up not used.

A reproduction example:

    void missing(void);
    void (*func_ptr)(void);

    static void wrapper(void) {
        missing();
    }

    void init(int cpu_flags) {
        if (0) {
            func_ptr = wrapper;
        }
    }

If "wrapper" is entirely unreferenced, then MSVC doesn't produce
any reference to the symbol "missing". Also, if we do
"func_ptr = missing;" then the reference to missing also is
eliminated. But for the case of referencing the function in a
static function, even if the reference to the static function can
be eliminated, then MSVC does keep the reference to the symbol.
2026-03-05 11:57:40 +02:00
Michael Niedermayer
cbbe68fb1a
avcodec/snowenc: avoid NULL ptr arithmetic
Fixes: applying non-zero offset 16 to null pointer
Fixes: 471614378/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SNOW_fuzzer-5967030642868224

Note: FF_PTR_ADD() does not work as this code has NULL + 123 cases where the pointer is unsused afterwards

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-05 01:23:40 +01:00
Zuoqiang He
1fc7464cf7 libavcodec/huffyuvdsp: Add NEON optimization for the add_int16 function
Benchmark Results (1024 iterations, Raspberry Pi 5 - Cortex-A76):
add_int16_128_c:                                       914.0 ( 1.00x)
add_int16_128_neon:                                    516.9 ( 1.77x)
add_int16_rnd_width_c:                                 914.0 ( 1.00x)
add_int16_rnd_width_neon:                              517.5 ( 1.77x)

Co-Authored-By: Martin Storsjö <martin@martin.st>
2026-03-04 22:31:19 +00:00
Georgii Zagoruiko
510999f6b0 aarch64/vvc: sme2 optimisation of alf_filter_luma() 8/10/12 bit
Apple M4:
vvc_alf_filter_luma_8x8_8_c:                           347.3 ( 1.00x)
vvc_alf_filter_luma_8x8_8_neon:                        138.7 ( 2.50x)
vvc_alf_filter_luma_8x8_8_sme2:                        134.5 ( 2.58x)
vvc_alf_filter_luma_8x8_10_c:                          299.8 ( 1.00x)
vvc_alf_filter_luma_8x8_10_neon:                       129.8 ( 2.31x)
vvc_alf_filter_luma_8x8_10_sme2:                       128.6 ( 2.33x)
vvc_alf_filter_luma_8x8_12_c:                          293.0 ( 1.00x)
vvc_alf_filter_luma_8x8_12_neon:                       126.8 ( 2.31x)
vvc_alf_filter_luma_8x8_12_sme2:                       126.3 ( 2.32x)
vvc_alf_filter_luma_16x16_8_c:                        1386.1 ( 1.00x)
vvc_alf_filter_luma_16x16_8_neon:                      560.3 ( 2.47x)
vvc_alf_filter_luma_16x16_8_sme2:                      540.1 ( 2.57x)
vvc_alf_filter_luma_16x16_10_c:                       1200.3 ( 1.00x)
vvc_alf_filter_luma_16x16_10_neon:                     515.6 ( 2.33x)
vvc_alf_filter_luma_16x16_10_sme2:                     531.3 ( 2.26x)
vvc_alf_filter_luma_16x16_12_c:                       1223.8 ( 1.00x)
vvc_alf_filter_luma_16x16_12_neon:                     510.7 ( 2.40x)
vvc_alf_filter_luma_16x16_12_sme2:                     524.9 ( 2.33x)
vvc_alf_filter_luma_32x32_8_c:                        5488.8 ( 1.00x)
vvc_alf_filter_luma_32x32_8_neon:                     2233.4 ( 2.46x)
vvc_alf_filter_luma_32x32_8_sme2:                     1093.6 ( 5.02x)
vvc_alf_filter_luma_32x32_10_c:                       4738.0 ( 1.00x)
vvc_alf_filter_luma_32x32_10_neon:                    2057.5 ( 2.30x)
vvc_alf_filter_luma_32x32_10_sme2:                    1053.6 ( 4.50x)
vvc_alf_filter_luma_32x32_12_c:                       4808.3 ( 1.00x)
vvc_alf_filter_luma_32x32_12_neon:                    1981.2 ( 2.43x)
vvc_alf_filter_luma_32x32_12_sme2:                    1047.7 ( 4.59x)
vvc_alf_filter_luma_64x64_8_c:                       22116.8 ( 1.00x)
vvc_alf_filter_luma_64x64_8_neon:                     8951.0 ( 2.47x)
vvc_alf_filter_luma_64x64_8_sme2:                     4225.2 ( 5.23x)
vvc_alf_filter_luma_64x64_10_c:                      19072.8 ( 1.00x)
vvc_alf_filter_luma_64x64_10_neon:                    8448.1 ( 2.26x)
vvc_alf_filter_luma_64x64_10_sme2:                    4225.8 ( 4.51x)
vvc_alf_filter_luma_64x64_12_c:                      19312.6 ( 1.00x)
vvc_alf_filter_luma_64x64_12_neon:                    8270.9 ( 2.34x)
vvc_alf_filter_luma_64x64_12_sme2:                    4245.4 ( 4.55x)
vvc_alf_filter_luma_128x128_8_c:                     88530.5 ( 1.00x)
vvc_alf_filter_luma_128x128_8_neon:                  35686.3 ( 2.48x)
vvc_alf_filter_luma_128x128_8_sme2:                  16961.2 ( 5.22x)
vvc_alf_filter_luma_128x128_10_c:                    76904.9 ( 1.00x)
vvc_alf_filter_luma_128x128_10_neon:                 32439.5 ( 2.37x)
vvc_alf_filter_luma_128x128_10_sme2:                 16845.6 ( 4.57x)
vvc_alf_filter_luma_128x128_12_c:                    77363.3 ( 1.00x)
vvc_alf_filter_luma_128x128_12_neon:                 32907.5 ( 2.35x)
vvc_alf_filter_luma_128x128_12_sme2:                 17018.1 ( 4.55x)
2026-03-04 23:52:58 +02:00
Tong Wu
5b8a4a0e14 avcodec/d3d12va_encode_h264: simplify deblock default option
The deblocking filter is enabled by default. This behavior is the same
as priv->deblock == 1.

Signed-off-by: Tong Wu <wutong1208@outlook.com>
2026-03-04 14:25:00 +00:00
Georgii Zagoruiko
90431417cb aarch64/vvc: Optimisations of put_luma_hv() functions for 10/12-bit
Apple M2:
put_luma_hv_10_4x4_c:                                   36.3 ( 1.00x)
put_luma_hv_10_8x8_c:                                   82.9 ( 1.00x)
put_luma_hv_10_8x8_neon:                                34.9 ( 2.37x)
put_luma_hv_10_16x16_c:                                239.2 ( 1.00x)
put_luma_hv_10_16x16_neon:                             119.0 ( 2.01x)
put_luma_hv_10_32x32_c:                                900.3 ( 1.00x)
put_luma_hv_10_32x32_neon:                             429.3 ( 2.10x)
put_luma_hv_10_64x64_c:                               2984.7 ( 1.00x)
put_luma_hv_10_64x64_neon:                            1736.2 ( 1.72x)
put_luma_hv_10_128x128_c:                            11194.2 ( 1.00x)
put_luma_hv_10_128x128_neon:                          6357.3 ( 1.76x)
put_luma_hv_12_4x4_c:                                   35.9 ( 1.00x)
put_luma_hv_12_8x8_c:                                   82.6 ( 1.00x)
put_luma_hv_12_8x8_neon:                                34.3 ( 2.41x)
put_luma_hv_12_16x16_c:                                240.2 ( 1.00x)
put_luma_hv_12_16x16_neon:                             115.3 ( 2.08x)
put_luma_hv_12_32x32_c:                                787.7 ( 1.00x)
put_luma_hv_12_32x32_neon:                             414.2 ( 1.90x)
put_luma_hv_12_64x64_c:                               3058.4 ( 1.00x)
put_luma_hv_12_64x64_neon:                            1592.3 ( 1.92x)
put_luma_hv_12_128x128_c:                            11350.8 ( 1.00x)
put_luma_hv_12_128x128_neon:                          6378.3 ( 1.78x)

RPi4:
put_luma_hv_10_4x4_c:                                  637.8 ( 1.00x)
put_luma_hv_10_8x8_c:                                 1044.9 ( 1.00x)
put_luma_hv_10_8x8_neon:                               483.7 ( 2.16x)
put_luma_hv_10_16x16_c:                               3098.0 ( 1.00x)
put_luma_hv_10_16x16_neon:                            1603.1 ( 1.93x)
put_luma_hv_10_32x32_c:                              10054.8 ( 1.00x)
put_luma_hv_10_32x32_neon:                            5843.6 ( 1.72x)
put_luma_hv_10_64x64_c:                              40506.2 ( 1.00x)
put_luma_hv_10_64x64_neon:                           24384.0 ( 1.66x)
put_luma_hv_10_128x128_c:                           130604.2 ( 1.00x)
put_luma_hv_10_128x128_neon:                         99746.6 ( 1.31x)
put_luma_hv_12_4x4_c:                                  638.2 ( 1.00x)
put_luma_hv_12_8x8_c:                                 1074.6 ( 1.00x)
put_luma_hv_12_8x8_neon:                               482.6 ( 2.23x)
put_luma_hv_12_16x16_c:                               3094.0 ( 1.00x)
put_luma_hv_12_16x16_neon:                            1602.5 ( 1.93x)
put_luma_hv_12_32x32_c:                              10034.4 ( 1.00x)
put_luma_hv_12_32x32_neon:                            5843.3 ( 1.72x)
put_luma_hv_12_64x64_c:                              40447.5 ( 1.00x)
put_luma_hv_12_64x64_neon:                           24377.2 ( 1.66x)
put_luma_hv_12_128x128_c:                           130610.4 ( 1.00x)
put_luma_hv_12_128x128_neon:                         99765.8 ( 1.31x)
2026-03-04 12:53:16 +00:00
zengshuang
9d73d10c50 avformat,avcodec: use PRI format macros for uint32_t in log messages
Use PRIu32/PRIX32 format specifiers instead of %d/%u/%X for uint32_t
variables in av_log calls. On some platforms (e.g. NuttX), uint32_t is
typedef'd as unsigned long rather than unsigned int, which triggers
-Wformat warnings despite both types being 4 bytes. Using PRI macros
is the portable way to match the actual underlying type of uint32_t.

Signed-off-by: zengshuang <zengshuang@xiaomi.com>
2026-03-04 10:40:12 +00:00
James Almer
264283bd0a avcodec/av1_parser: also decompose Redundant Frame Headers
Ensures samples where a missing Frame Header is handled by a subsequent
Redundant one are parsed correctly.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-03 13:52:58 -03:00