ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-06-04 22:50:24 +00:00

Author	SHA1	Message	Date
DROOdotFOO	34501921fd	tests/checkasm/sw_yuv2rgb: cover nv12 and nv21 The previous chroma stride formula (width >> log2_chroma_w) is correct for planar yuv but wrong for semi-planar nv12/nv21, where the UV plane is interleaved at width bytes per row (width/2 UV pairs of 2 bytes each). Use av_image_get_linesize() so the test feeds a valid stride to libswscale regardless of input format; for the existing planar suites the value is unchanged. With the stride fixed, add nv12 and nv21 to check_yuv2rgb() so the upcoming NEON 16bpp paths get bench coverage. ff_get_unscaled_swscale does not wire a C yuv2rgb fast path for these inputs, so the suites report bench-only (no correctness reference); they still run clobber detection and cycle counts. Signed-off-by: DROOdotFOO <drew@axol.io>	2026-05-22 10:03:07 +00:00
Thilo Borgmann via ffmpeg-devel	1572784128	fate: add test for animated WebP Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>	2026-05-19 11:36:10 +02:00
James Almer	4444a75590	avformat/movenc: support writing more than one entry per tref tag Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-17 11:16:56 -03:00
James Almer	90dd8673ce	avformat/mov: handle all references in tref boxes tref types can have more than one value, as is the case of tmcd in fcp_export8-236.mov, where the single video track references all timecode tracks. Handle them in a generic and extensible way. Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-17 11:16:56 -03:00
Niklas Haas	0c1a1ee12e	swscale/ops_optimizer: don't push scale past truncating conversions In an op list like: [ u8 +XXX] SWS_OP_READ : 1 elem(s) planar >> 3 [ u8 .XXX] SWS_OP_FILTER_V : 256 -> 320 bilinear (2 taps) [f32 .XXX] SWS_OP_SCALE : * 65535 [f32 +XXX] SWS_OP_CONVERT : f32 -> u16 [u16 zXXX] SWS_OP_SWAP_BYTES [u16 zzzX] SWS_OP_SWIZZLE : 0003 [u16 zzz+] SWS_OP_CLEAR : {_ _ _ 65535} [u16 XXXX] SWS_OP_WRITE : 4 elem(s) packed >> 0 The current version of the code would happily push the SWS_OP_SCALE past the truncating conversion, leading to degenerate loss of information. (In this case, the result was quite extreme) Affects quality across a wide range of formats, e.g.: rgb24 16x16 -> rgb48be 16x32: [ u8 +++X] SWS_OP_READ : 3 elem(s) packed >> 0 min: {0 0 0 _}, max: {255 255 255 _} [ u8 ...X] SWS_OP_FILTER_V : 16 -> 32 bilinear (2 taps) min: {0 0 0 _}, max: {255 255 255 _} + [f32 ...X] SWS_OP_SCALE : * 257 + min: {0 0 0 _}, max: {65535 65535 65535 _} [f32 +++X] SWS_OP_CONVERT : f32 -> u16 - min: {0 0 0 _}, max: {255 255 255 _} - [u16 +++X] SWS_OP_SCALE : * 257 min: {0 0 0 _}, max: {65535 65535 65535 _} [u16 zzzX] SWS_OP_SWAP_BYTES min: {0 0 0 _}, max: {65535 65535 65535 _} [u16 XXXX] SWS_OP_WRITE : 3 elem(s) packed >> 0 (X = unused, z = byteswapped, + = exact, 0 = zero) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-05-17 10:41:34 +00:00
Niklas Haas	812b5654ae	swscale/tests/sws_ops: use SWS_SCALE_BILINEAR for printing ops lists This actually changes the behavior vs SWS_SCALE_POINT, because point scaling is bit-exact and thus implies a different set of optimizations. Ideally, we would still try and somehow merge this with tests/swscale.c to allow testing a different set of scalers; but I still don't have a good idea for how to accomplish that here. As it stands, results in additional extra dithering steps in almost all filters involving scaling, e.g.: rgb24 16x16 -> rgb24 16x32: [ u8 +++X] SWS_OP_READ : 3 elem(s) packed >> 0 min: {0 0 0 _}, max: {255 255 255 _} - [ u8 +++X] SWS_OP_FILTER_V : 16 -> 32 point (1 taps) + [ u8 ...X] SWS_OP_FILTER_V : 16 -> 32 bilinear (2 taps) min: {0 0 0 _}, max: {255 255 255 _} + [f32 ...X] SWS_OP_DITHER : 16x16 matrix + {0 3 2 -1} + min: {1/512 1/512 1/512 _}, max: {255.998047 255.998047 255.998047 _} + [f32 ...X] SWS_OP_MIN : x <= {255 255 255 _} + min: {1/512 1/512 1/512 _}, max: {255 255 255 _} [f32 +++X] SWS_OP_CONVERT : f32 -> u8 min: {0 0 0 _}, max: {255 255 255 _} [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) packed >> 0 (X = unused, z = byteswapped, + = exact, 0 = zero) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-05-17 10:41:34 +00:00
Niklas Haas	2dfe055ddd	swscale/tests/sws_ops: print split sub-passes for lists with filters This allows us to inspect exactly the logic that is going on inside the CPU backends (which don't support bare filter passes). rgb24 16x16 -> rgb24 16x32: [ u8 +++X] SWS_OP_READ : 3 elem(s) packed >> 0 min: {0 0 0 _}, max: {255 255 255 _} [ u8 +++X] SWS_OP_FILTER_V : 16 -> 32 point (1 taps) min: {0 0 0 _}, max: {255 255 255 _} [f32 +++X] SWS_OP_CONVERT : f32 -> u8 min: {0 0 0 _}, max: {255 255 255 _} [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) packed >> 0 (X = unused, z = byteswapped, + = exact, 0 = zero) + Retrying with split passes: + [ u8 +++X] SWS_OP_READ : 3 elem(s) packed >> 0 + min: {0 0 0 _}, max: {255 255 255 _} + [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) planar >> 0 + (X = unused, z = byteswapped, + = exact, 0 = zero) + Sub-pass #1: + [ u8 +++X] SWS_OP_READ : 3 elem(s) planar >> 0 + 1 tap point filter (V) + min: {0 0 0 _}, max: {255 255 255 _} + [f32 +++X] SWS_OP_CONVERT : f32 -> u8 + min: {0 0 0 _}, max: {255 255 255 _} + [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) packed >> 0 + (X = unused, z = byteswapped, + = exact, 0 = zero) rgb24 16x16 -> rgb24 32x16: [ u8 +++X] SWS_OP_READ : 3 elem(s) packed >> 0 min: {0 0 0 _}, max: {255 255 255 _} [ u8 +++X] SWS_OP_FILTER_H : 16 -> 32 point (1 taps) min: {0 0 0 _}, max: {255 255 255 _} [f32 +++X] SWS_OP_CONVERT : f32 -> u8 min: {0 0 0 _}, max: {255 255 255 _} [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) packed >> 0 (X = unused, z = byteswapped, + = exact, 0 = zero) + Retrying with split passes: + [ u8 +++X] SWS_OP_READ : 3 elem(s) packed >> 0 + min: {0 0 0 _}, max: {255 255 255 _} + [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) planar >> 0 + (X = unused, z = byteswapped, + = exact, 0 = zero) + Sub-pass #1: + [ u8 +++X] SWS_OP_READ : 3 elem(s) planar >> 0 + 1 tap point filter (H) + min: {0 0 0 _}, max: {255 255 255 _} + [f32 +++X] SWS_OP_CONVERT : f32 -> u8 + min: {0 0 0 _}, max: {255 255 255 _} + [ u8 XXXX] SWS_OP_WRITE : 3 elem(s) packed >> 0 + (X = unused, z = byteswapped, + = exact, 0 = zero) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-05-17 10:41:34 +00:00
James Almer	412aa48868	fftools/ffmpeg_mux_init: propagate the muxer request for fixed frame size Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-16 13:55:23 -03:00
James Almer	8e162daf9a	fftools:/ffmpeg_enc: honor the user request for fixed size frames And set it also for non-variable frame size encoders. FATE changes are the result of passing a frame_size to flac and wavenc encoders, instead of letting them choose one. Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-16 13:55:22 -03:00
James Almer	8567345514	tests/fate/lavf-audio: set frame_size on fate-lavf-ogg Both worksaround a issue the following commit reveals (encoding with 4096 frame_size fails on aarch64 for unknown reasons), and tests setting frame_size now that it's allowed (and ensuring the CLI doesn't overwrite it). Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-16 13:55:22 -03:00
James Almer	53d46a51fa	avcodec/encode: propagate skip samples side data if present Only for non-delay codecs. Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-16 13:55:22 -03:00
James Almer	7c5df8d34d	avformat/matroskaenc: use frame_size to write audio DefaultDuration Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-16 13:55:22 -03:00
Michael Niedermayer	37c176a2a2	tests/fate/voice: Add fate-g726le-encode Co-Authored-by: AI Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-05-16 15:09:52 +00:00
Andreas Rheinhardt	7971953d29	avfilter/x86/vf_pp7: Port ff_pp7_dctB_mmx to SSE2 Unfortunately a bit slower than the MMX version due to the impossibility to use memory operands in paddw. The situation would reverse if ff_dctB_mmx() would have to issue emms. dctB_c: 3.7 ( 1.00x) dctB_mmx: 3.3 ( 1.13x) dctB_sse2: 3.6 ( 1.03x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-05-15 20:29:29 +02:00
Andreas Rheinhardt	94a49068db	tests/checkasm: Add vf_pp7 checkasm test Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-05-15 20:29:29 +02:00
Niklas Haas	9021448857	swscale/ops_dispatch: merge ff_sws_ops_compile_backend() and compile() Passing backend == NULL now loops over the backends as before. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-05-15 18:53:05 +02:00
Niklas Haas	1d841635a4	swscale/ops: also include scaling ops in ff_sws_enum_op_lists() Using the configured scaler from the SwsContext implicitly. This does affect the output of libswscale/tests/sws_ops.c, which now prints about 4x as much data (taking roughly 4x as long, but still within a second on my machine). We can make this process a lot faster by forcing SWS_SCALE_POINT as the scaler, which skips calculating any actual filter weights in favor of generating a trivial 1-tap filter. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-05-15 18:53:05 +02:00
James Almer	b2dfc14276	avcodec/vvc_parser: properly split PUs when a Prefix SEI NUT is found Signed-off-by: James Almer <jamrial@gmail.com>	2026-05-09 11:44:39 -03:00
Vignesh Venkat	8518599cd1	avformat/matroskaenc: Write additional mappings for webm The elements written in mkv_write_blockadditionmapping (MaxBlockAdditionID, BlockAddIDType and BlockAddIDValue) are all allowed in WebM as well. Move them out of the "if (!IS_WEBM)" block. Matroska spec: https://www.matroska.org/technical/elements.html#MaxBlockAdditionID (See column with title "W" which shows WebM availability). WebM spec: https://www.webmproject.org/docs/container/#MaxBlockAdditionID Signed-off-by: Vignesh Venkat <vigneshv@google.com>	2026-05-08 13:33:31 -07:00
Romain Beauxis	0f2e693956	tests/fate/id3v2.mak: add new tests for comm, lyrics, txx and wma comments. Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>	2026-05-07 09:46:53 -05:00
Jun Zhao	ea3e09bfb1	fate/filter-video: use run $(FFMPEG) for scale-zero-dim test $(FFMPEG) expands to "ffmpeg.exe" on Windows/MSYS2, and the bare $(FFMPEG) call falls through to PATH lookup, picking up an externally installed ffmpeg instead of the freshly built binary in $target_path. That stale binary lacked the rejection added in `a45fe72c9d`, causing msys2-clang64/clangarm64/ucrt64 slots to silently produce 250x2 instead of failing at 500x0. Wrap with fate-run.sh's run() so $target_exec and $target_path are resolved correctly on all platforms, matching the convention used by e.g. fate-id3v2-invalid-tags, and avoiding the ffmpeg() helper's unrelated default flags. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-05-01 23:45:30 +08:00
Nil Fons Miret	e294b390a0	avfilter/vf_unsharp: fix amount scaling in the high-bit-depth path The 16-bit kernel is dispatched for every non-8-bit pixel format (9/10/12/16-bit content, all stored in uint16_t). It's supposed to undo the Q16 scaling that set_filter_param() applies to `amount`: fp->amount = amount * 65536.0; but the shift written in the kernel is `>> (8+nbits)`, which for the nbits=16 instantiation of the macro comes out to `>> 24` instead of `>> 16`. Because of this, on any non-8-bit input, unsharp applies ~1/256 of the user's requested strength and is effectively a no-op. The 8-bit kernel (nbits=8) happens to be correct because 8+8 == 16. This commit also widens the intermediate product to int64 before the shift, to avoid a potential overflow. Take a 16-bit pixel at the edge of a sharp white/black region, with the user-facing `amount` set to its declared maximum of 5.0. srx = 65535 blur = 32768 diff = srx - blur = 32767 amount_q16 = 5.0 * 65536 = 327680 Then the kernel computes: product = diff * amount_q16 = 32767 * 327680 = 10,737,090,560 (~1.07e10) which overflows INT32_MAX. Widening to int64 keeps the multiplication in range; the subsequent `>> 16` brings it back to sample range and the final cast to int32 is then safe. The widening is a semantic no-op for 8/9/10/12-bit content where the product always fits in int32 (worst case at 12-bit: 4095 * 327680 ~ 1.34e9). Introduced by `ee792ebe08` (2019-11-08, "avfilter/vf_unsharp: add 10bit support"). The fate-filter-unsharp-yuv420p10 reference added in the same series was generated from the broken kernel and is regenerated here. fate-filter-unsharp (8-bit) is unaffected. Repro: python3 -c "import numpy as np; y=np.tile(np.where(np.arange(128)//8 & 1, 512, 256).astype('<u2'), (128,1)); c=np.full((64,64), 512, '<u2'); open('in.yuv','wb').write(y.tobytes()+c.tobytes()*2)" ffmpeg -f rawvideo -pix_fmt yuv420p10le -s 128x128 -i in.yuv \ -lavfi "split=2[a][b];[b]unsharp=la=1[bs];[a][bs]psnr" \ -f null - 2>&1 \| grep PSNR Before: `PSNR y:66.50 ...` -- the filter is effectively a no-op, so the sharpened output matches the input almost exactly. After: `PSNR y:28.27 ...` -- the filter actually sharpens, so output and input differ as expected. Signed-off-by: Nil Fons Miret <nilf@netflix.com> Made-with: Cursor	2026-04-30 21:15:58 +00:00
Stefan Breunig	9172ab1245	fate/filter-video: add frei0r_src test An installation of frei0r-plugins is required to run the tests, which is usually seperate from the build headers. Some systems have it packaged (e.g. apt install frei0r-plugins). An upstream release extracted to FREI0R_PATH also works. Signed-off-by: Stefan Breunig <stefan-ffmpeg-devel@breunig.xyz>	2026-04-30 03:46:18 +00:00
marcos ashton	fa3d20072b	tests/fate/libavutil: add FATE test for timestamp Test av_ts_make_string with NOPTS, zero, positive, negative, and INT64 boundary values, av_ts2str macro, av_ts_make_time_string2 with various timebases, and av_ts_make_time_string pointer variant. Coverage for libavutil/timestamp.c: 0.00% -> 100.00%	2026-04-28 16:17:47 +00:00
marcos ashton	9b47495dee	tests/fate/libavutil: add FATE test for tdrdi Test av_tdrdi_alloc with 1 and 3 displays, and the inline av_tdrdi_get_display accessor. Verifies that the returned pointer matches entries_offset + idx * entry_size, tests write/read-back of display width exponent/mantissa and view ID fields, and OOM paths via av_max_alloc. Coverage for libavutil/tdrdi.c: 0.00% -> 100.00%	2026-04-28 16:17:47 +00:00
marcos ashton	215799e369	tests/fate/libavutil: add FATE test for hdr_dynamic_vivid_metadata Test av_dynamic_hdr_vivid_alloc and av_dynamic_hdr_vivid_create_side_data. Verifies zero defaults, write/read-back of system_start_code, num_windows, and color transform params (min/avg/var/max RGB), frame side data attachment, and OOM paths via av_max_alloc. Coverage for libavutil/hdr_dynamic_vivid_metadata.c: 0.00% -> 100.00%	2026-04-28 16:17:47 +00:00
marcos ashton	2d9c8a9382	tests/fate/libavutil: add FATE test for buffer Test av_buffer_alloc, av_buffer_allocz, av_buffer_create with custom free callback, AV_BUFFER_FLAG_READONLY, av_buffer_ref, av_buffer_is_writable, av_buffer_get_ref_count, av_buffer_make_writable, av_buffer_realloc (including from NULL), av_buffer_replace (including with NULL), av_buffer_pool init/get/uninit cycle, av_buffer_pool_init2 with custom alloc and pool_free callbacks, av_buffer_pool_buffer_get_opaque, and OOM paths via av_max_alloc. Coverage for libavutil/buffer.c: 0.00% -> 90.19% Remaining uncovered lines are mutex init failures and secondary allocation failure paths.	2026-04-28 16:17:47 +00:00
Jun Zhao	d247110148	fate/filter-video: add regression test for scale zero-dim rejection Add a regression test covering issue #22817: cascaded scale=...:-2 filters on extreme aspect ratios previously produced zero output dimensions silently. The test expects ffmpeg to fail fast. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-04-28 06:14:38 +00:00
Andreas Rheinhardt	f5ed254528	swscale/x86/yuv2yuvX: Port ff_yuv2yuvX_mmxext to SSE2 The mmx function performs two registers in parallel; given the larger register size of SSE2, the same amount of data can be processed in one register with some speedups. (Given that this function is used for tail-processing, not processing more data is important.) Switching to SSE2 also fixes a bug introduced in `554c2bc708`: Since said commit, only half the dither values were used. This seems not to matter in practice, as the functions here use dither only in the following form: ((filtersize-1)*8+dither)>>4. The dither values used here come from ff_dither_8x8_128 which has the property that ff_dither_8x8_128[i][j] and ff_dither_8x8_128[i][j+4] always lead to the same result in the above formula. Old benchmarks: yuv2yuvX_8_2_0_512_approximate_c: 2309.9 ( 1.00x) yuv2yuvX_8_2_0_512_approximate_mmxext: 250.2 ( 9.23x) yuv2yuvX_8_2_0_512_approximate_sse3: 98.8 (23.39x) yuv2yuvX_8_2_0_512_approximate_avx2: 52.9 (43.63x) yuv2yuvX_8_2_16_512_approximate_c: 2263.0 ( 1.00x) yuv2yuvX_8_2_16_512_approximate_mmxext: 245.3 ( 9.22x) yuv2yuvX_8_2_16_512_approximate_sse3: 114.3 (19.80x) yuv2yuvX_8_2_16_512_approximate_avx2: 85.6 (26.45x) yuv2yuvX_8_2_32_512_approximate_c: 2155.8 ( 1.00x) yuv2yuvX_8_2_32_512_approximate_mmxext: 235.6 ( 9.15x) yuv2yuvX_8_2_32_512_approximate_sse3: 93.6 (23.04x) yuv2yuvX_8_2_32_512_approximate_avx2: 78.1 (27.60x) yuv2yuvX_8_2_48_512_approximate_c: 2084.8 ( 1.00x) yuv2yuvX_8_2_48_512_approximate_mmxext: 230.2 ( 9.05x) yuv2yuvX_8_2_48_512_approximate_sse3: 105.0 (19.85x) yuv2yuvX_8_2_48_512_approximate_avx2: 71.9 (29.00x) yuv2yuvX_8_4_0_512_approximate_c: 3496.3 ( 1.00x) yuv2yuvX_8_4_0_512_approximate_mmxext: 455.0 ( 7.68x) yuv2yuvX_8_4_0_512_approximate_sse3: 157.5 (22.20x) yuv2yuvX_8_4_0_512_approximate_avx2: 88.4 (39.53x) yuv2yuvX_8_4_16_512_approximate_c: 3380.9 ( 1.00x) yuv2yuvX_8_4_16_512_approximate_mmxext: 440.0 ( 7.68x) yuv2yuvX_8_4_16_512_approximate_sse3: 175.0 (19.32x) yuv2yuvX_8_4_16_512_approximate_avx2: 134.1 (25.22x) yuv2yuvX_8_4_32_512_approximate_c: 3277.6 ( 1.00x) yuv2yuvX_8_4_32_512_approximate_mmxext: 427.2 ( 7.67x) yuv2yuvX_8_4_32_512_approximate_sse3: 149.7 (21.89x) yuv2yuvX_8_4_32_512_approximate_avx2: 115.5 (28.37x) yuv2yuvX_8_4_48_512_approximate_c: 3167.8 ( 1.00x) yuv2yuvX_8_4_48_512_approximate_mmxext: 414.9 ( 7.63x) yuv2yuvX_8_4_48_512_approximate_sse3: 164.1 (19.31x) yuv2yuvX_8_4_48_512_approximate_avx2: 101.2 (31.30x) yuv2yuvX_8_8_0_512_approximate_c: 5987.5 ( 1.00x) yuv2yuvX_8_8_0_512_approximate_mmxext: 854.1 ( 7.01x) yuv2yuvX_8_8_0_512_approximate_sse3: 294.6 (20.32x) yuv2yuvX_8_8_0_512_approximate_avx2: 144.1 (41.56x) yuv2yuvX_8_8_16_512_approximate_c: 5848.9 ( 1.00x) yuv2yuvX_8_8_16_512_approximate_mmxext: 834.4 ( 7.01x) yuv2yuvX_8_8_16_512_approximate_sse3: 312.1 (18.74x) yuv2yuvX_8_8_16_512_approximate_avx2: 214.9 (27.22x) yuv2yuvX_8_8_32_512_approximate_c: 5610.1 ( 1.00x) yuv2yuvX_8_8_32_512_approximate_mmxext: 811.6 ( 6.91x) yuv2yuvX_8_8_32_512_approximate_sse3: 277.5 (20.21x) yuv2yuvX_8_8_32_512_approximate_avx2: 189.8 (29.55x) yuv2yuvX_8_8_48_512_approximate_c: 5415.8 ( 1.00x) yuv2yuvX_8_8_48_512_approximate_mmxext: 782.3 ( 6.92x) yuv2yuvX_8_8_48_512_approximate_sse3: 289.4 (18.72x) yuv2yuvX_8_8_48_512_approximate_avx2: 165.3 (32.76x) yuv2yuvX_8_16_0_512_approximate_c: 11100.7 ( 1.00x) yuv2yuvX_8_16_0_512_approximate_mmxext: 1682.1 ( 6.60x) yuv2yuvX_8_16_0_512_approximate_sse3: 558.8 (19.86x) yuv2yuvX_8_16_0_512_approximate_avx2: 280.1 (39.63x) yuv2yuvX_8_16_16_512_approximate_c: 10772.1 ( 1.00x) yuv2yuvX_8_16_16_512_approximate_mmxext: 1611.0 ( 6.69x) yuv2yuvX_8_16_16_512_approximate_sse3: 578.1 (18.63x) yuv2yuvX_8_16_16_512_approximate_avx2: 418.8 (25.72x) yuv2yuvX_8_16_32_512_approximate_c: 10381.5 ( 1.00x) yuv2yuvX_8_16_32_512_approximate_mmxext: 1560.4 ( 6.65x) yuv2yuvX_8_16_32_512_approximate_sse3: 525.8 (19.74x) yuv2yuvX_8_16_32_512_approximate_avx2: 370.7 (28.01x) yuv2yuvX_8_16_48_512_approximate_c: 10046.1 ( 1.00x) yuv2yuvX_8_16_48_512_approximate_mmxext: 1512.4 ( 6.64x) yuv2yuvX_8_16_48_512_approximate_sse3: 546.0 (18.40x) yuv2yuvX_8_16_48_512_approximate_avx2: 315.0 (31.89x) New benchmarks: yuv2yuvX_8_2_0_512_approximate_c: 2302.5 ( 1.00x) yuv2yuvX_8_2_0_512_approximate_sse2: 184.4 (12.49x) yuv2yuvX_8_2_0_512_approximate_sse3: 100.1 (23.01x) yuv2yuvX_8_2_0_512_approximate_avx2: 54.9 (41.98x) yuv2yuvX_8_2_16_512_approximate_c: 2224.6 ( 1.00x) yuv2yuvX_8_2_16_512_approximate_sse2: 180.0 (12.36x) yuv2yuvX_8_2_16_512_approximate_sse3: 109.5 (20.31x) yuv2yuvX_8_2_16_512_approximate_avx2: 81.3 (27.35x) yuv2yuvX_8_2_32_512_approximate_c: 2165.3 ( 1.00x) yuv2yuvX_8_2_32_512_approximate_sse2: 176.6 (12.26x) yuv2yuvX_8_2_32_512_approximate_sse3: 93.7 (23.11x) yuv2yuvX_8_2_32_512_approximate_avx2: 73.1 (29.61x) yuv2yuvX_8_2_48_512_approximate_c: 2088.0 ( 1.00x) yuv2yuvX_8_2_48_512_approximate_sse2: 170.7 (12.23x) yuv2yuvX_8_2_48_512_approximate_sse3: 103.4 (20.20x) yuv2yuvX_8_2_48_512_approximate_avx2: 69.4 (30.10x) yuv2yuvX_8_4_0_512_approximate_c: 3496.8 ( 1.00x) yuv2yuvX_8_4_0_512_approximate_sse2: 320.3 (10.92x) yuv2yuvX_8_4_0_512_approximate_sse3: 158.8 (22.02x) yuv2yuvX_8_4_0_512_approximate_avx2: 86.4 (40.49x) yuv2yuvX_8_4_16_512_approximate_c: 3443.5 ( 1.00x) yuv2yuvX_8_4_16_512_approximate_sse2: 325.3 (10.59x) yuv2yuvX_8_4_16_512_approximate_sse3: 171.9 (20.03x) yuv2yuvX_8_4_16_512_approximate_avx2: 123.6 (27.85x) yuv2yuvX_8_4_32_512_approximate_c: 3272.2 ( 1.00x) yuv2yuvX_8_4_32_512_approximate_sse2: 302.7 (10.81x) yuv2yuvX_8_4_32_512_approximate_sse3: 148.9 (21.98x) yuv2yuvX_8_4_32_512_approximate_avx2: 110.6 (29.58x) yuv2yuvX_8_4_48_512_approximate_c: 3166.3 ( 1.00x) yuv2yuvX_8_4_48_512_approximate_sse2: 291.0 (10.88x) yuv2yuvX_8_4_48_512_approximate_sse3: 162.9 (19.44x) yuv2yuvX_8_4_48_512_approximate_avx2: 102.3 (30.95x) yuv2yuvX_8_8_0_512_approximate_c: 5967.6 ( 1.00x) yuv2yuvX_8_8_0_512_approximate_sse2: 691.2 ( 8.63x) yuv2yuvX_8_8_0_512_approximate_sse3: 294.2 (20.28x) yuv2yuvX_8_8_0_512_approximate_avx2: 154.9 (38.52x) yuv2yuvX_8_8_16_512_approximate_c: 5780.2 ( 1.00x) yuv2yuvX_8_8_16_512_approximate_sse2: 606.2 ( 9.53x) yuv2yuvX_8_8_16_512_approximate_sse3: 309.3 (18.69x) yuv2yuvX_8_8_16_512_approximate_avx2: 208.7 (27.69x) yuv2yuvX_8_8_32_512_approximate_c: 5604.3 ( 1.00x) yuv2yuvX_8_8_32_512_approximate_sse2: 592.3 ( 9.46x) yuv2yuvX_8_8_32_512_approximate_sse3: 281.1 (19.94x) yuv2yuvX_8_8_32_512_approximate_avx2: 185.4 (30.23x) yuv2yuvX_8_8_48_512_approximate_c: 5413.7 ( 1.00x) yuv2yuvX_8_8_48_512_approximate_sse2: 570.4 ( 9.49x) yuv2yuvX_8_8_48_512_approximate_sse3: 294.9 (18.36x) yuv2yuvX_8_8_48_512_approximate_avx2: 166.5 (32.51x) yuv2yuvX_8_16_0_512_approximate_c: 11099.4 ( 1.00x) yuv2yuvX_8_16_0_512_approximate_sse2: 1213.6 ( 9.15x) yuv2yuvX_8_16_0_512_approximate_sse3: 563.0 (19.72x) yuv2yuvX_8_16_0_512_approximate_avx2: 294.8 (37.65x) yuv2yuvX_8_16_16_512_approximate_c: 10718.1 ( 1.00x) yuv2yuvX_8_16_16_512_approximate_sse2: 1121.2 ( 9.56x) yuv2yuvX_8_16_16_512_approximate_sse3: 563.7 (19.01x) yuv2yuvX_8_16_16_512_approximate_avx2: 389.5 (27.51x) yuv2yuvX_8_16_32_512_approximate_c: 10373.3 ( 1.00x) yuv2yuvX_8_16_32_512_approximate_sse2: 1096.2 ( 9.46x) yuv2yuvX_8_16_32_512_approximate_sse3: 526.7 (19.70x) yuv2yuvX_8_16_32_512_approximate_avx2: 354.7 (29.24x) yuv2yuvX_8_16_48_512_approximate_c: 10066.9 ( 1.00x) yuv2yuvX_8_16_48_512_approximate_sse2: 1055.8 ( 9.53x) yuv2yuvX_8_16_48_512_approximate_sse3: 527.9 (19.07x) yuv2yuvX_8_16_48_512_approximate_avx2: 313.7 (32.09x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-26 23:48:21 +02:00
James Almer	45fe315cf0	tests/fate/mpegts: add tests for LCEVC samples Both single track (Payloads inside SEI messages) and dual track. Signed-off-by: James Almer <jamrial@gmail.com>	2026-04-24 16:04:48 -03:00
jade	5242bdae82	avformat/id3v2: add image/jxl for JPEG XL image attachments This allows JPEG XL images to be recognized as valid attachments. Since JPEG is already widely used for cover art, JXL's support for lossless JPEG transcodes can decrease the total size of music collections. This fixes JXL cover art rendering in applications like mpv which rely on FFmpeg for demuxing. Signed-off-by: jade <heartstopp1ng@proton.me>	2026-04-22 13:28:17 +00:00
Jun Zhao	188757d43d	tests/checkasm: add hevc_pred ref_filter_3tap and ref_filter_strong tests Test 3-tap for 8x8/16x16/32x32 (both filtered_left and filtered_top outputs). Test strong smoothing for filtered_top and in-place left modification. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-04-21 07:50:49 +00:00
Romain Beauxis	82d7e375f1	libavdevice/alsa.c: fix NULL pointer dereference	2026-04-19 15:00:08 +00:00
Andreas Rheinhardt	415b466d41	avcodec/x86/vp3dsp: Port ff_vp3_idct_dc_add_mmxext to SSE2 This change should improve performance on Skylake and later Intel CPUs (which have only half the ports for saturated adds/subs for mmx register compared to xmm register): llvm-mca predicts a 25% performance improvement on Skylake. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:17 +02:00
Andreas Rheinhardt	88879f2eff	tests/checkasm/vp3dsp: Add test for idct_add, idct_put, idct_dc_add Due to a discrepancy between SSE2 and the C version coefficients for idct_put and idct_add are restricted to a range not causing overflows. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:08 +02:00
Andreas Rheinhardt	84b9de0633	avcodec/x86/vp3dsp: Port ff_put_vp_no_rnd_pixels8_l2_mmx to SSE2 This allows to use pavgb to reduce the amount of instructions used to calculate the average; processing two rows via movhps allows to reduce the amount of pxor and pavgb even further and turned out to be beneficial. This patch also avoids a load as the constant used here can be easily generated at runtime. Old benchmarks: put_no_rnd_pixels_l2_c: 13.3 ( 1.00x) put_no_rnd_pixels_l2_mmx: 11.6 ( 1.15x) New benchmarks: put_no_rnd_pixels_l2_c: 13.4 ( 1.00x) put_no_rnd_pixels_l2_sse2: 7.5 ( 1.77x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:15:54 +02:00
Andreas Rheinhardt	37bc3a237b	tests/checkasm/vp3dsp: Add test for put_no_rnd_pixels_l2 Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:14:50 +02:00
Niklas Haas	cf2d40f65d	swscale/ops: add explicit clear mask to SwsClearOp Instead of implicitly testing for NaN values. This is mostly a straightforward translation, but we need some slight extra boilerplate to ensure the mask is correctly updated when e.g. commuting past a swizzle. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Kacper Michajłow	03967fcff4	tests/checkasm/sw_ops: fix too large shift for int Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2026-04-16 18:56:22 +00:00
Andreas Rheinhardt	39f34ee019	tests/checkasm/h264chroma: Use more realistic block sizes Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-16 07:36:01 +02:00
Niklas Haas	dcfd8ebe86	tests/checkasm/sw_ops: remove random value clears These can randomly trigger the alpha/zero fast paths, resulting in spurious tests or randomly diverging performance if the backend happens to implement that particular fast path. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	80b86f0807	tests/checkasm/sw_ops: fix check_scale() This was not actually testing integer path. Additionally, for integer scales, there is a special fast path for expansion from bits to full range, which we should separate from the random value test.	2026-04-15 14:51:16 +00:00
Niklas Haas	026a6a3101	tests/checkasm/sw_ops: remove redundant filter tests Most of these filters don't test anything meaningfully different relative to each other; the only filters that really have special significant are POINT (for now) and maybe BILINEAR down the line. Apart from that, SINC, combined with the src size loop, already tests both extreme cases (large and small filters), with large, oscillating unwindonwed weights. The other filters are not adding anything of substance to this, while massively slowing down the runtime of this test. We can, of course, change this if the backends ever get more nuanced handling. checkasm: all 855 tests passed (down from 1575) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	91582f7287	tests/checkasm/sw_ops: explicitly test all backends The current code was a bit clumsy in that it always picked the first available backend when choosing the new function. This meant that some x86 paths were not being tested at all, whenever the memcpy backend (which has higher priority) could serve the request. This change makes it so that each backend is explicitly tested against only implementations provided by that same backend. checkasm: all 1575 tests passed (up from 1305) As an aside, it also lets us benchmark the memcpy backend directly against the C reference backend. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	d5089a1c62	tests/checkasm/sw_ops: don't shadow 'report' Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	3c1781f931	tests/checkasm/sw_ops: separate op compilation from testing This commit is purely moving around code; there is no functional change. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	e83de76f08	tests/checkasm/sw_ops: check all planes in CHECK_COMMON() This can help e.g. properly test that the masked/excluded components are left unmodified. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	eac90ce6ce	tests/checkasm/sw_ops: set correct plane index order All four components were accidentally being read/written to/from the same plane. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Niklas Haas	590eb4b70d	tests/checkasm/sw_ops: remove some unnecessary checks These don't actually exist at runtime, and will soon be removed from the backends as well. This commit is intentionally a bit incomplete; as I will rewrite this based on the auto-generated macros in the upcoming ops_micro series. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-15 14:51:16 +00:00
Andreas Rheinhardt	338dc25642	avcodec/x86/snowdsp_init: Remove MMXEXT, SSE2 inner_add_yblock versions They have been superseded by SSSE3; the SSE2 version was even disabled (and segfaults if enabled). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:53:17 +02:00

1 2 3 4 5 ...

6929 commits