ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-02-13 19:05:37 +00:00

Author	SHA1	Message	Date
Lynne	72e83b42d1	vulkan_decode: clean up decoder initialization Now that we don't reset on every seek, we can simplify it.	2025-12-13 19:12:24 +01:00
Lynne	018ba6b612	vulkan_decode: do not reset the decoder when flushing The issue is that .flush gets called asynchronously, and modifies the video session state while its being used for decoding. This did not result in issues since all known vendors do not keep important state there, but its not compliant with the specs. Its not necessary to flush the decoder at all when seeking, so simply don't. Fixes #20487	2025-12-13 19:12:20 +01:00
Andreas Rheinhardt	3da2a21710	avcodec/hq_hqadata: Avoid relocations for HQProfiles Reviewed-by: Marton Balint <cus@passwd.hu> Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-13 05:57:47 +01:00
Andreas Rheinhardt	2718874724	avcodec/hq_hqadata: Remove padding from tables Each table needs only tab_wtab_h2 entries. Reviewed-by: Marton Balint <cus@passwd.hu> Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-13 05:55:44 +01:00
Andreas Rheinhardt	0cf187471f	avcodec/hq_hqa: Don't rederive value perm gets incremented in the loop in such a manner that it already has the value it is set to here except for the first loop iteration. Reviewed-by: Marton Balint <cus@passwd.hu> Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-13 05:55:20 +01:00
Ruikai Peng	c48b8ebbbb	avcodec/vulkan: fix DPX unpack offset The DPX Vulkan unpack shader computes a word offset as uint off = (line_off + pix_off >> 5); Due to GLSL operator precedence this is evaluated as line_off + (pix_off >> 5) rather than (line_off + pix_off) >> 5. Since line_off is in bits while off is a 32-bit word index, scanlines beyond y=0 use an inflated offset and the shader reads past the end of the DPX slice buffer. Parenthesize the expression so that the sum is shifted as intended: uint off = (line_off + pix_off) >> 5; This corrects the unpacked data and removes the CRC mismatch observed between the software and Vulkan DPX decoders for mispacked 12-bit DPX samples. The GPU OOB read itself is only observable indirectly via this corruption since it occurs inside the shader. Repro on x86_64 with Vulkan/llvmpipe (`531ce713a0`): ./configure --cc=clang --disable-optimizations --disable-stripping \ --enable-debug=3 --disable-doc --disable-ffplay \ --enable-vulkan --enable-libshaderc \ --enable-hwaccel=dpx_vulkan \ --extra-cflags='-fsanitize=address -fno-omit-frame-pointer' \ --extra-ldflags='-fsanitize=address' && make VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.json PoC: packed 12-bit DPX with the packing flag cleared so the unpack shader runs (4x64 gbrp12le), e.g. poc12_packed0.dpx. Software decode: ./ffmpeg -v error -i poc12_packed0.dpx -f framecrc - -> 0, ..., 1536, 0x26cf81c2 Vulkan hwaccel decode: VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.json \ ./ffmpeg -v error -init_hw_device vulkan \ -hwaccel vulkan -hwaccel_output_format vulkan \ -i poc12_packed0.dpx \ -vf hwdownload,format=gbrp12le -f framecrc - -> 0, ..., 1536, 0x71e10a51 The only difference between the two runs is the Vulkan unpack shader, and the stable CRC mismatch indicates that it is reading past the intended DPX slice region. Regression since: `531ce713a0` Found-by: Pwno	2025-12-12 20:13:16 +00:00
James Almer	9c14527f1a	avcodec/vvc/refs: export in-band LCEVC side data in frames Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-12 15:21:49 -03:00
James Almer	94c491287c	avcodec/vvc/sei: parse Registered and Unregistered SEI messages Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-12 15:21:48 -03:00
James Almer	6dad70507f	avcodec/cbs_sei: store a pointer to the start of Registered and Unregistered SEI messages Required for the following commit, where a parsing function expects the buffer to include the country code bytes. Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-12 15:21:48 -03:00
James Almer	b6655e9594	avcodec/dpx: make the lack of break in a switch case explicit Should fix CID 1676036 Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-12 18:18:46 +00:00
Cameron Gutman	0637a28dc0	lavc/vulkan_video: fix leak on CreateVideoSessionKHR failure Signed-off-by: Cameron Gutman <aicommander@gmail.com>	2025-12-12 12:43:00 +00:00
Cameron Gutman	4e4677bf58	lavc/vulkan_video: fix double-free if ff_vk_decode_init() fails ff_vk_video_common_init() calls ff_vk_video_common_uninit() on failure which leaves dangling object handles. Those get freed again when the destructor of FFVulkanDecodeShared calls ff_vk_video_common_uninit() again. Signed-off-by: Cameron Gutman <aicommander@gmail.com>	2025-12-12 12:43:00 +00:00
Andreas Rheinhardt	a72e01b4ec	avcodec/ppc/vc1dsp_altivec: Don't read too much data vc1_inv_trans_8x4_altivec() is supposed to process a block of 8x4 words, yet it read and processed eight lines. This led to ASAN failures (see [1]) that this commit intends to fix. It should also lead to performance improvements, but I don't have real hardware to bench it. [1]: https://fate.ffmpeg.org/report.cgi?time=20251207214004&slot=ppc64-linux-gcc-14.3-asan Reviewed-by: Sean McGovern <gseanmcg@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-12 09:44:01 +01:00
James Almer	04df80f973	avcodec/cavs_parser: check return value of init_get_bits8 Fixes Coverity issue CID 1676035 Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-11 20:01:01 -03:00
Rémi Denis-Courmont	a4cb6c724b	lavc/llvidencdsp: R-V V sub_left_predict SpacemiT X60: sub_left_predict_c: 51836.0 ( 1.00x) sub_left_predict_rvv_i32: 5843.1 ( 8.87x)	2025-12-11 17:24:38 +02:00
Leo Izen	37858dc6bd	avcodec/libjxlenc: add EXIF box to output We already parse the EXIF side data to extract the orientation, so we should add it to the output file as an EXIF box. Signed-off-by: Leo Izen <leo.izen@gmail.com>	2025-12-11 05:38:36 -05:00
Leo Izen	e349118b4c	avcodec/libjxlenc: avoid calling functions inside if statements It leads to messier, less readable code, and can also lead to bugs. I prefer this code style. Signed-off-by: Leo Izen <leo.izen@gmail.com>	2025-12-11 05:38:35 -05:00
Leo Izen	6ec4b3a9cb	avcodec/libjxlenc: give display matrix sidedata priority Before this commit, we ignore the display matrix side data if any EXIF side data is present, even if that side data contains no orientation tag. This allows us to calculate the orientation from the display matrix sidedata first, if present. Ideally the decoder will have removed the orientation tag upon decoding and attached the data as display matrix side data instead, so this makes our orientation code respect this behavior. Signed-off-by: Leo Izen <leo.izen@gmail.com>	2025-12-11 05:38:33 -05:00
Hyunjun Ko	6726359326	vulkan_vp9: fix subsampling source and show_frame flag	2025-12-10 18:41:20 +00:00
Kacper Michajłow	04a46a2ae4	avcodec/d3d12va_encode_av1: don't ignore return value Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-08 21:31:13 +00:00
Kacper Michajłow	f4fc14fb38	avcodec/d3d12va_encode_av1: fix size_t format specifier	2025-12-08 21:31:13 +00:00
Kacper Michajłow	5b2bd6f88d	avcodec/d3d12va_encode_av1: remove unused variables Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-08 21:31:13 +00:00
Kacper Michajłow	1f7182a991	avcodec/libx265: add explicit enum cast to suppress compiler warnings Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-08 21:31:13 +00:00
Kacper Michajłow	eaa2b3d4be	avcodec/libsvtav1: add explicit enum cast to suppress compiler warnings Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-08 21:31:13 +00:00
Kacper Michajłow	490af2d4cf	avcodec/libaomdec: add explicit enum cast to suppress compiler warnings Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-08 21:31:13 +00:00
Andreas Rheinhardt	dc843cdd9a	avcodec/x86/vp9mc: Reindent after the previous commit Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:35:07 +01:00
Andreas Rheinhardt	65e71b0837	avcodec/x86/vp9mc: Deduplicate coefficient tables Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:35:01 +01:00
Andreas Rheinhardt	38e2174ce4	avcodec/x86/vp9mc: Avoid MMX regs in width 4 hor 8tap funcs Using wider registers (and pshufb) allows to halve the number of pmaddubsw used. It is also ABI compliant (no more missing emms). Old benchmarks: vp9_avg_8tap_smooth_4h_8bpp_c: 97.6 ( 1.00x) vp9_avg_8tap_smooth_4h_8bpp_ssse3: 15.0 ( 6.52x) vp9_avg_8tap_smooth_4hv_8bpp_c: 342.9 ( 1.00x) vp9_avg_8tap_smooth_4hv_8bpp_ssse3: 54.0 ( 6.35x) vp9_put_8tap_smooth_4h_8bpp_c: 94.9 ( 1.00x) vp9_put_8tap_smooth_4h_8bpp_ssse3: 14.2 ( 6.67x) vp9_put_8tap_smooth_4hv_8bpp_c: 325.9 ( 1.00x) vp9_put_8tap_smooth_4hv_8bpp_ssse3: 52.5 ( 6.20x) New benchmarks: vp9_avg_8tap_smooth_4h_8bpp_c: 97.6 ( 1.00x) vp9_avg_8tap_smooth_4h_8bpp_ssse3: 10.8 ( 9.08x) vp9_avg_8tap_smooth_4hv_8bpp_c: 342.4 ( 1.00x) vp9_avg_8tap_smooth_4hv_8bpp_ssse3: 38.8 ( 8.82x) vp9_put_8tap_smooth_4h_8bpp_c: 94.7 ( 1.00x) vp9_put_8tap_smooth_4h_8bpp_ssse3: 9.7 ( 9.75x) vp9_put_8tap_smooth_4hv_8bpp_c: 321.7 ( 1.00x) vp9_put_8tap_smooth_4hv_8bpp_ssse3: 37.0 ( 8.69x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:34:35 +01:00
Andreas Rheinhardt	dd5dc254ff	avcodec/x86/vp9mc: Avoid reloads, MMX regs in width 4 vert 8tap func Four rows of four bytes fit into one xmm register; therefore one can arrange the rows as follows (A,B,C: first, second, third etc. row) xmm0: ABABABAB BCBCBCBC xmm1: CDCDCDCD DEDEDEDE xmm2: EFEFEFEF FGFGFGFG xmm3: GHGHGHGH HIHIHIHI and use four pmaddubsw to calculate two rows in parallel. The history fits into four registers, making this possible even on 32bit systems. Old benchmarks (Unix 64): vp9_avg_8tap_smooth_4v_8bpp_c: 105.5 ( 1.00x) vp9_avg_8tap_smooth_4v_8bpp_ssse3: 16.4 ( 6.44x) vp9_put_8tap_smooth_4v_8bpp_c: 99.3 ( 1.00x) vp9_put_8tap_smooth_4v_8bpp_ssse3: 15.4 ( 6.44x) New benchmarks (Unix 64): vp9_avg_8tap_smooth_4v_8bpp_c: 105.0 ( 1.00x) vp9_avg_8tap_smooth_4v_8bpp_ssse3: 11.8 ( 8.90x) vp9_put_8tap_smooth_4v_8bpp_c: 99.7 ( 1.00x) vp9_put_8tap_smooth_4v_8bpp_ssse3: 10.7 ( 9.30x) Old benchmarks (x86-32): vp9_avg_8tap_smooth_4v_8bpp_c: 138.2 ( 1.00x) vp9_avg_8tap_smooth_4v_8bpp_ssse3: 28.0 ( 4.93x) vp9_put_8tap_smooth_4v_8bpp_c: 123.6 ( 1.00x) vp9_put_8tap_smooth_4v_8bpp_ssse3: 28.0 ( 4.41x) New benchmarks (x86-32): vp9_avg_8tap_smooth_4v_8bpp_c: 139.0 ( 1.00x) vp9_avg_8tap_smooth_4v_8bpp_ssse3: 20.1 ( 6.92x) vp9_put_8tap_smooth_4v_8bpp_c: 124.5 ( 1.00x) vp9_put_8tap_smooth_4v_8bpp_ssse3: 19.9 ( 6.26x) Loading the constants into registers did not turn out to be advantageous here (not to mention Win64, where this would necessitate saving and restoring ever more register); probably because there are only two loop iterations. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:31:59 +01:00
Andreas Rheinhardt	36204fbc3c	avcodec/vp9itxfm{,_16bpp}: Remove MMXEXT functions overridden by SSSE3 SSSE3 is already quite old (introduced 2006 for Intel, 2011 for AMD), so that the overwhelming majority of our users (particularly those that actually update their FFmpeg) will be using the SSSE3 versions. This commit therefore removes the MMXEXT functions overridden by them (which don't abide by the ABI) to get closer to a removal of emms_c. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:27:51 +01:00
Andreas Rheinhardt	ea37f49aed	avcodec/vp9intrapred: Remove MMXEXT functions overridden by SSSE3 SSSE3 is already quite old (introduced 2006 for Intel, 2011 for AMD), so that the overwhelming majority of our users (particularly those that actually update their FFmpeg) will be using the SSSE3 versions. This commit therefore removes the MMXEXT functions overridden by them (which don't abide by the ABI) to get closer to a removal of emms_c. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:27:44 +01:00
Andreas Rheinhardt	6e418af810	avcodec/vp9mc: Remove MMXEXT functions overridden by SSSE3 SSSE3 is already quite old (introduced 2006 for Intel, 2011 for AMD), so that the overwhelming majority of our users (particularly those that actually update their FFmpeg) will be using the SSSE3 versions. This commit therefore removes the MMXEXT functions overridden by them (which don't abide by the ABI) to get closer to a removal of emms_c. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-08 19:27:05 +01:00
Kacper Michajłow	5b5d51cbc1	avcodec/x86/h264_idct: fix version check for NASM 3 and newer Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-08 17:43:29 +00:00
Oliver Chang	9849a274df	avcodec/dpx: Fix heap-buffer-overflow in 16-bit decoding Fixes a heap-buffer-overflow in `libavcodec/dpx.c` triggered by a stale `unpadded_10bit` flag in the `DPXDecContext`. This flag, set for 10-bit unpadded frames, persisted across `decode_frame` calls. If a subsequent frame was 16-bit, the stale flag caused incorrect buffer size validation, allowing truncated buffers to pass checks designed for smaller 10-bit packed data. This led to an out-of-bounds read in `av_image_copy_plane` during 16-bit decoding. The fix explicitly resets `dpx->unpadded_10bit = 0` at the start of `decode_frame` to ensure correct validation for each frame. Fixes: https://issues.oss-fuzz.com/issues/464471792 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> Fixes: out of array read Fixes: 464471792/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DPX_DEC_fuzzer-5275522210004992	2025-12-07 19:41:02 +00:00
Rémi Denis-Courmont	10ea5f8b99	lavc/h264idct: R-V V 9-bit h264_luma_dc_dequant_idct Note that, like the C reference, the same function can be used for larger bit depths.	2025-12-07 20:27:35 +02:00
Rémi Denis-Courmont	d69a36a8d1	lavc/h264idct: R-V V 8-bit h264_luma_dc_dequant_idct This does not improve performance with current hardware due to the poor performance of segmented accesses. Performance should be slightly better with expensive or near-future hardware that I don't have, however it is still limited by two other factors: - There are only 4 elements. - The final stores are necessarily indexed and hit multiple cache lines, thus as slow as scalar.	2025-12-07 20:27:35 +02:00
Rémi Denis-Courmont	f222eb2b08	lavc/mpv_unquantize: R-V V H.263 DCT unquantize SpacemiT X60: dct_unquantize_h263_inter_c: 417.8 ( 1.00x) dct_unquantize_h263_inter_rvv_i32: 66.0 ( 6.33x) dct_unquantize_h263_intra_c: 140.2 ( 1.00x) dct_unquantize_h263_intra_rvv_i32: 67.7 ( 2.07x) Note that the C benchmarks are not stable, depending heavily on the number of coefficients picked by the RNG. The R-V V benchmarks are however very stable and generally better than C's.	2025-12-07 20:20:38 +02:00
averne	c384b1e803	vulkan/prores: use vkCmdClearColorImage The VK spec forbids using clear commands on YUV images, so we need to allocate separate per-plane images. This removes the need for a separate reset shader.	2025-12-07 18:17:36 +00:00
James Almer	00caeba050	avcodec: rename avcodec_receive_frame2 to avcodec_receive_frame_flags It's a name that communicates its functionality in a better way. Since the function was introduced very recently, we can safely rename it. Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-07 12:47:46 -03:00
Michael Niedermayer	88f26718a0	avcodec/decode: Fix build due to ff_thread_receive_frame() Regression since: `5e56937b74` Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-12-07 11:58:01 +01:00
Niklas Haas	5e56937b74	avcodec: allow bypassing frame threading with an optional flag Normally, this function tries to make sure all threads are saturated with work to do before returning any frames; and will continue requesting packets until that is the case. However, this significantly slows down initial decoding latency when only requesting a single frame (to e.g. configure the filter graph), and also wastes a lot of unnecessary memory in the event that the user does not intend to decode more frames until later. By introducing a new `flags` paramater and a new flag `AV_CODEC_RECEIVE_FRAME_FLAG_SYNCHRONOUS` to go along with it, we can allow users to temporarily bypass this logic.	2025-12-05 19:42:41 +01:00
Araz Iusubov	077864dfd6	avcodec/amf: fix hw_device_ctx handling	2025-12-05 15:53:19 +00:00
Zhao Zhili	d3953237d1	avcodec/h264_slice: don't force ff_get_format unconditionally after flush h->context_initialized is zero after flush, which triggers call to ff_get_format unconditionally. ff_get_format can be heavy with ff_hwaccel_uninit and hwaccel_init. For example, it takes 20 ms on macOS with videotoolbox. ff_get_format should not be called if nothing changed. ff_get_format is guarantee to be called at the first time and when video information changed with (must_reinit \|\| needs_reinit). Fix #20760.	2025-12-05 13:54:08 +00:00
Andreas Rheinhardt	1d47ae65bf	avcodec/tableprint_vlc: Unbreak hardcoded tables Forgotten in `d8ffec5bf9`. Fixes issue #21102. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-05 11:31:23 +01:00
Lynne	f80addbb07	ffv1enc_vulkan: fix encoding with large contexts When RGB_LINECACHE == 2, then top2 is not the current line.	2025-12-04 16:53:58 +01:00
Andreas Rheinhardt	4b6e40a298	avcodec/vp8dsp: Don't compile unused functions The width 16 epel functions never use four taps in any direction, so don't build said functions. Saves 4352B of .text and 89B of .text.unlikely here. : mx and my in vp8_mc_luma() are always even. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	9cff236e2f	avcodec/riscv/vp8dsp_rvv: Remove unused functions Only the sixtap functions are used for size 16. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	050c80a526	avcodec/x86/vp8dsp: Don't use saturated addition when unnecessary For the epel functions, there can be no overflow as long as the sum contains only one of the two large central coefficients; for bilinear functions, there can be no overflow whatsoever. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	575e9e9c08	avcodec/x86/vp8dsp: Reduce number of coefficient tables By changing the permutations used in the epel8_h{4,6} case we can simply reuse the coefficient tables from the vertical epel filters. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	99fb257f58	avcodec/x86/vp8dsp: Don't use MMX registers in ff_put_vp8_epel4_h6_ssse3 Doubling the register width allowed to avoid a pshufb and a pmaddubsw. Old benchmarks: vp8_put_epel4_h6_c: 115.9 ( 1.00x) vp8_put_epel4_h6_ssse3: 20.2 ( 5.74x) vp8_put_epel4_h6v4_c: 276.3 ( 1.00x) vp8_put_epel4_h6v4_ssse3: 58.6 ( 4.71x) vp8_put_epel4_h6v6_c: 363.6 ( 1.00x) vp8_put_epel4_h6v6_ssse3: 62.5 ( 5.82x) New benchmarks: vp8_put_epel4_h6_c: 116.4 ( 1.00x) vp8_put_epel4_h6_ssse3: 16.0 ( 7.29x) vp8_put_epel4_h6v4_c: 280.9 ( 1.00x) vp8_put_epel4_h6v4_ssse3: 44.3 ( 6.33x) vp8_put_epel4_h6v6_c: 365.6 ( 1.00x) vp8_put_epel4_h6v6_ssse3: 53.1 ( 6.89x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00

1 2 3 4 5 ...

53232 commits