ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-06-15 12:00:33 +00:00

Author	SHA1	Message	Date
Andreas Rheinhardt	e4e6377afc	avcodec/arm/mpegvideo_arm: Use static_assert to check offsets Also move AV_CHECK_OFFSET to its only user, namely lavc/arm/mpegvideo_arm.c and rename it to CHECK_OFFSET. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:43 +01:00
Andreas Rheinhardt	790f793844	avutil/common: Don't auto-include mem.h There are lots of files that don't need it: The number of object files that actually need it went down from 2011 to 884 here. Keep it for external users in order to not cause breakages. Also improve the other headers a bit while just at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:43 +01:00
Andreas Rheinhardt	b616be1649	lib*/version: Use static_assert for static asserts Also update the checks that guard against inserting a new enum entry in the middle of a range. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:42 +01:00
Andreas Rheinhardt	c8549d480f	avcodec/msmpeg4: Don't include x86-specific header unconditionally Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:42 +01:00
Andreas Rheinhardt	a265e8ca92	avcodec, avfilter: Don't use "" for system headers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:42 +01:00
Andreas Rheinhardt	347a70f101	avcodec/pcm-bluray/dvd: Use correct pointer types on BE Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:42 +01:00
Andreas Rheinhardt	cd63dab55c	avcodec/mips/ac3dsp_mips: Add missing includes Likely broken in `d7a75d2163`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-31 00:08:42 +01:00
Andreas Rheinhardt	dc7a60529c	avcodec/ratecontrol: Use forward declaration for AVExpr Avoids including eval.h everywhere where mpegvideo.h is included. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-30 05:06:28 +01:00
Andreas Rheinhardt	348461e550	avcodec/h264_refs: Use smaller scope, don't use av_uninit In particular, declare iterators with loop scope. Also remove av_uninit while at it, because they are now unnecessary due to the changes of the preceding commit. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-30 05:06:28 +01:00
Andreas Rheinhardt	ac14d68277	avcodec/h264_refs: Rewrite code to make control flow clearer While this change IMO makes the control flow clearer for the human reader, it is especially important for GCC: It erroneously believes that it is possible to enter the SHORT2(UNUSED\|LONG) cases without having entered the preceding block that initializes pic, frame_num, structure and j; it would emit -Wmaybe-uninitialized warnings for these variables if they were not pseudo- initialized with av_uninit(). This patch allows to remove the latter. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-30 05:06:28 +01:00
Timo Rothenpieler	e99c273fec	avcodec/nvdec: reset bitstream_len/nb_slices when resetting bitstream pointer	2024-03-30 00:12:23 +01:00
James Almer	547c920193	avcodec/hevc_ps: don't use a fixed sized buffer for parameter set raw data Allocate it instead, and use it to compare sets instead of the parsed struct. Signed-off-by: James Almer <jamrial@gmail.com>	2024-03-29 15:34:21 -03:00
Tong Wu	6bf17136a2	avcodec/hevc_ps: fix the problem of memcmp losing effectiveness HEVCHdrParams* receives a pointer which points to a dynamically allocated memory block. It causes the memcmp always returning 1. Add a function to do the comparision. A condition is also added to avoid malloc(0). Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Tong Wu <tong1.wu@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>	2024-03-29 12:35:54 -03:00
Anton Khirnov	c240ff98b3	lavc/packet: schedule AV_PKT_DATA_QUALITY_FACTOR for removal It is unused internally and has been marked as deprecated a long time ago.	2024-03-29 09:01:54 +01:00
Anton Khirnov	1d843ae6c7	lavc: rename avpacket.c to packet.c For consistency with its API header packet.h.	2024-03-29 09:01:54 +01:00
Andreas Rheinhardt	8d1093a784	avcodec/libvpxenc: Remove obsolete av_unused Forgotten in `753074721b`. Reviewed-by: James Zern <jzern@google.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-29 00:58:05 +01:00
Andreas Rheinhardt	1093b40218	avcodec/libvpxenc: Only search for side data when intending to use it Also rewrite the code so that a variable that is only used depending upon CONFIG_LIBVPX_VP9_ENCODER is not declared outside of the #if block. (The variable was declared with av_uninit, but it should have been av_unused, as the former does not work for all compilers.) Reviewed-by: James Zern <jzern@google.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-29 00:45:17 +01:00
Andreas Rheinhardt	e465cebfee	avcodec/Makefile: Remove redundant dependencies on hevc_data.o hevc_data.c only provides ff_hevc_diag_scan tables and neither the QSV HEVC encoder nor the HEVC parser use these directly and the indirect dependency is already accounted for in the dependencies of the hevcparse subsystem since `b0c61209cd`, so remove these spurious dependencies. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-29 00:39:11 +01:00
Anton Khirnov	e0de84ad2e	lavc/encode: map AVCodecContext.decoded_side_data to coded_side_data This way it can be automagically propagated through the encoder to muxing.	2024-03-28 08:40:11 +01:00
Anton Khirnov	a3f4670943	lavc/decode: move sd_global_map to avcodec It will be shared with encoding code.	2024-03-28 08:40:01 +01:00
Anton Khirnov	e1f384adbf	lavc/frame_thread_encoder: avoid assigning a whole AVCodecContext It is highly unsafe, as AVCodecContext contains many allocated fields. Almost everything needed by worker threads should be covered by routing through AVCodecParameters and av_opt_copy(), except for a few fields that are copied manually. avcodec_free_context() can now be used for per-thread contexts.	2024-03-28 08:40:01 +01:00
Anton Khirnov	198a7788e7	lavc: avoid leaking AVCodecContext.chroma_intra_matrix	2024-03-28 08:40:01 +01:00
Andreas Rheinhardt	686d33a6b0	avcodec/profiles: Don't include avcodec.h Forgotten in `8238bc0b5e`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-28 03:08:01 +01:00
Andreas Rheinhardt	33b1c7ebbf	avcodec/magicyuvenc: Don't call functions twice due to macro Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-28 03:08:01 +01:00
Andreas Rheinhardt	8013574e9b	avcodec/mjpegenc: Inline chroma subsampling Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-28 03:08:00 +01:00
Andreas Rheinhardt	0b212f3595	avcodec/bfi: Remove unused AVCodecContext* from context Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-28 03:06:13 +01:00
Andreas Rheinhardt	6edd83c0e2	avcodec/ratecontrol: Avoid function pointer casts It is undefined behaviour to call a function with a different signature for the call than the actual function signature; there are no exceptions for void* and RateControlEntry*. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-28 03:06:13 +01:00
Andreas Rheinhardt	641850f67f	avcodec/wmaprodec: Explicitly return 0 on success Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-03-28 03:06:13 +01:00
Zhao Zhili	89e9486bc3	avcodec/h264_mp4toannexb: Fix heap buffer overflow Fixes: out of array write Fixes: 64407/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_MP4TOANNEXB_fuzzer-4966763443650560 mp4toannexb_filter counts the number of bytes needed in the first pass and allocate the memory, then do memcpy in the second pass. Update sps/pps size in the loop makes the count invalid in the case of SPS/PPS occur after IDR slice. This patch process in-band SPS/PPS before the two pass loops. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2024-03-27 20:04:40 +08:00
Michael Niedermayer	6b213175c9	Bump after 7.0 branch point Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-27 01:04:54 +01:00
Michael Niedermayer	872980ace6	Bump prior release/7.0 branch Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-27 01:04:53 +01:00
Michael Niedermayer	1eb8cbd09c	avcodec/wavarc: avoid signed integer overflow in AC code Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-659847401740697 Fixes: signed integer overflow: 65312 * 34078 cannot be represented in type 'int' Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Michael Niedermayer	6009dd07bd	avcodec/wavarc: Avoid signed integer overflow in sample Fixes: signed integer overflow: -2147483648 + -25122315 cannot be represented in type 'int' Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-6199806972198912 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Michael Niedermayer	ebdcf98499	avcodec/truemotion1: Height not being a multiple of 4 is unsupported mb_change_bits is given space based on height >> 2, while more data is read Fixes: out of array access Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TRUEMOTION1_fuzzer-5201925062590464.fuzz Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Michael Niedermayer	d188a86730	avcodec/rtv1: fix undefined FFALIGN Fixes: signed integer overflow: 2147483647 + 4 cannot be represented in type 'int' Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RTV1_fuzzer-6324303861514240 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Michael Niedermayer	7eabe56436	avcodec/qoadec: Fix undefined overflow in lms_predict Fixes: signed integer overflow: -1575944192 + -602931200 cannot be represented in type 'int' Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_QOA_fuzzer-6470469339185152 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Michael Niedermayer	48eeb198a5	avcodec/hcadec: do not allow code to continue after failed init Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HCA_fuzzer-6247136417087488 Fixes: out of array write Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Michael Niedermayer	addb85ea39	avcodec/hcadec: do not set hfr_group_count to invalid values Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HCA_fuzzer-6247136417087488 Fixes: out of array write Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-03-26 23:19:49 +01:00
Dai, Jianhui J	61afe4d98c	avcodec/cbs_vp8: Improve the bitstream position check The VP8 compressed header may not be byte-aligned due to boolean coding. Round up byte count for accurate data positioning. Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2024-03-26 09:05:04 -04:00
Dai, Jianhui J	63dea3c1e1	avcodec/cbs_vp8: Use little endian in fixed() This commit adds value range checks to cbs_vp8_read_unsigned_le, migrates fixed() to use it, and enforces little-endian consistency for all read methods. Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2024-03-26 09:04:44 -04:00
Martin Storsjö	f872b19714	aarch64: hevc: Produce plain neon versions of qpel_bi_hv As the plain neon qpel_h functions process two rows at a time, we need to allocate storage for h+8 rows instead of h+7. By allocating storage for h+8 rows, incrementing the stack pointer won't end up at the right spot in the end. Store the intended final stack pointer value in a register x14 which we store on the stack. AWS Graviton 3: put_hevc_qpel_bi_hv4_8_c: 385.7 put_hevc_qpel_bi_hv4_8_neon: 131.0 put_hevc_qpel_bi_hv4_8_i8mm: 92.2 put_hevc_qpel_bi_hv6_8_c: 701.0 put_hevc_qpel_bi_hv6_8_neon: 239.5 put_hevc_qpel_bi_hv6_8_i8mm: 191.0 put_hevc_qpel_bi_hv8_8_c: 1162.0 put_hevc_qpel_bi_hv8_8_neon: 228.0 put_hevc_qpel_bi_hv8_8_i8mm: 225.2 put_hevc_qpel_bi_hv12_8_c: 2305.0 put_hevc_qpel_bi_hv12_8_neon: 558.0 put_hevc_qpel_bi_hv12_8_i8mm: 483.2 put_hevc_qpel_bi_hv16_8_c: 3965.2 put_hevc_qpel_bi_hv16_8_neon: 732.7 put_hevc_qpel_bi_hv16_8_i8mm: 656.5 put_hevc_qpel_bi_hv24_8_c: 8709.7 put_hevc_qpel_bi_hv24_8_neon: 1555.2 put_hevc_qpel_bi_hv24_8_i8mm: 1448.7 put_hevc_qpel_bi_hv32_8_c: 14818.0 put_hevc_qpel_bi_hv32_8_neon: 2763.7 put_hevc_qpel_bi_hv32_8_i8mm: 2468.0 put_hevc_qpel_bi_hv48_8_c: 32855.5 put_hevc_qpel_bi_hv48_8_neon: 6107.2 put_hevc_qpel_bi_hv48_8_i8mm: 5452.7 put_hevc_qpel_bi_hv64_8_c: 57591.5 put_hevc_qpel_bi_hv64_8_neon: 10660.2 put_hevc_qpel_bi_hv64_8_i8mm: 9580.0 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:55 +02:00
Martin Storsjö	d21b9a0411	aarch64: hevc: Produce plain neon versions of qpel_uni_w_hv As the plain neon qpel_h functions process two rows at a time, we need to allocate storage for h+8 rows instead of h+7. AWS Graviton 3: put_hevc_qpel_uni_w_hv4_8_c: 422.2 put_hevc_qpel_uni_w_hv4_8_neon: 140.7 put_hevc_qpel_uni_w_hv4_8_i8mm: 100.7 put_hevc_qpel_uni_w_hv8_8_c: 1208.0 put_hevc_qpel_uni_w_hv8_8_neon: 268.2 put_hevc_qpel_uni_w_hv8_8_i8mm: 261.5 put_hevc_qpel_uni_w_hv16_8_c: 4297.2 put_hevc_qpel_uni_w_hv16_8_neon: 802.2 put_hevc_qpel_uni_w_hv16_8_i8mm: 731.2 put_hevc_qpel_uni_w_hv32_8_c: 15518.5 put_hevc_qpel_uni_w_hv32_8_neon: 3085.2 put_hevc_qpel_uni_w_hv32_8_i8mm: 2783.2 put_hevc_qpel_uni_w_hv64_8_c: 57254.5 put_hevc_qpel_uni_w_hv64_8_neon: 11787.5 put_hevc_qpel_uni_w_hv64_8_i8mm: 10659.0 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:55 +02:00
Martin Storsjö	5ab138673b	aarch64: hevc: Produce plain neon versions of qpel_uni_hv As the plain neon qpel_h functions process two rows at a time, we need to allocate storage for h+8 rows instead of h+7. By allocating storage for h+8 rows, incrementing the stack pointer won't end up at the right spot in the end. Store the intended final stack pointer value in a register x14 which we store on the stack. AWS Graviton 3: put_hevc_qpel_uni_hv4_8_c: 384.2 put_hevc_qpel_uni_hv4_8_neon: 127.5 put_hevc_qpel_uni_hv4_8_i8mm: 85.5 put_hevc_qpel_uni_hv6_8_c: 705.5 put_hevc_qpel_uni_hv6_8_neon: 224.5 put_hevc_qpel_uni_hv6_8_i8mm: 176.2 put_hevc_qpel_uni_hv8_8_c: 1136.5 put_hevc_qpel_uni_hv8_8_neon: 216.5 put_hevc_qpel_uni_hv8_8_i8mm: 214.0 put_hevc_qpel_uni_hv12_8_c: 2259.5 put_hevc_qpel_uni_hv12_8_neon: 498.5 put_hevc_qpel_uni_hv12_8_i8mm: 410.7 put_hevc_qpel_uni_hv16_8_c: 3824.7 put_hevc_qpel_uni_hv16_8_neon: 670.0 put_hevc_qpel_uni_hv16_8_i8mm: 603.7 put_hevc_qpel_uni_hv24_8_c: 8113.5 put_hevc_qpel_uni_hv24_8_neon: 1474.7 put_hevc_qpel_uni_hv24_8_i8mm: 1351.5 put_hevc_qpel_uni_hv32_8_c: 14744.5 put_hevc_qpel_uni_hv32_8_neon: 2599.7 put_hevc_qpel_uni_hv32_8_i8mm: 2266.0 put_hevc_qpel_uni_hv48_8_c: 32800.0 put_hevc_qpel_uni_hv48_8_neon: 5650.0 put_hevc_qpel_uni_hv48_8_i8mm: 5011.7 put_hevc_qpel_uni_hv64_8_c: 57856.2 put_hevc_qpel_uni_hv64_8_neon: 9863.5 put_hevc_qpel_uni_hv64_8_i8mm: 8767.7 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:55 +02:00
Martin Storsjö	5cbeefc79e	aarch64: hevc: Produce plain neon versions of qpel_hv As the plain neon qpel_h functions process two rows at a time, we need to allocate storage for h+8 rows instead of h+7. By allocating storage for h+8 rows, incrementing the stack pointer won't end up at the right spot in the end. Store the intended final stack pointer value in a register x14 which we store on the stack. AWS Graviton 3: put_hevc_qpel_hv4_8_c: 386.0 put_hevc_qpel_hv4_8_neon: 125.7 put_hevc_qpel_hv4_8_i8mm: 83.2 put_hevc_qpel_hv6_8_c: 749.0 put_hevc_qpel_hv6_8_neon: 207.0 put_hevc_qpel_hv6_8_i8mm: 166.0 put_hevc_qpel_hv8_8_c: 1305.2 put_hevc_qpel_hv8_8_neon: 216.5 put_hevc_qpel_hv8_8_i8mm: 213.0 put_hevc_qpel_hv12_8_c: 2570.5 put_hevc_qpel_hv12_8_neon: 480.0 put_hevc_qpel_hv12_8_i8mm: 398.2 put_hevc_qpel_hv16_8_c: 4158.7 put_hevc_qpel_hv16_8_neon: 659.7 put_hevc_qpel_hv16_8_i8mm: 593.5 put_hevc_qpel_hv24_8_c: 8626.7 put_hevc_qpel_hv24_8_neon: 1653.5 put_hevc_qpel_hv24_8_i8mm: 1398.7 put_hevc_qpel_hv32_8_c: 14646.0 put_hevc_qpel_hv32_8_neon: 2566.2 put_hevc_qpel_hv32_8_i8mm: 2287.5 put_hevc_qpel_hv48_8_c: 31072.5 put_hevc_qpel_hv48_8_neon: 6228.5 put_hevc_qpel_hv48_8_i8mm: 5291.0 put_hevc_qpel_hv64_8_c: 53847.2 put_hevc_qpel_hv64_8_neon: 9856.7 put_hevc_qpel_hv64_8_i8mm: 8831.0 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:55 +02:00
Martin Storsjö	20c38f4b8d	aarch64: hevc: Reorder qpel_hv functions to prepare for templating This is a pure reordering of code without changing anything in the individual functions. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:50 +02:00
Martin Storsjö	4f71e4ebf2	aarch64: hevc: Deduplicate the hevc_put_hevc_qpel_uni_w_hv*_8_end_neon functions The hv32 and hv64 functions were identical - both loop and process 16 pixels at a time. The hv16 function was near identical, except for the outer loop (and using sp instead of a separate register). Given the size of these functions, the extra cost of the outer loop is negligible, so use the same function for hv16 as well. This removes over 200 lines of duplicated assembly, and over 4 KB of binary size. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:40 +02:00
Martin Storsjö	4063e50eec	aarch64: hevc: Split the qpel_*_hv functions into two parts The first horizontal filter can use either i8mm or plain neon versions, while the second part is a pure neon implementation. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:05:29 +02:00
Martin Storsjö	ad01d06f91	aarch64: hevc: Implement a neon version of hevc_qpel_uni_w_h*_8 AWS Graviton 3: put_hevc_qpel_uni_w_h4_8_c: 159.0 put_hevc_qpel_uni_w_h4_8_neon: 64.2 put_hevc_qpel_uni_w_h4_8_i8mm: 40.0 put_hevc_qpel_uni_w_h6_8_c: 344.7 put_hevc_qpel_uni_w_h6_8_neon: 114.5 put_hevc_qpel_uni_w_h6_8_i8mm: 82.0 put_hevc_qpel_uni_w_h8_8_c: 596.2 put_hevc_qpel_uni_w_h8_8_neon: 132.2 put_hevc_qpel_uni_w_h8_8_i8mm: 106.0 put_hevc_qpel_uni_w_h12_8_c: 1325.0 put_hevc_qpel_uni_w_h12_8_neon: 299.0 put_hevc_qpel_uni_w_h12_8_i8mm: 211.5 put_hevc_qpel_uni_w_h16_8_c: 2300.0 put_hevc_qpel_uni_w_h16_8_neon: 422.0 put_hevc_qpel_uni_w_h16_8_i8mm: 286.2 put_hevc_qpel_uni_w_h24_8_c: 5059.0 put_hevc_qpel_uni_w_h24_8_neon: 912.2 put_hevc_qpel_uni_w_h24_8_i8mm: 664.2 put_hevc_qpel_uni_w_h32_8_c: 9198.2 put_hevc_qpel_uni_w_h32_8_neon: 1638.2 put_hevc_qpel_uni_w_h32_8_i8mm: 1033.7 put_hevc_qpel_uni_w_h48_8_c: 20754.7 put_hevc_qpel_uni_w_h48_8_neon: 3633.7 put_hevc_qpel_uni_w_h48_8_i8mm: 2300.7 put_hevc_qpel_uni_w_h64_8_c: 36854.7 put_hevc_qpel_uni_w_h64_8_neon: 6435.7 put_hevc_qpel_uni_w_h64_8_i8mm: 4039.2 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:03:18 +02:00
Martin Storsjö	de23b384fd	aarch64: hevc: Produce epel_bi_hv functions for both neon and i8mm In addition to just templating, this contains one change to ff_hevc_put_hevc_epel_bi_hv32_8, by setting the w6 register which ff_hevc_put_hevc_epel_h32_8_neon requires. AWS Graviton 3: put_hevc_epel_bi_hv4_8_c: 176.5 put_hevc_epel_bi_hv4_8_neon: 62.0 put_hevc_epel_bi_hv4_8_i8mm: 58.0 put_hevc_epel_bi_hv6_8_c: 343.7 put_hevc_epel_bi_hv6_8_neon: 109.7 put_hevc_epel_bi_hv6_8_i8mm: 105.7 put_hevc_epel_bi_hv8_8_c: 536.0 put_hevc_epel_bi_hv8_8_neon: 112.7 put_hevc_epel_bi_hv8_8_i8mm: 111.7 put_hevc_epel_bi_hv12_8_c: 1107.7 put_hevc_epel_bi_hv12_8_neon: 254.7 put_hevc_epel_bi_hv12_8_i8mm: 239.0 put_hevc_epel_bi_hv16_8_c: 1927.7 put_hevc_epel_bi_hv16_8_neon: 356.2 put_hevc_epel_bi_hv16_8_i8mm: 334.2 put_hevc_epel_bi_hv24_8_c: 4195.2 put_hevc_epel_bi_hv24_8_neon: 736.7 put_hevc_epel_bi_hv24_8_i8mm: 715.5 put_hevc_epel_bi_hv32_8_c: 7280.5 put_hevc_epel_bi_hv32_8_neon: 1287.7 put_hevc_epel_bi_hv32_8_i8mm: 1162.2 put_hevc_epel_bi_hv48_8_c: 16857.7 put_hevc_epel_bi_hv48_8_neon: 2836.2 put_hevc_epel_bi_hv48_8_i8mm: 2908.5 put_hevc_epel_bi_hv64_8_c: 29248.2 put_hevc_epel_bi_hv64_8_neon: 5051.7 put_hevc_epel_bi_hv64_8_i8mm: 4491.5 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 09:03:16 +02:00
Martin Storsjö	96e5adda9f	aarch64: hevc: Produce epel_uni_w_hv functions for both neon and i8mm AWS Graviton 3: put_hevc_epel_uni_w_hv4_8_c: 191.2 put_hevc_epel_uni_w_hv4_8_neon: 87.7 put_hevc_epel_uni_w_hv4_8_i8mm: 83.2 put_hevc_epel_uni_w_hv6_8_c: 349.5 put_hevc_epel_uni_w_hv6_8_neon: 153.0 put_hevc_epel_uni_w_hv6_8_i8mm: 148.5 put_hevc_epel_uni_w_hv8_8_c: 581.2 put_hevc_epel_uni_w_hv8_8_neon: 166.7 put_hevc_epel_uni_w_hv8_8_i8mm: 163.5 put_hevc_epel_uni_w_hv12_8_c: 1230.0 put_hevc_epel_uni_w_hv12_8_neon: 387.7 put_hevc_epel_uni_w_hv12_8_i8mm: 370.2 put_hevc_epel_uni_w_hv16_8_c: 2003.2 put_hevc_epel_uni_w_hv16_8_neon: 501.5 put_hevc_epel_uni_w_hv16_8_i8mm: 490.2 put_hevc_epel_uni_w_hv24_8_c: 4448.7 put_hevc_epel_uni_w_hv24_8_neon: 1092.2 put_hevc_epel_uni_w_hv24_8_i8mm: 1069.7 put_hevc_epel_uni_w_hv32_8_c: 7817.2 put_hevc_epel_uni_w_hv32_8_neon: 1916.2 put_hevc_epel_uni_w_hv32_8_i8mm: 1829.5 put_hevc_epel_uni_w_hv48_8_c: 16728.2 put_hevc_epel_uni_w_hv48_8_neon: 4263.7 put_hevc_epel_uni_w_hv48_8_i8mm: 4342.7 put_hevc_epel_uni_w_hv64_8_c: 29563.2 put_hevc_epel_uni_w_hv64_8_neon: 7474.2 put_hevc_epel_uni_w_hv64_8_i8mm: 7128.5 Signed-off-by: Martin Storsjö <martin@martin.st>	2024-03-26 08:59:58 +02:00

1 2 3 4 5 ...

49678 commits