ffmpeg/libavutil/x86
Andreas Rheinhardt e1782fb016 avutil/x86/pixelutils: Don't use mmx in 8x8 SAD
This function is exported, so has to abide by the ABI
and therefore issues emms since commit
5b85ca5317. Yet this is
expensive and using SSE2 instead improves performance.
Also avoid the initial zeroing and the last pointer
increment while just at it.
This removes the last usage of mmx from libavutil*.

Old benchmarks:
sad_8x8_0_c:                                            13.2 ( 1.00x)
sad_8x8_0_mmxext:                                       27.8 ( 0.48x)
sad_8x8_1_c:                                            13.2 ( 1.00x)
sad_8x8_1_mmxext:                                       27.6 ( 0.48x)
sad_8x8_2_c:                                            13.3 ( 1.00x)
sad_8x8_2_mmxext:                                       27.6 ( 0.48x)

New benchmarks:
sad_8x8_0_c:                                            13.3 ( 1.00x)
sad_8x8_0_sse2:                                         11.7 ( 1.13x)
sad_8x8_1_c:                                            13.8 ( 1.00x)
sad_8x8_1_sse2:                                         11.6 ( 1.20x)
sad_8x8_2_c:                                            13.2 ( 1.00x)
sad_8x8_2_sse2:                                         11.8 ( 1.12x)

Hint: Using two psadbw or one psadbw and movhps made no difference
in the benchmarks, so I chose the latter due to smaller codesize.

*: except if lavu provides avpriv_emms for other libraries

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-18 21:21:11 +02:00
..
aes.asm avutil/x86/aes: Only assemble iff HAVE_AESNI_EXTERNAL 2026-03-28 23:25:54 +01:00
aes_init.c avutil/x86/aes: remove a few branches 2025-04-10 12:02:34 -03:00
asm.h avutil/x86/asm: Remove wrong comment, rename FF_REG_sp 2025-11-18 20:41:13 +01:00
bswap.h lavu/x86: remove GCC 4.4- stuff 2024-06-13 21:16:16 +03:00
cpu.c avutil/cpu: add x86 CPU feature flag for clmul 2026-01-04 15:49:30 +01:00
cpu.h avutil/cpu: add x86 CPU feature flag for clmul 2026-01-04 15:49:30 +01:00
cpuid.asm libavutil: include assembly with full path from source root 2022-02-08 10:42:26 +01:00
crc.asm avutil/crc: Use x86 clmul for CRC when available 2026-01-04 15:49:30 +01:00
crc.h avutil/crc: refactor helper functions to separate header file 2026-03-11 14:03:36 +00:00
emms.asm libavutil: include assembly with full path from source root 2022-02-08 10:42:26 +01:00
fixed_dsp.asm libavutil: include assembly with full path from source root 2022-02-08 10:42:26 +01:00
fixed_dsp_init.c configure: Remove av_restrict 2024-03-15 12:51:15 +01:00
float_dsp.asm x86/float_dsp: add SSE2 and AVX versions of scalarproduct_double 2024-06-03 22:14:55 -03:00
float_dsp_init.c x86/float_dsp: add SSE2 and AVX versions of scalarproduct_double 2024-06-03 22:14:55 -03:00
imgutils.asm Merge commit '6be7944ee2' 2017-03-23 18:05:27 -03:00
imgutils_init.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
intmath.h avutil/intmath: use AV_HAS_BUILTIN to detect builtin availability 2025-06-12 14:17:37 +03:00
intreadwrite.h x86/intreadwrite: add SSE2 optimized AV_COPY128U 2024-07-29 23:17:52 -03:00
lls.asm x86: replace explicit REP_RETs with RETs 2023-02-01 04:23:55 +01:00
lls_init.c Include attributes.h directly 2021-04-19 14:34:10 +02:00
Makefile avutil/x86/aes: Only assemble iff HAVE_AESNI_EXTERNAL 2026-03-28 23:25:54 +01:00
pixelutils.asm avutil/x86/pixelutils: Don't use mmx in 8x8 SAD 2026-04-18 21:21:11 +02:00
pixelutils.h avutil/x86/pixelutils: Don't use mmx in 8x8 SAD 2026-04-18 21:21:11 +02:00
timer.h Merge commit 'd1a6cb195f' 2015-07-09 12:28:09 +02:00
tx_float.asm Revert "avutil/tx_template: extend to 2M" 2025-12-13 15:14:38 +00:00
tx_float_init.c Revert "avutil/tx_template: extend to 2M" 2025-12-13 15:14:38 +00:00
w64xmmtest.h all: Add missing header guards 2016-01-28 19:49:48 -08:00
x86inc.asm avutil/x86/x86inc: Use parentheses in has_epilogue 2025-11-30 00:15:43 +01:00
x86util.asm avutil/x86/x86util: tone down NASM workaround and use info section 2026-03-30 19:46:53 +02:00