ffmpeg/libavutil/x86
James Almer a039726c2a avutil/x86/aes: remove a few branches
The rounds value is constant and can be one of three hardcoded values, so
instead of checking it on every loop, just split the function into three
different implementations for each value.

Before:
aes_decrypt_128_aesni:                                  93.8 (47.58x)
aes_decrypt_192_aesni:                                 106.9 (49.30x)
aes_decrypt_256_aesni:                                 109.8 (56.50x)
aes_encrypt_128_aesni:                                  93.2 (47.70x)
aes_encrypt_192_aesni:                                 111.1 (48.36x)
aes_encrypt_256_aesni:                                 113.6 (56.27x)

After:
aes_decrypt_128_aesni:                                  71.5 (63.31x)
aes_decrypt_192_aesni:                                  96.8 (55.64x)
aes_decrypt_256_aesni:                                 106.1 (58.51x)
aes_encrypt_128_aesni:                                  81.3 (55.92x)
aes_encrypt_192_aesni:                                  91.2 (59.78x)
aes_encrypt_256_aesni:                                 109.0 (58.26x)

Signed-off-by: James Almer <jamrial@gmail.com>
2025-04-10 12:02:34 -03:00
..
aes.asm avutil/x86/aes: remove a few branches 2025-04-10 12:02:34 -03:00
aes_init.c avutil/x86/aes: remove a few branches 2025-04-10 12:02:34 -03:00
asm.h asm: FF_-prefix internal macros used in inline assembly 2016-06-27 17:21:18 +02:00
bswap.h lavu/x86: remove GCC 4.4- stuff 2024-06-13 21:16:16 +03:00
cpu.c avutil/cpu: add AVX512 Icelake flag 2022-03-10 16:45:48 -03:00
cpu.h avutil/cpu: add AVX512 Icelake flag 2022-03-10 16:45:48 -03:00
cpuid.asm libavutil: include assembly with full path from source root 2022-02-08 10:42:26 +01:00
emms.asm libavutil: include assembly with full path from source root 2022-02-08 10:42:26 +01:00
fixed_dsp.asm libavutil: include assembly with full path from source root 2022-02-08 10:42:26 +01:00
fixed_dsp_init.c configure: Remove av_restrict 2024-03-15 12:51:15 +01:00
float_dsp.asm x86/float_dsp: add SSE2 and AVX versions of scalarproduct_double 2024-06-03 22:14:55 -03:00
float_dsp_init.c x86/float_dsp: add SSE2 and AVX versions of scalarproduct_double 2024-06-03 22:14:55 -03:00
imgutils.asm Merge commit '6be7944ee2' 2017-03-23 18:05:27 -03:00
imgutils_init.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
intmath.h avutil/common: assert that bit position in av_zero_extend is valid 2024-06-13 20:36:09 -03:00
intreadwrite.h x86/intreadwrite: add SSE2 optimized AV_COPY128U 2024-07-29 23:17:52 -03:00
lls.asm x86: replace explicit REP_RETs with RETs 2023-02-01 04:23:55 +01:00
lls_init.c Include attributes.h directly 2021-04-19 14:34:10 +02:00
Makefile lavu/aes: add x86 AESNI optimizations 2025-04-05 20:46:40 -03:00
pixelutils.asm avutil/x86/pixelutils: Empty MMX state in ff_pixelutils_sad_8x8_mmxext 2023-11-04 01:26:03 +01:00
pixelutils.h
pixelutils_init.c avutil/x86/pixelutils: Remove obsolete MMX(EXT) functions 2022-06-22 13:36:44 +02:00
timer.h
tx_float.asm x86/tx_float: remove HAVE_AVX2_EXTERNAL checks 2024-10-06 01:32:49 +02:00
tx_float_init.c x86/tx_float: remove HAVE_AVX2_EXTERNAL checks 2024-10-06 01:32:49 +02:00
w64xmmtest.h all: Add missing header guards 2016-01-28 19:49:48 -08:00
x86inc.asm x86: Update x86inc.asm 2024-03-24 14:53:57 +01:00
x86util.asm avutil/x86util: Fix broken pre-SSE4.1 PMINSD emulation 2024-03-17 13:52:27 +01:00