ffmpeg/libavfilter/x86
Jun Zhao 91ae6d10ab lavfi/nlmeans: add aarch64 neon for compute_weights_line
Implement NEON optimization for compute_weights_line.

Also update the function signature to use ptrdiff_t for stack arguments
(max_meaningful_diff, startx, endx). This is done to unify the stack
layout between Apple platforms (which pack 32-bit stack arguments tightly)
and the generic AAPCS64 ABI (which requires 8-byte stack slots for 32-bit
arguments). Using ptrdiff_t ensures 8-byte slots are used on all AArch64
platforms, avoiding ABI mismatches with the assembly implementation.

The x86 AVX2 prototype is updated to match the new signature.

Performance benchmark (AArch64) in MacOS M4:
./tests/checkasm/checkasm --test=vf_nlmeans --bench
compute_weights_line_c:     151.1 ( 1.00x)
compute_weights_line_neon:  62.6 ( 2.42x)

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-01-09 16:10:10 +00:00
..
af_afir.asm avfilter/x86/af_afir: add FMA3 SIMD 2023-09-17 11:11:24 +02:00
af_afir_init.c avfilter/x86/af_afir: add FMA3 SIMD 2023-09-17 11:11:24 +02:00
af_anlmdn.asm
af_anlmdn_init.c
af_volume.asm
af_volume_init.c
avf_showcqt.asm
avf_showcqt_init.c
colorspacedsp.asm
colorspacedsp_init.c All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
f_ebur128.asm avfilter/x86/f_ebur128: replace AVX2 instruction with AVX equivalent 2025-06-22 09:31:44 -03:00
f_ebur128_init.c All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
Makefile avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
scene_sad.asm avfilter/x86/scene_sad: add high bit depth AVX2/AVX512 version 2025-07-17 12:26:06 +02:00
scene_sad_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_atadenoise.asm
vf_atadenoise_init.c All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
vf_blackdetect.asm avfilter/x86/vf_blackdetect: add missing preprocessor check 2025-07-18 15:17:02 -03:00
vf_blackdetect_init.c avfilter/x86/vf_blackdetect_init: don't enable the ASM functions on targets where it's known they will be slower 2025-07-18 13:05:44 -03:00
vf_blend.asm
vf_blend_init.c avfilter/blend: put slice parameters to a single struct 2024-05-14 21:07:37 +02:00
vf_bwdif.asm avfilter/x86/vf_bwdif: use the correct preprocessor check 2025-08-03 19:26:18 -03:00
vf_bwdif_init.c All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
vf_colordetect.asm avfilter/vf_colordetect: detect fully opaque alpha planes 2025-08-18 18:50:00 +00:00
vf_colordetect_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_convolution.asm
vf_convolution_init.c
vf_eq.asm
vf_eq_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_framerate.asm
vf_framerate_init.c
vf_fspp.asm avfilter/x86/vf_fspp: Avoid stack on x64 2025-11-17 12:18:12 +01:00
vf_fspp_init.c avfilter/vf_fsppdsp: Constify 2025-11-17 12:18:12 +01:00
vf_gblur.asm
vf_gblur_init.c avutil/common: Don't auto-include mem.h 2024-03-31 00:08:43 +01:00
vf_gradfun.asm avfilter/x86/vf_gradfun: Remove MMXEXT func overridden by SSSE3 2025-09-26 06:21:35 +02:00
vf_gradfun_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_hflip.asm
vf_hflip_init.c
vf_hqdn3d.asm all: fix typos found by codespell 2025-08-03 13:48:47 +02:00
vf_hqdn3d_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_idetdsp.asm avfilter/x86/vf_idetdsp: add AVX2 and AVX512 implementations 2025-09-21 11:02:41 +00:00
vf_idetdsp_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_interlace.asm
vf_limiter.asm
vf_limiter_init.c
vf_lut3d.asm
vf_lut3d_init.c
vf_maskedclamp.asm
vf_maskedclamp_init.c
vf_maskedmerge.asm
vf_maskedmerge_init.c
vf_nlmeans.asm
vf_nlmeans_init.c lavfi/nlmeans: add aarch64 neon for compute_weights_line 2026-01-09 16:10:10 +00:00
vf_noise.c avfilter/x86/vf_noise: Use unaligned access 2025-12-12 19:25:21 +00:00
vf_overlay.asm x86: Avoid using 'd' as an argument name 2024-03-24 14:53:57 +01:00
vf_overlay_init.c avfilter/x86/vf_overlay: simplify function signature 2025-09-02 17:06:25 +02:00
vf_pp7.asm
vf_pp7_init.c
vf_psnr.asm
vf_psnr_init.c
vf_pullup.asm avfilter/x86/vf_pullup: Port pullup functions to SSE2, SSSE3 2025-10-15 19:43:37 +02:00
vf_pullup_init.c avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
vf_removegrain.asm
vf_removegrain_init.c
vf_spp.c avfilter/x86/vf_spp: Fix comment 2025-11-17 12:18:12 +01:00
vf_ssim.asm avfilter/vf_ssim: Fix x86 assembly code for SSIM calculation 2023-08-21 17:04:51 +02:00
vf_ssim_init.c All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
vf_stereo3d.asm
vf_stereo3d_init.c
vf_threshold.asm
vf_threshold_init.c
vf_tinterlace_init.c
vf_transpose.asm
vf_transpose_init.c
vf_v360.asm
vf_v360_init.c
vf_w3fdif.asm
vf_w3fdif_init.c All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
vf_yadif.asm
vf_yadif_init.c
yadif-10.asm
yadif-16.asm