ffmpeg/libavcodec/aarch64
Georgii Zagoruiko 1c385023aa aarch64/vvc: Optimisations of put_chroma_v() functions for 10/12-bit
Apple M4:
put_chroma_v_10_2x2_c:                                   5.8 ( 1.00x)
put_chroma_v_10_4x4_c:                                   9.0 ( 1.00x)
put_chroma_v_10_4x4_neon:                                1.7 ( 5.29x)
put_chroma_v_10_8x8_c:                                  22.1 ( 1.00x)
put_chroma_v_10_8x8_neon:                                5.8 ( 3.79x)
put_chroma_v_10_16x16_c:                                56.3 ( 1.00x)
put_chroma_v_10_16x16_neon:                             21.2 ( 2.66x)
put_chroma_v_10_32x32_c:                               181.6 ( 1.00x)
put_chroma_v_10_32x32_neon:                             86.9 ( 2.09x)
put_chroma_v_10_64x64_c:                               680.3 ( 1.00x)
put_chroma_v_10_64x64_neon:                            337.4 ( 2.02x)
put_chroma_v_10_128x128_c:                            2567.3 ( 1.00x)
put_chroma_v_10_128x128_neon:                         1374.8 ( 1.87x)
put_chroma_v_12_2x2_c:                                   6.4 ( 1.00x)
put_chroma_v_12_4x4_c:                                   8.2 ( 1.00x)
put_chroma_v_12_4x4_neon:                                1.5 ( 5.56x)
put_chroma_v_12_8x8_c:                                  18.9 ( 1.00x)
put_chroma_v_12_8x8_neon:                                5.7 ( 3.29x)
put_chroma_v_12_16x16_c:                                52.6 ( 1.00x)
put_chroma_v_12_16x16_neon:                             19.9 ( 2.65x)
put_chroma_v_12_32x32_c:                               185.7 ( 1.00x)
put_chroma_v_12_32x32_neon:                             81.9 ( 2.27x)
put_chroma_v_12_64x64_c:                               661.8 ( 1.00x)
put_chroma_v_12_64x64_neon:                            342.1 ( 1.93x)
put_chroma_v_12_128x128_c:                            2547.8 ( 1.00x)
put_chroma_v_12_128x128_neon:                         1368.0 ( 1.86x)

RPi4:
put_chroma_v_10_2x2_c:                                  64.8 ( 1.00x)
put_chroma_v_10_4x4_c:                                 157.2 ( 1.00x)
put_chroma_v_10_4x4_neon:                               39.7 ( 3.96x)
put_chroma_v_10_8x8_c:                                 562.1 ( 1.00x)
put_chroma_v_10_8x8_neon:                               98.8 ( 5.69x)
put_chroma_v_10_16x16_c:                              1170.7 ( 1.00x)
put_chroma_v_10_16x16_neon:                            380.7 ( 3.07x)
put_chroma_v_10_32x32_c:                              3696.6 ( 1.00x)
put_chroma_v_10_32x32_neon:                           1723.8 ( 2.14x)
put_chroma_v_10_64x64_c:                             13170.9 ( 1.00x)
put_chroma_v_10_64x64_neon:                           7284.1 ( 1.81x)
put_chroma_v_10_128x128_c:                           46068.3 ( 1.00x)
put_chroma_v_10_128x128_neon:                        27219.5 ( 1.69x)
put_chroma_v_12_2x2_c:                                  63.8 ( 1.00x)
put_chroma_v_12_4x4_c:                                 156.5 ( 1.00x)
put_chroma_v_12_4x4_neon:                               39.3 ( 3.98x)
put_chroma_v_12_8x8_c:                                 560.9 ( 1.00x)
put_chroma_v_12_8x8_neon:                               98.7 ( 5.68x)
put_chroma_v_12_16x16_c:                              1169.9 ( 1.00x)
put_chroma_v_12_16x16_neon:                            380.8 ( 3.07x)
put_chroma_v_12_32x32_c:                              3693.9 ( 1.00x)
put_chroma_v_12_32x32_neon:                           1728.4 ( 2.14x)
put_chroma_v_12_64x64_c:                             13170.9 ( 1.00x)
put_chroma_v_12_64x64_neon:                           7284.9 ( 1.81x)
put_chroma_v_12_128x128_c:                           46068.0 ( 1.00x)
put_chroma_v_12_128x128_neon:                        27224.6 ( 1.69x)
2026-03-27 13:42:50 +00:00
..
h26x aarch64: Add PAC sign/validation of the link register 2026-03-20 13:16:06 +02:00
vvc aarch64/vvc: Optimisations of put_chroma_v() functions for 10/12-bit 2026-03-27 13:42:50 +00:00
aacencdsp_init.c avcodec/aarch64/aacencdsp: NEON implementation 2025-01-28 10:44:40 +02:00
aacencdsp_neon.S avcodec/aarch64/aacencdsp: NEON implementation 2025-01-28 10:44:40 +02:00
aacpsdsp_init_aarch64.c
aacpsdsp_neon.S
ac3dsp_init_aarch64.c
ac3dsp_neon.S avcodec/aarch64/ac3dsp_neon.S: Optimize ac3_sum_square_butterfly_int32_neon 2025-03-02 01:17:53 +02:00
cabac.h
fdct.h
fdctdsp_init_aarch64.c
fdctdsp_neon.S
fmtconvert_init.c
fmtconvert_neon.S
h264chroma_init_aarch64.c
h264cmc_neon.S
h264dsp_init_aarch64.c avcodec/h264dsp: Remove redundant h264 from H264DSPCtx member names 2026-01-25 22:53:25 +01:00
h264dsp_neon.S
h264idct_neon.S
h264pred_init.c aarch64/h264pred: disable inefficient functions 2026-02-04 09:06:37 +00:00
h264pred_neon.S lavc/aarch64: Fix addp overflow in ff_pred16x16_plane_neon_10 2025-10-24 15:32:35 +00:00
h264qpel_init_aarch64.c
h264qpel_neon.S
hevcdsp_deblock_neon.S aarch64: hevcdsp: Make returns match the call site 2026-03-17 20:37:53 +00:00
hevcdsp_dequant_neon.S lavc/hevc: add aarch64 neon for 12-bit dequant 2026-01-25 06:55:26 +00:00
hevcdsp_idct_neon.S aarch64/hevcdsp_idct_neon: Add implementation for idct dc 12 2025-03-04 17:01:58 +08:00
hevcdsp_init_aarch64.c lavc/hevc: reorder aarch64 NEON pel function assignments 2026-03-13 21:43:37 +00:00
hpeldsp_init_aarch64.c
hpeldsp_neon.S aarch64/hpeldsp_neon: fix out-of-bounds read 2026-01-04 03:22:55 +00:00
huffyuvdsp_init_aarch64.c libavcodec/huffyuvdsp: Add NEON optimization for the add_int16 function 2026-03-04 22:31:19 +00:00
huffyuvdsp_neon.S libavcodec/huffyuvdsp: Add NEON optimization for the add_int16 function 2026-03-04 22:31:19 +00:00
idct.h
idctdsp_init_aarch64.c
idctdsp_neon.S
Makefile libavcodec/huffyuvdsp: Add NEON optimization for the add_int16 function 2026-03-04 22:31:19 +00:00
me_cmp_init_aarch64.c avcodec/mpegvideoenc: Add MPVEncContext 2025-03-26 04:08:33 +01:00
me_cmp_neon.S aarch64: Add PAC sign/validation of the link register 2026-03-20 13:16:06 +02:00
mpegaudiodsp_init.c
mpegaudiodsp_neon.S
mpegvideoencdsp_init.c avcodec/mpegvideoencdsp: convert stride parameters from int to ptrdiff_t 2024-09-01 13:42:30 +02:00
mpegvideoencdsp_neon.S avcodec/mpegvideoencdsp: convert stride parameters from int to ptrdiff_t 2024-09-01 13:42:30 +02:00
neon.S
neontest.c
opusdsp_init.c lavc/opus*: move to opus/ subdir 2024-09-02 11:56:53 +02:00
opusdsp_neon.S avcodec/aarch64/opusdsp_neon: Simplify opus_postfilter_neon 2025-02-10 14:55:16 +02:00
pixblockdsp_init_aarch64.c avcodec/pixblockdsp: be consistent about restrict use in ff_{get,diff}_pixels 2025-10-25 01:01:15 +02:00
pixblockdsp_neon.S
pngdsp_init.c avcodec/aarch64: add pngdsp 2026-02-04 12:05:35 +08:00
pngdsp_neon.S avcodec/aarch64: add pngdsp 2026-02-04 12:05:35 +08:00
rv40dsp_init_aarch64.c
sbrdsp_init_aarch64.c
sbrdsp_neon.S
simple_idct_neon.S
synth_filter_init.c
synth_filter_neon.S
vc1dsp_init_aarch64.c
vc1dsp_neon.S
videodsp.S
videodsp_init.c
vorbisdsp_init.c
vorbisdsp_neon.S
vp8dsp.h
vp8dsp_init_aarch64.c
vp8dsp_neon.S
vp9dsp_init.h
vp9dsp_init_10bpp_aarch64.c
vp9dsp_init_12bpp_aarch64.c
vp9dsp_init_16bpp_aarch64_template.c
vp9dsp_init_aarch64.c
vp9itxfm_16bpp_neon.S
vp9itxfm_neon.S
vp9lpf_16bpp_neon.S
vp9lpf_neon.S
vp9mc_16bpp_neon.S
vp9mc_aarch64.S
vp9mc_neon.S aarch64: vp9mc: Load only 12 pixels in the 4 pixel wide horizontal filter 2025-01-03 17:53:46 -05:00