ffmpeg/libswscale
Swinney, Jonathan 0d7caa5b09 swscale/aarch64: add vscale specializations
This commit adds new code paths for vscale when filterSize is 2, 4, or
8. By using specialized code with unrolling to match the filterSize we
can improve performance.

On AWS c7g (Graviton 3, Neoverse V1) instances:
                                 before   after
yuv2yuvX_2_0_512_accurate_neon:  558.8    268.9
yuv2yuvX_4_0_512_accurate_neon:  637.5    434.9
yuv2yuvX_8_0_512_accurate_neon:  1144.8   806.2
yuv2yuvX_16_0_512_accurate_neon: 2080.5   1853.7

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-08-16 13:40:42 +03:00
..
aarch64 swscale/aarch64: add vscale specializations 2022-08-16 13:40:42 +03:00
arm sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
ppc sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
tests swscale: introduce isSwappedChroma 2022-01-04 19:39:22 -06:00
x86 checkasm: updated tests for sw_scale 2022-08-16 13:40:42 +03:00
alphablend.c swscale/alphablend: Fix slice handling 2021-10-03 20:38:29 +02:00
bayer_template.c swscale: do not drop half of bits from 16bit bayer formats 2020-08-08 12:03:42 +02:00
gamma.c swscale: re-enable gamma 2015-09-04 19:00:20 -03:00
hscale.c avutil: Rename FF_CEIL_COMPAT to AV_CEIL_COMPAT 2016-01-27 16:36:46 +00:00
hscale_fast_bilinear.c sws: Move fast bilinear C code into seperate file 2014-07-19 05:36:26 +02:00
input.c swscale/input: add VUYA input support 2022-08-05 09:39:21 -03:00
libswscale.v build: Change structure of the linker version script templates 2016-05-29 16:43:11 +02:00
log2_tab.c lsws: duplicate ff_log2_tab 2014-08-12 20:52:21 +02:00
Makefile configure: always enable gnu_windres if available 2022-08-13 14:42:36 +02:00
options.c Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
output.c swscale/output: fix reading chroma values when generating vuya output 2022-08-08 09:39:33 -03:00
rgb2rgb.c swscale/rgb2rgb: Don't cast const away 2022-07-31 01:09:52 +02:00
rgb2rgb.h Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
rgb2rgb_template.c swscale/rgb2rgb_template: use shuffle macro on big-endian arches 2020-12-12 23:07:22 -05:00
slice.c Replace all occurences of av_mallocz_array() by av_calloc() 2021-09-20 01:03:52 +02:00
swscale.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00
swscale.h Keep including the full version.h when headers are included externally 2022-03-19 00:01:57 +02:00
swscale_internal.h Remove unnecessary libavutil/(avutil|common|internal).h inclusions 2022-02-24 12:56:49 +01:00
swscale_unscaled.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00
swscaleres.rc Add Windows resource file support for shared libraries 2013-12-05 23:42:07 +01:00
utils.c swscale/output: add VUYA output support 2022-08-07 09:33:16 -03:00
version.c lib*/version: Move library version functions into files of their own 2022-05-10 06:49:32 +02:00
version.h swscale/input: add VUYA input support 2022-08-05 09:39:21 -03:00
version_major.h libswscale: Split version.h 2022-03-16 14:05:26 +02:00
vscale.c Replace all occurences of av_mallocz_array() by av_calloc() 2021-09-20 01:03:52 +02:00
yuv2rgb.c all: Replace if (ARCH_FOO) checks by #if ARCH_FOO 2022-06-15 04:56:37 +02:00