ffmpeg/libswscale
Krzysztof Pyrkosz d765e5f043 swscale/aarch64: dotprod implementation of rgba32_to_Y
The idea is to split the 16 bit coefficients into lower and upper half,
invoke udot for the lower half, shift by 8, and follow by udot for the
upper half.

Benchmark on A78:
bgra_to_y_128_c:                                       682.0 ( 1.00x)
bgra_to_y_128_neon:                                    181.2 ( 3.76x)
bgra_to_y_128_dotprod:                                 117.8 ( 5.79x)
bgra_to_y_1080_c:                                     5742.5 ( 1.00x)
bgra_to_y_1080_neon:                                  1472.5 ( 3.90x)
bgra_to_y_1080_dotprod:                                906.5 ( 6.33x)
bgra_to_y_1920_c:                                    10194.0 ( 1.00x)
bgra_to_y_1920_neon:                                  2589.8 ( 3.94x)
bgra_to_y_1920_dotprod:                               1573.8 ( 6.48x)

Signed-off-by: Martin Storsjö <martin@martin.st>
2025-03-04 10:16:44 +02:00
..
aarch64 swscale/aarch64: dotprod implementation of rgba32_to_Y 2025-03-04 10:16:44 +02:00
arm libswscale/arm/swscale_unscaled: Fix function prototype 2025-03-02 01:10:38 +02:00
loongarch loongarch: fixes fate-checkasm-sw_rgb failure 2025-01-15 01:27:36 +01:00
ppc swscale/ppc: disable YUV2RGB AltiVec acceleration 2024-12-02 02:51:39 +01:00
riscv swscale/range_convert: saturate output instead of limiting input 2024-12-05 21:10:29 +01:00
tests tests/swscale: allow nonzero positive return codes from sws_scale_frame() 2024-12-18 17:30:48 +01:00
x86 swscale/x86/rgb2rgb: add AVX512ICL version of uyvytoyuv422 2025-02-18 12:43:57 -03:00
alphablend.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00
bayer_template.c swscale/internal: constify SwsFunc 2024-10-07 19:51:34 +02:00
cms.c swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
cms.h swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
csputils.c swscale/csputils: add internal colorspace math helpers 2024-12-23 12:33:43 +01:00
csputils.h swscale/csputils: add internal colorspace math helpers 2024-12-23 12:33:43 +01:00
gamma.c swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
graph.c swscale/graph: copy scaler_params to the legacy subpass context 2025-02-07 13:17:37 -03:00
graph.h swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
half2float.c swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
hscale.c swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats 2024-12-05 21:10:29 +01:00
hscale_fast_bilinear.c swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
input.c swscale: 16bit planar float input support 2025-01-21 21:06:14 +01:00
libswscale.v build: Change structure of the linker version script templates 2016-05-29 16:43:11 +02:00
log2_tab.c lsws: duplicate ff_log2_tab 2014-08-12 20:52:21 +02:00
lut3d.c swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
lut3d.h swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions 2025-01-12 15:41:39 +01:00
Makefile swscale/lut3d: add 3DLUT dispatch system 2024-12-23 12:33:43 +01:00
options.c swscale/options: add -sws_dither none alias 2024-12-23 12:47:10 +01:00
output.c swscale/output: Fix undefined overflow in yuv2rgba64_full_X_c_template() 2025-01-08 23:23:24 +01:00
rgb2rgb.c swscale/swscale_unscaled: add unscaled x2rgb10le to packed RGB 2024-11-06 17:34:32 -03:00
rgb2rgb.h swscale/swscale_unscaled: add unscaled x2rgb10le to packed RGB 2024-11-06 17:34:32 -03:00
rgb2rgb_template.c swscale/swscale_unscaled: add unscaled conversion for AYUV/VUYA/UYVA 2024-11-02 15:01:31 -03:00
slice.c swscale/slice: fix init of 32 bpc planes 2024-12-16 12:21:55 +01:00
swscale.c swscale/swscale: don't reject scaling when color parameters are not supported but conversion is not required 2025-01-22 12:15:18 -03:00
swscale.h swscale: add ICC intent enum and option 2024-12-23 12:33:43 +01:00
swscale_internal.h swscale: use 16-bit intermediate precision for RGB/XYZ conversion 2024-12-26 20:31:36 +01:00
swscale_unscaled.c swscale: 16bit planar float input support 2025-01-21 21:06:14 +01:00
swscaleres.rc Add Windows resource file support for shared libraries 2013-12-05 23:42:07 +01:00
utils.c swscale: 16bit planar float input support 2025-01-21 21:06:14 +01:00
utils.h swscale/swscale: don't reject scaling when color parameters are not supported but conversion is not required 2025-01-22 12:15:18 -03:00
version.c lib*/version: Use static_assert for static asserts 2024-03-31 00:08:42 +01:00
version.h swscale: add ICC intent enum and option 2024-12-23 12:33:43 +01:00
version_major.h libs: bump major version for all libraries 2024-03-07 11:29:43 -03:00
vscale.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00
yuv2rgb.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00