ffmpeg/libswscale/x86
Shreesh Adiga e18f87ed9f swscale/x86/rgb2rgb: add AVX512ICL version of uyvytoyuv422
The scalar loop is replaced with masked AVX512 instructions.
For extracting the Y from UYVY, vperm2b is used instead of
various AND and packuswb.

Instead of loading the vectors with interleaved lanes as done
in AVX2 version, normal load is used. At the end of packuswb,
for U and V, an extra permute operation is done to get the
required layout.

AMD 7950x Zen 4 benchmark data:
uyvytoyuv422_c:                                      29105.0 ( 1.00x)
uyvytoyuv422_sse2:                                    3888.0 ( 7.49x)
uyvytoyuv422_avx:                                     3374.2 ( 8.63x)
uyvytoyuv422_avx2:                                    2649.8 (10.98x)
uyvytoyuv422_avx512icl:                               1615.0 (18.02x)

Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-02-18 12:43:57 -03:00
..
hscale_fast_bilinear_simd.c swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
input.asm swscale/x86/rgb2rgb: fix deinterleaveBytes for unaligned dst pointers 2024-09-06 23:05:01 +02:00
Makefile swscale/x86: add sse2 and avx2 {lum,chr}ConvertRange 2024-06-16 00:35:51 +02:00
output.asm swscale: add ICC intent enum and option 2024-12-23 12:33:43 +01:00
range_convert.asm swscale/x86: add sse4 and avx2 {lum,chr}ConvertRange16 2024-12-05 21:10:29 +01:00
rgb2rgb.c swscale/x86/rgb2rgb: add AVX512ICL version of uyvytoyuv422 2025-02-18 12:43:57 -03:00
rgb_2_rgb.asm swscale/x86/rgb2rgb: add AVX512ICL version of uyvytoyuv422 2025-02-18 12:43:57 -03:00
scale.asm swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
scale_avx2.asm swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
swscale.c swscale/x86/swscale: Make M24 variables static 2025-02-02 17:03:13 +01:00
swscale_template.c swscale/x86/swscale: Make M24 variables static 2025-02-02 17:03:13 +01:00
w64xmmtest.c swscale: rename SwsContext to SwsInternal 2024-10-24 22:50:00 +02:00
yuv2rgb.c swscale/internal: group user-facing options together 2024-11-21 12:49:56 +01:00
yuv2yuvX.asm x86: replace explicit REP_RETs with RETs 2023-02-01 04:23:55 +01:00
yuv_2_rgb.asm swscale/x86/yuv2rgb: add ssse3 yuv42{0,2}p -> gbrp unscaled colorspace converters 2024-08-18 22:26:14 +02:00