ffmpeg/libavcodec/hevc
Jun Zhao cfa3ceac7a lavc/hevc: add aarch64 NEON for angular modes 10 and 26
Add NEON-optimized implementations for HEVC angular intra prediction
modes 10 (pure horizontal) and 26 (pure vertical) at 8-bit depth.

Mode 10 (Horizontal):
- Broadcasts left[y] to fill each row using ld2r/ld4r for efficiency
- Applies edge smoothing for luma blocks smaller than 32x32

Mode 26 (Vertical):
- Copies top reference row to all output rows
- Applies edge smoothing for luma blocks smaller than 32x32

Edge smoothing uses uhsub+usqadd to compute the filtered result
directly in 8-bit, avoiding widening to 16-bit intermediates.

The C pred_angular wrappers are made non-static with ff_ prefix to
allow the NEON dispatch to fall back to C for modes not yet optimized.
This will be reverted once all angular modes are implemented.

Note: since pred_angular[] is a per-size function pointer (not
per-mode), checkasm benchmarks will show '_neon' for all 33 modes
even though only modes 10/26 are truly accelerated; unoptimized
modes show ~1.0x speedup as they pass through the NEON wrapper to
the C fallback with negligible overhead.

Speedup over C on Apple M4 (checkasm --bench, 15-run average):

  Mode 10 (Horizontal):
    4x4: 4.66x    8x8: 5.80x    16x16: 16.86x    32x32: 24.89x

  Mode 26 (Vertical):
    4x4: 1.16x    8x8: 1.83x    16x16: 2.45x    32x32: 4.50x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-06-07 23:29:33 +00:00
..
cabac.c
data.c
data.h
dsp.c avcodec/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
dsp.h avcodec/hevc/dsp: Add alignment for dequant 2026-01-29 12:25:33 +01:00
dsp_template.c avcodec/hevc/dsp_template: Add restrict to add_residual functions 2026-04-06 11:28:49 +02:00
filter.c
hevc.h
hevcdec.c avcodec/hevc: look for the DOVI RPU in all NALs, not just the last one 2026-06-05 01:08:08 +00:00
hevcdec.h
Makefile
mvs.c
parse.c
parse.h
parser.c
pred.c lavc/hevc: add aarch64 NEON for angular modes 10 and 26 2026-06-07 23:29:33 +00:00
pred.h lavc/hevc: add aarch64 NEON for angular modes 10 and 26 2026-06-07 23:29:33 +00:00
pred_template.c lavc/hevc: add aarch64 NEON for angular modes 10 and 26 2026-06-07 23:29:33 +00:00
ps.c avcodec/hevc/ps: validate rep_format dimensions in multi-layer SPS 2026-05-03 13:26:06 +00:00
ps.h
ps_enc.c
refs.c avcodec/h2645_sei: use the ITU-T T35 parsing helpers 2026-06-02 19:50:39 -03:00
sei.c avcodec/hevc/sei: Use get_bits64() in decode_nal_sei_3d_reference_displays_info() 2026-02-05 20:20:08 +00:00
sei.h avcodec/hevc/sei: Use get_bits64() in decode_nal_sei_3d_reference_displays_info() 2026-02-05 20:20:08 +00:00