ffmpeg/libavcodec/hevc
Jun Zhao 75838b9c89 lavc/hevc: add aarch64 NEON for reference sample filtering
3-tap [1,2,1]>>2: shared implementation body across size-specialized
entry points (8x8/16x16/32x32) to reduce code size. Fold the 3-tap
kernel into uhadd + urhadd: uhadd gives floor((prev+next)/2), then
urhadd rounds with curr to produce (prev + 2*curr + next + 2) >> 2
on 16 bytes in-place (no widen/narrow needed). Overlap-last technique
for tail avoids partial stores. Caller pads input arrays by 16 bytes
to guarantee safe over-read.

Strong smoothing (32x32): preloaded weight tables, interleaved
umull/umlal pairs (two 16-byte blocks at a time) to hide
rshrn-to-store latency, with paired st1 for 32-byte writes.

checkasm --bench --runs=15 (Apple M4, average of 3 trials):
  ref_filter_3tap_8x8_8_neon:    4.1x
  ref_filter_3tap_16x16_8_neon:  3.3x
  ref_filter_3tap_32x32_8_neon:  2.5x
  ref_filter_strong_8_neon:      1.9x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-21 07:50:49 +00:00
..
cabac.c all: fix whitespace/new-line issues 2025-08-03 13:48:47 +02:00
data.c
data.h
dsp.c avcodec/x86/Makefile: Only compile ASM init files when X86ASM is enabled 2025-11-30 22:20:13 +01:00
dsp.h avcodec/hevc/dsp: Add alignment for dequant 2026-01-29 12:25:33 +01:00
dsp_template.c avcodec/hevc/dsp_template: Add restrict to add_residual functions 2026-04-06 11:28:49 +02:00
filter.c lavc/hevcdec: make a HEVCFrame hold a reference to its PPS 2024-09-06 13:59:29 +02:00
hevc.h avcodec/hevc/ps: Add basic HEVC_SCALABILITY_AUXILIARY support 2025-02-17 15:08:42 +08:00
hevcdec.c avcodec/hevc/hevcdec: take into account YUV400 in block length 2026-02-14 16:23:16 +00:00
hevcdec.h avcodec/h274: Make H274FilmGrainDatabase a shared object 2025-09-22 04:54:22 +02:00
Makefile avcodec/hevc/Makefile: Move rules for lavc/* files to lavc/Makefile 2024-06-09 10:59:33 +02:00
mvs.c lavc/hevcdec: make a HEVCFrame hold a reference to its PPS 2024-09-06 13:59:29 +02:00
parse.c lavc/hevcdec/parse: process NALUs with nuh_layer_id>0 2024-09-23 17:11:40 +02:00
parse.h
parser.c avcodec/parser_internal: Remove prefix from parser_{init,parse,close} 2025-11-01 16:57:03 +01:00
pred.c lavc/hevc: extract reference sample filter into function pointers 2026-04-21 07:50:49 +00:00
pred.h lavc/hevc: extract reference sample filter into function pointers 2026-04-21 07:50:49 +00:00
pred_template.c lavc/hevc: add aarch64 NEON for reference sample filtering 2026-04-21 07:50:49 +00:00
ps.c avcodec/hevc: workaround hevc-alpha videos generated by VideoToolbox 2026-04-01 22:54:36 +08:00
ps.h avcodec/hevc: add ff_hevc_compute_poc2 which don't depend on HEVCSPS directly 2025-11-05 15:13:54 +00:00
ps_enc.c
refs.c avcodec/hevc: remove an always true condition 2025-11-10 12:22:05 +08:00
sei.c avcodec/hevc/sei: Use get_bits64() in decode_nal_sei_3d_reference_displays_info() 2026-02-05 20:20:08 +00:00
sei.h avcodec/hevc/sei: Use get_bits64() in decode_nal_sei_3d_reference_displays_info() 2026-02-05 20:20:08 +00:00