ffmpeg/libavcodec/riscv
Rémi Denis-Courmont 378d1b06c3 riscv: probe for Zbb extension at load time
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.

Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1:  AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
    LBU   rd, %pcrel_lo(1b)(rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1:  AUIPC rd, GOT_OFFSET_HI
    LD    rd, GOT_OFFSET_LO(rd)
    LBU   rd, (rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.
2024-06-11 20:12:37 +03:00
..
aacencdsp_init.c lavc/aacencdsp: R-V V quant_bands 2024-06-03 22:43:37 +03:00
aacencdsp_rvv.S lavc/aacencdsp: fix rounding in R-V V quantize_bands 2024-06-08 18:30:43 +03:00
aacpsdsp_init.c
aacpsdsp_rvv.S lavc/riscv: explicitly require Zbb for MIN 2024-05-10 18:59:06 +03:00
ac3dsp_init.c lavc/ac3dsp: add R-V Zvbb extract_exponents 2024-05-11 11:38:49 +03:00
ac3dsp_rvb.S lavc/ac3dsp: R-V Zbb ac3_exponent_min 2024-05-06 22:10:16 +03:00
ac3dsp_rvv.S lavc/ac3dsp: R-V V min_exponents 2024-05-04 10:17:11 +03:00
ac3dsp_rvvb.S lavc/ac3dsp: add R-V Zvbb extract_exponents 2024-05-11 11:38:49 +03:00
alacdsp_init.c
alacdsp_rvv.S
audiodsp_init.c
audiodsp_rvf.S
audiodsp_rvv.S
blockdsp_init.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
blockdsp_rvv.S lavc/blockdsp: R-V V fill_block 2024-05-03 17:49:23 +03:00
bswapdsp_init.c
bswapdsp_rvb.S
bswapdsp_rvv.S
cpu_common.c riscv: probe for Zbb extension at load time 2024-06-11 20:12:37 +03:00
exrdsp_init.c
exrdsp_rvv.S
flacdsp_init.c lavc/flacdsp: R-V Zvl256b lpc33 2024-05-27 22:07:29 +03:00
flacdsp_rvv.S lavc/flacdsp: fix sign extension in R-V V wasted33 2024-06-07 17:53:05 +03:00
fmtconvert_init.c
fmtconvert_rvv.S
g722dsp_init.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
g722dsp_rvv.S
h263dsp_init.c lavc/h263dsp: R-V V {h,v}_loop_filter 2024-05-22 19:15:39 +03:00
h263dsp_rvv.S lavc/h263dsp: R-V V {h,v}_loop_filter 2024-05-22 19:15:39 +03:00
h264_chroma_init_riscv.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
h264_mc_chroma.S
h264dsp_init.c lavc/startcode: add R-V V startcode_find_candidate 2024-05-19 10:03:49 +03:00
huffyuvdsp_init.c lavc/huffyuvdsp: optimise RVV vtype for add_hfyu_left_pred_bgr32 2024-05-19 18:37:33 +03:00
huffyuvdsp_rvv.S lavc/huffyuvdsp: optimise RVV vtype for add_hfyu_left_pred_bgr32 2024-05-19 18:37:33 +03:00
idctdsp_init.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
idctdsp_rvv.S
jpeg2000dsp_init.c
jpeg2000dsp_rvv.S
llauddsp_init.c
llauddsp_rvv.S
llviddsp_init.c
llviddsp_rvv.S
llvidencdsp_init.c
llvidencdsp_rvv.S
lpc_init.c lavc/lpc: optimise RVV vector type for compute_autocorr 2024-05-29 16:57:02 +03:00
lpc_rvv.S riscv: allow passing addend to vtype_vli macro 2024-05-30 18:30:52 +03:00
Makefile riscv: probe for Zbb extension at load time 2024-06-11 20:12:37 +03:00
me_cmp_init.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
me_cmp_rvv.S lavc/me_cmp: R-V V nsse 2024-02-27 20:31:30 +02:00
opusdsp_init.c
opusdsp_rvv.S lavc/riscv: explicitly require Zbb for MIN 2024-05-10 18:59:06 +03:00
pixblockdsp_init.c lavc/pixblockdsp: add scalar get_pixels_unaligned 2024-05-24 17:53:43 +03:00
pixblockdsp_rvi.S
pixblockdsp_rvv.S
rv34dsp_init.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
rv34dsp_rvv.S lavc/rv34dsp: remove stray load immediate 2024-05-26 19:20:45 +03:00
rv40dsp_init.c lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
rv40dsp_rvv.S lavc/rv40dsp: R-V V chroma_mc 2024-05-03 18:00:53 +03:00
sbrdsp_init.c lavc/sbrdsp: add support for 256-bit vectors 2024-05-31 22:22:43 +03:00
sbrdsp_rvv.S lavc/sbrdsp: fold immediate offset into relocation 2024-05-28 19:44:11 +03:00
startcode_rvb.S lavc/startcode: add R-V Zbb startcode_find_candidate 2024-05-19 10:03:49 +03:00
startcode_rvv.S lavc/startcode: fix RVV return value on no match 2024-05-28 19:43:40 +03:00
svqenc_init.c lavc/svq1enc: R-V V ssd_int8_vs_int16 2024-01-17 17:49:54 +02:00
svqenc_rvv.S lavc/svq1enc: R-V V ssd_int8_vs_int16 2024-01-17 17:49:54 +02:00
takdsp_init.c lavc/takdsp: R-V V decorrelate_sf 2024-01-15 19:00:25 +02:00
takdsp_rvv.S lavc/takdsp: R-V V decorrelate_sf 2024-01-15 19:00:25 +02:00
utvideodsp_init.c
utvideodsp_rvv.S
vc1dsp_init.c lavc/vc1dsp: R-V V vc1_inv_trans_4x4 2024-06-07 17:53:05 +03:00
vc1dsp_rvi.S lavc/vc1dsp: R-V V mspel_pixels 2024-05-16 17:08:18 +03:00
vc1dsp_rvv.S lavc/vc1dsp: match C block layout in inv_trans_4x8_rvv 2024-06-11 17:15:09 +03:00
vorbisdsp_init.c
vorbisdsp_rvv.S
vp7dsp_init.c lavc/vp7dsp: add R-V V vp7_idct_dc_add4uv 2024-06-04 17:42:07 +03:00
vp7dsp_rvv.S lavc/vp7dsp: add R-V V vp7_idct_dc_add4uv 2024-06-04 17:42:07 +03:00
vp8dsp.h lavc/vp8dsp: R-V put_vp8_pixels 2024-05-10 18:41:13 +03:00
vp8dsp_init.c lavc/vp8dsp: R-V V vp8_idct_add 2024-06-08 18:30:43 +03:00
vp8dsp_rvi.S lavc/vp8dsp: R-V put_vp8_pixels 2024-05-10 18:41:13 +03:00
vp8dsp_rvv.S lavc/vp8dsp: R-V V vp8_idct_add 2024-06-08 18:30:43 +03:00
vp9_intra_rvi.S lavc/vp9dsp: R-V ipred vert 2024-05-15 19:52:25 +03:00
vp9_intra_rvv.S lavc/vp9_intra: fix another .irp use with LLVM as 2024-05-19 10:22:46 +03:00
vp9_mc_rvi.S lavc/vp9dsp: R-V mc copy 2024-05-15 19:52:28 +03:00
vp9_mc_rvv.S lavc/vp9dsp: R-V V rename ff_avg to ff_vp9_avg 2024-05-30 18:30:52 +03:00
vp9dsp.h lavc/vp9dsp: R-V V rename ff_avg to ff_vp9_avg 2024-05-30 18:30:52 +03:00
vp9dsp_init.c lavc/vp9dsp: R-V V rename ff_avg to ff_vp9_avg 2024-05-30 18:30:52 +03:00