Timo Rothenpieler
262d41c804
all: fix typos found by codespell
2025-08-03 13:48:47 +02:00
Rémi Denis-Courmont
f2b945147d
lavc/vp8dsp: fix compilation for RV32IMA
2024-11-25 19:29:21 +02:00
Rémi Denis-Courmont
616fdeaea3
lavc/riscv: depend on RVB and simplify accordingly
...
There is no known (real) hardware with V and without the complete B
extension. B was indeed required in the RISC-V application profile from
2022, earlier than V. There should not be any relevant hardware in the
future either.
In practice, different R-V Vector optimisations in FFmpeg already depend on
every constituent of the B extension anyhow, so it would not work well.
2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
658439934b
lavc/vp8dsp: R-V V vp8_idct_add
...
T-Head C908 (cycles):
vp8_idct_add_c: 312.2
vp8_idct_add_rvv_i32: 117.0
2024-06-08 18:30:43 +03:00
Rémi Denis-Courmont
121fb846b9
lavc/vp7dsp: add R-V V vp7_idct_dc_add4uv
...
This is almost the same story as vp7_idct_add4y. We just have to use
strided loads of 2 64-bit elements to account for the different data
layout in memory.
T-Head C908:
vp7_idct_dc_add4uv_c: 7.5
vp7_idct_dc_add4uv_rvv_i64: 2.0
vp8_idct_dc_add4uv_c: 6.2
vp8_idct_dc_add4uv_rvv_i32: 2.2 (before)
vp8_idct_dc_add4uv_rvv_i64: 2.0
SpacemiT X60:
vp7_idct_dc_add4uv_c: 6.7
vp7_idct_dc_add4uv_rvv_i64: 2.2
vp8_idct_dc_add4uv_c: 5.7
vp8_idct_dc_add4uv_rvv_i32: 2.5 (before)
vp8_idct_dc_add4uv_rvv_i64: 2.0
2024-06-04 17:42:07 +03:00
Rémi Denis-Courmont
91b5ea7bb9
lavc/vp8dsp: R-V V vp8_luma_dc_wht
...
This is not great as transposition is poorly supported, but it works:
vp8_luma_dc_wht_c: 2.5
vp8_luma_dc_wht_rvv_i32: 1.7
2024-05-29 16:57:02 +03:00
Rémi Denis-Courmont
4e56455d36
lavc/vp8dsp: avoid one multiplication on RISC-V
...
Use shifts rather than multiply, and save one instruction.
2024-05-28 19:44:11 +03:00
Rémi Denis-Courmont
5ebb071d79
lavc/vp8dsp: disable EPEL HV on RV128
...
RV128 is mostly scifi at this point, so we can just disable it here
(the EPEL HV prologue/epilogue do not save 128-bit registers).
2024-05-27 22:07:29 +03:00
sunyuechi
63697d3350
lavc/vp8dsp: R-V V put_epel hv
...
C908:
vp8_put_epel4_h4v4_c: 20.0
vp8_put_epel4_h4v4_rvv_i32: 11.0
vp8_put_epel4_h4v6_c: 25.2
vp8_put_epel4_h4v6_rvv_i32: 13.5
vp8_put_epel4_h6v4_c: 22.2
vp8_put_epel4_h6v4_rvv_i32: 14.5
vp8_put_epel4_h6v6_c: 29.0
vp8_put_epel4_h6v6_rvv_i32: 15.7
vp8_put_epel8_h4v4_c: 73.0
vp8_put_epel8_h4v4_rvv_i32: 22.2
vp8_put_epel8_h4v6_c: 90.5
vp8_put_epel8_h4v6_rvv_i32: 26.7
vp8_put_epel8_h6v4_c: 85.0
vp8_put_epel8_h6v4_rvv_i32: 27.2
vp8_put_epel8_h6v6_c: 104.7
vp8_put_epel8_h6v6_rvv_i32: 29.5
vp8_put_epel16_h4v4_c: 145.5
vp8_put_epel16_h4v4_rvv_i32: 26.5
vp8_put_epel16_h4v6_c: 190.7
vp8_put_epel16_h4v6_rvv_i32: 47.5
vp8_put_epel16_h6v4_c: 173.7
vp8_put_epel16_h6v4_rvv_i32: 33.2
vp8_put_epel16_h6v6_c: 222.2
vp8_put_epel16_h6v6_rvv_i32: 35.5
Amended to disable unsupported RV128.
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-26 15:15:28 +03:00
Rémi Denis-Courmont
9d3f561721
lavc/vp8dsp: restrict RVI optimisations
...
They are actually awfully slow if the CPU does not support misaligned
accesses natively, so only use them if misaligned accesses are fast.
2024-05-14 19:50:00 +03:00
Rémi Denis-Courmont
cdcb4b98b7
lavc/riscv: use ff_rv_vlen_least()
2024-05-13 18:36:07 +03:00
sunyuechi
6e77af1c22
lavc/vp8dsp: R-V V put_epel v
...
C908:
vp8_put_epel4_v4_c: 11.0
vp8_put_epel4_v4_rvv_i32: 5.0
vp8_put_epel4_v6_c: 16.5
vp8_put_epel4_v6_rvv_i32: 6.2
vp8_put_epel8_v4_c: 43.7
vp8_put_epel8_v4_rvv_i32: 11.2
vp8_put_epel8_v6_c: 68.7
vp8_put_epel8_v6_rvv_i32: 13.2
vp8_put_epel16_v4_c: 92.5
vp8_put_epel16_v4_rvv_i32: 13.7
vp8_put_epel16_v6_c: 135.7
vp8_put_epel16_v6_rvv_i32: 16.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-10 18:41:13 +03:00
sunyuechi
109daea619
lavc/vp8dsp: R-V V put_epel h
...
C908:
vp8_put_epel4_h4_c: 10.7
vp8_put_epel4_h4_rvv_i32: 5.0
vp8_put_epel4_h6_c: 15.0
vp8_put_epel4_h6_rvv_i32: 6.2
vp8_put_epel8_h4_c: 43.2
vp8_put_epel8_h4_rvv_i32: 11.2
vp8_put_epel8_h6_c: 57.5
vp8_put_epel8_h6_rvv_i32: 13.5
vp8_put_epel16_h4_c: 92.5
vp8_put_epel16_h4_rvv_i32: 13.7
vp8_put_epel16_h6_c: 139.0
vp8_put_epel16_h6_rvv_i32: 16.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-10 18:41:13 +03:00
sunyuechi
538f217bbb
lavc/vp8dsp: R-V V put_bilin_hv
...
C908:
vp8_put_bilin4_hv_c: 561.0
vp8_put_bilin4_hv_rvv_i32: 232.7
vp8_put_bilin8_hv_c: 2162.7
vp8_put_bilin8_hv_rvv_i32: 506.7
vp8_put_bilin16_hv_c: 4769.7
vp8_put_bilin16_hv_rvv_i32: 556.7
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-10 18:41:13 +03:00
sunyuechi
bb5039b3cb
lavc/vp8dsp: R-V V put_bilin_h v
...
C908:
vp8_put_bilin4_h_c: 367.0
vp8_put_bilin4_h_rvv_i32: 137.7
vp8_put_bilin4_v_c: 377.0
vp8_put_bilin4_v_rvv_i32: 137.7
vp8_put_bilin8_h_c: 1431.0
vp8_put_bilin8_h_rvv_i32: 297.5
vp8_put_bilin8_v_c: 1449.0
vp8_put_bilin8_v_rvv_i32: 297.5
vp8_put_bilin16_h_c: 2839.0
vp8_put_bilin16_h_rvv_i32: 344.7
vp8_put_bilin16_v_c: 2857.0
vp8_put_bilin16_v_rvv_i32: 344.7
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-10 18:41:13 +03:00
sunyuechi
0b8e5e5a00
lavc/vp8dsp: R-V put_vp8_pixels
...
C908:
vp8_put_pixels4_c: 78.0
vp8_put_pixels4_rvi: 33.7
vp8_put_pixels8_c: 278.0
vp8_put_pixels8_rvi: 55.0
vp8_put_pixels16_c: 999.0
vp8_put_pixels16_rvi: 86.7
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-10 18:41:13 +03:00
sunyuechi
d897bbb48d
lavc/vp8dsp: R-V V vp8_idct_dc_add4uv
...
c908:
vp8_idct_dc_add4uv_c: 387.7
vp8_idct_dc_add4uv_rvv_i32: 134.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-02-17 14:45:49 +02:00
sunyuechi
e74e18cae4
lavc/vp8dsp: R-V V vp8_idct_dc_add4y
...
c908:
vp8_idct_dc_add4y_c: 368.5
vp8_idct_dc_add4y_rvv_i32: 134.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-02-17 14:45:49 +02:00
sunyuechi
c12053cefc
lavc/vp8dsp: R-V V vp8_idct_dc_add
...
c908:
vp8_idct_dc_add_c: 102.2
vp8_idct_dc_add_rvv_i32: 42.0
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-02-17 14:45:49 +02:00