sunyuechi
|
afb967b81e
|
af_afir: RISC-V V fcmul_add
Segmented loads are slow, so here we use unit-strided load and narrowing shifts.
c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652
c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
|
2023-11-16 20:53:18 +02:00 |
|