Commit graph

13 commits

Author SHA1 Message Date
Andreas Rheinhardt
20ddada2a3 avcodec/pixblockdsp: Improve 8 vs 16 bit check
Before this commit, the input in get_pixels and get_pixels_unaligned
has been treated inconsistenly:
- The generic code treated 9, 10, 12 and 14 bits as 16bit input
(these bits correspond to what FFmpeg's dsputils supported),
everything with <= 8 bits as 8 bit and everything else as 8 bit
when used via AVDCT (which exposes these functions and purports
to support up to 14 bits).
- AARCH64, ARM, PPC and RISC-V, x86 ignore this AVDCT special case.
- RISC-V also ignored the restriction to 9, 10, 12 and 14 for its
16bit check and treated everything > 8 bits as 16bit.
- The mmi MIPS code treats everything as 8 bit when used via
AVDCT (this is certainly broken); otherwise it checks for <= 8 bits.
The msa MIPS code behaves like the generic code.

This commit changes this to treat 9..16 bits as 16 bit input,
everything else as 8 bit (the former because it makes sense,
the latter to preserve the behaviour for external users*).

*: The only internal user of AVDCT (the spp filter) always
uses 8, 9 or 10 bits.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-05-31 01:25:27 +02:00
Rémi Denis-Courmont
d3acffae7a lavc/pixblockdsp: fix compilation for RV32IMA 2024-11-25 19:29:21 +02:00
Rémi Denis-Courmont
ba38d0e328 lavc/pixblockdsp: add scalar get_pixels_unaligned
The code is already there, we just need to use it.

get_pixels_unaligned_c: 2.2
get_pixels_unaligned_misaligned: 1.7
2024-05-24 17:53:43 +03:00
Rémi Denis-Courmont
cdcb4b98b7 lavc/riscv: use ff_rv_vlen_least() 2024-05-13 18:36:07 +03:00
Rémi Denis-Courmont
b3825bbe45 riscv: test for assembler support
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
Rémi Denis-Courmont
02594c8c01 lavc/pixblockdsp: rework R-V V get_pixels_unaligned
As in the aligned case, we can use VLSE64.V, though the way of doing so
gets more convoluted, so the performance gains are more modest:

get_pixels_unaligned_c:       126.7
get_pixels_unaligned_rvv_i32: 145.5 (before)
get_pixels_unaligned_rvv_i64:  62.2 (after)

For the reference, those are the aligned benchmarks (unchanged) on the
same T-Head C908 hardware:

get_pixels_c:                 126.7
get_pixels_rvi:                85.7
get_pixels_rvv_i64:            33.2
2023-11-06 19:42:49 +02:00
Rémi Denis-Courmont
92bcc6703a lavc/pixblockdsp: remove R-V V get_pixels_16
In the aligned case, the existing RVI assembler is actually much
faster. In the unaligned case, there is nothing much to gain over C.
2023-11-01 19:27:22 +02:00
Rémi Denis-Courmont
300ee8b02d lavc/pixblockdsp: aligned R-V V 8-bit functions
If the scan lines are aligned, we can load each row as a 64-bit value,
thus avoiding segmentation. And then we can factor the conversion or
subtraction.

In principle, the same optimisation should be possible for high depth,
but would require 128-bit elements, for which no FFmpeg CPU flag
exists.
2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
722765687b lavc/pixblockdsp: rename unaligned R-V V functions 2023-10-30 18:14:16 +02:00
Rémi Denis-Courmont
d31013166a lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned 2022-09-28 11:46:11 +02:00
Rémi Denis-Courmont
ebee25855a lavc/pixblockdsp: RISC-V V 16-bit get_pixels & get_pixels_unaligned 2022-09-28 11:46:11 +02:00
Rémi Denis-Courmont
676b08cb70 lavc/pixblockdsp: RISC-V V 8-bit get_pixels & get_pixels_unaligned 2022-09-28 11:46:11 +02:00
Rémi Denis-Courmont
1edac8eb46 lavc/pixblockdsp: RISC-V I get_pixels
Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech):
get_pixels_c: 180.0
get_pixels_rvi: 136.7
2022-09-27 13:19:52 +02:00