Commit graph

50123 commits

Author SHA1 Message Date
sunyuechi
0c1304ae11 lavc/vp9dsp: R-V V mc avg
C908:
vp9_avg4_8bpp_c: 1.2
vp9_avg4_8bpp_rvv_i64: 1.0
vp9_avg8_8bpp_c: 3.7
vp9_avg8_8bpp_rvv_i64: 1.5
vp9_avg16_8bpp_c: 14.7
vp9_avg16_8bpp_rvv_i64: 3.5
vp9_avg32_8bpp_c: 57.7
vp9_avg32_8bpp_rvv_i64: 10.0
vp9_avg64_8bpp_c: 229.0
vp9_avg64_8bpp_rvv_i64: 31.7

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-21 21:28:14 +03:00
Rémi Denis-Courmont
7591eb4055 Revert "lavc/sbrdsp: R-V V neg_odd_64"
While this function can easily be written with vectors, it just fails to
get any performance improvement.

For reference, this is a simpler loop-free implementation that does get
better performance than the current one depending on hardware, but still
more or less the same metrics as the C code:

 func ff_sbr_neg_odd_64_rvv, zve64x
         li      a1, 32
         addi    a0, a0, 7
         li      t0, 8
         vsetvli zero, a1, e8, m2, ta, ma
         li      t1, 0x80
         vlse8.v v8, (a0), t0
         vxor.vx v8, v8, t1
         vsse8.v v8, (a0), t0
         ret
 endfunc

This reverts commit d06fd18f8f.
2024-05-21 21:26:39 +03:00
Rémi Denis-Courmont
d452db8410 lavc/vc1dsp: R-V V vc1_unescape_buffer
Notes:
- The loop is biased toward no unescaped bytes as that should be most common.
- The input byte array is slid rather than the (8 times smaller) bit-mask,
  as RISC-V V does not provide a bit-mask (or bit-wise) slide instruction.
- There are two comparisons with 0 per iteration, for the same reason.
- In case of match, bytes are copied until the first match, and the loop is
  restarted after the escape byte. Vector compression (vcompress.vm) could
  discard all escape bytes but that is slower if escape bytes are rare.

Further optimisations should be possible, e.g.:
- processing 2 bytes fewer per iteration to get rid of a 2 slides,
- taking a short cut if the input vector contains less than 2 zeroes.
But this is a good starting point:

T-Head C908:
vc1dsp.vc1_unescape_buffer_c:      12749.5
vc1dsp.vc1_unescape_buffer_rvv_i32: 6009.0

SpacemiT X60:
vc1dsp.vc1_unescape_buffer_c:      11038.0
vc1dsp.vc1_unescape_buffer_rvv_i32: 2061.0
2024-05-21 21:16:30 +03:00
Nuo Mi
1b33c9a50a avcodec/vvcdec: support Reference Picture Resampling
passed clips:
    RPR_A_Alibaba_4.bit
    RPR_B_Alibaba_3.bit
    RPR_C_Alibaba_3.bit
    RPR_D_Qualcomm_1.bit
    VVC_HDR_UHDTV1_OpenGOP_Max3840x2160_50fps_HLG10_res_change_with_RPR.ts
2024-05-21 20:20:25 +08:00
Nuo Mi
cae0b01282 avcodec/vvcdec: increase edge_emu_buffer for RPR 2024-05-21 20:20:25 +08:00
Nuo Mi
7904ec2d34 avcodec/vvcdec: refact, remove hf_idx and vf_idx from mc_xxx's param list 2024-05-21 20:20:25 +08:00
Nuo Mi
77d971c348 avcodec/vvcdec: refact out luma_prof from luma_prof_bi 2024-05-21 20:20:25 +08:00
Nuo Mi
ac4575594f avcodec/vvcdec: fix dmvr, bdof, cb_prof for RPR 2024-05-21 20:20:25 +08:00
Nuo Mi
77acd0a0dd avcodec/vvcdec: inter, wait reference with a different resolution
For RPR, the current frame may reference a frame with a different resolution.
Therefore, we need to consider frame scaling when we wait for reference pixels.
2024-05-21 20:20:25 +08:00
Nuo Mi
deda59a996 avcodec/vvcdec: add RPR dsp 2024-05-21 20:20:25 +08:00
Nuo Mi
e70225e0a8 avcodec/vvcdec: emulated_edge, use reference frame's sps and pps
a preparation for Reference Picture Resampling
2024-05-21 20:20:25 +08:00
Nuo Mi
aa8d5c6e7e avcodec/vvcdec: add vvc inter filters for RPR 2024-05-21 20:20:25 +08:00
Nuo Mi
08ad51ece6 avcodec/vvcdec: refact, pred_get_refs return VVCRefPic instead of VVCFrame 2024-05-21 20:20:25 +08:00
Nuo Mi
66c6bee061 avcodec/vvcdec: refact out VVCRefPic from RefPicList 2024-05-21 20:20:25 +08:00
Nuo Mi
44bbafb69f avcodec/vvcdec: refact, unify pred_regular_{luma, chroma} to pred_regular 2024-05-21 20:20:25 +08:00
Nuo Mi
875fa9692c avcodec/vvcdec: misc, remove unused EMULATED_EDGE_{LUMA, CHROMA}, EMULATED_EDGE_DMVR_{LUAM, CHROMA} 2024-05-21 20:20:25 +08:00
Nuo Mi
84a93d91d1 avcodec/vvcdec: refact, unify {luma, chroma}_mc_bi to mc_bi 2024-05-21 20:20:25 +08:00
Nuo Mi
6769fe1614 avcodec/vvcdec: refact, unify {luma, chroma}_mc_uni to mc_uni 2024-05-21 20:20:25 +08:00
Nuo Mi
bc099afc8d avcodec/vvcdec: refact, unify {luma, chroma}_mc to mc 2024-05-21 20:20:25 +08:00
Nuo Mi
1289da9244 avcodec/vvcdec: misc, inter, use is_chroma instead of is_luma 2024-05-21 20:20:25 +08:00
David Rosca
f7a1453f27 lavc/vaapi_decode: Reject decoding of frames with no slices
Matches other hwaccels.
2024-05-21 16:57:46 +08:00
James Almer
b113050d96 avcodec/cbs_h266: read vps_ptl_max_tid before using it
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-05-20 10:29:30 -03:00
Andreas Rheinhardt
2c94b1bbf1 avcodec/tiff: Fix leak on error
Fixes Coverity issue #1516957.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 14:15:48 +02:00
Andreas Rheinhardt
59b1838e09 avcodec/ac3enc: Move transient PutBitContext to stack
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 14:11:25 +02:00
Andreas Rheinhardt
e863cbceae avcodec/ac3enc_template: Avoid always-true check
This might also help Coverity with issue #1596532.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 14:11:03 +02:00
Andreas Rheinhardt
482afe8f3f avcodec/lib*, avformat/tee: Simplify iterating over AVDictionary
Reviewed-by: epirat07@gmail.com
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 13:51:55 +02:00
Andreas Rheinhardt
a2874c5721 avcodec/aac_ac3_parser: Use ff_adts_header_parse_buf()
instead of avpriv_adts_header_parse(). Using the former avoids
an indirection.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 12:06:50 +02:00
Andreas Rheinhardt
12ded9cd85 avcodec/adts_header: Add ff_adts_header_parse_buf()
Most users of ff_adts_header_parse() don't already have
an opened GetBitContext for the header, so add a convenience
function for them.
Also use a forward declaration of GetBitContext in adts_header.h
as this avoids (implicit) inclusion of get_bits.h in some of
the users that now no longer use a GetBitContext of their own.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 12:06:31 +02:00
Andreas Rheinhardt
ae937c4902 avcodec/aac_ac3_parser: Untangle AAC and AC3 parsing error codes
Also remove the (unused) AAC_AC3_PARSE_ERROR_CHANNEL_CFG while at it;
furthermore, fix the documentation of ff_ac3_parse_header()
and (ff|avpriv)_adts_header_parse().

Reviewed-by: Andrew Sayers <ffmpeg-devel@pileofstuff.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 11:58:07 +02:00
Andreas Rheinhardt
6c812a80dd avcodec/adts_parser: Don't presume buffer to be padded
The documentation of av_adts_header_parse() does not require
the buffer to be padded at all.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-20 11:48:03 +02:00
Haihao Xiang
a00cfc6c24 lavc/qsvdec: require a dynamic frame pool if possible
This allows a downstream element stores more frames from qsv decoders
and fixes error in get_buffer().

$ ffmpeg -hwaccel qsv -hwaccel_output_format qsv -i input.mp4 -vf
reverse -f null -

[vist#0:0/h264 @ 0x562248f12c50] Decoding error: Cannot allocate memory
[h264_qsv @ 0x562248f66b10] get_buffer() failed

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-20 09:30:49 +08:00
Haihao Xiang
75015f9b0e lavc/qsvenc: use the right info for encoding
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-20 09:30:49 +08:00
Haihao Xiang
cda721e01d lavc/qsv: fix the mfx allocator to support dynamic frame pool
When the external allocator is used for dynamic frame allocation, only
video memory is supported, the SDK doesn't lock/unlock the memory block
via Lock()/Unlock() calls.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-20 09:30:49 +08:00
Michael Niedermayer
e35fe3d8b9
avcodec/mscc & mwsc: Check loop counts before use
This could cause timeouts

Fixes: CID1439568 Untrusted loop bound

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:14:39 +02:00
Michael Niedermayer
b6b2b01025
avcodec/mpegvideo_enc: Fix potential overflow in RD
Fixes: CID1500285 Unintentional integer overflow

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:14:38 +02:00
Michael Niedermayer
8fc649b931
avcodec/mpeg4videodec: assert impossible wrap points
Helps: CID1473517 Uninitialized scalar variable
Helps: CID1473497 Uninitialized scalar variable

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:14:37 +02:00
Michael Niedermayer
4c725df059
avcodec/mpeg12dec: Use 64bit in bit computation
I dont think this can actually overflow but 64bit seems reasonable to use

Fixes: CID1521983 Unintentional integer overflow

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:14:37 +02:00
Michael Niedermayer
6a9302739f
avcodec/vqcdec: Check init_get_bits8() for failure
Fixes: CID1516090 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:12:55 +02:00
Michael Niedermayer
4a8506c794
avcodec/vvc/dec: Check init_get_bits8() for failure
Fixes: CID1560042 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:12:55 +02:00
Michael Niedermayer
dd5379db5d
avcodec/vble: Check av_image_get_buffer_size() for failure
Fixes: CID1461482 Improper use of negative value

Sponsored-by: Sovereign Tech Fund
Reviewed-.by: "Xiang, Haihao" <haihao.xiang@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:12:54 +02:00
Michael Niedermayer
1b991e77b9
avcodec/vp3: Replace check by assert
Fixes: CID1452425 Logically dead code

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:12:54 +02:00
Michael Niedermayer
63feed1519
avcodec/vp8: Forward return of ff_vpx_init_range_decoder()
Fixes: CID1507483 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-05-19 22:12:54 +02:00
Rémi Denis-Courmont
463c573e6b lavc/huffyuvdsp: optimise RVV vtype for add_hfyu_left_pred_bgr32
T-Head C908:
add_hfyu_left_pred_bgr32_c:       237.5
add_hfyu_left_pred_bgr32_rvv_i32: 173.5 (before)
add_hfyu_left_pred_bgr32_rvv_i32: 110.0 (after)
2024-05-19 18:37:33 +03:00
Rémi Denis-Courmont
233066e85a lavc/flacdsp: optimise RVV vector type for lpc32
This is pretty much the same as for lpc16, though it only improves half
as large prediction orders. With 128-bit vectors, this gives:

   C      V old  V new
1   69.2  181.5   95.5
2  107.7  180.7   95.2
3  145.5  180.0  103.5
4  183.0  179.2  102.7
5  220.7  178.5  128.0
6  257.7  194.0  127.5
7  294.5  193.7  126.7
8  331.0  193.0  126.5

Larger prediction orders see no significant changes at that size.
2024-05-19 18:37:33 +03:00
Rémi Denis-Courmont
6ab4b92e82 lavc/flacdsp: optimise RVV vector type for lpc16
This calculates the optimal vector type value at run-time based on the
hardware vector length and the FLAC LPC prediction order. In this
particular case, the additional computation is easily amortised over
the loop iterations:

T-Head C908:
    C     V before  V after
 1   48.0    214.7     95.2
 2   64.7    214.2     94.7
 3   79.7    213.5     94.5
 4   96.2    196.5     94.2 #
 5  111.0    195.7    118.5
 6  127.0    211.2    102.0
 7  143.7    194.2    101.5
 8  175.7    193.2    101.2 #
 9  176.2    224.2    126.0
10  191.5    192.0    125.5
11  224.5    191.2    124.7
12  223.0    190.2    124.2
13  239.2    189.5    123.7
14  253.7    188.7    139.5
15  286.2    188.0    122.7
16  284.0    187.0    122.5 #
17  300.2    186.5    186.5
18  314.0    185.5    185.7
19  329.7    184.7    185.0
20  343.0    184.2    184.2
21  358.7    199.2    183.7
22  371.7    182.7    182.7
23  387.5    181.7    182.0
24  400.7    181.0    181.2
25  431.5    180.2    196.5
26  443.7    195.5    196.0
27  459.0    178.7    196.2
28  470.7    177.7    194.2
29  470.0    177.0    193.5
30  481.2    176.2    176.5
31  496.2    175.5    175.7
32  507.2    174.7    191.0 #

 # Power of two boundary.

With 128-bit vectors, improvements are expected for the first two
test cases only. For the other two, there is overhead but below noise.
Improvements should be better observable with prediction order of 8
and less, or on hardware with larger vector sizes.
2024-05-19 18:37:33 +03:00
Andreas Rheinhardt
a7e506fcd8 avcodec/vc1_parser: Check init_get_bits8()
Addresses Coverity issue #1441935.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-19 11:49:33 +02:00
Andreas Rheinhardt
95937d6316 avcodec/codec_desc: Mark AVRN, TGQ, PhotoCD, VBN as intra-only
Also remove the then redundant assignments of AV_FRAME_FLAG_KEY,
AV_PICTURE_TYPE_I.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-19 11:40:55 +02:00
Andreas Rheinhardt
2a00d68c09 avcodec/yop: Add missing AV_CODEC_CAP_DR1
This decoder does not do anything fancy any more since
c6303f8d70 (before that,
it overwrote the frame's linesize) so that it supports
direct rendering. This effectively reverts
d3de3a16d1.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-19 11:40:43 +02:00
Andreas Rheinhardt
df3cdf4c75 avcodec: Remove redundant setting of AV_FRAME_FLAG_KEY, AV_PICTURE_TYPE_I
This is done generically now.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-19 11:40:07 +02:00
Andreas Rheinhardt
eee88ba0dc avcodec/decode: Set KEY flag+pict_type generically for intra-only codecs
This commit is the analog of 3f11eac757
for decoding: It sets the AV_FRAME_FLAG_KEY and (for video decoders)
also pict_type to AV_PICTURE_TYPE_I. It furthermore stops setting
audio frames as always being key frames -- it is wrong for e.g.
TrueHD/MLP. The latter also affects TAK and DFPWM.

The change already improves output for several decoders where
it has been forgotten to set e.g. pict_type like speedhq, wnv1
or tiff. The latter is the reason for the change to the exif-image-tiff
FATE test reference file.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-19 11:39:45 +02:00