Commit graph

50644 commits

Author SHA1 Message Date
Michael Niedermayer
8f74c313f1
avcodec/vvc/ctu: Simplify code at the end of pred_mode_decode()
This simplification assumes that the code is correct

Fixes: CID1560036 Logically dead code

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-15 01:59:37 +02:00
Rémi Denis-Courmont
c654e37254 lavc/h264dsp: R-V V high-depth h264_idct8_add
Unlike the 8-bit version, we need two iterations to process this within
128-bit vectors. This adds some extra complexity for pointer arithmetic
and counting down which is unnecessary in the 8-bit variant.

Accordingly the gain relative to C are just slight better than half as
good with 128-bit vectors as with 256-bit ones.

T-Head C908 (2 iterations):
h264_idct8_add_9bpp_c:       17.5
h264_idct8_add_9bpp_rvv_i32: 10.0
h264_idct8_add_10bpp_c:      17.5
h264_idct8_add_10bpp_rvv_i32: 9.7
h264_idct8_add_12bpp_c:      17.7
h264_idct8_add_12bpp_rvv_i32: 9.7
h264_idct8_add_14bpp_c:      17.7
h264_idct8_add_14bpp_rvv_i32: 9.7

SpacemiT X60 (single iteration):
h264_idct8_add_9bpp_c:       15.2
h264_idct8_add_9bpp_rvv_i32:  5.0
h264_idct8_add_10bpp_c:      15.2
h264_idct8_add_10bpp_rvv_i32: 5.0
h264_idct8_add_12bpp_c:      14.7
h264_idct8_add_12bpp_rvv_i32: 5.0
h264_idct8_add_14bpp_c:      14.7
h264_idct8_add_14bpp_rvv_i32: 4.7
2024-07-14 21:06:50 +03:00
Rémi Denis-Courmont
8b3d997bed lavc/h264dsp: remove MMI 8-bit 4:2:2 chroma DC dequant
The function is exactly identical to the C reference, only with the
constant propagated and the loop unrolled manually.
2024-07-14 11:39:35 +03:00
Rémi Denis-Courmont
a194131cb6 lavc/h264dsp: remove MMI 8-bit chroma DC dequant
The function is exactly identical to the C reference, only with the
constant propagated manually. It does not optimise anything.
2024-07-14 11:39:35 +03:00
Rémi Denis-Courmont
4e0e872881 lavc/h264dsp: R-V V high-depth h264_idct_add
T-Head C908 (cycles):
h264_idct4_add_9bpp_c:        248.2
h264_idct4_add_9bpp_rvv_i32:  128.7
h264_idct4_add_10bpp_c:       256.7
h264_idct4_add_10bpp_rvv_i32: 128.7
h264_idct4_add_12bpp_c:       252.5
h264_idct4_add_12bpp_rvv_i32: 129.7
h264_idct4_add_14bpp_c:       258.0
h264_idct4_add_14bpp_rvv_i32: 129.7
2024-07-14 11:39:35 +03:00
James Almer
d059ea5663 avcodec/bsf/showinfo: print packet data checksum
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-13 23:48:34 -03:00
Michael Niedermayer
9af348bd1a
avcodec/flac_parser: Assert that we do not overrun the link_penalty array
Helps: CID1454676 Out-of-bounds read

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:49:33 +02:00
Michael Niedermayer
ed34b0c54e avcodec/osq: avoid signed overflow in downsample path
Fixes: signed integer overflow: 865309950 * 256 cannot be represented in type 'int'
Fixes: 69191/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6310214413385728

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:45:58 +02:00
Michael Niedermayer
0474614e6c avcodec/pixlet: Simplify pfx computation
Found by reviewing code related to CID1604365 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:46 +02:00
Michael Niedermayer
f18b442370 avcodec/motion_est: Fix score squaring overflow
Fixes: CID1604552 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:44 +02:00
Michael Niedermayer
06f01d9fa0 avcodec/mlpenc: Use 64 for ml, mr
Fixes: CID1604429 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:42 +02:00
Michael Niedermayer
371265f0ec avcodec/me_cmp: Fix type check
Fixes: CID1604375 Out-of-bounds read

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:40 +02:00
Michael Niedermayer
d553276843 avcodec/loco: Check loco_get_rice() for failure
Fixes: CID1604495 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:38 +02:00
Michael Niedermayer
b989986641 avcodec/loco: check get_ur_golomb_jpegls() for failure
Fixes: CID1604400 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:35 +02:00
Michael Niedermayer
0e3e7e8aeb avcodec/leaddec: Check init_get_bits8() for failure
Fixes: CID1604416 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:33 +02:00
Michael Niedermayer
6e4c037833 avcodec/imm4: check cbphi for error
Fixes: CID1604356 Overflowed constant
Fixes: CID1604573 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:31 +02:00
Michael Niedermayer
cfe66dfebb avcodec/iff: Use signed count
This is more a style fix than a bugfix (CID1604392 Overflowed constant)

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:28 +02:00
Michael Niedermayer
1e888fb006 avcodec/hw_base_encode: Simplify EOF check
Found while reviewing CID1608712 Explicit null dereferenced

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:25 +02:00
Michael Niedermayer
b2aaeb81f6 avcodec/golomb: Assert that k is in the supported range for get_ur/sr_golomb()
Found by code review related to CID1604563 Overflowed return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:23 +02:00
Michael Niedermayer
7cf5b83f6f avcodec/golomb: Document return for get_ur_golomb_jpegls() and get_sr_golomb_flac()
Found while reviewing code related to CID1604409 Overflowed return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:21 +02:00
Michael Niedermayer
e5af1c6e91 avcodec/dxv: Fix type in get_opcodes()
Found by code review related to CID1604386 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:20 +02:00
Michael Niedermayer
69dcd123f1 avcodec/cri: Check length
Fixes: CID1604394 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:18 +02:00
Michael Niedermayer
96fd9417e2 avcodec/xsubdec: Check parse_timecode()
Fixes: CID1604490 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 22:42:14 +02:00
Michael Niedermayer
93e0265e27
avcodec/proresenc_kostya: use unsigned alpha for rotation
Fixes: left shift of negative value -208
Fixes: 69073/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PRORES_KS_fuzzer-4745020002336768

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-12 16:40:51 +02:00
Fei Wang
71f802cdc9 lavc/hevcdec: Update slice index before hwaccel decode slice
Otherwise, slice index will never update for hwaccel decode, and slice
RPL will be always overlap into first one which use slice index to construct.

Fixes hwaccel decoding after 47d34ba7fb

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-07-12 16:27:34 +08:00
Fei Wang
e741cf665d lavc/hevcdec: Put slice address checking after hwaccel decode slice
Slice address tab only been updated in software decode slice data.

Fixes hwaccel decoding after d725c737fe.

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-07-12 16:27:34 +08:00
Rémi Denis-Courmont
d28a7e8eb7 lavc/h264dsp: avoid \+ expansion
This seems to be unsupported by LLVM-as.
2024-07-11 21:07:17 +03:00
Zhao Zhili
0e5f8ddc1d avcodec/vvc: Use static const for function table 2024-07-11 20:26:47 +08:00
Michael Niedermayer
eb552ecd54
avcodec/vvc/refs: Use unsigned mask
Not a bugfix, but might fix CID1604361 Overflowed constant

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-10 18:10:08 +02:00
Rémi Denis-Courmont
f1ed351d3b lavc/h264dsp: R-V V 8-bit h264_biweight_pixels
T-Head C908:
h264_biweight2_8_c:        58.0
h264_biweight2_8_rvv_i32:  11.2
h264_biweight4_8_c:       106.0
h264_biweight4_8_rvv_i32:  22.7
h264_biweight8_8_c:       205.7
h264_biweight8_8_rvv_i32:  50.0
h264_biweight16_8_c:      403.5
h264_biweight16_8_rvv_i32: 83.2

SpacemiT X60:
h264_weight2_8_c:          48.2
h264_weight2_8_rvv_i32:     8.2
h264_weight4_8_c:          90.5
h264_weight4_8_rvv_i32:    16.5
h264_weight8_8_c:         175.2
h264_weight8_8_rvv_i32:    38.0
h264_weight16_8_c:        342.2
h264_weight16_8_rvv_i32:   66.0
2024-07-09 18:03:30 +03:00
Rémi Denis-Courmont
3606e592ea lavc/h264dsp: R-V V 8-bit h264_weight_pixels
There are two implementations here:
- a generic scalable one processing two columns at a time,
- a specialised processing one (fixed-size) row at a time.

Unsurprisingly, the generic one works out better with smaller widths.
With larger widths, the gains from filling vectors are outweighed by
the extra cost of strided loads and stores. In other words, memory
accesses become the bottleneck.

T-Head C908:
h264_weight2_8_c:        54.5
h264_weight2_8_rvv_i32:  13.7
h264_weight4_8_c:       101.7
h264_weight4_8_rvv_i32:  27.5
h264_weight8_8_c:       197.0
h264_weight8_8_rvv_i32:  75.5
h264_weight16_8_c:      385.0
h264_weight16_8_rvv_i32: 74.2

SpacemiT X60:
h264_weight2_8_c:        48.5
h264_weight2_8_rvv_i32:   8.2
h264_weight4_8_c:        90.7
h264_weight4_8_rvv_i32:  16.5
h264_weight8_8_c:       175.0
h264_weight8_8_rvv_i32:  37.7
h264_weight16_8_c:      342.2
h264_weight16_8_rvv_i32: 66.0
2024-07-09 18:03:29 +03:00
James Almer
1b58f3af30 avcodec/packet: add a decoded frame cropping side data type
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-08 13:23:33 -03:00
Hao Guan
cd2f8a22e9 avcodec/videotoolboxenc: fix vtctx reset condition
In vtenc_populate_extradata, the cleanup function vtenc_reset should not
be used when no error occurs, otherwise some color information is lost
(#11036).

This patch checks the status code and conducts the correct cleanup.

Signed-off-by: Hao Guan <hguandl@gmail.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-07 18:46:02 +08:00
Rémi Denis-Courmont
f9d1230224 lavc/h264dsp: R-V V 8-bit h264_idct8_add
T-Head C908 (cycles):
h264_idct8_add_8bpp_c:      1072.0
h264_idct8_add_8bpp_rvv_i32: 318.5
2024-07-07 09:34:32 +03:00
Rémi Denis-Courmont
f447189b0c lavc/h264dsp: R-V V 8-bit h264_idct_add
T-Head C908 (cycles):
h264_idct4_add_8bpp_c:      271.5
h264_idct4_add_8bpp_rvv_i32: 91.5
2024-07-05 20:06:22 +03:00
Rémi Denis-Courmont
e0eff64ed1 lavc/h264dsp: R-V V 8-bit h264_idct8_add4 2024-07-05 18:56:03 +03:00
Rémi Denis-Courmont
d1f0c1fbf8 lavc/h264dsp: R-V V 8-bit h264_idct_add16intra 2024-07-05 18:56:03 +03:00
Rémi Denis-Courmont
30475c95ba lavc/h264dsp: R-V V 8-bit h264_idct_add16
While this *tends* to be faster than plain C, the performance numbers
are all over the place, presuambly due to the conditional character of
the main loop.

Some additional micro-optimisations should be feasible after the
underlying h264_idct_add and h264_idct_dc_add functions are also
implemented. Then it will no longer be necesseray to stricly abide by
the C ABI.
2024-07-05 18:56:02 +03:00
Jun Zhao
25a7dcf069 lavc/libx264: minor format fix
Remove redundant semicolons

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2024-07-05 18:05:10 +08:00
Andreas Rheinhardt
b13291f37c avcodec/hw_base_encode: Add missing include
Fixes checkheaders.

Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Tong Wu <wutong1208@outlook.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-07-04 19:45:51 +02:00
Rémi Denis-Courmont
e2af5904f0 lavc/h264dsp: R-V V 8-bit MBAFF loop filter
Performance is (unfortunately) the same as with non-MBAFF, since the
hardware under test does not short-circuit vector tail calculations.
(IMO, a generic solution or work-around should be agreed on, rather
than bespoke approaches all over the place.)
2024-07-04 19:57:42 +03:00
Rémi Denis-Courmont
5a6e333fc7 lavc/h264dsp: R-V V 8-bit luma loop filter
T-Head C908 (cycles):
h264_h_loop_filter_luma_8bpp_c:       297.5
h264_h_loop_filter_luma_8bpp_rvv_i32: 369.2
h264_v_loop_filter_luma_8bpp_c:       862.7
h264_v_loop_filter_luma_8bpp_rvv_i32: 199.7

Performance in the horizontal scenario seems worse than scalar. x86
SSE2 and AVX optimisations are similarly affected. This is presumably
caused by unlucky inputs from checkasm, such that the C code
short-circuits almost all filter calculations.
2024-07-04 19:57:42 +03:00
Rémi Denis-Courmont
4a2de380b7 lavc/vc1dsp: fuse multiply-adds in R-V V inv_trans_8
T-Head C908 (cycles)             before   after
vc1dsp.vc1_inv_trans_4x8_rvv_i32: 240.0   228.0
vc1dsp.vc1_inv_trans_8x4_rvv_i32: 235.2   224.2
vc1dsp.vc1_inv_trans_8x8_rvv_i32: 340.7   327.2
2024-07-03 18:16:36 +03:00
Rémi Denis-Courmont
78e1565f84 lavc/vc1dsp: fuse multiply-adds in R-V V inv_trans_4
T-Head C908 (cycles):            before   after
vc1dsp.vc1_inv_trans_4x4_rvv_i32: 128.0   120.0
vc1dsp.vc1_inv_trans_4x8_rvv_i32: 244.0   240.0
vc1dsp.vc1_inv_trans_8x4_rvv_i32: 239.2   235.2
2024-07-03 18:16:36 +03:00
Leo Izen
d69e522523
avcodec/pngenc: fix mDCv typo
When mDCv support was added, there was a typo in both variable names
and also the MKTAG itself, incorrectly listing it as mDVc. The tag name
stands for Mastering Display Color Volume so mDCv is correct.

Typo originally introduced in 7894904141.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla <ramiro.polla@gmail.com>
2024-07-03 10:21:17 -04:00
Leo Izen
c1af34c25b
avcodec/pngdec: fix mDCv typo
When mDCv support was added, there was a typo in both variable names
and also the MKTAG itself, incorrectly listing it as mDVc. The tag name
stands for Mastering Display Color Volume so mDCv is correct. See other
files such as av1dec.c which uses mdcv.

Typo originally introduced in c7a57b0f70.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla <ramiro.polla@gmail.com>
2024-07-03 10:21:06 -04:00
Yotam Ofek
a9c05eb657 avcodec/aaccoder_twoloop: remove unread max scaler 2024-07-03 02:51:37 +02:00
Yotam Ofek
b5ad997e72 avcodec/aaccoder_twoloop: remove unused macro
seems the `sclip` macro was never used
2024-07-03 02:51:09 +02:00
Marvin Scholz
ac60ad1872 avcodec/aacdec_usac: Fix array size
The array in ff_aac_usac_mdst_filt_cur that is passed to that has a size
of 7 elements, not 6 and the code in the function accesses the array at
index 6, which would be out of bounds if the size was actually 6.

Fixes: CID1603196
2024-07-03 02:48:27 +02:00
Michael Niedermayer
86cd7c68bc
avcodec/mfenc: check IMFSample_ConvertToContiguousBuffer() for failure
Fixes: CID1591911 Logically dead code

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-02 21:57:22 +02:00