ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2025-12-08 06:09:50 +00:00

Author	SHA1	Message	Date
Nuo Mi	ca3550948c	lavc/vvcdec: ensure slices contain nonzero CTUs fixes https://github.com/ffvvc/tests/tree/main/fuzz/passed/000323.bit Co-authored-by: Frank Plowman <post@frankplowman.com>	2025-01-29 18:22:41 +08:00
Nuo Mi	974d4a8f0a	lavc/vvcdec: remove unneeded VVCContext->pix_fmt AVCodecContext->sw_pix_fmt is used to hold the software pixel format. Co-authored-by: Frank Plowman <post@frankplowman.com>	2025-01-29 18:22:41 +08:00
Nuo Mi	61ff0fac35	lavc/vvcdec: remove unneeded set_output_format Downstream can determine the format from the output frame format Co-authored-by: Frank Plowman <post@frankplowman.com>	2025-01-29 18:22:41 +08:00
Zhao Zhili	ea381285e7	avcodec/vvc: Add support for output_corrupt/showall flags	2025-01-19 13:30:13 +08:00
Nuo Mi	8eb1d76e14	lavc/vvc/refs: export keyframe and picture type in output frames fixes https://trac.ffmpeg.org/ticket/11406 Co-authored-by: Ruben Gonzalez <rgonzalez@fluendo.com> Signed-off-by: James Almer <jamrial@gmail.com>	2025-01-13 18:05:06 -03:00
Frank Plowman	8bd66a8c95	lavc/vvc: Check slice structure The criteria for slice structure validity is similar to that of subpicture structure validity that we saw not too long ago [1]. The relationship between tiles and slices must satisfy the following properties: * Exhaustivity. All tiles in a picture must belong to a slice. The tiles cover the picture, so this implies the slices must cover the picture. * Mutual exclusivity. No tile may belong to more than one slice, i.e. slices may not overlap. In most cases these properties are guaranteed by the syntax. There is one noticable exception however: when pps_tile_idx_delta_present_flag is equal to one, each slice is associated with a syntax element pps_tile_idx_delta_val[i] which "specifies the difference between the tile index of the tile containing the first CTU in the ( i + 1 )-th rectangular slice and the tile index of the tile containing the first CTU in the i-th rectangular slice" [2]. When these syntax elements are present, the i-th slice can begin anywhere and the usual guarantees provided by the syntax are lost. The patch detects slice structures which violate either of the two properties above, and are therefore invalid, while building the slice map. Should the slice map be determined to be invalid, an AVERROR_INVALIDDATA is returned. This prevents issues including segmentation faults when trying to decode, invalid bitstreams. [1]: https://ffmpeg.org//pipermail/ffmpeg-devel/2024-October/334470.html [2]: H.266 (V3) Section 7.4.3.5, Picture parameter set RBSP semantics Signed-off-by: Frank Plowman <post@frankplowman.com>	2025-01-12 13:15:06 +08:00
James Almer	d7180a3f92	avcodec/vvc/dec: print thread debug logs only if DEBUG is defined Makes the output of a normal decoding process with loglevel debug a lot less verbose. Signed-off-by: James Almer <jamrial@gmail.com>	2025-01-10 10:23:57 -03:00
Frank Plowman	539cea3183	lavc/vvc: Fix race condition for MVs cropped to subpic When the current subpicture has sps_subpic_treated_as_pic_flag equal to 1, motion vectors are cropped such that they cannot point to other subpictures. This was accounted for in the prediction logic, but not in pred_get_y, which is used by the scheduling logic to determine which parts of the reference pictures must have been reconstructed before inter prediction of a subsequent frame may begin. Consequently, where a motion vector pointed to a location significantly above the current subpicture, there was the possibility of a race condition. Patch fixes this by cropping the motion vector to the current subpicture in pred_get_y. Signed-off-by: Frank Plowman <post@frankplowman.com>	2025-01-05 20:25:29 +08:00
Chris Warrington	f80af3657f	avcodec/vvc decode: ALF filtering without CC-ALF When a stream has ALF filtering enabled but not CC-ALF, the CC-ALF set indexes alf->ctb_cc_idc are being read uninitialized during ALF filtering. This change initializes alf->ctb_cc_idc whenever ALF is enabled. Ref. https://trac.ffmpeg.org/ticket/11325	2025-01-05 18:00:18 +08:00
Anton Khirnov	56ba57b672	lavc/refstruct: move to lavu and make public It is highly versatile and generally useful.	2024-12-15 14:03:47 +01:00
Frank Plowman	8629306627	lavc/vvc: Fix scaling matrix DC coef derivation In 7.4.3.20 of H.266 (V3), there are two similarly-named variables: scalingMatrixDcPred and ScalingMatrixDcRec. The old code set ScalingMatrixDcRec, rather than scalingMatrixDcPred, in the first two branches of the conditions on scaling_list_copy_mode_flag[id] and aps->scaling_list_pred_mode_flag[id]. This could lead to decode mismatches in sequences with explicitly-signalled scaling lists. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-12-10 20:26:12 +08:00
Frank Plowman	34c6ad0a07	lavc/vvc: Use a bitfield to store MIP information Reduces memory consumption by ~4MB for 1080p video with a maximum delay of 16 frames by packing various information related to MIP: * intra_mip_flag, 1 bit * intra_mip_transposed_flag, 1 bit * intra_mip_mode, 4 bits Into a single byte. Co-authored-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-12-07 17:37:45 +08:00
Frank Plowman	56419fd096	lavc/vvc: Fix overflow in MVD derivation H.266 (V3) section 7.4.12.8: "The value of lMvd[ compIdx ] shall be in the range of −2^{17} to 2^{17} − 1, inclusive." Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-12-03 10:22:55 +08:00
Frank Plowman	499896ca2f	lavc/vvc: Fix derivation of LmcsMaxBinIdx Per H.266 (V3) section 7.4.3.19, LmcsMaxBinIdx is set equal to 15 - lmcs_delta_max_bin_idx. The previous code instead had it equal to 15 - lmcs_min_bin_idx. This could cause decoder mismatches. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-12-03 10:22:55 +08:00
Frank Plowman	699322519c	lavc/vvc: Store MIP information over entire CU area Previously, the code only stored the MIP mode and transpose flag in the relevant tables at the top-left corner of the CU. This information ends up being retrieved in ff_vvc_intra_pred_* not based on the CU position but instead the transform unit position (specifically, using the x0 and y0 from get_luma_predict_unit). There might be multiple transform units in a CU, hence the top-left corner of the transform unit might not coincide with the top-left corner of the CU. Consequently, we need to store the MIP information at all positions in the CU, not only its top-left corner, as we already do for the MIP flag. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-12-03 10:20:51 +08:00
Frank Plowman	7399d9f374	lavc/vvc: Don't check motion estimation region for IBC The final parameter of check_available determines whether the motion estimation region constraints imposed in section 8.5.2.3 of H.266 (V3) on MVP candidates apply to the current candidate or not. In the case of IBC spatial merge candidates they do not, as their availability is dependent only on the criteria described in sections 8.6.2.3 and 6.4.4, which do not include this constraint on the motion estimation region. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-12-03 10:20:51 +08:00
Nuo Mi	4de67e8746	avcodec/vvcdec: return error if CTU size > 128 The v3 spec reserves CTU size 256. Currently, we use an uint8_t* table to hold cb_width and cb_height. If a CTU size of 256 is not split, cb_width and cb_height will overflow to 0. To avoid switching to uint16_t, rejecting CTU size 256 provides a simple solution.	2024-11-30 09:58:59 +08:00
Nuo Mi	eb67e60cb0	avcodec/vvcdec: schedule next stage only if the current stage reports no error If the current stage reports an error, some variables may not be correctly initialized. Scheduling the next stage could lead to the use of uninitialized variables.	2024-11-30 09:58:59 +08:00
Nuo Mi	4ec767abcc	avcodec/vvcdec: misc, reformat inter_data()	2024-11-30 09:58:59 +08:00
Nuo Mi	ba89c5b989	avcodec/vvcdec: inter_data, check the return value from hls_merge_data Reported-by: Frank Plowman <post@frankplowman.com>	2024-11-30 09:58:59 +08:00
Nuo Mi	5c5a08ecb5	avcodec/vvcdec: ensure every CTU belongs to a slice According to section 6.3.3 "Spatial or component-wise partitionings," CTUs should fully cover slices with no overlaps, gaps, or additions. No overlaps are ensured by task_init_parse. No gaps and no additions are ensured by this patch. Co-authored-by: Frank Plowman <post@frankplowman.com>	2024-11-30 09:58:59 +08:00
Frank Plowman	1e5f24d1a6	lavc/vvc: Remove floating point logic This was the only floating point logic in the native VVC decoder. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-11-11 19:31:00 +08:00
Fei Wang	e726fdeb05	lavc/vaapi_dec: Add VVC decoder Signed-off-by: Fei Wang <fei.w.wang@intel.com>	2024-11-01 12:13:07 +08:00
Fei Wang	4dc18c78cd	lavc/vvc_dec: Add hardware decode API Signed-off-by: Fei Wang <fei.w.wang@intel.com>	2024-11-01 12:13:07 +08:00
Fei Wang	a94aa2d61e	lavc/vvc_ps: Add alf raw syntax into VVCALF Signed-off-by: Fei Wang <fei.w.wang@intel.com>	2024-11-01 12:13:07 +08:00
Fei Wang	15a75e8e04	lavc/vvc_refs: Define VVC_FRAME_FLAG* to h header So that hardware decoder can use the flags too. Signed-off-by: Fei Wang <fei.w.wang@intel.com>	2024-11-01 12:13:07 +08:00
Nuo Mi	b611410569	avcodec/vvc/thread: Check frame to be non NULL Fixes: NULL pointer dereference Fixes: 71303/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-4875859050168320 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reported-by: Michael Niedermayer <michael@niedermayer.cc>	2024-10-20 20:36:15 +08:00
Nuo Mi	a144e7b92e	avcodec/vvcdec: remove unused tb_pos_x0 and tb_pos_y0 This change will save approximately 531 MB for an 8K clip when processed with 16 threads. The calculation is as follows: 7680 * 4320 * sizeof(int) * 2 * 2 * 16 / (4 * 4).	2024-10-16 20:28:09 +08:00
Nuo Mi	2e936f2c11	avcodec/vvdec: refact, ff_vvc_deblock_bs use CodingUnit/TransformUnit instead of fc->tabs perf result for: "perf record -F 99 ./ffmpeg_g -i Tango2_3840x2160_60_10_420_27_LD.266 -f null -" before: 5.24% 1.87% ffmpeg_g [.] vvc_deblock_bs_chroma 1.72% ffmpeg_g [.] ff_vvc_deblock_bs 1.65% ffmpeg_g [.] vvc_deblock_bs_luma after: 3.48% 1.84% ffmpeg_g [.] vvc_deblock_bs_chroma 1.64% ffmpeg_g [.] ff_vvc_deblock_bs + vvc_deblock_bs_luma(inlined)	2024-10-16 20:28:09 +08:00
Nuo Mi	d78b43ecf8	avcodec/vvcdec: misc, move pcmf from min_tu_tl_init to min_cb_nz_tl_init pcmf are cu level flags	2024-10-16 20:28:09 +08:00
Nuo Mi	634780f3cf	avcodec/vvcdec: refact out deblock boundary strength stage The deblock boundary strength stage utilizes ~5% of CPU resources for 8K clips. It's worth considering it as a standalone stage. This stage has been relocated to follow the parser process, allowing us to reuse CUs and TUs before releasing them.	2024-10-16 20:28:09 +08:00
Nuo Mi	846fbc395b	avcodec/vvc: simplify priority logical to improve performance for 4K/8K For 4K/8K video processing, it's possible to have over 1,000 tasks pending on the executor. In such cases, O(n) and O(log(n)) insertion times are too costly. Reducing this to O(1) will significantly decrease the time spent in critical sections clip \| before \| after \| delta ------------------------------------------------------------\|--------\|--------\|------- VVC_HDR_UHDTV2_OpenGOP_7680x4320_50fps_HLG10.bit \| 24 \| 27 \| 12.5% VVC_HDR_UHDTV2_OpenGOP_7680x4320_50fps_HLG10_HighBitrate.bit\| 12 \| 17 \| 41.7% tears_of_steel_4k_8M_8bit_2000.vvc \| 34 \| 102 \| 200.0% VVC_UHDTV1_OpenGOP_3840x2160_60fps_HLG10.bit \| 126 \| 128 \| 1.6% RitualDance_1920x1080_60_10_420_37_RA.266 \| 350 \| 378 \| 8.0% NovosobornayaSquare_1920x1080.bin \| 341 \| 369 \| 8.2% Tango2_3840x2160_60_10_420_27_LD.266 \| 69 \| 70 \| 1.4% RitualDance_1920x1080_60_10_420_32_LD.266 \| 243 \| 259 \| 6.6% Chimera_8bit_1080P_1000_frames.vvc \| 420 \| 392 \| -6.7% BQTerrace_1920x1080_60_10_420_22_RA.vvc \| 148 \| 144 \| -2.7%	2024-10-04 21:58:42 +08:00
Nuo Mi	40a14ef970	avcodec/executor: remove unused ready callback Due to the nature of multithreading, using a "ready check" mechanism may introduce a deadlock. For example: Suppose all tasks have been submitted to the executor, and the last thread checks the entire list and finds no ready tasks. It then goes to sleep, waiting for a new task. However, for some multithreading-related reason, a task becomes ready after the check. Since no other thread is aware of this and no new tasks are being added to the executor, a deadlock occurs. In VVC, this function is unnecessary because we use a scoreboard. All tasks submitted to the executor are ready tasks.	2024-10-04 21:58:42 +08:00
Nuo Mi	8446e27bf3	avcodec: make a local copy of executor We still need several refactors to improve the current VVC decoder's performance, which will frequently break the API/ABI. To mitigate this, we've copied the executor from avutil to avcodec. Once the API/ABI is stable, we will move this class back to avutil	2024-10-04 21:58:42 +08:00
Zhao Zhili	240c16bbc6	avcodec/vvc: Don't use large array on stack tmp_array in dmvr_hv takes 33024 bytes on stack, which can be dangerous.	2024-10-01 11:30:22 +08:00
sunyuechi	ba7d0d5fc3	lavc/vvc_mc: R-V V avg w_avg C908 X60 avg_8_2x2_c : 1.2 1.0 avg_8_2x2_rvv_i32 : 0.7 0.7 avg_8_2x4_c : 2.0 2.2 avg_8_2x4_rvv_i32 : 1.2 1.2 avg_8_2x8_c : 3.7 4.0 avg_8_2x8_rvv_i32 : 1.7 1.5 avg_8_2x16_c : 7.2 7.7 avg_8_2x16_rvv_i32 : 3.0 2.7 avg_8_2x32_c : 14.2 15.2 avg_8_2x32_rvv_i32 : 5.5 5.0 avg_8_2x64_c : 51.0 43.7 avg_8_2x64_rvv_i32 : 39.2 29.7 avg_8_2x128_c : 100.5 79.2 avg_8_2x128_rvv_i32 : 79.7 68.2 avg_8_4x2_c : 1.7 2.0 avg_8_4x2_rvv_i32 : 1.0 0.7 avg_8_4x4_c : 3.5 3.7 avg_8_4x4_rvv_i32 : 1.2 1.2 avg_8_4x8_c : 6.7 7.0 avg_8_4x8_rvv_i32 : 1.7 1.5 avg_8_4x16_c : 13.5 14.0 avg_8_4x16_rvv_i32 : 3.0 2.7 avg_8_4x32_c : 26.2 27.7 avg_8_4x32_rvv_i32 : 5.5 4.7 avg_8_4x64_c : 73.0 73.7 avg_8_4x64_rvv_i32 : 39.0 32.5 avg_8_4x128_c : 143.0 137.2 avg_8_4x128_rvv_i32 : 72.7 68.0 avg_8_8x2_c : 3.5 3.5 avg_8_8x2_rvv_i32 : 1.0 0.7 avg_8_8x4_c : 6.2 6.5 avg_8_8x4_rvv_i32 : 1.5 1.0 avg_8_8x8_c : 12.7 13.2 avg_8_8x8_rvv_i32 : 2.0 1.5 avg_8_8x16_c : 25.0 26.5 avg_8_8x16_rvv_i32 : 3.2 2.7 avg_8_8x32_c : 50.0 52.7 avg_8_8x32_rvv_i32 : 6.2 5.0 avg_8_8x64_c : 118.7 122.5 avg_8_8x64_rvv_i32 : 40.2 31.5 avg_8_8x128_c : 236.7 220.2 avg_8_8x128_rvv_i32 : 85.2 67.7 avg_8_16x2_c : 6.2 6.7 avg_8_16x2_rvv_i32 : 1.2 0.7 avg_8_16x4_c : 12.5 13.0 avg_8_16x4_rvv_i32 : 1.7 1.0 avg_8_16x8_c : 24.5 26.0 avg_8_16x8_rvv_i32 : 3.0 1.7 avg_8_16x16_c : 49.0 51.5 avg_8_16x16_rvv_i32 : 5.5 3.0 avg_8_16x32_c : 97.5 102.5 avg_8_16x32_rvv_i32 : 10.5 5.5 avg_8_16x64_c : 213.7 222.0 avg_8_16x64_rvv_i32 : 48.5 34.2 avg_8_16x128_c : 434.7 420.0 avg_8_16x128_rvv_i32 : 97.7 74.0 avg_8_32x2_c : 12.2 12.7 avg_8_32x2_rvv_i32 : 1.5 1.0 avg_8_32x4_c : 24.5 25.5 avg_8_32x4_rvv_i32 : 3.0 1.7 avg_8_32x8_c : 48.5 50.7 avg_8_32x8_rvv_i32 : 5.2 2.7 avg_8_32x16_c : 96.7 101.2 avg_8_32x16_rvv_i32 : 10.2 5.0 avg_8_32x32_c : 192.7 202.2 avg_8_32x32_rvv_i32 : 19.7 9.5 avg_8_32x64_c : 427.5 426.5 avg_8_32x64_rvv_i32 : 64.2 18.2 avg_8_32x128_c : 816.5 821.0 avg_8_32x128_rvv_i32 : 135.2 75.5 avg_8_64x2_c : 24.0 25.2 avg_8_64x2_rvv_i32 : 2.7 1.5 avg_8_64x4_c : 48.2 50.5 avg_8_64x4_rvv_i32 : 5.0 2.7 avg_8_64x8_c : 96.0 100.7 avg_8_64x8_rvv_i32 : 9.7 4.5 avg_8_64x16_c : 207.7 201.2 avg_8_64x16_rvv_i32 : 19.0 9.0 avg_8_64x32_c : 383.2 402.0 avg_8_64x32_rvv_i32 : 37.5 17.5 avg_8_64x64_c : 837.2 828.7 avg_8_64x64_rvv_i32 : 84.7 35.5 avg_8_64x128_c : 1640.7 1640.2 avg_8_64x128_rvv_i32 : 206.0 153.0 avg_8_128x2_c : 48.7 51.0 avg_8_128x2_rvv_i32 : 5.2 2.7 avg_8_128x4_c : 96.7 101.5 avg_8_128x4_rvv_i32 : 10.2 5.0 avg_8_128x8_c : 192.2 202.0 avg_8_128x8_rvv_i32 : 19.7 9.2 avg_8_128x16_c : 400.7 403.2 avg_8_128x16_rvv_i32 : 38.7 18.5 avg_8_128x32_c : 786.7 805.7 avg_8_128x32_rvv_i32 : 77.0 36.2 avg_8_128x64_c : 1615.5 1655.5 avg_8_128x64_rvv_i32 : 189.7 80.7 avg_8_128x128_c : 3182.0 3238.0 avg_8_128x128_rvv_i32 : 397.5 308.5 w_avg_8_2x2_c : 1.7 1.2 w_avg_8_2x2_rvv_i32 : 1.2 1.0 w_avg_8_2x4_c : 2.7 2.7 w_avg_8_2x4_rvv_i32 : 1.7 1.5 w_avg_8_2x8_c : 21.7 4.7 w_avg_8_2x8_rvv_i32 : 2.7 2.5 w_avg_8_2x16_c : 9.5 9.2 w_avg_8_2x16_rvv_i32 : 4.7 4.2 w_avg_8_2x32_c : 19.0 18.7 w_avg_8_2x32_rvv_i32 : 9.0 8.0 w_avg_8_2x64_c : 62.0 50.2 w_avg_8_2x64_rvv_i32 : 47.7 33.5 w_avg_8_2x128_c : 116.7 87.7 w_avg_8_2x128_rvv_i32 : 80.0 69.5 w_avg_8_4x2_c : 2.5 2.5 w_avg_8_4x2_rvv_i32 : 1.2 1.0 w_avg_8_4x4_c : 4.7 4.5 w_avg_8_4x4_rvv_i32 : 1.7 1.7 w_avg_8_4x8_c : 9.0 8.7 w_avg_8_4x8_rvv_i32 : 2.7 2.5 w_avg_8_4x16_c : 17.7 17.5 w_avg_8_4x16_rvv_i32 : 4.7 4.2 w_avg_8_4x32_c : 35.0 35.0 w_avg_8_4x32_rvv_i32 : 9.0 8.0 w_avg_8_4x64_c : 100.5 84.5 w_avg_8_4x64_rvv_i32 : 42.2 33.7 w_avg_8_4x128_c : 203.5 151.2 w_avg_8_4x128_rvv_i32 : 83.0 69.5 w_avg_8_8x2_c : 4.5 4.5 w_avg_8_8x2_rvv_i32 : 1.2 1.2 w_avg_8_8x4_c : 8.7 8.7 w_avg_8_8x4_rvv_i32 : 2.0 1.7 w_avg_8_8x8_c : 17.0 17.0 w_avg_8_8x8_rvv_i32 : 3.2 2.5 w_avg_8_8x16_c : 34.0 33.5 w_avg_8_8x16_rvv_i32 : 5.5 4.2 w_avg_8_8x32_c : 86.0 67.5 w_avg_8_8x32_rvv_i32 : 10.5 8.0 w_avg_8_8x64_c : 187.2 149.5 w_avg_8_8x64_rvv_i32 : 45.0 35.5 w_avg_8_8x128_c : 342.7 290.0 w_avg_8_8x128_rvv_i32 : 108.7 70.2 w_avg_8_16x2_c : 8.5 8.2 w_avg_8_16x2_rvv_i32 : 2.0 1.2 w_avg_8_16x4_c : 16.7 16.7 w_avg_8_16x4_rvv_i32 : 3.0 1.7 w_avg_8_16x8_c : 33.2 33.5 w_avg_8_16x8_rvv_i32 : 5.5 3.0 w_avg_8_16x16_c : 66.2 66.7 w_avg_8_16x16_rvv_i32 : 10.5 5.0 w_avg_8_16x32_c : 132.5 131.0 w_avg_8_16x32_rvv_i32 : 20.0 9.7 w_avg_8_16x64_c : 340.0 283.5 w_avg_8_16x64_rvv_i32 : 60.5 37.2 w_avg_8_16x128_c : 641.2 597.5 w_avg_8_16x128_rvv_i32 : 118.7 77.7 w_avg_8_32x2_c : 16.5 16.7 w_avg_8_32x2_rvv_i32 : 3.2 1.7 w_avg_8_32x4_c : 33.2 33.2 w_avg_8_32x4_rvv_i32 : 5.5 2.7 w_avg_8_32x8_c : 66.0 62.5 w_avg_8_32x8_rvv_i32 : 10.5 5.0 w_avg_8_32x16_c : 131.5 132.0 w_avg_8_32x16_rvv_i32 : 20.2 9.5 w_avg_8_32x32_c : 261.7 272.0 w_avg_8_32x32_rvv_i32 : 39.7 18.0 w_avg_8_32x64_c : 575.2 545.5 w_avg_8_32x64_rvv_i32 : 105.5 58.7 w_avg_8_32x128_c : 1154.2 1088.0 w_avg_8_32x128_rvv_i32 : 207.0 98.0 w_avg_8_64x2_c : 33.0 33.0 w_avg_8_64x2_rvv_i32 : 6.2 2.7 w_avg_8_64x4_c : 65.5 66.0 w_avg_8_64x4_rvv_i32 : 11.5 5.0 w_avg_8_64x8_c : 131.2 132.5 w_avg_8_64x8_rvv_i32 : 22.5 9.5 w_avg_8_64x16_c : 268.2 262.5 w_avg_8_64x16_rvv_i32 : 44.2 18.0 w_avg_8_64x32_c : 561.5 528.7 w_avg_8_64x32_rvv_i32 : 88.0 35.2 w_avg_8_64x64_c : 1136.2 1124.0 w_avg_8_64x64_rvv_i32 : 222.0 82.2 w_avg_8_64x128_c : 2345.0 2312.7 w_avg_8_64x128_rvv_i32 : 423.0 190.5 w_avg_8_128x2_c : 65.7 66.5 w_avg_8_128x2_rvv_i32 : 11.2 5.5 w_avg_8_128x4_c : 131.2 132.2 w_avg_8_128x4_rvv_i32 : 22.0 10.2 w_avg_8_128x8_c : 263.5 312.0 w_avg_8_128x8_rvv_i32 : 43.2 19.7 w_avg_8_128x16_c : 528.7 526.2 w_avg_8_128x16_rvv_i32 : 85.5 39.5 w_avg_8_128x32_c : 1067.7 1062.7 w_avg_8_128x32_rvv_i32 : 171.7 78.2 w_avg_8_128x64_c : 2234.7 2168.7 w_avg_8_128x64_rvv_i32 : 400.0 159.0 w_avg_8_128x128_c : 4752.5 4295.0 w_avg_8_128x128_rvv_i32 : 757.7 365.5 Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2024-09-24 20:04:51 +03:00
Zhao Zhili	5c66a3ab51	avcodec/vvc: Fix output and unref a frame which isn't decoding yet ff_vvc_output_frame is called before actually decoding. It's possible for ff_vvc_output_frame to select current frame to output. If current frame is nonref frame, it will be released by ff_vvc_unref_frame. Fix this by always marking the current frame with VVC_FRAME_FLAG_SHORT_REF, as is done by the HEVC decoder.	2024-09-15 16:42:14 +08:00
Frank Plowman	6df0c5f9f4	lavc/vvc: Remove experimental flag This reverts commit `110d8549d5`. I have been working through fixing bugs, particularly crashes I've found using a fuzzer, in the VVC decoder for the past few months. While I won't claim it is now bug-free, it is considerably more resilient than it was and I think in a position to have the experimental flag removed for release 7.1. Additionally, most of the Main 10 features of VVC which were missing version of the decoder released in 7.0 have now been implemented. This includes the most major missing features: IBC, subpictures and RPR. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-09-06 22:14:52 +08:00
Nuo Mi	3d2fafa229	avcodec/vvcdec: fix potential deadlock in report_frame_progress Fixes: https://fate.ffmpeg.org/report.cgi?slot=x86_64-archlinux-gcc-tsan&time=20240823175808 Reproduction steps: ./configure --enable-memory-poisoning --toolchain=gcc-tsan --disable-stripping && make fate-vvc Root cause: We hold the current frame's lock while updating progress for other frames, which also requires acquiring other frame locks. This could potentially lead to a deadlock. However, I don't think this will happen in practice because progress updates are one-way, with no cyclic dependencies. But we need this patch to make FATE happy.	2024-09-03 21:32:27 +08:00
Frank Plowman	54291f4383	lavc/vvc: Fix assertion bound on qPy_{a,b} Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-09-03 20:57:52 +08:00
Frank Plowman	01701bdcd5	lavc/vvc: Prevent OOB access in subpic_tiles The previous logic relied on the subpicture boundaries coinciding with the tile boundaries. Per 6.3.1 of H.266 (V3), vertical subpicture boundaries are always tile boundaries however the same cannot be said for horizontal subpicture boundaries. Furthermore, it is possible to construct an illegal bitstream where vertical subpicture boundaries are not coincident with tile boundaries. In these cases, the condition of the while loop would never be satisfied resulting in an OOB read on col_bd/row_bd. Patch fixes this issue by replacing != with <, thereby not requiring subpicture boundaries and tile boundaries to be coincident. Signed-off-by: Frank Plowman <post@frankplowman.com>	2024-08-31 15:05:23 +08:00
Nuo Mi	b2eabe0ff2	avcodec/vvcdec: format, fix indent for vvc_deblock_bs	2024-08-31 14:16:19 +08:00
Nuo Mi	7bd22342c3	avcodec/vvcdec: filter, fix uninitialized variables for YUV400 format fix ==135000== Conditional jump or move depends on uninitialised value(s) ==135000== at 0x169FF95: vvc_deblock_bs (filter.c:699) and ==135000== Conditional jump or move depends on uninitialised value(s) ==135000== at 0x16A2E72: ff_vvc_alf_filter (filter.c:1217) Reported-by: James Almer <jamrial@gmail.com>	2024-08-31 14:16:19 +08:00
Nuo Mi	f851abb4b3	avcodec/vvcdec: bdof, do not pad sources and gradients to simplify the code	2024-08-31 13:57:51 +08:00
Nuo Mi	8347def797	avcodec/vvcdec: misc, rename BDOF_BLOCK_SIZE to BDOF_MIN_BLOCK_SIZE	2024-08-31 13:57:51 +08:00
Wu Jianhua	ca5c9e810a	avcodec/vvc/dsp: prefix TxType and TxSize with VVC See https://patchwork.ffmpeg.org/project/ffmpeg/patch/TYSPR06MB64337C4A9ADF5312E6648543AA62A@TYSPR06MB6433.apcprd06.prod.outlook.com/#81892 Signed-off-by: Wu Jianhua <toqsxw@outlook.com>	2024-08-15 20:52:14 +08:00
Wu Jianhua	ae1a9cfd52	avcodec/vvc_parser: move avctx->has_b_frames initialization to dec From Jun Zhao <mypopydev@gmail.com>: > Should we relocate this to the decoder? Other codecs typically set this > parameter in the decoder. Signed-off-by: Wu Jianhua <toqsxw@outlook.com>	2024-08-15 20:50:24 +08:00
Nuo Mi	80af195804	avcodec/vvcdec: move frame tab memset from the main thread to worker threads memset tables in the main thread can become a bottleneck for the decoder. For example, if it takes 1% of the processing time for one core, the maximum achievable FPS will be 100. Move the memeset to worker threads will fix the issue.	2024-08-15 20:33:57 +08:00
Nuo Mi	daf6fcd816	avcodec/vvcdec: do not zero frame qp table For luma, qp can only change at the CU level, so the qp tab size is related to the CU. For chroma, considering the joint CbCr, the QP tab size is related to the TU.	2024-08-15 20:33:57 +08:00
Nuo Mi	ca2caeb21d	avcodec/vvcdec: do not zero frame msf mmi table	2024-08-15 20:33:57 +08:00

1 2 3 4 5

212 commits