Commit graph

19 commits

Author SHA1 Message Date
Andreas Rheinhardt
1df63acdc4 avcodec: Add av_cold to flush,init,close functions missing it
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-13 20:37:03 +00:00
Wu Jianhua
26215b8c83 avcodec/vvc/ctu: add palette support
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2025-05-17 09:22:40 +08:00
Wu Jianhua
75e5fb6e37 avcodec/vvc: refact out ep_init and ep_init_wpp
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2025-05-17 09:22:40 +08:00
James Almer
d7180a3f92 avcodec/vvc/dec: print thread debug logs only if DEBUG is defined
Makes the output of a normal decoding process with loglevel debug a lot less
verbose.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-01-10 10:23:57 -03:00
Nuo Mi
eb67e60cb0 avcodec/vvcdec: schedule next stage only if the current stage reports no error
If the current stage reports an error, some variables may not be correctly initialized.
Scheduling the next stage could lead to the use of uninitialized variables.
2024-11-30 09:58:59 +08:00
Nuo Mi
5c5a08ecb5 avcodec/vvcdec: ensure every CTU belongs to a slice
According to section 6.3.3 "Spatial or component-wise partitionings,"
CTUs should fully cover slices with no overlaps, gaps, or additions.
No overlaps are ensured by task_init_parse.
No gaps and no additions are ensured by this patch.

Co-authored-by: Frank Plowman <post@frankplowman.com>
2024-11-30 09:58:59 +08:00
Nuo Mi
634780f3cf avcodec/vvcdec: refact out deblock boundary strength stage
The deblock boundary strength stage utilizes ~5% of CPU resources for 8K clips.
It's worth considering it as a standalone stage. This stage has been relocated
to follow the parser process, allowing us to reuse CUs and TUs before releasing them.
2024-10-16 20:28:09 +08:00
Nuo Mi
846fbc395b avcodec/vvc: simplify priority logical to improve performance for 4K/8K
For 4K/8K video processing, it's possible to have over 1,000 tasks pending on the executor.
In such cases, O(n) and O(log(n)) insertion times are too costly.
Reducing this to O(1) will significantly decrease the time spent in critical sections

clip                                                        | before | after  | delta
------------------------------------------------------------|--------|--------|-------
VVC_HDR_UHDTV2_OpenGOP_7680x4320_50fps_HLG10.bit            |    24  |   27   |  12.5%
VVC_HDR_UHDTV2_OpenGOP_7680x4320_50fps_HLG10_HighBitrate.bit|    12  |   17   |  41.7%
tears_of_steel_4k_8M_8bit_2000.vvc                          |    34  |  102   | 200.0%
VVC_UHDTV1_OpenGOP_3840x2160_60fps_HLG10.bit                |   126  |  128   |   1.6%
RitualDance_1920x1080_60_10_420_37_RA.266                   |   350  |  378   |   8.0%
NovosobornayaSquare_1920x1080.bin                           |   341  |  369   |   8.2%
Tango2_3840x2160_60_10_420_27_LD.266                        |    69  |   70   |   1.4%
RitualDance_1920x1080_60_10_420_32_LD.266                   |   243  |  259   |   6.6%
Chimera_8bit_1080P_1000_frames.vvc                          |   420  |  392   |  -6.7%
BQTerrace_1920x1080_60_10_420_22_RA.vvc                     |   148  |  144   |  -2.7%
2024-10-04 21:58:42 +08:00
Nuo Mi
40a14ef970 avcodec/executor: remove unused ready callback
Due to the nature of multithreading, using a "ready check" mechanism may introduce a deadlock. For example:

Suppose all tasks have been submitted to the executor, and the last thread checks the entire list and finds
no ready tasks. It then goes to sleep, waiting for a new task. However, for some multithreading-related reason,
a task becomes ready after the check. Since no other thread is aware of this and no new tasks are being added to
the executor, a deadlock occurs.

In VVC, this function is unnecessary because we use a scoreboard. All tasks submitted to the executor are ready tasks.
2024-10-04 21:58:42 +08:00
Nuo Mi
8446e27bf3 avcodec: make a local copy of executor
We still need several refactors to improve the current VVC decoder's performance,
which will frequently break the API/ABI. To mitigate this, we've copied the executor from
avutil to avcodec. Once the API/ABI is stable, we will move this class back to avutil
2024-10-04 21:58:42 +08:00
Nuo Mi
3d2fafa229 avcodec/vvcdec: fix potential deadlock in report_frame_progress
Fixes:
https://fate.ffmpeg.org/report.cgi?slot=x86_64-archlinux-gcc-tsan&time=20240823175808

Reproduction steps:
./configure --enable-memory-poisoning --toolchain=gcc-tsan --disable-stripping && make fate-vvc

Root cause:
We hold the current frame's lock while updating progress for other frames,
which also requires acquiring other frame locks. This could potentially lead to a deadlock.
However, I don't think this will happen in practice because progress updates are one-way, with no cyclic dependencies.
But we need this patch to make FATE happy.
2024-09-03 21:32:27 +08:00
Nuo Mi
80af195804 avcodec/vvcdec: move frame tab memset from the main thread to worker threads
memset tables in the main thread can become a bottleneck for the decoder.
For example, if it takes 1% of the processing time for one core, the maximum achievable FPS will be 100.
Move the memeset to worker threads will fix the issue.
2024-08-15 20:33:57 +08:00
Nuo Mi
bdb79fe60a avcodec/vvcdec: thread, ensure the parse stage gets the highest priority
The parser stage is not parallelizable.
We need to schedule it as soon as possible to create later stages, which are more parallelizable

clips                                       | before | after | delta
--------------------------------------------|--------|-------|------
RitualDance_1920x1080_60_10_420_37_RA.266   | 342.7  | 365.3 |  6.59%
NovosobornayaSquare_1920x1080.bin           | 321.7  | 400   | 24.34%
Tango2_3840x2160_60_10_420_27_LD.266        |  82.3  |  91.7 | 11.42%
RitualDance_1920x1080_60_10_420_32_LD.266   | 323.7  | 319.3 | -1.36%
Chimera_8bit_1080P_1000_frames.vvc          | 364    | 411.3 | 12.99%
BQTerrace_1920x1080_60_10_420_22_RA.vvc     | 162.7  | 185.7 | 14.14%
2024-08-15 20:19:45 +08:00
Zhao Zhili
0e5f8ddc1d avcodec/vvc: Use static const for function table 2024-07-11 20:26:47 +08:00
Frank Plowman
c917c423e0 lavc/vvc: Don't discard return codes
Signed-off-by: Frank Plowman <post@frankplowman.com>
2024-06-27 20:36:13 +08:00
Nuo Mi
77acd0a0dd avcodec/vvcdec: inter, wait reference with a different resolution
For RPR, the current frame may reference a frame with a different resolution.
Therefore, we need to consider frame scaling when we wait for reference pixels.
2024-05-21 20:20:25 +08:00
Nuo Mi
66c6bee061 avcodec/vvcdec: refact out VVCRefPic from RefPicList 2024-05-21 20:20:25 +08:00
Nuo Mi
a9586a00df avcodec/vvcdec: ff_vvc_frame_submit, avoid initializing task twice.
For some error bitstreams, a CTU belongs to two slices/entry points.
If the decoder initializes and submmits the CTU task twice, it may crash the program
or cause it to enter an infinite loop.

Reported-by: Frank Plowman <post@frankplowman.com>
2024-05-06 20:22:42 +08:00
Andreas Rheinhardt
db063212c8 avcodec/vvc: Rename vvc_?foo->foo
A namespace is unnecessary here given that all these files
are already in the vvc subfolder.

Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-04-04 16:45:00 +02:00
Renamed from libavcodec/vvc/vvc_thread.c (Browse further)