Commit graph

49729 commits

Author SHA1 Message Date
Tong Wu
83e0fcbe03 avcodec/d3d12va_decode: delete an empty line and fix a fuction alignment
Signed-off-by: Tong Wu <tong1.wu@intel.com>
2024-01-05 11:06:57 +08:00
Tong Wu
d18ed2ddb5 avcodec/d3d12va_vp9: fix vp9 max_num_refs value
Previous max_num_refs was based on pp.frame_refs plus 1 and it could possibly
reaches the size limit. Actually it should be the size of pp.ref_frame_map
plus 1.

Signed-off-by: Tong Wu <tong1.wu@intel.com>
2024-01-05 11:06:57 +08:00
Zhao Zhili
33698ef891 avcodec/mpegutils: print axis in debug_info2
For example,

./ffmpeg -nostats -threads 1 -debug qp \
	-export_side_data +venc_params \
	-i reinit-small_420_9-to-small_420_8.h264 \
	-frames 2 \
	-f null -

New frame, type: B
    0       64      128     192
  0 313131313131313131313131313129
 16 292929292929292929292929292929
 32 323232323232323232323232323232
 48 323232323232323232323232323232
 64 323232323232323232323232323232
 80 323232323232323232323232323232
 96 323232323030303030303030303030
112 303030303030303030303030303030
128 303030303030303030303030303028
144 313131312929292929292929292929
160 292929292929292929292929292929
176 292929292929292929292929292931
192 312831312631313131312730283131

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-01-04 17:35:11 +08:00
Zhao Zhili
bd48c08f80 avcodec/mpegutils: make debug_info2 thread safe
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-01-04 17:33:56 +08:00
Zhao Zhili
7f900a737f avcodec/videotoolbox: specify color range for hw frame ctx
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-01-04 17:33:38 +08:00
Zhao Zhili
0f824d792d avcodec/hevc_parser: fix missing zero_byte at frame beginning
The start code is matched against 0x000001, zero_byte was treated
as last byte of last frame rather than the beginning of next frame.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-01-05 01:14:33 +08:00
Zhao Zhili
b7ac1f9856 avcodec/hevc_mp4toannexb_bsf: use HEVCNALUnitType instead of integer literal
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-01-05 01:14:33 +08:00
Nuo Mi
301ed950d1 vvcdec: add vvc decoder
vvc decoder plug-in to avcodec.
split frames into slices/tiles and send them to vvc_thread for further decoding
reorder and wait for the frame decoding to be done and output the frame

Features:
    + Support I, P, B frames
    + Support 8/10/12 bits, chroma 400, 420, 422, and 444 and range extension
    + Support VVC new tools like MIP, CCLM, AFFINE, GPM, DMVR, PROF, BDOF, LMCS, ALF
    + 295 conformace clips passed
    - Not support RPR, IBC, PALETTE, and other minor features yet

Performance:
    C code FPS on an i7-12700K (x86):
        BQTerrace_1920x1080_60_10_420_22_RA.vvc      93.0
        Chimera_8bit_1080P_1000_frames.vvc          184.3
        NovosobornayaSquare_1920x1080.bin           191.3
        RitualDance_1920x1080_60_10_420_32_LD.266   150.7
        RitualDance_1920x1080_60_10_420_37_RA.266   170.0
        Tango2_3840x2160_60_10_420_27_LD.266         33.7

    C code FPS on a M1 Mac Pro (ARM):
        BQTerrace_1920x1080_60_10_420_22_RA.vvc     58.7
        Chimera_8bit_1080P_1000_frames.vvc          153.3
        NovosobornayaSquare_1920x1080.bin           150.3
        RitualDance_1920x1080_60_10_420_32_LD.266   105.0
        RitualDance_1920x1080_60_10_420_37_RA.266   133.0
        Tango2_3840x2160_60_10_420_27_LD.266        21.7

    Asm optimizations still working in progress. please check
    https://github.com/ffvvc/FFmpeg/wiki#performance-data for the latest

Contributors (based on code merge order):
    Nuo Mi <nuomi2021@gmail.com>
    Xu Mu <toxumu@outlook.com>
    Frank Plowman <post@frankplowman.com>
    Shaun Loo <shaunloo10@gmail.com>
    Wu Jianhua <toqsxw@outlook.com>

Thank you for reporting issues and providing performance reports:
    Łukasz Czech <lukaszcz18@wp.pl>
    Xu Fulong <839789740@qq.com>

Thank you for providing review comments:
    Ronald S. Bultje <rsbultje@gmail.com>
    James Almer <jamrial@gmail.com>
    Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:12 +08:00
Nuo Mi
e7ef457d6b vvcdec: add CTU thread logical
This is the main entry point for the CTU (Coding Tree Unit) decoder.
The code will divide the CTU decoder into several stages.
It will check the stage dependencies and run the stage decoder.

Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:12 +08:00
Nuo Mi
07f75d5e02 vvcdec: add CTU parser
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:12 +08:00
Nuo Mi
b49575f4cf vvcdec: add dsp init and inv transform
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:12 +08:00
Nuo Mi
02c1455b44 vvcdec: add LMCS, Deblocking, SAO, and ALF filters
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:12 +08:00
Nuo Mi
c05ba94ce8 vvcdec: add intra prediction
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:12 +08:00
Nuo Mi
2592cc1f96 vvcdec: add inv transform 1d
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:11 +08:00
Nuo Mi
ea49c83bad vvcdec: add inter prediction
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:11 +08:00
Nuo Mi
603d0bd171 vvcdec: add motion vector decoder
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:11 +08:00
Nuo Mi
c1a3d17491 vvcdec: add reference management
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:11 +08:00
Nuo Mi
976d3b7d69 vvcdec: add cabac decoder
add Context-based Adaptive Binary Arithmetic Coding (CABAC) decoder

Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 23:15:06 +08:00
Nuo Mi
e97a5bbb13 vvcdec: add parameter parser for sps, pps, ph, sh
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 16:31:59 +08:00
Nuo Mi
49db9fc171 vvcdec: add vvc_data
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
2024-01-03 16:31:59 +08:00
James Almer
85b8d59ec7 avcodec/d3d12va_mpeg2: change the type for the ID3D12Resource_Map input data argument
Fixes -Wincompatible-pointer-types warnings.

Reviewed-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-01 09:46:41 -03:00
James Almer
e9722735fa avcodec/d3d12va_mpeg2: remove unused variables
Reviewed-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-01-01 09:46:06 -03:00
Michael Niedermayer
e063c1d079
avcodec/mpegvideo_enc: Use ptrdiff_t for stride
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-30 21:50:05 +01:00
Michael Niedermayer
a066b8a809
avcodec/mpegvideo_enc: Dont copy beyond the image
Fixes: out of array access
Fixes: tickets/10754/poc17ffmpeg

Discovered by Zeng Yunxiang.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-30 21:50:04 +01:00
Michael Niedermayer
bf1159774b
avcodec/vaapi_encode: Avoid double AVERRORS
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 21:36:02 +01:00
Michael Niedermayer
850ab8f6da
avcodec/jpegxl_parser: Check get_vlc2()
Fixes: shift exponent -1 is negative
Fixes: 63889/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-6009343056936960

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 19:21:26 +01:00
Michael Niedermayer
d909d8e5e0
avcodec/leaddec: Check remaining bits in decode_block()
Fixes: Timeout
Fixes: 64163/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_LEAD_fuzzer-6418925835124736

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 01:15:42 +01:00
Michael Niedermayer
5f88458bea
avcodec/jpegxl_parser: Add padding to cs_buffer
Fixes: out of array access
Fixes: 64081/clusterfuzz-testcase-minimized-ffmpeg_DEMUXER_fuzzer-6151006496620544

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 01:15:42 +01:00
Michael Niedermayer
c72a20f01a
avcodec/jpeglsdec: Check Jpeg-LS LSE
Fixes: signed integer overflow: 2147478526 + 33924 cannot be represented in type 'int'
Fixes: shift exponent 32 is too large for 32-bit type 'unsigned int'
Fixes: 64243/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEGLS_fuzzer-5195717848989696

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 01:00:48 +01:00
Michael Niedermayer
c75fccd1d4
avcodec/osq: Implement flush()
Fixes: out of array access
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6227491892887552
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6268561729126400
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6414805046788096
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6538151088488448
Fixes: 62164/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6608131540779008

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 00:45:20 +01:00
jinbo
545686e49e
avcodec/hevc: Add init for sao_edge_filter
Forgot to init c->sao_edge_filter[idx] when idx=0/1/2/3.
After this patch, the speedup of decoding H265 4K 30FPS
30Mbps on 3A6000 is about 7% (42fps==>45fps).

Change-Id: I521999b397fa72b931a23c165cf45f276440cdfb
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-12-29 00:45:20 +01:00
Devin Heitmueller
b2c82b23b9 avcodec/bitpacked_dec: optimize bitpacked_decode_yuv422p10
Rework the code a bit to speed up the 10-bit bitpacked decoding
routine.  This is probably about as fast as I can get it without
switching to assembly language.

Demonstratable with:

./ffmpeg -f lavfi -i "smptehdbars=size=3840x2160" -c bitpacked -f image2 -frames:v 1 source.yuv
./ffmpeg -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le out.yuv

On my development system, it went from 80ms for a 2160p frame
down to 20ms (i.e. a 4X speedup).  Good enough for now, I hope...

Comments from Marton:

Originally on my system better performance could be achieved by simply
switching to the cached bitstream reader, but for Devin it was slower than
his direct byte operations.

I changed the order of writing output from u/y/v/y to u/v/y/y, and that made
the code faster than the cached bitstream reader on my system as well.

TIMER measurement of the decode loop on Ryzen 5 3600 with command line:

./ffmpeg -stream_loop 256 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none -loglevel error

Before: 823204127 decicycles in YUV,     256 runs,      0 skips
After:  315070524 decicycles in YUV,     256 runs,      0 skips

Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2023-12-28 23:56:14 +01:00
Marton Balint
059ea1d6f6 avcodec/mjpegdec: avoid indirection when accessing avctx
Signed-off-by: Marton Balint <cus@passwd.hu>
2023-12-28 23:15:56 +01:00
Marton Balint
e6b9bfaac3 avcodec/mjpegdec: use memset to clear alpha
Signed-off-by: Marton Balint <cus@passwd.hu>
2023-12-28 23:15:56 +01:00
Leo Izen
fb54c89a0d
avcodec/jpegxl_parser: check ANS cluster alphabet size vs bundle size
The specification doesn't mention that clusters cannot have alphabet
sizes greater than 1 << bundle->log_alphabet_size, but the reference
implementation rejects these entropy streams as invalid, so we should
too. Refusing to do so can overflow a stack variable that should be
large enough otherwise.

Fixes #10738.

Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-27 10:10:09 -05:00
James Almer
4fee63b241 x86/takdsp: add missing wrappers to AVX2 functions
Fixes compilation with old yasm.

Signed-off-by: James Almer <jamrial@gmail.com>
2023-12-25 22:31:15 -03:00
James Almer
591dc3b4b8 x86/takdsp: add avx2 versions of all functions
On an Intel Core i7 12700k:

decorrelate_ls_c: 814.3
decorrelate_ls_sse2: 165.8
decorrelate_ls_avx2: 101.3
decorrelate_sf_c: 1602.6
decorrelate_sf_sse4: 640.1
decorrelate_sf_avx2: 324.6
decorrelate_sm_c: 1564.8
decorrelate_sm_sse2: 379.3
decorrelate_sm_avx2: 203.3
decorrelate_sr_c: 785.3
decorrelate_sr_sse2: 176.3
decorrelate_sr_avx2: 99.8

Tested-by: Lynne <dev@lynne.ee>
Signed-off-by: James Almer <jamrial@gmail.com>
2023-12-23 08:39:22 -03:00
Andreas Rheinhardt
370ce305f4
avcodec/libjxlenc: Set AV_CODEC_CAP_DR1
This encoder uses ff_get_encode_buffer() to allocate the packet buffer.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-12-22 22:08:29 -05:00
Andreas Rheinhardt
577dd7b762
avcodec/libjxlenc: Don't refer to decoder in comments
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-12-22 22:08:29 -05:00
Leo Izen
5942cf46b6
avcodec/libjxldec: emit proper PTS to decoded AVFrame
If a sequence of JXL images is encapsulated in a container that has PTS
information, we should use the PTS information from the container. At
this time there is no container that does this, but if JPEG XL support
is ever added to NUT, AVTransport, or some other container, this commit
should allow the PTS information those containers provide to work as
expected.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-22 22:08:29 -05:00
Leo Izen
42f78925d7
avcodec/libjxlenc: accept rgbf32 and rgbaf32 frames
These pixel formats have always been supported by libjxl, but at the
time this plugin was written, they were not in FFmpeg yet. Now that
they are in FFmpeg, we should support them.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-22 22:08:29 -05:00
Leo Izen
f6ef6a853c
avcodec/libjxldec: produce rgbf32 and rgbaf32 frames
These pixel formats have always been supported by libjxl, but at the
time this plugin was written, they were not in FFmpeg yet. Now that
they are in FFmpeg, we should support them.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2023-12-22 22:08:29 -05:00
Leo Izen
4013b8d3f0
avcodec/pngdec: improve handling of bad cICP range tags
FFmpeg doesn't support tv-range RGB throughout most of its pipeline, so
we should keep the warning. However, in case something does support it
we should at least keep it tagged properly. Additionally, the encoder
writes this tag if the space is tagged as such so this makes a round
trip work as it should.

Also, PNG doesn't support nonzero matrices but we only warn and ignore
in that case, so we have no reason to error out for illegal cICP ranges
either (i.e. greater than 1).

Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Kacper Michajłow <kasper93@gmail.com>
2023-12-22 22:07:35 -05:00
sunyuechi
3d39b8d4e7 lavc/takdsp: R-V V decorrelate_sm
C908:
decorrelate_sm_c: 130.0
decorrelate_sm_rvv_i32: 43.2

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
(with minor changes)
2023-12-22 17:40:00 +02:00
Andreas Rheinhardt
0c6203c97a all: Don't set AVClass.item_name to its default value
Unnecessary since acf63d5350;
also avoids relocations.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-12-22 15:12:33 +01:00
James Almer
46775e64f8 avcodec/takdsp: fix const correctness
Signed-off-by: James Almer <jamrial@gmail.com>
2023-12-22 09:28:04 -03:00
Andreas Rheinhardt
45b4781e9a avcodec/v4l2_m2m: Remove redundant av_frame_unref()
This frame will be freed in the next line.

Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-12-21 23:29:02 +01:00
Rémi Denis-Courmont
0f05f9ed3e mlp: move pack_output pointer to decoder context
The current pack_output function pointer is a property of the decoder,
rather than a constant method provided by the DSP code. Indeed, except
for an unused initialisation, the field is never used in DSP code.
2023-12-21 22:42:34 +02:00
sunyuechi
c933ff2779 lavc/takdsp: R-V V decorrelate_sr
C908:
decorrelate_sr_c: 95.5
decorrelate_sr_rvv_i32: 28.2

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00
sunyuechi
864174dd00 lavc/takdsp: R-V V decorrelate_ls
C908:
decorrelate_ls_c: 69.7
decorrelate_ls_rvv_i32: 27.2

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-21 22:42:34 +02:00