Commit graph

50209 commits

Author SHA1 Message Date
Lynne
eee5fa0808
aacdec: add a decoder for AAC USAC (xHE-AAC)
This commit adds a decoder for the frequency-domain part of USAC.

What works:
 - Mono
 - Stereo (no prediction)
 - Stereo (mid/side coding)
 - Stereo (complex prediction)

What's left:
 - SBR
 - Speech coding

Known issues:
 - Desync with certain sequences
 - Preroll crossover missing (shouldn't matter, bitrate adaptation only)
2024-06-02 18:34:45 +02:00
Lynne
23b45d7e20
aactab: add new scalefactor offset tables for 96/768pt windows 2024-06-02 18:34:45 +02:00
Lynne
a300ec3569
aactab: add tables for the new USAC arithmetic coder 2024-06-02 18:34:45 +02:00
Lynne
7cd8a3f509
aactab: add deemphasis tables for USAC 2024-06-02 18:34:44 +02:00
Lynne
0513c5cd25
aacdec_dsp: implement 768-point transform and windowing
Required for USAC
2024-06-02 18:34:44 +02:00
Lynne
f8543f3763
aacdec: expose decode_tns
USAC has the same syntax, with one minor change we can check for.
2024-06-02 18:34:43 +02:00
Lynne
0f2303f629
aacdec: expose channel layout related functions 2024-06-02 18:34:43 +02:00
Lynne
39b8d84b53
aacdec: move from scalefactor ranged arrays to flat arrays
AAC uses an unconventional system to send scalefactors
(the volume+quantization value for each band).
Each window is split into either 1 or 8 blocks (long vs short),
and transformed separately from one another, with the coefficients
for each being also completely independent. The scalefactors
slightly increase from 64 (long) to 128 (short) to accomodate
better per-block-per-band volume for each window.

To reduce overhead, the codec signals scalefactor sizes in an obtuse way,
where each group's scalefactor types are sent via a variable length decoding,
with a range.
But our decoder was written in a way where those ranges were carried through
the entire decoder, and to actually read them you had to use the range.

Instead of having a dedicated array with a range for each scalefactor,
just let the decoder directly index each scalefactor.

This also switches the form of quantized scalefactors to the format
the spec uses, where for intensity stereo and regular, scalefactors
are stored in a scalefactor - 100 form, rather than as-is.

USAC gets rid of the complex scalefactor handling. This commit permits
for code sharing between both.
2024-06-02 18:34:43 +02:00
Rémi Denis-Courmont
6c6bec04f3 lavc/vc1dsp: fix R-V V avg_mspel_pixels
The 8x8 pixel arrays are not necessarily aligned to 64 bits, so the
current code leads to Bus error on real hardware. This reproducible
with FATE's vc1_ilaced_twomv test case.

The new "pessimist" code can trivially be shared for 16x16 pixel
arrays so we also do that. FWIW, this also nominally reduces the
hardware requirement from Zve64x to Zve32x.

T-Head C908:
vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_c:      14.7
vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_rvv_i32: 3.5
vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_c:       3.7
vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_rvv_i32: 1.5

SpacemiT X60:
vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_c:      13.0
vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_rvv_i32: 3.0
vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_c:       3.2
vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_rvv_i32: 1.2
2024-06-02 10:37:09 +03:00
Andreas Rheinhardt
2c38ca3d37 avcodec/hevc_ps: Fix UB 1 << 31
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-06-02 05:15:00 +02:00
Michael Niedermayer
00d029d5c0
avcodec/sga: Make it clear that the return is intentionally not checked
Related: CID1473496 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-01 18:01:29 +02:00
James Almer
7e182a8d92 avcodec/videotoolbox: use the correct HEVCSPS field name
Fixes compilation that was broken in 6fed1841a1.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-01 10:56:19 -03:00
Rémi Denis-Courmont
06fc919aad lavc/sbrdsp: add support for 256-bit vectors
hf_apply_noise_0_c: 35.7
hf_apply_noise_0_rvv_f32: 9.5
hf_apply_noise_1_c: 38.5
hf_apply_noise_1_rvv_f32: 10.0
hf_apply_noise_2_c: 35.5
hf_apply_noise_2_rvv_f32: 9.7
hf_apply_noise_3_c: 38.5
hf_apply_noise_3_rvv_f32: 10.0

Maybe extending the noise table manually is not such great idea, but I
not quite sure how to deal with that otherwise? Allocating the table
dynamically is possible but would require an ELF destructor to clean up.
2024-05-31 22:22:43 +03:00
Anton Khirnov
63a96dbcce lavc/hevc_ps: compactify ShortTermRPS
Do not use larger fields than needed, use size-1 bitfields for flags.

Reduces sizeof(HEVCSPS) by 1280 bytes.
2024-05-31 19:26:06 +02:00
Anton Khirnov
9127819d51 lavc/hevc_ps: reduce the size of ShortTermRPS.used
It is currently an array of 32 uint8_t, each storing a single flag. A
single uint32_t is sufficient.

Reduces sizeof(HEVCSPS) by 1792 bytes.
2024-05-31 19:26:06 +02:00
Anton Khirnov
d893667867 lavc/hevc_ps: do not store delta_poc_s[01] in ShortTermRPS
They are only used in vulkan_hevc and are not actually needed, as they
can be computed from delta_poc.

Reduces sizeof(HEVCSPS) by 16kB.

Also, fix a typo (s0->s1) in the code being touched.
2024-05-31 19:26:06 +02:00
Anton Khirnov
4264e4056c lavc/hevc_ps: fix variable signedness in ff_hevc_decode_short_term_rps()
It is actually supposed to go negative in the loop over num_negative
pics, but underflow does not break anything as the result is then
assigned to a signed int.
2024-05-31 19:26:06 +02:00
Anton Khirnov
6fed1841a1 lavc/hevc_ps/HEVCSPS: change flags into uint8_t
Reduces sizeof(HEVCSPS) by 64 bytes.

Also improve flag names: drop redundant suffixes and prefixes, and
consistently use disabled/enabled.
2024-05-31 19:26:06 +02:00
Anton Khirnov
bd1a06dc43 lavc/hevc_ps: reduce the size of used_by_curr_pic_lt_sps_flag
It is currently an array of 32 uint8_t, each storing a single flag. A
single uint32_t is sufficient.
2024-05-31 19:26:06 +02:00
Anton Khirnov
72bdbce00d lavc/hevcdec: drop a useless execute() call with 1 job 2024-05-31 19:26:06 +02:00
Anton Khirnov
f0aece90d9 lavc/hevcdec: allocate local_ctx as array of structs rather than pointers
It is more efficient and easier to manage.
2024-05-31 19:26:06 +02:00
Anton Khirnov
25ce44efa5 lavc/hevcdec: track local context count separately from WPP thread count
The latter can be lowered while decoding, which would lead to memleaks.
2024-05-31 19:26:06 +02:00
Anton Khirnov
a1471ec8ad lavc/hevcdec: rename HEVCContext.HEVClcList to local_ctx
It is more consistent with our naming conventions.
2024-05-31 19:26:06 +02:00
James Almer
e0db1f51d6 avcodec/lpc: account for odd len values
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-05-31 13:36:02 -03:00
James Almer
8a1c491354 avcodec/packet: remove reference to old AV_SIDE_DATA_PARAM_CHANGE_ values
They were forgotten in 65ddc74988.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-05-31 11:24:37 -03:00
Andreas Rheinhardt
8cbf7e8408 avcodec/diracdec: Mark flush as av_cold
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
0f3090cbd1 avcodec/diracdec: Use FF_CODEC_CAP_INIT_CLEANUP
This was one of the few decoders incompatible with the flag.
Also only call free_sequence_buffers() instead of dirac_decode_flush()
in dirac_decode_end().

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
d9bd5baf9d avcodec/vc2enc: Use already available AVPixFmtDescriptor
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
d1d40a7c9b avcodec/vc2enc: Move transient PutBitContext from ctx to stack
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
c309285666 avcodec/vc2enc: Avoid relocations for short strings
These strings are so short that they can be put directly
into the containing structure, avoiding the pointer
and putting it into .rodata.
Also use chars for interlaced and level while at it, as
these are so small.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
02ecf8d7f3 avcodec/vc2enc: Fix slice length
args->bytes here already includes prefix_bytes (see
SSIZE_ROUND macro), so including it here again and
forgetting it when offsetting skip seems wrong.
This only works because prefix_bytes is currently
always zero in this encoder.
(This has been added in b88be742fa
without any reason.)

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
eac8dcb187 avcodec/vc2enc: Remove superfluous error message
ff_get_encode_buffer() already emits an error message of its own.
While just at it, also check for ret < 0 instead of just ret != 0.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
b1702afdfd avcodec/vc2enc: Constify slices->main context pointers
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Andreas Rheinhardt
6d86146fce avcodec/vc2enc: Avoid void* where possible
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-05-31 14:18:33 +02:00
Wu Jianhua
9950f14864 avcodec/x86/vvc/vvc_alf: use xq to match ptrdiff_t
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-05-31 19:57:31 +08:00
Wu Jianhua
09d3370c28 avcodec/x86/vvc/vvc_alf: fix integer overflow
Some tests fails with certain seeds

tests/checkasm/checkasm 2325607578 --test=vvc_alf
checkasm: using random seed 2325607578
AVX2:
    vvc_alf_filter_luma_120x20_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x24_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x28_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x32_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x36_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x40_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x44_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x48_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x52_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x56_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x60_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x64_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x68_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x72_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x76_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x80_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x84_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x88_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x92_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x96_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x100_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x104_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x108_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x112_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x116_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x120_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x124_12_avx2 (vvc_alf.c:104)
    vvc_alf_filter_luma_120x128_12_avx2 (vvc_alf.c:104)
  - vvc_alf.alf_filter   [FAILED]
  - vvc_alf.alf_classify [OK]
checkasm: 28 of 9216 tests have failed

Reported-by: James Almer <jamrial@gmail.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
2024-05-31 19:57:31 +08:00
Pierre-Anthony Lemieux
249c66bb22
avcodec/jpeg2000dec: fix HT block decoder
Addresses https://trac.ffmpeg.org/ticket/10905

Co-authored-by: Osamu Watanabe <owatanab@es.takushoku-u.ac.jp>
Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>
2024-05-30 21:30:52 -07:00
sunyuechi
544acfa2c0 lavc/vp9dsp: R-V V rename ff_avg to ff_vp9_avg
Avoid potential naming conflicts

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-05-30 18:30:52 +03:00
Rémi Denis-Courmont
4fe8f2cc43 riscv: allow passing addend to vtype_vli macro
A constant (-1) is added to the length value, so we can have an added
for free, and optimise the addition away if the addend is exactly 1.
2024-05-30 18:30:52 +03:00
Rémi Denis-Courmont
fa3b153cb1 lavc/vp7dsp: R-V V vp7_idct_add
Most of the code is shared with DC, thanks to minor earlier changes.

vp7_idct_add_c:       5.2
vp7_idct_add_rvv_i32: 2.5
2024-05-29 16:57:02 +03:00
Rémi Denis-Courmont
4a0e629b6f lavc/vp7dsp: revector ff_vp7_dc_wht_rvv
This prepares for some code reuse.
2024-05-29 16:57:02 +03:00
Rémi Denis-Courmont
fd39997f72 lavc/vp7dsp: add R-V V vp7_luma_dc_wht
This works out a bit more favourably than VP8's due to:
- additional multiplications that can be vectored,
- hardware-supported fixed-point rounding mode.

vp7_luma_dc_wht_c:       3.2
vp7_luma_dc_wht_rvv_i64: 2.0
2024-05-29 16:57:02 +03:00
Rémi Denis-Courmont
91b5ea7bb9 lavc/vp8dsp: R-V V vp8_luma_dc_wht
This is not great as transposition is poorly supported, but it works:
vp8_luma_dc_wht_c:       2.5
vp8_luma_dc_wht_rvv_i32: 1.7
2024-05-29 16:57:02 +03:00
Rémi Denis-Courmont
c53d42380d lavc/lpc: optimise RVV vector type for compute_autocorr
On SpacemiT X60 (with len == 4000):
autocorr_10_c:       2303.7
autocorr_10_rvv_f64: 1411.5 (before)
autocorr_10_rvv_f64:  842.2 (after)
2024-05-29 16:57:02 +03:00
Stone Chen
55e9c758f0 libavcode/x86/vvc: change label to vvc_sad_16 to reflect block sizes
According to the VVC specification (section 8.5.1), the maximum width/height of a subblock passed for DMVR SAD is 16. This along with previous constraint requiring width * height >= 128 means that  8x16, 16x8, and 16x16 are the only allowed sizes. This re-labels vvc_sad_16_128 to vvc_sad_16 to reflect this and adds a comment about the block size constraints. There's no functionality change.
2024-05-29 21:35:34 +08:00
David Rosca
510494760c lavc/vaapi_h264: Fix merging fields in DPB with missing references
If there are missing references, h264 decode does error concealment
by copying previous refs which means there will be duplicated surfaces.
Check long_ref and frame_idx in addition to surface when looking for
the other field to avoid trying to merge with wrong picture.
Also allow to merge with multiple pictures in case there are duplicates
of the other field.

Signed-off-by: David Rosca <nowrep@gmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-29 10:52:10 +08:00
David Rosca
d2d911eb9a lavc/vaapi_av1: Avoid sending the same slice buffer multiple times
When there are multiple tiles in one slice buffer, use multiple slice
params to avoid sending the same slice buffer multiple times and thus
increasing the bitstream size the driver will need to upload to hw.

Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: David Rosca <nowrep@gmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-29 10:49:35 +08:00
David Rosca
fe9d889dcd lavc/vaapi_decode: Make it possible to send multiple slice params buffers
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: David Rosca <nowrep@gmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-29 10:47:43 +08:00
Haihao Xiang
c872ba5899 lavc/qsvenc: respect user's setting for keyframes
For example:
./ffmpeg -hwaccel qsv -i input.mp4 -force_key_frames:v source -c:v
hevc_qsv -f null -

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-29 10:46:54 +08:00
Haihao Xiang
dbdd9ccded lavc/qsvdec: fix keyframes
MFX_FRAMETYPE_IDR is ORed to the frame type for AVC and HEVC keyframes,
and MFX_FRAMETYPE_I is taken as keyframe flag for other codecs when
getting the output surface from the SDK, hence we may mark the output
frame as keyframe accordingly.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-05-29 10:46:54 +08:00