Add a shader-based Apple ProRes decoder.
It supports all codec features for profiles up to
the 4444 XQ profile, ie.:
- 4:2:2 and 4:4:4 chroma subsampling
- 10- and 12-bit component depth
- Interlacing
- Alpha
The implementation consists in two shaders: the
VLD kernel does entropy decoding for color/alpha,
and the IDCT kernel performs the inverse transform
on color components.
Benchmarks for a 4k yuv422p10 sample:
- AMD Radeon 6700XT: 178 fps
- Intel i7 Tiger Lake: 37 fps
- NVidia Orin Nano: 70 fps
In preparation for the Vulkan hwaccel.
The existing hwaccel code was designed around
videotoolbox, which ingests the whole frame
bitstream including picture headers.
This adapts the code to accomodate lower-level,
slice-based hwaccels.
This was introduced in commit 9c43703, to support a codec "extension"
in the prores_aw encoder.
This removes the chroma fill loop, and instead performs the inverse
transform on null coefficients, which achieves the same result and
fixes an off-by-one in the chroma values produced.
Updated test to reflect this change.
This commit adds a reference to the buffer as an argument to
start_frame, and adapts all existing code.
This allows for asynchronous hardware accelerators to skip
copying packet data by referencing it.
avctx->bits_per_raw_sample is always 10 or 12 here;
the checks have been added in preparation for making
bits_per_raw_sample user-settable via an AVOption,
but this never happened.
While just at it, also set unpack_alpha earlier
(where bits_per_raw_sample is set).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Using LONG_BITSTREAM_READER means that every get_bits() call
uses an AV_RB64() to ensure that cache always contains 32 valid bits
(as opposed to the ordinary 25 guaranteed by reading 32 bits);
yet this is unnecessary when unpacking alpha. So only use these
64bit reads where necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: 70036/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PRORES_fuzzer-6298797647396864
Fixes: shift exponent 40 is too large for 32-bit type 'uint32_t' (aka 'unsigned int')
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Once upon a time, there used to be a LGPL and a GPL ProRes decoder
in FFmpeg; the current decoder evolved from the second of these.
But given that it is now the only ProRes decoder we have, it's file
should simply be named proresdec.c (which also brings it in line with
its header).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* qatar/master:
Add LATM demuxer
avplay: flush audio decoder with empty packets at EOF if the decoder has CODEC_CAP_DELAY set.
8svx/iff: fix decoding of compressed stereo 8svx files.
8svx: log an error message if output buffer is too small
8svx: check packet size before reading the initial sample value.
8svx: output 8-bit samples instead of 16-bit.
8svx: split delta decoding into a separate function.
mp4: Don't read an empty Decoder Config Descriptor
fate.sh: Ignore errors from rm command during cleanup.
fate.sh: Run git-pull in quiet mode to avoid console spam.
Apple ProRes decoder
rtmp: Make the input FLV parser handle data cut at any point
rv34: Check for invalid slices offsets
eval: test isnan(sqrt(-1)) instead of just sqrt(-1)
Conflicts:
Changelog
libavcodec/8svx.c
libavcodec/proresdec.c
libavcodec/version.h
libavformat/iff.c
libavformat/version.h
tests/ref/fate/eval
Merged-by: Michael Niedermayer <michaelni@gmx.at>
LOCAL_ALIGNED should work for all compilers/systems whereas
DECLARE_ALIGNED does not work on some (do not remember which though).
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>