Commit graph

47409 commits

Author SHA1 Message Date
Andreas Rheinhardt
7d23b350c2 avcodec/mpeg12dec: Remove always-true check
mpeg12dec.c is a decoder-only file.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 07:29:05 +02:00
Andreas Rheinhardt
6fe4e8fab4 avcodec/mpegvideo: Split ff_mpv_reconstruct_mb() into de/encoder part
This has the advantage of not having to check for whether
a given MpegEncContext is actually a decoder or an encoder
context at runtime.

To do so, mpv_reconstruct_mb_internal() is moved into a new
template file that is included by both mpegvideo_enc.c
and mpegvideo_dec.c; the decoder-only code (mainly lowres)
are also moved to mpegvideo_dec.c. The is_encoder checks are
changed to #if IS_ENCODER in order to avoid having to include
headers for decoder-only functions in mpegvideo_enc.c.

This approach also has the advantage that it is easy to adapt
mpv_reconstruct_mb_internal() to using different structures
for decoders and encoders (e.g. the check for whether
a macroblock should be processed for the encoder or not
uses MpegEncContext elements that make no sense for decoders
and should not be part of their context).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 07:29:03 +02:00
Andreas Rheinhardt
9ca312d8ab avcodec/mpegvideo: Inline is_encoder in mpv_reconstruct_mb_internal()
Up until now, we inlined lowres_flag as well as is_mpeg12
independently (unless CONFIG_SMALL was true); this commit
changes this to instead inline mpv_reconstruct_mb_internal()
(at most) four times, namely once for encoders, once for decoders
using lowres and once for non-lowres mpeg-1/2 decoders and once
for non-lowres non-mpeg-1/2 decoders (mpeg-1/2 is not inlined
in case of CONFIG_SMALL). This is neutral performance-wise,
but proved beneficial size-wise: It saved 1776B of .text
for GCC 11 or 1344B for Clang 14 (both -O3 x64).

Notice that inlining is_mpeg12 for is_encoder would not really
be beneficial, as the encoder codepath does mostly not depend
on is_mpeg12 at all.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 07:21:33 +02:00
Andreas Rheinhardt
409c4723ec avcodec/mpegvideo: Make inlining is_mpeg12 more flexible
There are two types of checks for whether the current codec
is MPEG-1/2 in mpv_reconstruct_mb_internal(): Those that are
required for correctness and those that are not; an example
of the latter is "is_mpeg12 || (s->codec_id != AV_CODEC_ID_WMV2)".
The reason for the existence of such checks is that
mpv_reconstruct_mb_internal() has the av_always_inline attribute
and is_mpeg12 is usually inlined, so that in case we are dealing
with MPEG-1/2 the above check can be completely optimized away.

But is_mpeg12 is not always inlined: it is not in case
CONFIG_SMALL is true in which case is_mpeg12 is always zero,
so that the checks required for correctness need to check
out_format explicitly. This is currently done via a macro
in mpv_reconstruct_mb_internal(), so that the fact that
it is CONFIG_SMALL that determines this is encoded at two places.

This commit changes this by making is_mpeg12 a three-state:
DEFINITELY_MPEG12, MAY_BE_MPEG12 and NOT_MPEG12. In the second
case, one has to resort to check out_format, in the other cases
is_mpeg12 can be taken at face-value. This will allow to make
inlining is_mpeg12 more flexible in a future commit.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 07:09:11 +02:00
Andreas Rheinhardt
cab876f5f4 avcodec/mpegvideo: Ignore skip_idct for encoders
It is documented to be unused for encoders.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 07:09:11 +02:00
Andreas Rheinhardt
a5e59fec07 avcodec/ffv1: Move ffv1_template.c inclusion to dec/enc templates
Both the FFV1 decoder and encoder use a template of their own
to generate code multiple times. They also use a common template,
used by both decoder and encoder templates which is currently
instantiated in ffv1.h (and therefore also in ffv1.c, which
doesn't need it at all).

All these templates have the prerequisite that two macros
are defined, namely RENAME() and TYPE. The codec-specific
templates call the functions generated via the common template
via the RENAME() macro and therefore the macros used for
the common template must coincide with the macros used for
the codec-specific templates. But then it is better to not
instantiate the common template in ffv1.h, but in the codec
specific templates.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
f63c6c81d4 avcodec/mpegutils: Reindent after the previous commit
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
678f1b1cf4 avcodec/mpegutils: Return early in ff_draw_horiz_band()
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
85f02c300f avcodec/mpegvideo: Move VIDEO_FORMAT_* defines to mpeg12enc.c
Forgotten in f899e3b51b.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
9e32f2ebfd avcodec/h261: Use ptrdiff_t for stride
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
10dfbb0502 avcodec/mpegvideo: Reindent after the last commit
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
5ecf5b93dd avcodec/mpegvideo: Don't check for draw_horiz_band
Some parts of mpegvideo.c behave differently depending
upon whether AVCodecContext.draw_horiz_band is set or not.
This differing behaviour makes lots of FATE tests fail
and leads to garbage output, although setting this callback
is not supposed to change the output at all.

These checks have been added in commits
3994623df2 and
b68ab2609c. The commit messages
do not contain a real reason for adding the checks and it is
indeed a mystery to me. But removing these checks fixes
the FATE tests when one adds an (empty) draw_horiz_band
when using a codec that claims to support it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
5bcae5251f avcodec/vc1_block: Remove dead calls to ff_mpeg_draw_horiz_band()
The VC-1 decoders don't support draw_horiz_band at all.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
Andreas Rheinhardt
e0c01a62ad avcodec/(ffv1|h264|png|snow)dec: Remove comment out DRAW_HORIZ_BAND cap
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-20 06:57:30 +02:00
James Almer
1f63225f2b avcodec/librav1e: support AV_CODEC_CAP_ENCODER_RECON_FRAME
This bumps the minimum required version to 0.5.0

Signed-off-by: James Almer <jamrial@gmail.com>
2022-10-19 11:15:38 -03:00
James Almer
f58f238936 avcodec/librav1e: export extradata on init()
librav1e provides a function to create extradata, so use it instead of
extracting the sequence header OBU from packets.

Signed-off-by: James Almer <jamrial@gmail.com>
2022-10-19 10:13:37 -03:00
James Almer
d569958d29 avcodec/librav1e: support setting sample aspect ratio
Signed-off-by: James Almer <jamrial@gmail.com>
2022-10-19 10:13:37 -03:00
Andreas Rheinhardt
30e1f5ec77 avcodec/startcode: Avoid unaligned accesses
Up until now, ff_startcode_find_candidate_c() simply casts
an uint8_t* to uint64_t*/uint32_t* to read 64/32 bits at a time
in case HAVE_FAST_UNALIGNED is true. Yet this ignores the
alignment requirement of these types as well as effective type
rules of the C standard. This commit therefore replaces these
direct accesses with AV_RN64/32; this also improves
readability.

UBSan reported these unaligned accesses which happened in 233
FATE-tests involving H.264 and VC-1 (this has also been reported
in tickets #8138 and #8485); these tests are fixed by this commit.

The output of GCC with -O3 is unchanged for aarch64, loongarch,
ppc and x64 (as well as for arches like alpha for which
HAVE_FAST_UNALIGNED is never true in the first place).
There was only a slight difference for mips and arm.
I don't know about the speed impact of them.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-19 13:48:31 +02:00
Andreas Rheinhardt
81bc4ef142 avcodec/msmpeg4data: Mark tables as hidden
This e.g. allows compilers to bake the "+ 256" offset
used to access ff_v2_dc_(lum|chroma)_table into
the general offset; for certain arches this is also necessary
in order to avoid building suboptimal code.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-18 15:44:29 +02:00
Andreas Rheinhardt
c9d0ba9a60 avcodec/jpegtables: Mark jpegtables as hidden
These tables are not exported as avpriv symbols, but instead
included into every library using them. Therefore they
can be mark with the hidden elf visibility. For certain arches
this is necessary in order to avoid building suboptimal code;
for other arches it just allows the compiler to simplify accesses
like ff_mjpeg_bits_dc_luminance + 1 because the "+ 1" can be baked
into the offset.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-18 15:44:29 +02:00
Peter Ross
3141dbb7ad avcodec: ViewQuest VQC decoder
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Peter Ross <pross@xvid.org>
2022-10-18 13:20:37 +11:00
Haihao Xiang
e253bc4b17 lavc/qsvenc: fill the padding area
qsvenc makes a copy when the input in system memory is not padded as the
SDK requires, however the padding area is not filled with right data

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-10-18 09:59:58 +08:00
Marvin Scholz
0d34137925 avcodec/bsf: document missing arguments 2022-10-17 09:56:47 +02:00
Marvin Scholz
80c8b988fb avcodec/mediacodec: link to related documentation 2022-10-17 09:55:19 +02:00
Marvin Scholz
56bbfe1136 avcodec/mediacodec: use inline code for coderefs
Avoids doxy to interpret these as internal references forced by the #
character, fixing the warnings:

  warning: explicit link request to 'nanoTime()' could not be resolved
  warning: explicit link request to 'releaseOutputBuffer(int,long)'
  could not be resolved
2022-10-17 09:55:19 +02:00
Marvin Scholz
67298d8ea1 avcodec/videotoolbox: Add proper doxy group
Same as done for other HW decoders, that way it will be
properly listed on the relevant module page.
2022-10-17 09:55:16 +02:00
Marvin Scholz
4be6d065d4 avcodec/codec_par: Add missing doxy group opening 2022-10-17 09:51:47 +02:00
Marvin Scholz
295d217117 avcodec/vdpau: Fix doxy comment typo
This is clearly supposed to be a doxy comment and needed to properly
close the group.
2022-10-17 09:51:47 +02:00
Marvin Scholz
2c59038208 avcodec/avcodec: Escape Doxygen reference
The # is interpreted as explicit reference request by Doxygen
which is not desired here, use markdown inline code to avoid
that.
2022-10-17 09:51:47 +02:00
Marvin Scholz
ea5884e2e3 avcodec: Fix Doxygen trailing brief comments
The //< comment is not any magic comment supported by Doxygen,
instead use ///< to mark them as doc for the members.
2022-10-17 09:51:47 +02:00
Rémi Denis-Courmont
4d66e8c12e lavc/audiodsp: fix RISC-V V scalar product (again)
The loop uses a 32-bit accumulator. The current code would only zero
the lower 16 bits thereof.
2022-10-17 06:39:00 +02:00
James Almer
bd5b59deea avcodec/libdav1d: add an option to set max frame delay
Signed-off-by: James Almer <jamrial@gmail.com>
2022-10-15 15:24:51 -03:00
Andreas Rheinhardt
d2fd0ea1d7 avcodec/motion_est: Remove unused elements
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-15 12:17:10 +02:00
Andreas Rheinhardt
a010193fcb avcodec/svq1enc: Move PutBitContext from context to stack
This is more natural, because said context is only used
for the duration of one call to svq1_encode_frame().

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-14 16:14:24 +02:00
Andreas Rheinhardt
d08b2900a9 avcodec/svq1: Set hidden visibility
The encoder uses ff_svq1_inter_mean_vlc + 256 and setting
hidden visibility allows to bake this "+ 256" into the
general offset of ff_svq1_inter_mean_vlc and the code
accessing it.

For certain arches, this is also required for the compiler
to not produce overtly pessimistic code that can't be fixed
up by the linker lateron.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-14 16:14:24 +02:00
Andreas Rheinhardt
e84348a8ab avcodec/svq1enc: Add SVQ1EncDSPContext, make codec context private
Currently, SVQ1EncContext is defined in a header that is also
included by the arch-specific code that initializes the one
and only dsp function that this encoder uses directly.

But the arch-specific functions to set this dsp function
do not need anything from SVQ1EncContext. This commit therefore
adds a small SVQ1EncDSPContext whose only member is said
function pointer and renames svq1enc.h to svq1encdsp.h
to avoid exposing unnecessary internals to these init
functions (and the whole mpegvideo with it).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-14 16:14:24 +02:00
Andreas Rheinhardt
42bde73b20 avcodec/svq1enc: Inline constants
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-14 16:14:24 +02:00
Andreas Rheinhardt
25e1986e68 avcodec/vp8: Add const where appropriate
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-13 23:42:25 +02:00
Rémi Denis-Courmont
96a83ceea4 riscv: fix scalar product initialisation
VSETVLI xd, x0, ...' has rather nonobvious semantics:
- If xd is x0, then it preserves the current vector length.
- If xd is not x0, it sets the vector length to the supported maximum.

Also somewhat confusingly, while VMV.X.S always does its thing
regardless of the selected vector length, VMV.S.X does _nothing_ if the
selected vector length is zero.

So the current code breaks fails to initialise the accumulator if we
are unlucky to have a selected vector length of zero on entry. Fix it
by forcing the vector length to one.
2022-10-13 10:17:38 +02:00
Andreas Rheinhardt
28ac2279ad avcodec/snow: Move initializing MotionEstContext to snowenc.c
Only used by the encoder.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-12 10:58:15 +02:00
Anton Khirnov
adb927fa7a lavc/encode: combine setting no-delay pts for video/audio 2022-10-11 11:59:11 +02:00
Anton Khirnov
8789720d28 lavc/encode: generalize a check for setting dts=pts
DTS may be different from PTS only if both of these are true:
- the codec supports reordering
- the encoder has delay
2022-10-11 11:57:52 +02:00
Reimar Döffinger
38cd829dce
aarch64: Implement stack spilling in a consistent way.
Currently it is done in several different ways, which
might cause needless dependencies or in case of
tx_float_neon.S is incorrect.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2022-10-11 09:12:02 +02:00
Andreas Rheinhardt
e10e27a2ea avcodec/opustab: Avoid indirection to access ff_celt_window
Currently, it is accessed via a pointer (ff_celt_window)
exported from opustab.h which points inside a static array
(ff_celt_window_padded) in opustab.h. Instead export
ff_celt_window_padded directly and make opustab.h
a static const pointer pointing inside ff_celt_window_padded.
Also mark all the declarations in opustab.h as hidden,
so that the compiler knows that ff_celt_window has a fixed
offset from the code even when compiling position-independent
code.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-10 14:10:49 +02:00
Haihao Xiang
f3b5277057 lavc/qsvenc_hevc: use open GOP by default
HEVC spec has CRA frame which allows random access with open GOP, hence
it can achieve higher compression efficiency.

Removing the entry was suggested by Andreas

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-10-10 11:10:13 +08:00
Fei Wang
56a52af12b lavc/qsv: add support for decoding & encoding 12bit content
AV_PIX_FMT_P012, AV_PIX_FMT_Y212 and AV_PIX_FMT_XV36 are used in
FFmpeg and MFX_FOURCC_P016, MFX_FOURCC_Y216, and MFX_FOURCC_Y416 are used
in the SDK

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-10-10 09:31:34 +08:00
Haihao Xiang
1898dbddd5 lavc/qsv: add support for decoding & encoding 10bit 4:4:4 content
AV_PIX_FMT_XV30 is used in FFmpeg and MFX_FOURCC_Y410 is used in the
SDK.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-10-10 09:31:34 +08:00
Haihao Xiang
3f28116ea2 lavc/qsv: specify Shift for each format too
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-10-10 09:31:34 +08:00
Rémi Denis-Courmont
105921251a lavc/aacpsdsp: fix clobber on RISC-V LP64D/ILP32D
Although the DSP function only uses single precision from RISC-V F, the
caller may leave double precision values in the spilled registers if the
calling convention supports double precision hardware floats. Then, we
need to save and restore FS registers as double precision.

Conversely, we do not need to save anything at all if an integer calling
convention is in use. However we can assume that single precision floats
are supported, since the Zve32f extension implies the F extension.
So for the sake of simplicity, we always save at least single precision
values.

In theory, we should even save quadruple precision values if the LP64Q
ABI is in use. I have yet to see a compiler that supports it though.
2022-10-10 02:23:18 +02:00
Rémi Denis-Courmont
bfc69297c5 lavc/opusdsp: RISC-V V (512-bit) postfilter
This adds a variant of the postfilter for use with 512-bit vectors.
Half a vector is enough to perform the scalar product. Normally a whole
vector would be used anyhow. Indeed fractional multiplers are no faster
than the unit multipler.

But in this particular function, a full vector makes up 16 samples,
which would be loaded at each iteration of the outer loop. The minimum
guaranteed CELT postfilter period is only 15. Accounting for the edges,
we can only safely preload up to 13 samples.

The fractional multipler is thus used to cap the selected vector length
to a safe value of 8 elements or 256 bits.

Likewise, we have the 1024-bit variant with the quarter multipler. In
theory, a 2048-bit one would be possible with the eigth multipler, but
that length is not even defined in the specifications as of yet, nor is
it supported by any emulator - forget actual hardware.
2022-10-10 02:23:17 +02:00